Lugaru's Epsilon Programmer's Editor 14.06
Context:
|
Previous
|
Up
|
Next
|
Replacing |
Commands by Topic |
Entering Special Characters |
Epsilon User's Manual and Reference >
Commands by Topic >
Changing Text >
Regular Expressions
Most of Epsilon's searching
commands, described in Searching, take a
simple string to search for. Epsilon provides a more powerful
regular expression search facility, and a regular expression replace
facility.
Instead of a simple search string, you provide a pattern, which
describes a set of strings. Epsilon searches the buffer for an
occurrence of one of the strings contained in the set. You can think
of the pattern as generating a (possibly infinite) set of strings,
and the regex search commands as looking in the buffer for the first
occurrence of one of those strings.
The following characters have special meaning in a regex search:
vertical bar, parentheses, plus, star, question mark, square
brackets, period, dollar, percent sign, left angle bracket
("<"), and caret ("^"). To match them literally, they
must be quoted; see Entering Special Characters. See the following
sections for syntax details and additional examples.
| abc|def | Finds either abc or def . |
| (abc) | Finds abc . |
| abc+ | Finds abc or abcc or abccc or ... . |
| abc* | Finds ab or abc or abcc or abccc or ... . |
| abc? | Finds ab or abc . |
| [abcx-z] | Finds any single character
of a , b , c , x , y , or z . |
| [^abcx-z] | Finds any single character
except a , b , c , x , y , or z . |
| . | Finds any single character except
<Newline>. |
| abc$ | Finds abc that
occurs at the end of a line. |
| ^abc | Finds abc that
occurs at the beginning of a line. |
| %^abc | Finds a literal
^abc . |
| <Tab> | Finds a <Tab>
character. |
| <#123> | Finds the
character with ASCII code 123. |
| <p:cyrillic> | Finds any
character with that Unicode property. |
| <alpha|1-5&!x-z> | Finds any
alpha character except x, y or z or digit 1-5. |
| <^c:*comment>printf |
Finds uses of printf that aren't commented out. |
| <h:0d 0a 45> | Finds char
sequence with those hexadecimal codes. |
Plain Patterns
In a regular expression, a string that does not contain any of the above
characters denotes the set that contains precisely that one string. For
example, the regular expression abc denotes the set that contains,
as its only member, the string "abc". If you search for this
regular expression, Epsilon will search for the string "abc", just
as in a normal search.
Alternation
To include more than one string in the set, you can use the
vertical bar character. For example, the regular expression abc|xyz
denotes the set that contains the strings "abc" and "xyz". If you
search for that pattern, Epsilon will find the first occurrence of either
"abc" or "xyz". The alternation operator (| ) always applies
as widely as possible, limited only by grouping parentheses.
Grouping
You can enclose any regular expression in parentheses, and the
resulting expression refers to the same set. So searching for
(abc|xyz) has the same effect as searching for abc|xyz ,
which works as in the previous paragraph. You would use parentheses
for grouping purposes in conjunction with some of the operators
described below.
Parentheses are also used for retrieving specific portions of the
match. A regular expression replacement uses the syntax #3 to
refer to the third parenthesized group, for instance. The
find_group( ) function provides a similar function for EEL
programmers. The special syntax (?: ) provides grouping just
like ( ) , but isn't counted as a group when retrieving parts of
the match in these ways.
Concatenation
You can concatenate two regular expressions to form a new
regular expression. Suppose the regular expressions p and
q denote sets P and Q, respectively. Then
the regular expression pq denotes the set of strings that
you can make by concatenating, to members of P, strings from
the set Q. For example, suppose you concatenate the
regular expressions (abc|xyz) and (def|ghi) to yield
(abc|xyz)(def|ghi) . From the previous paragraph, we know that
(abc|xyz) denotes the set that contains "abc" and "xyz"; the
expression (def|ghi) denotes the set that contains "def" and
"ghi". Applying the rule, we see that (abc|xyz)(def|ghi)
denotes the set that contains the following four strings:
"abcdef", "abcghi", "xyzdef", "xyzghi".
Closure
Clearly, any regular expression must have finite length;
otherwise you couldn't type it in. But because of the closure
operators, the set to which the regular expression refers
may contain an infinite number of strings. If you append plus to a
parenthesized regular expression, the resulting expression denotes
the set of one or more repetitions of that string. For example,
the regular expression (ab)+ refers to the set that contains "ab",
"abab", "ababab", "abababab", and so on. Star works similarly,
except it denotes the set of zero or more repetitions of the indicated
string.
Optionality
You can specify the question operator in the same place you might
put a star or a plus. If you append a question mark to a
parenthesized regular expression, the resulting expression denotes
the set that contains that string, and the empty string. You would
typically use the question operator to specify an optional
subpart of the search string.
You can also use the plus, star, and question-mark operators with
subexpressions, and with non-parenthesized things. These operators
always apply to the smallest possible substring to their left. For
example, the regular expression abc+ refers to the set that
contains "abc", "abcc", "abccc", "abcccc", and so on. The
expression a(bc)*d refers to the set that contains
"ad", "abcd", "abcbcd", "abcbcbcd", and so on.
The expression a(b?c)*d denotes the set that contains all
strings that start with "a" and end with "d", with the inside
consisting of any number of the letter "c", each optionally
preceded by "b". The set includes such strings
as "ad", "acd", "abcd", "abccccbcd".
Subtopics:
Entering Special Characters
Character Classes
Regular Expression Examples
Searching Rules
Regular Expression Assertions
Regular Expression Commands
Previous
|
Up
|
Next
|
Replacing |
Commands by Topic |
Entering Special Characters |
Epsilon Programmer's Editor 14.06 manual. Copyright (C) 1984, 2024 by Lugaru Software Ltd. All rights reserved.
|