Previous
|
Up
|
Next
|
Control Flow |
Primitives and EEL Subroutines |
Examining Strings |
Epsilon User's Manual and Reference >
Primitives and EEL Subroutines >
Control Primitives >
Character Types
int isspace(int ch)
int isdigit(int ch)
int isalpha(int ch)
int islower(int ch)
int isupper(int ch)
int iscntrl(int ch)
int isgraph(int ch)
int ispunct(int ch)
int isprint(int ch)
int isxdigit(int ch)
int isalnum(int ch) /* basic.e */
int isident(int ch) /* basic.e */
int any_uppercase(char *p)
Epsilon has several primitives that are helpful for determining if a
character is in a certain class. The isspace( ) primitive
tells if its character argument is a space, tab, or newline character.
It returns 1 if it is, otherwise 0 .
In the same way, the isdigit( ) primitive tells if a
character is a digit (one of the characters 0 through 9 ), and
the isalpha( ) primitive tells if the character is a letter.
The islower( ) and isupper( ) primitives tell if the
character is a lower case letter or upper case letter, respectively.
The iscntrl( ) primitive tells if a character is a control
character, isgraph( ) if a character is a graphical character
(has a printed representation, not a space or a control character),
ispunct( ) if a character is a punctuation character,
isprint( ) if a character is printable (not a control
character), and isxdigit( ) if a character is a hex digit.
The isalnum( ) subroutine returns nonzero if the specified character
is alphanumeric: either a letter or a digit. The isident( )
subroutine returns nonzero if the specified character is an identifier
character: a letter, a digit, or the _ character.
All functions in this section also handle Unicode characters
appropriately: the isspace( ) primitive, for instance, also
returns 1 for characters with a Unicode category of Z, meaning
separators, and primitives that report on lowercase letters understand
any Unicode character that's a lowercase letter.
The any_uppercase( ) subroutine returns nonzero if there are
any upper case characters in its string argument p .
int tolower(int ch)
int toupper(int ch)
The tolower( ) primitive converts an upper case letter to the
corresponding lower case letter. It returns a character that is not
an upper case letter unchanged. The toupper( ) primitive
converts a lower case letter to its upper case equivalent, and leaves
other characters unchanged.
int set_character_property(int ch, int propcode, int value)
You can alter the rules Epsilon uses for
determining if a particular character is alphabetic, uppercase, or
lowercase, and how Epsilon case-folds when searching, sorting or
otherwise comparing text, using the set_character_property( )
primitive. It takes the numeric code of the character whose
properties you want to modify, a property code indicating which of its
properties to access, and a new value for that property.
The property code CPROP_CTYPE sets whether the
isalpha( ), isupper( ), islower( ), and isdigit( )
primitives consider a character alphabetic, uppercase, lowercase, or a
digit, respectively. These attributes are independent, though there
are conventions for their use. (For instance, only alpha characters
generally have a case, no character is both uppercase and lowercase,
and so forth.) The bits C_ALPHA , C_LOWER ,
C_UPPER , and C_DIGIT represent these attributes.
The bits also control whether the regular expressions <digit>,
<alpha>, <alphanum>, and <word> match these
characters; see Character Classes.
Similarly, the bits C_CNTRL , C_GRAPH ,
C_PUNCT , and C_XDIGIT may be used to modify the
iscntrl( ), isgraph( ), ispunct( ), and isxdigit( )
primitives, respectively, using CPROP_CTYPE.
The property code CPROP_TOLOWER controls what value the
tolower( ) primitive returns for the specified character, and the
property code CPROP_TOUPPER controls what value the
toupper( ) primitive returns for it.
The property code CPROP_FOLD controls how Epsilon case-folds
that character during searching, sorting, and similar functions,
whenever case folding is in use. It specifies a replacement character
to be used in place of the original during comparisons. The complete
set of case-folding properties must follow two rules: if some
character X folds to Y, then Y must fold to itself, and character
codes below 256 must never fold to a value greater than or equal to
256. (If a particular group of characters should be treated as equal
when searching, setting the case folding property of each to the code
of the lowest-numbered one is sufficient to comply with these rules.)
The primitive returns the previous value of the specified property of
that character. If the new value is out of range for the property
(such as a negative value), it will be ignored, and the primitive will
just return the current value. You can use this to retrieve the
current properties of a character without changing them.
The special property code CPROP_DISP_WIDTH may be used to
retrieve the width in columns of any character in the current font in
Epsilon for Windows. It returns 1 for normal characters, 0 for
zero-width characters, and 2 for those characters treated as
doublewidth (if Epsilon's experimental doublewidth feature to better
support Chinese, Japanese and Korean text has been enabled; it's
disabled by default). It returns -1 if called during startup before
Epsilon has selected a font, and in non-Windows versions (where
all characters have a font display width of 1).
The special property code CPROP_DEF_REPL may be used to
retrieve replacement Unicode characters for those in the range
128-255 in Epsilon for Windows using the current language settings.
Some Extended ASCII characters in this range have different character
numbers in Unicode, so Epsilon uses this function to map undisplayable
characters in that range to displayable characters (for display
purposes only). The function returns the original character for any
values outside its range.
Epsilon doesn't store current character properties in its state file.
If you want to use non-default properties all the time, write a
startup function that calls this primitive. See Starting and Finishing.
Epsilon always starts with character classifications based on standard
Unicode properties, except for the Windows Console version. That
version, when running with a DOS/OEM character set (see the
console-ansi-font variable), begins with its classifications
for 8-bit characters set to match the current OEM font.
int get_direction() /* window.e */
The get_direction( ) subroutine converts the last key pressed
into a direction. It understands arrow keys, as well as the
equivalent control characters. It returns BTOP, BBOTTOM,
BLEFT, BRIGHT, or -1 if the key doesn't correspond to
any direction.
Previous
|
Up
|
Next
|
Control Flow |
Primitives and EEL Subroutines |
Examining Strings |
Epsilon Programmer's Editor 14.00 manual. Copyright (C) 1984, 2020 by Lugaru Software Ltd. All rights reserved.
|