Previous
|
Up
|
Next
|
Line Translation Primitives |
Primitives and EEL Subroutines |
More File Primitives |
Epsilon User's Manual and Reference >
Primitives and EEL Subroutines >
File Primitives >
Character Encoding Conversions
char *encoding_to_name(int enc)
int encoding_from_name(char *name)
When Epsilon reads or writes a file, it converts text between the
Unicode character representation it uses internally and one of various
file encodings. Epsilon represents each possible encoding with a
number.
These numbers may change from one version of Epsilon to the next, so
if an encoding setting must be recorded somehow, it should be recorded
by name, not by number. Certain specific encodings will not change
their codes: the encoding "auto-detect" is always numbered 0 ,
and the encoding "raw" is always numbered 1 .
The encoding_from_name( ) primitive returns the number of an
encoding given its name. It returns -1 pointer if the encoding
name is unknown.
The encoding_to_name( ) primitive returns the name of an
encoding given its number. It returns a NULL pointer if the encoding
number is unknown. Many encodings have more than one name, but this
primitive treats each name as a separate encoding, even if it's an
alias of another encoding.
int file_convert_write(char *file, int trans,
struct file_info *f_info)
int save_remote_file(char *fname, int trans,
struct file_info *finfo)
buffer char *(*file_io_converter)();
char *oem_file_converter(int func)
zeroed char *(*new_file_io_converter)();
zeroed buffer char file_write_newfile;
The do_save_file( ) subroutine uses the
file_convert_write( ) subroutine to actually write the file.
Like new_file_write( ), it takes a file name, a line translation
code as described under translation-type, and a
structure which Epsilon will fill with information on the file's
write date, file type, and so forth. See do_save_file( ) above
for details.
Unlike primitives such as new_file_write( ), the
file_convert_write( ) subroutine knows how to handle URL files by
calling the save_remote_file( ) subroutine.
In addition to the built-in conversion codes described above, Epsilon
also supports user-defined EEL conversion routines. These are
currently used only for DOS/OEM files read using the
find-oem-file command and similar. The
file_convert_write( ) subroutine handles writing these. It looks
for a buffer-specific variable file_io_converter . This
variable can be null, for no special translation, or it can contain a
function pointer. For OEM files, for example, it points to the
subroutine oem_file_converter( ).
Any such subroutine will be called with a code indicating the desired
action. The codes are defined in eel.h. The code
FILE_CONVERT_READ tells the subroutine to translate the
text in the current buffer as appropriate when reading a file. The
code FILE_CONVERT_WRITE tells the subroutine to translate
the buffer as appropriate when writing a file.
Before actually performing a conversion, Epsilon will call the
subroutine to ask if the conversion is safe (reversible), by passing
the FILE_CONVERT_ASK in addition to one of the above flags.
A conversion is reversible, and therefore safe, if the conversion
followed by the opposite conversion (for instance, ANSI => OEM
=> ANSI) yields the original text. If the conversion isn't safe,
the subroutine should ask the user for permission to proceed.
The converter should then return a null pointer to cancel the read or
write operation, or any other value to let it proceed. You can add
the FILE_CONVERT_QUIET flag, and the converter won't ask the
user for confirmation, merely return a value indicating whether the
conversion would be safe.
Whenever the FILE_CONVERT_ASK flag isn't present, the subroutine
should return the name of its minor mode--Epsilon will display this
in the mode line. The OEM converter returns " OEM" .
When creating a new buffer, file-reading subroutines initialize the
file_io_converter variable by copying the value of
new_file_io_converter . Commands like find-oem-file
temporarily set this variable to effect reading a file with OEM
translation.
The file_convert_write( ) subroutine performs one more function.
It checks the variable file_write_newfile . If this variable
is nonzero, it arranges things so the attempt to write a file will
fail with an error code if the file already exists, by passing the
FILE_IO_NEWFILE code to new_file_write( ).
int perform_unicode_conversion(int buf, int from, int to,
int flags, char *encoding)
The
perform_unicode_conversion( ) primitive converts between
16-bit Unicode characters and various 8-bit encodings such as UTF-8.
It converts characters in the range from ...to in the specified
buffer buf in place.
By default, the primitive converts from 16-bit Unicode characters to
the named 8-bit encoding . The CONV_TO_16 flag makes it
convert in the opposite direction, from the specified 8-bit encoding
to 16-bit characters.
The primitive returns the code EBADENCODE if it doesn't
recognize the encoding name. It returns ETOOBIG when
converting from 8-bit characters if one of the characters is outside
the range 0-255. It returns 0 on success. The primitive moves
point to the end of the buffer.
If the specified encoding has a defined signature (a byte order mark),
and an entire buffer was converted, not just part of one, Epsilon adds
the signature when converting to the encoding, and removes the
signature, if there is one, when converting from the encoding.
int buffer_flags(int buf)
Internally, Epsilon stores the text of a buffer with 8 bits for each
character, unless it contains some characters outside the range
0-255. In that case it uses 16 bits for each character. A buffer
that once contained such characters but no longer does may still be
stored as 16 bits per character. Epsilon transparently handles all
needed translations between the two formats (for instance, when you
copy text from one buffer to another), but it's occasionally useful to
tell which format Epsilon is using.
The buffer_flags( ) primitive returns a bit mask. Check the
bit represented by the BF_UNICODE macro; if it's present,
the specified buffer buf is stored in 16-bit format internally.
If buf is omitted or zero, the primitive checks the current
buffer.
Previous
|
Up
|
Next
|
Line Translation Primitives |
Primitives and EEL Subroutines |
More File Primitives |
Epsilon Programmer's Editor 14.04 manual. Copyright (C) 1984, 2021 by Lugaru Software Ltd. All rights reserved.
|