Epsilon User's Manual and Reference >
Command Reference >
unicode-convert-encoding
Convert buffer to another Unicode encoding.
This command converts a buffer between various Unicode 8-bit and
16-bit encodings.
In the UTF-8 8-bit encoding, characters in the range 0-127 represent
themselves. Sequences of two to four bytes in the range 128-255
represent each character outside the range 0-127. In the Latin 1
encoding, characters in the range 0-255 represent themselves, and no
characters outside that range may be represented.
In the 16-bit UTF-16 encoding, a two or four byte sequence represents
each character, no matter its range. (There are two variations,
UTF-16 LE and UTF-16 BE, identical but for byte order.)
The command prompts for the type of conversion desired. It warns if
any characters in the buffer cannot be represented in the new format
(or if the buffer contains encoding errors), and positions to the
first such problem if you choose not to perform the conversion.
Under Windows, Epsilon first performs DOS/Windows line translation
before conversion to UTF-16, unless the buffer contains non-text
binary data (nulls or Return characters). Each Newline character will
be converted to a Return, Newline sequence. It performs the opposite
line translation when converting from UTF-16. Under Unix, Epsilon
doesn't perform any translation by default. Provide a zero prefix
argument to disable line terminator conversion; provide a nonzero
prefix argument to force it.
Copyright (C) 1984, 2020 by Lugaru Software Ltd. All rights reserved.
|