Unicode support
===============

-----------------------------

[The Rexx Parser](../../) includes optional
support for
[TUTOR-flavored Unicode](https://github.com/RexxLA/rexx-repository/tree/master/ARB/standards/work-in-progress/unicode/UnicodeTools).

API changes
-----------

### New element categories

- [The Element API](../guide/elementapi/) defines five new element categories,
  `.EL.BYTES_STRING`, `.EL.CODEPOINTS_STRING`, `.EL.GRAPHEMES_STRING`,
  `.EL.TEXT_STRING` and `.EL.UNICODE_STRING`, for the new Unicode strings. These
  are defined in [the Globals.cls package](/rexx-parser/doc/ref/categories/),
  and they will be generated by the parser only when Unicode support has been requested.

In that case, the Parser will examine the contents of string literals,
and raise the appropriate syntax errors when the strings do not follow
the Unicode or TUTOR conventions. P- G- and T-strings will be checked
for UTF-8 well-formedness, and, in the case of U-strings,
codepoints will be range checked, names will be checked against
`UnicodeData.txt` and `NameAliases.txt`, and labels will checked against their
respective definitions.

### New HTML classes

- The [HTMLClasses](/rexx-parser/doc/highlighter/HTMLClasses/) routine
  assigns new HTML classes to the newly defined string classes.

### New Parser options

- The [Rexx.Parser](/rexx-parser/doc/ref/classes/rexx.parser/) class
  accepts a new boolean `Unicode` option, which activates Unicode support.


Updated programs and utilities
------------------------------

- The [FencedCode](/rexx-parser/doc/highlighter/FencedCode/)
  routine accepts `TUTOR` and `Unicode` as new atributes in Rexx fenced code blocks.
  They both activate Unicode support.
- The [Highlighter class](/rexx-parser/doc/ref/classes/highlighter/) class
  accepts `TUTOR` and `Unicode` options. They both activate Unicode support.
- Both the [highlight.rex](/rexx-parser/doc/utilities/highlight/)
  and [elements.rex](/rexx-parser/doc/utilities/elements/) utilities accept new
  `-u`, `--tutor` and `--unicode` options, which enable Unicode support.
