Private Use (Unicode)

In Unicode, the Private Use Areas are three ranges of code points (U+E000–U+F8FF in the BMP, and planes 15 and 16) that, by definition, will not be assigned characters by the Unicode Consortium. They are intentionally left undefined so that third parties may define their own characters without conflicting with Unicode Consortium assignments. Under the Unicode Stability Policy,^[1] the Private Use Areas will remain allocated for that purpose in all future Unicode versions.

Assignments to Private Use Area characters need not be "private" in the sense of strictly internal to an organisation; a number of assignment schemes have been published by several organisations. Such publication may include a font that supports the definition (showing the glyphs), and software making use of the private-use characters (e.g. a graphics character for a "print document" function). For example, Apple Inc. has assigned the Apple logo to code point U+F8FF <private-use-F8FF>, and makes use of this assignment in its fonts and software. By definition, multiple private parties may assign different characters to the same code point, with the consequence that a user may see one private character from an installed font where a different one was intended.

1 Definition
2 Private Use Areas
3 Private-use characters in other character sets
4 Usage
- 4.1 Coordinated private use, and publishings into Unicode
- 4.2 Example code point U+F8FF
5 Notes
6 References

Definition

Under the Unicode definition, code points in the Private Use Areas are assigned characters—they are not noncharacters, reserved, or unassigned. Their category is "Other, private use (Co)", and no character names are specified. No representative glyphs are provided, and character semantics are left to private agreement.

Private-use characters are assigned Unicode code points whose interpretation is not specified by this standard and whose use may be determined by private agreement among cooperating users. These characters are designated for private use and do not have defined, interpretable semantics except by private agreement. … No charts are provided for private-use characters, as any such characters are, by their very nature, defined only outside the context of this standard.^[2]

Private Use Areas

In the Basic Multilingual Plane (plane 0), the block titled Private Use Area has 6400 code points. Planes 15 and 16 are almost^{[note 1]} entirely assigned to two further Private Use Areas, Supplemental Private Use Area-A and Supplemental Private Use Area-B respectively.

In order to encode characters from planes 15 and 16 in UTF-16, a further block of the BMP is assigned to High Private Use Surrogates (U+DB80..U+DBFF, 128 code points).

Unicode: Private Use Areas
Definition: `General Category=Co`^[a]^[b]
Range	Plane	Block name	Number of code points	Note
U+E000..U+F8FF	BMP (0)	Private Use Area	6400
U+F0000..U+FFFFD	PUP (15)^[c]	Supplemental Private Use Area-A	65534	Based on block High Private Use Surrogates (U+DB80..U+DBFF) in BMP, using UTF-16.
U+100000..U+10FFFD	PUP (16)^[c]	Supplemental Private Use Area-B	65534
Notes ^ Unicode Standard chapter 2 ^ Unicode Standard chapter 16.5 ^ Private Use Plane: Unicode has not published identifying names for planes 15 and 16. Chapter 2.8 says The two Private Use Planes (Planes 15 and 16), while the PUA block names used are Supplemental PUA-A and Supplemental PUA-B. Final code points U+xxFFFE and U+xxFFFF in the blocks are not Private-use characters.

Private-use characters in other character sets

The Unicode Private Use Area concept was based on similar earlier usage in other character sets. In particular, many otherwise obsolete characters in East Asian scripts continue to be used in specific names or other situations, and so some character sets for those scripts made allowance for private-use characters (such as the user-defined planes of CNS 11643, or gaiji in certain Japanese encodings). The Unicode standard references these uses under the name "End User Character Definition" (EUCD).^[2]

Additionally, the C1 control block contains two codes intended for private use "control functions" by ECMA-48: 0x91 private use one (PU1) and 0x92 private use two (PU2).^[3]^[4] Unicode includes these at U+0091 <control-0091> and U+0092 <control-0092> but defines them as control characters (category Cc), not private-use characters (category Co).^[5]^[6]

Usage

Coordinated private use, and publishings into Unicode

Many people and institutions have created character collections for the PUA. Some of these private use agreements are published, so other PUA implementers can aim for unused or less used code points to prevent overlaps. Several characters and scripts previously encoded in private use agreements have actually been fully encoded in Unicode, necessitating mappings from the PUA to other Unicode code points.

One of the more well-known and broadly implemented PUA agreements is maintained by the ConScript Unicode Registry (CSUR). The CSUR, which is not officially endorsed or associated with the Unicode Consortium, provides a mapping for constructed scripts, such as Klingon pIqaD and Ferengi script (Star Trek), Tengwar and Cirth (J.R.R. Tolkien's cursive and runic scripts), Alexander Melville Bell's Visible Speech, and Dr. Seuss' alphabet from On Beyond Zebra. The CSUR previously encoded the undeciphered Phaistos characters, as well as the Shavian and Deseret alphabets, which have all been accepted for official encoding in Unicode.

Another common PUA agreement is maintained by the Medieval Unicode Font Initiative (MUFI). This project is attempting to support all of the scribal abbreviations, ligatures, and alternate letterforms found in medieval texts written in the Latin alphabet. The express purpose of MUFI is to experimentally determine which characters are necessary to represent these texts, and to have those characters officially encoded in Unicode. As of Unicode version 5.1, 152 MUFI characters have been incorporated into the official Unicode encoding.

Publishing organisation	Short	Topic	PUA area claimed	Font
ConScript Unicode Registry	CSUR	Artificial scripts	PUA (BMP) and Plane 15	Code2000
Medieval Unicode Font Initiative	MUFI	Medieval scripts	PUA (BMP)	Charis SIL
SIL International [1]	SIL	Phonetics and languages	PUA (BMP)

Example code point U+F8FF

Unicode code point U+F8FF or  is the last code point in the Private Use Area of the BMP. Its meaning and appearance vary depending on the font in use, but its usage in several fonts makes it the most notable code point in the Private Use Area.

Some font makers place a copyright statement or other creator's mark at that code point.
- For example, the dingbats font "DavysDingbats" uses it to display a face, presumably that of the font's creator.
- In most Apple-supplied fonts, it represents the Apple logo, or an early version of the command key.
Some early Tengwar fonts map Elvish characters to it.
The Imitari font draws it as a capital eth.
The font Luxi draws it as the euro sign.
The font "Standard Symbols L" uses it as one of the box drawing characters.
The official PRC standard on precomposed Tibetan uses the codepoint for the Tibetan syllable "hwo".
The ConScript Unicode Registry suggests it be used for the Klingon glyph "KLINGON MUMMIFICATION GLYPH".
In Wingdings 1, it is the Microsoft Windows logo. (Some other fonts place this logo at U+F000 instead.)

Notes

^ The last two characters of every plane are defined to be noncharacters. The remaining 65,534 characters of each of planes 15 and 16 are assigned as private-use characters.

References

^ "Unicode Character Encoding Stability Policy". 2012-05-29. http://unicode.org/policies/stability_policy.html. Retrieved 2012-08-15.
^ ^a ^b Unicode Standard chapter 16.5 Private Use characters
^ Standard ECMA-48, Fifth Edition - June 1991 §8.2.14 Miscellaneous control functions, §8.3.100, §8.3.101
^ ISO C1 Control Character Set of ISO 6429 (1983)
^ UnicodeData.txt
^ Unicode 6.1.0, Chapter 4, Table 4-9

Unicode

Code points

Characters

Special purpose	BOM Combining grapheme joiner Left-to-right mark / Right-to-left mark Soft hyphen Zero-width joiner Zero-width non-breaking space Zero-width non-joiner Zero-width space

Lists	CJK Unified Ideographs Combining character Duplicate characters Numerals Scripts Spaces Symbols

Processing

Algorithms	Bi-directional text Collation ISO 14651 Equivalence

Comparison	BOCU-1 CESU-8 Punycode SCSU UTF-1 UTF-7 UTF-8 UTF-9/UTF-18 UTF-16/UCS-2 UTF-32/UCS-4 UTF-EBCDIC

On pairs of
code points

Usage

Related standards

Personal tools

Create account
Log in

Interaction

Toolbox

What links here
Related changes
Upload file
Special pages
Permanent link
Cite this page

Print/export

Create a book
Download as PDF
Printable version

Languages

Français

Conheça Walt Disney World

Private Use (Unicode)

Contents

Definition

Private Use Areas

Private-use characters in other character sets

Usage

Coordinated private use, and publishings into Unicode

Example code point U+F8FF

Notes

References

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Interaction

Toolbox

Print/export

Languages