Private Use (Unicode)

In Unicode, Private Use is a concept to allow characters to be defined and used by private agreement between parties (that is, not involving the Unicode Consortium), using unspecified code points in a Private Use Area or range. The private agreement may be published, and often is. Such publication may include a font that supports the definition (showing the glyphs), and processes to support privately-defined graphic or even control effects (e.g. a clickable <do print> character). As a stability rule, the Unicode Standard guarantees these Private Use code points will never be assigned regular characters, so Unicode will never interfere with the private agreement.

For example, Apple Inc. has published the Apple logo to be encoded at Private-use code point U+F8FF <private-use-F8FF>, and maintains this in its fonts and systems.

By definition, multiple private parties may define a specific code point this way, with the consequence that a user can experience using the wrong font, seeing characters from another definition set.

1 Definition
2 Private Use Area
3 Private Use in other encodings
4 Usage
- 4.1 Coordinated private use, and publishings into Unicode
- 4.2 Example code point U+F8FF
5 References

Definition

Unicode defines that Private-use code points are assigned characters (as opposed to, say, reserved code points), but no specifics are defined, and properties can be overruled by the private agreement. Part of the stability of the standard is that these code points will never be assigned a regular Unicode character:

Characters in these [Private Use] areas will never be defined by the Unicode Standard. These code points can be freely used for characters of any purpose, but successful interchange requires an agreement between sender and receiver on their interpretation.^[1]^[2]

Just all Private-use characters have General Category=Other, private use (Co).

Private Use Area

There are three blocks of private-use code points, each is a Private Use Area. In the Basic Multilingual Plane (plane 0) is block Private Use Area with 6400 code points, and in plane 15 and 16 are blocks Supplemental Private Use Area-A and Supplemental Private Use Area-B respectively with 65.534 code points each. The two PUA Planes in Unicode are composed by using surrogate pairs from the basic BMP plane. The high surrogates are those in BMP-block High Private Use Surrogates (U+DB80..U+DBFF, 128 code points), combined with all low surrogates (1028 code points). The 1-to-1 mapping between surrogate-pair and U+xxxxxx code point is defined in UTF-16.

Unicode: Private Use Areas
Definition: `General Category=Co`^[a]^[b]
Range	Plane	Block name	Number of code points	Note
U+E000..U+F8FF	BMP (0)	Private Use Area	6400
U+F0000..U+FFFFD	PUP (15)^[c]	Supplemental Private Use Area-A	65534	Based on block High Private Use Surrogates (U+DB80..U+DBFF) in BMP, using UTF-16.
U+100000..U+10FFFD	PUP (16)^[c]	Supplemental Private Use Area-B	65534
Notes ^ Unicode Standard chapter 2 ^ Unicode Standard chapter 16.5 ^ Private Use Plane: Unicode has not published identifying names for planes 15 and 16. Chapter 2.8 says The two Private Use Planes (Planes 15 and 16), while the PUA block names used are Supplemental PUA-A and Supplemental PUA-B. Final code points U+xxFFFE and U+xxFFFF in the blocks are not Private-use characters.

Private Use in other encodings

In earlier encodings, the concept of private use was present. East Asian systems used End User Character Definition (EUCD)^[1].

In ISO-8859-1 (and many other ASCII-compatible character encodings), the C1 control block contains two codes intended for private use "control functions" by ECMA-48: 0x91 private use one (PU1) and 0x92 private use two (PU2).^[3]^[4] Unicode includes these at U+0091 <control-0091> and U+0092 <control-0092> but defines them as control characters (category Cc), not private use characters (category Co).^[5]^[6]

The Chinese National Standard 11643 (CNS 11643) is an encoding independent of Unicode. Within this standard, planes 12 to 15 are designed for user-defined charactes.

Usage

Coordinated private use, and publishings into Unicode

Many people and institutions have created character collections for the PUA. Some of these private use agreements are published, so other PUA implementers can aim for unused or less used code points to prevent overlaps. Several characters and scripts previously encoded in private use agreements have actually been fully encoded in Unicode Template:Examples?, necessitating mappings from the PUA to other Unicode code points.

One of the more well-known and broadly implemented PUA agreements is maintained by the ConScript Unicode Registry (CSUR). The CSUR, which is not officially endorsed or associated with the Unicode Consortium, provides a mapping for constructed scripts, such as Klingon pIqaD and Ferengi script (Star Trek), Tengwar and Cirth (J.R.R. Tolkien's cursive and runic scripts), Alexander Melville Bell's Visible Speech, and Dr. Seuss' alphabet from On Beyond Zebra. The CSUR previously encoded the undeciphered or constructed scripts Phaistos, Shavian, and Deseret, which have all been accepted for official encoding in Unicode.

Another common PUA agreement is maintained by the Medieval Unicode Font Initiative (MUFI). This project is attempting to support all of the scribal abbreviations, ligatures, and alternate letterforms found in medieval texts written in the Latin alphabet. The express purpose of MUFI is to experimentally determine which characters are necessary to represent these texts, and to have those characters officially encoded in Unicode. As of Unicode version 5.1, 152 MUFI characters have been incorporated into the official Unicode encoding.

Example code point U+F8FF

Unicode code point U+F8FF or  is the last code point in the Private Use Area in BMP. Its meaning and appearance vary depending on the font in use, but its usage in several fonts makes it the most notable code point in the private use area.

Some early Tengwar fonts map Elvish characters to it.
The Imitari font draws it as a capital eth.
The font Luxi draws it as the euro sign.
The font "Standard Symbols L" uses it as one of the box drawing characters.
The official PRC standard on precomposed Tibetan uses the codepoint for the Tibetan syllable "hwo".
Some font makers place a copyright statement or other creator's mark at that code point.
- For example, the dingbats font "DavysDingbats" uses it to display a face, presumably that of the font's creator.
- In most Apple-supplied fonts, it represents the Apple logo, or an early version of the command key.
The ConScript Unicode Registry suggests it be used for the Klingon glyph "KLINGON MUMMIFICATION GLYPH." This is followed by e.g. Code2000.
In Wingdings 1,  is the Windows logo. In some computers, however, it is  (U+F000) instead of .

References

^ ^a ^b Unicode Standard chapter 16.5 Private Use characters
^ Unicode Standard chapter 2: General Structure
^ Standard ECMA-48, Fifth Edition - June 1991 §8.2.14 Miscellaneous control functions, §8.3.100, §8.3.101
^ ISO C1 Control Character Set of ISO 6429 (1983)
^ UnicodeData.txt
^ Unicode 6.1.0, Chapter 4, Table 4-9

Unicode

Code points

Characters

Special purpose	BOM Combining grapheme joiner Left-to-right mark and Right-to-left mark Soft hyphen Zero-width non-breaking space Zero-width joiner Zero-width non-joiner Zero-width space

Miscellaneous lists	CJK Unified Ideographs Combining character Duplicate characters Numerals Scripts Spaces Symbols

Processing

Algorithms	Bi-directional text Collation (ISO 14651) Equivalence

Transformation	BOCU-1 CESU-8 UTF-1 UTF-7 UTF-8 UTF-9/UTF-18 UTF-16/UCS-2 UTF-32/UCS-4 UTF-EBCDIC Punycode SCSU Comparison

On pairs
of code points

Usage

Email
HTML (numeric reference · entity references)
Domain names (IDN)
Input
Private Character Editor (MS)
Typefaces (fonts)

Related standards

Personal tools

Create account
Log in

Interaction

Toolbox

What links here
Related changes
Upload file
Special pages
Permanent link
Cite this page

Print/export

Create a book
Download as PDF
Printable version

Conheça Walt Disney World

Private Use (Unicode)

Contents

Definition

Private Use Area

Private Use in other encodings

Usage

Coordinated private use, and publishings into Unicode

Example code point U+F8FF

References

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Interaction

Toolbox

Print/export

Languages