Fonts in XFree86
: Appendix: background and terminology
Previous: More about core fonts
Next: References
5. Appendix: background and terminology
5.1. Characters and glyphs
A computer text-processing system inputs keystrokes and outputs
glyphs, small pictures that are assembled on paper or on a
computer screen. Keystrokes and glyphs do not, in general, coincide:
for example, if the system does generate ligatures, then to the
sequence of two keystrokes <f><i> will typically
correspond a single glyph. Similarly, if the system shapes Arabic
glyphs in a vaguely reasonable manner, then multiple different glyphs
may correspond to a single keystroke.
The complex transformation rules from keystrokes to glyphs are usually
factored into two simpler transformations, from keystrokes to
characters and from characters to glyphs. You may want to think
of characters as the basic unit of text that is stored e.g. in
the buffer of your text editor. While the definition of a character
is intrinsically application-specific, a number of standardised
collections of characters have been defined.
A coded character set is a set of characters together with a
mapping from integer codes --- known as codepoints --- to
characters. Examples of coded character sets include US-ASCII,
ISO 8859-1, KOI8-R, and JIS X 0208(1990).
A coded character set need not use 8 bit integers to index characters.
Many early systems used 6 bit character sets, while 16 bit (or more)
character sets are necessary for ideographic writing systems.
5.2. Font files, fonts, and XLFD
Traditionally, typographers speak about typefaces and
founts. A typeface is a particular style or design, such as
Times Italic, while a fount is a molten-lead incarnation of a given
typeface at a given size.
Digital fonts come in font files. A font file contains the
information necessary for generating glyphs of a given typeface, and
applications using font files may access glyph information in an
arbitrary order.
Digital fonts may consist of bitmap data, in which case they are said
to be bitmap fonts. They may also consist of a mathematical
description of glyph shapes, in which case they are said to be
scalable fonts. Common formats for scalable font files are
Type 1 (sometimes incorrectly called ATM fonts or
PostScript fonts), TrueType and Speedo.
The glyph data in a digital font needs to be indexed somehow. How
this is done depends on the font file format. In the case of
Type 1 fonts, glyphs are identified by glyph names. In the
case of TrueType fonts, glyphs are indexed by integers corresponding
to one of a number of indexing schemes (usually Unicode --- see below).
The X11 core fonts system uses the data in a font file to generate
font instances, which are collections of glyphs at a given size
indexed according to a given encoding.
X11 core font instances are usually specified using a notation known
as the X Logical Font Description (XLFD). An XLFD starts with a
dash `-', and consists of fourteen fields separated by dashes,
for example:
-adobe-courier-medium-r-normal--12-120-75-75-m-70-iso8859-1
Or particular interest are the last two fields `iso8859-1', which
specify the font instance's encoding.
A scalable font is specified by an XLFD which contains zeroes instead
of some fields:
-adobe-courier-medium-r-normal--0-0-0-0-m-0-iso8859-1
X11 font instances may also be specified by short name. Unlike an
XLFD, a short name has no structure and is simply a conventional name
for a font instance. Two short names are of particular interest, as
the server will not start if font instances with these names cannot be
opened. These are `fixed', which specifies the fallback font to
use when the requested font cannot be opened, and `cursor', which
specifies the set of glyphs to be used by the mouse pointer.
Short names are usually implemented as aliases to XLFDs; the
standard `fixed' and `cursor' aliases are defined in
/usr/X11R6/lib/X11/font/misc/fonts.alias
5.3. Unicode
Unicode (http://www.unicode.org) is a coded character
set with the goal of uniquely identifying all characters for all
scripts, current and historical. While Unicode was explicitly not
designed as a glyph encoding scheme, it is often possible to use it as
such.
Unicode is an open character set, meaning that codepoint
assignments may be added to Unicode at any time (once specified,
though, an assignment can never be changed). For this reason, a
Unicode font will be sparse, meaning that it only defines glyphs
for a subset of the character registry of Unicode.
The Unicode standard is defined in parallel with the international
standard ISO 10646. Assignments in the two standards are always
equivalent, and we often use the terms Unicode and
ISO 10646 interchangeably.
When used in the X11 core fonts system, Unicode-encoded fonts should
have the last two fields of their XLFD set to `iso10646-1'.
Fonts in XFree86
: Appendix: background and terminology
Previous: More about core fonts
Next: References
|