lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 12/6/2012 6:12 AM, Dirk Laurie wrote:
2012/12/5 Jay Carlson <nop@nop.com>:

Here's a nickel. Get yourself a real operating system
(or perhaps just a real MUA).

You're the second poster to make snide remarks at my OS.
Adam called it "crappy".

Actually unnecessary decomposed characters cannot arise
on my system without great inconvenience, so I can't blame
the authors for failing to provide an output mechanism that
uncraps crappy input.

Typographic issues are a bit beyond this list, but here is how it works (a it simplified as more is involved):

- input can consist of either a sequence of characters that are turned into one (u + diaeresis = udiaeresis) or of direct code points (udiaeresis); from the linguistic point of view the two dots can represent something different per language, e.g. an umlaut in german

- a font can provide a composed characters as precomposed or as decomposed and most modern (truetype/opentype) fonts provide for this; some fonts have composed glyphs but at the same time carry the information of how to compose them from other glyphs

- the way composition happens depends on the font logic: it can be done via substitution (resulting in a precomposed glyph) or relative positioning; fonts may also require a decomposition step and start from the individual characters

- in most cases already at the input stage collapsing takes place i.e. decomposed sequences get turned into composed (but a font might demand decomposition later on)

- characters get represented by glyphs and there is a one to many relationship, think of smallcap, oldstyle and other renderings; a font can have rulesets that are to be applied in sequence

- in a precomposed glyph the (for instance) accent is part of the package and the graphic definition might provide clues for rendering (hinting)

- in the decomposed case the base character and the accent (officially called mark) get positioned relative to each other using so called anchors; in that case you can run into rounding errors and hinting can be less optimal

- if none of this works, which is the case if no entry for the composed glyph is provided i.e. no information is available on how to deal with the situation, the characters get overlayed due to the fact that an accent has either width zero or some fixed width (fonts are not consistent in this)

- of course a font renderer can apply some heuristics i.e. centering the accent over the base character

- in addition, operating systems often use technologies where, if a font has no entry, a glyph from another font is taken

- situations where ligaturing is involved (nb: an accented character is not a ligature) things can be more complex as each component of the ligature can get its own marks (for instance in arabic scripts)

- some languages have stacked marks, for example vietnamese, so there we run into base to mark and mark to mark situations (given that no precomposed glyph is present)

Now to operating systems (just some personal observations):

- windows: the font rendering technology (volt, cleartype, etc) is quite good given that a decent font is used; in xp one had to turn on cleartype explicitly

- osx: no issues (apart from occasional issues in the built in pdf renderer); there is some apple font technology but I think it's being phased out in favor for generic opentype

- linux: the technology is okay, but not always applied / configured right; one of the things i like about (x)ubuntu is that right from the start they got this right i.e. enabled anti-aliasing and other features as well as chose fonts that render okay (so, in case of doubt about the quality, just check the settings)

microsoft and apple have some advantage here as they are behind the current font technologies (truetype and opentype)

so: rendering is not so much os related, but more a matter of using the right fonts and setting up the machinery right; of course a high res screen helps too

My system composes at keyboard entry level.   I hit Compose,
`a`, and `^`, and a genuine `â` appears, no matter which
program is asking for input.

it might be less optimal for chinese, korean or arabic (more font dependent as well as renderer dependent; for arabic one can often see the font machinery realtime in action when one keys in characters because sequences of characters are turned into combined shapes that need relative (vertical and horizontal) positioning as well as mark anchoring

To produce the second, decomposed, one in my post I had
to remind myself of the Unicode for combining circumflex
by consulting a document I wrote in August 2011 (revised
thanks to the present discussion and appended, helpful
comments welcome).

such documents are actually good tests for checking support of characters in an editor

Hans



-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
    tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                             | www.pragma-pod.nl
-----------------------------------------------------------------