Probably you are correct about the term "character". Although there are the combining characters, that you must parse 1 base character followed by one or more sequence of combining characters to modify and form a single and distinct digital typographic character.
In the Unicode definition, good unicode-aware string comparers/searches, like the one in the SQL Databases, should handle as equal the two cases:
- "Á" character (single code point: \u00C1)
- "Á" character in decomposed form (combination of 2 code points: \u0041 + \u0301)
The second one.... "Á" is a single character that is the result of 2 other characters (a base character and a combining character) and have length of unicode code points equals to 2.
Many programmers associates a Typographic Character to the programing data type "char" or "wchar_t", but that is not entirely true with unicode, because:
* With wchar/UTF-16, there are the surrogates pairs.
* There are combining characters
* There are well known and defined code point sequences