Skip to content
Surf Wiki
Save to docs
general/character-encoding

From Surf Wiki (app.surf) — the open knowledge base

Z-variant

Glyphs with minor typographical differences


Glyphs with minor typographical differences

In Unicode, two glyphs are said to be Z-variants (often spelled zVariants) if they share the same etymology but have slightly different appearances and different Unicode code points. For example, the Unicode characters 說 and 説 are Z-variants. The notion of Z-variance is only applicable to the "CJKV scripts"—Chinese, Japanese, Korean and Vietnamese—and is a subtopic of Han unification.

Differences on the Z-axis

The Unicode philosophy of code point allocation for CJK languages is organized along three "axes." The X-axis represents differences in semantics; for example, the Latin capital A ( A) and the Greek capital alpha ( Α) are represented by two distinct code points in Unicode, and might be termed "X-variants" (though this term is not common). The Y-axis represents significant differences in appearance though not in semantics; for example, the traditional Chinese character māo "cat" ( 貓) and the simplified Chinese character ( 猫) are Y-variants.

The Z-axis represents minor typographical differences. For example, the Chinese characters ( 莊) and ( 荘) are Z-variants, as are ( 說) and ( 説). The glossary at Unicode.org defines "Z-variant" as "Two CJK unified ideographs with identical semantics and unifiable shapes," where "unifiable" is taken in the sense of Han unification.

Thus, were Han unification perfectly successful, Z-variants would not exist. They exist in Unicode because it was deemed useful to be able to "round-trip" documents between Unicode and other CJK encodings such as Big5 and CCCII. For example, the character 莊 has CCCII encoding 21552D, while its Z-variant 荘 has CCCII encoding 2D552D. Therefore, these two variants were given distinct Unicode code points, so that converting a CCCII document to Unicode and back would be a lossless operation.

Confusion

There is some confusion over the exact definition of "Z-variant." For example, in an Internet Draft (of ) dated 2002, one finds zh "no" ( 不) and ( 不︀) described as "font variants," the term "Z-variant" being apparently reserved for interlanguage pairs such as the Mandarin Chinese zh "rabbit" ( 兔) and the Japanese ja "rabbit" ( 兎). However, the Unicode Consortium's Unihan database treats both pairs as Z-variants.

References

References

  1. "Glossary".
  2. (April 2004). "Joint Engineering Team (JET) Guidelines for Internationalized Domain Names (IDN) Registration and Administration for Chinese, Japanese, and Korean".
  3. "Unihan Database Lookup".
Info: Wikipedia Source

This article was imported from Wikipedia and is available under the Creative Commons Attribution-ShareAlike 4.0 License. Content has been adapted to SurfDoc format. Original contributors can be found on the article history page.

Want to explore this topic further?

Ask Mako anything about Z-variant — get instant answers, deeper analysis, and related topics.

Research with Mako

Free with your Surf account

Content sourced from Wikipedia, available under CC BY-SA 4.0.

This content may have been generated or modified by AI. CloudSurf Software LLC is not responsible for the accuracy, completeness, or reliability of AI-generated content. Always verify important information from primary sources.

Report