From Surf Wiki (app.surf) — the open knowledge base

CJK Unified Ideographs Extension B

Field	Value
rangestart	20000
rangeend	2A6DF
script1	Han
3_1	42711
13_0	7
14_0	2
note
Historic single-glyph (UCS2003) code chart

Historic single-glyph (UCS2003) code chart

CJK Unified Ideographs Extension B is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese submitted to the Ideographic Research Group between 1998 and 2000, plus seven gongche characters for kunqu added in Unicode 13.0, and two characters for the Macao Supplementary Character Set added in Unicode 14.0.

The block has dozens of variation sequences defined for standardized variants.

It also has thousands of ideographic variation sequences registered in the Unicode Ideographic Variation Database (IVD). These sequences specify the desired glyph variant for a given Unicode character.

It was the only CJK Unified Ideographs Extension block with a UCS2003 source identifier. Since Extension B contained too many characters, the original code charts were produced with a single glyph for all regions. The glyphs were designed by Beijing Zhongyi Electronic Ltd. After the introduction of multi-column code charts on Unicode 5.2, the original glyphs were retained under the UCS2003 source identifier; they were then removed in Unicode 14.0, being redundant as well as misleading. The glyphs are packaged in the "SimSun-ExtB" font distributed with the Simplified Chinese versions of Windows, and do not adhere to the glyphs for the Mainland China region.

Known issues

Unifiable variants and exact duplicates in Extension B

Also in CJK Unified Ideographs Extension B, hundreds of glyph variants were encoded. In addition to the deliberate encoding of close glyph variants, seven exact duplicates (where the same character has inadvertently been encoded twice) and two semi-duplicates (where the CJK-B character represents a de facto disunification of two glyph forms unified in the corresponding BMP character) were encoded by mistake:

U+34A8 㒨 = U+20457 𠑗 : U+20457 is the same as the China-source glyph for U+34A8, but it is significantly different from the Taiwan-source glyph for U+34A8
U+3DB7 㶷 = U+2420E 𤈎 : same glyph shapes
U+8641 虁 = U+27144 𧅄 : U+27144 is the same as the Korean-source glyph for U+8641, but it is significantly different from the Chinese Mainland-, Taiwan- and Japan-source glyphs for U+8641
U+204F2 𠓲 = U+23515 𣔕 : same glyph shapes, but ordered under different radicals
U+21018 𡀘 = U+2103C 𡀼 : same glyph shapes
U+249BC 𤦼 = U+249E9 𤧩 : same glyph shapes
U+24BD2 𤯒 = U+2A415 𪐕 : same glyph shapes, but ordered under different radicals
U+26842 𦡂 = U+26866 𦡦 : same glyph shapes
U+FA23 﨣 = U+27EAF 𧺯 : same glyph shapes (U+FA23 﨣 is a unified CJK ideograph, despite its name "CJK COMPATIBILITY IDEOGRAPH-FA23.")

Block

History

The following Unicode-related documents record the purpose and process of defining specific characters in the CJK Unified Ideographs Extension B block:

Version	Count	L2 ID	IRG ID	Document
3.1	U+20000..2A6D6	42,711
	N2109	N674
	N2105	N675
	N2144	N713R
	N2103
rtf)
doc)	N2298	N758
	doc)
	N2347	N785
	doc)	N787
	N2427
	N2448	N924
	N2518
	N2695	N1026
	N2774R	N1064
	N2830
	doc)
	N3285
		N1406
	doc)
	N4111
	N4103
	N4621
	doc)
		N2202
	N4974	N2301
	N4987
	N4988
		N2336
	N5016	N2349
	doc)
	N5086	N2379
	N5068
	N5107
	N5083	N2391
	N5082
		N2508
		N2512R
		N2520
		N2580R
		N2556R2
		N2588R
		N2609
		N2642
		N2651
		N2779R2
13.0	U+2A6D7..2A6DD	7
		N2296
		N2299
	N4967
	doc)
	N5122
	N5106
14.0	U+2A6DE..2A6DF	2	N5140	N2437

References

"Unicode character database". The Unicode Standard.
"Enumerated Versions of The Unicode Standard". The Unicode Standard.
(2022). "18.1: Han (§ Blocks Containing Han Ideographs)". The Unicode Standard: Core Specification.
"Unicode Character Database: Standardized Variation Sequences". The Unicode Consortium.
"Ideographic Variation Database". Unicode Consortium.
"UTS #37, Unicode Ideographic Variation Database". Unicode Consortium.
"The Unicode Standard, Version 14.0". Unicode Consortium.
"unifiable glyph variants".
Cook, Richard. "Defect Report on Duplicate Encoded CJK Forms".
Proposed code points and characters names may differ from final code points and names

Info: Wikipedia Source

This article was imported from Wikipedia and is available under the Creative Commons Attribution-ShareAlike 4.0 License. Content has been adapted to SurfDoc format. Original contributors can be found on the article history page.

unicode-blocks

Want to explore this topic further?

Ask Mako anything about CJK Unified Ideographs Extension B — get instant answers, deeper analysis, and related topics.

Research with Mako

Free with your Surf account

Content sourced from Wikipedia, available under CC BY-SA 4.0.

This content may have been generated or modified by AI. CloudSurf Software LLC is not responsible for the accuracy, completeness, or reliability of AI-generated content. Always verify important information from primary sources.

Report