Skip to content
Surf Wiki
Save to docs
general/unicode

From Surf Wiki (app.surf) — the open knowledge base

Unicode subscripts and superscripts

Unicode denominator & numerator glyphs


Unicode denominator & numerator glyphs

Unicode has subscripted and superscripted versions of a number of characters including a full set of Arabic numerals. These characters allow any polynomial, chemical and certain other equations to be represented in plain text without using any form of markup like HTML or TeX.

The World Wide Web Consortium and the Unicode Consortium have made recommendations on the choice between using markup and using superscript and subscript characters:

When used in mathematical context (MathML) it is recommended to consistently use style markup for superscripts and subscripts [...] However, when super and sub-scripts are to reflect semantic distinctions, it is easier to work with these meanings encoded in text rather than markup, for example, in phonetic or phonemic transcription.

Uses

The intended use when these characters were added to Unicode was to produce true superscripts and subscripts so that chemical and algebraic formulas could be written without markup. Thus (using a subscript 2 character) is supposed to be identical to (with subscript markup).

In reality, many fonts that include these characters ignore the Unicode definition, and instead design the digits for mathematical numerator and denominator glyphs, which are aligned with the cap line and the baseline, respectively. When used with the solidus or the Fraction Slash, they produce an almost typographically correct diagonal fraction, such as for the glyph. Super and subscript markup does not produce a correct fraction (compare markup with precomposed ). The change also makes the superscript letters useful for ordinal indicators, more closely matching the ª and º characters.

Unicode intended that diagonal fractions be rendered by a different mechanism: the fraction slash U+2044 is visually similar to the solidus, but when used with the ordinary digits (not the superscripts and subscripts), it instructs the layout system that a fraction such as is to be rendered using automatic glyph substitution.For a general overview and technical information on glyph substitution (though not specifically for fractions), see GSUB — Glyph Substitution Table in the OpenType specification on the Microsoft Typography site. User-end support was quite poor for a number of years, but fonts,Such as Andika, Arno Pro, Brill, Brioso Pro, Calibri, Candara, Carlito, Cantarell, FiraGO, EB Garamond, Gentium, Lato, Linux Libertine, Noto Sans, Noto Serif, Open Sans and Yrsa browsers,Such as Chrome, Firefox and Falkon word processors,Such as LibreOffice Writer desktop publishing softwareSuch as Adobe InDesign and Scribus and others increasingly support the intended Unicode behavior. This browser and your default font render the sequence as . (See Slash (punctuation)#Fractions for rendering in various other fonts.)

Superscripts and subscripts block

Main article: Superscripts and Subscripts (Unicode block)

The most common superscript digits (1, 2, and 3) were included in ISO-8859-1 and were therefore carried over into those code points in the Latin-1 range of Unicode. The remainder were placed along with basic arithmetical symbols, and later some Latin subscripts, in a dedicated block at to U+209F. The table below shows these characters together. Each superscript or subscript character is preceded by a baseline to show the height of subscripting/superscripting.

Six code points in the "Superscripts and Subscripts" block are unassigned, and remain available for future characters. three of these (209D, 209E, and 209F) were provisionally assigned to new subscript characters, namely Latin lowercase , , and .

0123456789ABCDEFU+00BxU+207xU+208xU+209x
x⁰xⁱx⁴
x₀x₁x₂x₃x₄
xₐxₑxₒxₓxₔ

Other superscript and subscript characters

Unicode also includes codepoints for subscript and superscript characters that are intended for semantic usage, in the following blocks: ;Superscript

  • The Latin-1 Supplement block contains the feminine and masculine ordinal indicators and .
  • The Latin Extended-C block contains one superscript, .
  • The Latin Extended-D block contains seven superscripts: .
  • The Latin Extended-E block contains five superscripts: .
  • The Latin Extended-F block is entirely superscript IPA letters: .
  • The Spacing Modifier Letters block has superscripted letters and symbols used for phonetic transcription: .
  • The Phonetic Extensions block has several superscripted letters and symbols: Latin/IPA , Greek , Cyrillic , other . These are intended to indicate secondary articulation.
  • The Phonetic Extensions Supplement block has several more: Latin/IPA , Greek .
  • The Cyrillic Extended-B block contains two Cyrillic superscripts: .
  • The Cyrillic Extended-D block contains many Cyrillic superscripts: .
  • The Georgian block contains one superscripted Mkhedruli letter: .
  • The Kanbun block has superscripted annotation characters used in Japanese copies of Classical Chinese texts: .
  • The Tifinagh block has one superscript letter : .
  • The Unified Canadian Aboriginal Syllabics and its Extended blocks contain several mostly consonant-only letters to indicate syllable coda called Finals, along with some characters that indicate syllable medial known as Medials: Main block ; Extended block: .

;Combining superscript

  • The Combining Diacritical Marks block contains medieval superscript letter diacritics. These letters are written directly above other letters appearing in medieval Germanic manuscripts, and so these glyphs do not include spacing, for example uͤ. They are shown here over the dotted circle placeholder ◌: .
  • The Combining Diacritical Marks Extended block contains three combining insular letters for the Middle English Ormulum, .
  • The Combining Diacritical Marks Supplement block contains additional medieval superscript letter diacritics, enough to complete the basic lowercase Latin alphabet except for , and , a few small capitals and ligatures (), and additional letters: , Greek .
  • The Cyrillic Extended-A and -B blocks contains multiple medieval superscript letter diacritics, enough to complete the basic lowercase Cyrillic alphabet used in Church Slavonic texts, also includes an additional ligature (ст): .
  • The Cyrillic Extended-D block has one additional combining character, that being і: .

;Subscript

  • The Latin Extended-C block contains one subscript, .
  • The Phonetic Extensions block has several subscripted letters and symbols: Latin/IPA and Greek .
  • The Cyrillic Extended-D block also contains many Cyrillic subscripts: .

;Combining subscript

  • The Combining Diacritical Marks Supplement block contains a combining subscript: .
  • The Combining Diacritical Marks Extended block contains two combining letters for linguistic transcriptions of Scots, .

Latin, Greek, Cyrillic, and IPA tables

A superscript small-cap ''W'' may be distinct from a superscript lowercase ''w'' in italic typeface, as in this phonetic notation.

Consolidated, the Unicode standard contains superscript and subscript versions of a subset of Latin, Greek and Cyrillic letters. Here they are arranged in alphabetical order for comparison (or for copy and paste convenience). Since these characters appear in different Unicode ranges, they may not appear to be the same size or position due to font substitution by the browser. Shaded cells mark petite capitals that are not very distinct from minuscules in Roman typeface, but they may be distinct in italic typeface, as is used in some phonetic notation.

Little punctuation is encoded. Parentheses are shown in the basic superscript block above, and the exclamation mark is shown in the IPA table below. In a supporting font, a question mark may be created with a superscript gelded question mark and a combining dot below: .

Latin superscript and subscript letters

ABCDEFGHstyle="min-width:1ch;"IJKLMNOPQRSTUVWXYZ
Superscript capital
Superscript small capital
Superscript minuscule
Overscript small capital
Overscript minuscule
Subscript minuscule
Underscript minuscule
  • Superscript versions of petite capital A, D, E and P, of ƀ, and subscript versions of w, y and z are scheduled to be released with version 18 of the Unicode Standard.

§ Cyrillic 𞀹 𞀻 𞁀, ◌ⷡ ◌ⷩ ◌ⷦ ◌ⷮ ◌ꙷ and 𞁞 might be substituted for these letters.

ÆÄɃƎƏĦŊȠÖÜ
Superscript capital(ᴬ̈)
Superscript minuscule𐞃(ᵃ̈)
Overscript minuscule◌ᷔ◌ᷲ◌ᷪ
Subscript minuscule

Some of these superscript capitals are small caps in the source documents in the Unicode proposals. Superscript Ä, Ö, Ü (in parentheses) are composed of the base letter and a combining tréma.

Except for the iota subscript, which has use in Greek text, the modifier Greek letters are intended as phonetic characters in Latin-script text. Shaded cells are indistinguishable from Latin letters, and so would not be expected to have distinctive use in Latin text or to be supported by Unicode.

ΑΒΓΔΕΖΗΘstyle="min-width:1ch;"ΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩ
Superscript minuscule
Overscript minuscule
Subscript minuscule
Underscript minuscule
  • Superscript versions of Greek psi and omega are scheduled for version 18 of the Unicode Standard.

Cyrillic modifier characters are intended for use in Cyrillic text.

АБВГДЕЖ[З](z)ИКЛМНОПРСТ[У](u)ФХЦЧШЩЪЫЬ[Э](e)ЮЯ
Superscript𞀰
Overscript◌ⷶ
Subscript𞁑
[Ә]()[Ґ](g)[Є](e)ЅІЇЈ[Ө](o)Ҫ[Ү](u)ҰЏӀ
Superscript
Overscript
Subscript
[Ѡ](o)ѢѤѦѪѬѲ
Superscript
Overscript◌ⷹ

Superscript and subscript ё, ї, й, ў etc. are handled with diacritics, etc. Many of the Cyrillic characters were added to the Cyrillic Extended-D block, which was added to the free Gentium and Andika fonts with version 6.2 in February 2023.

See also Unicode Small caps, Fullwidths, and Mathematical alphanumerics.

{{vanchor|Superscript IPA}}

The Latin Extended-F block was created for the remaining superscript IPA letters. They are supported by the free Gentium and Andika fonts. Additional superscript characters for historical and para-IPA letters are scheduled to be released with version 18 of the Unicode Standard in 2026.

Consonant letters

The Unicode characters for superscript (modifier) IPA and extIPA consonant letters are as follows. The entire Latin Extended-F block is dedicated to superscript IPA. Characters for sounds with secondary articulation are set off in parentheses and placed below the letter for the primary articulation. Asterisks mark superscript characters scheduled for release with Unicode 18 in September 2026.

BilabialLabiodentalDentalAlveolarPostalveolarRetroflexPalatalVelarUvularPharyngealGlottalNasalPlosiveAffricateFricativeApproximantTap/flapTrillLateral fricativeLateral approximantLateral tap/flapImplosiveClick releaseLateral click
releasePercussive
1D50
1DAC
207F
()
()
1DAF
1DAE
1D51
1DB0
1D56
1D47
1D57
()
1DB5
1D48
()
()
()
107AF
1078B
1D9C
1DA1
1D4F/
1DA2/1D4D
107A5
10792
107B3
02C0
107AC
10787
107AE
()
107AB
1078A
()
10789
107AD
()
10788
()
1DB2
1D5D
1DA0
1D5B
1DBF
1D9E
02E2
()
1DBB
()
1DB4
()
1D9D
1DBE
()
1DBD
1DB3
()
1DBC
()
1D9C + 0327Superscript is composed of superscript and a combining cedilla, which should display properly in a good font. Superscript c was specifically requested for this purpose in Unicode proposal L2/03-180.
1DA8
02E3
()
10797
02E0
1D61
02B6
10795
()
10790
02E4 is the superscript variant of and is defined for IPA use. The similar character is a reversed , perhaps a gelded reversed question mark. Fonts are inconsistent in whether they look different and what the difference is.
02B0
()
02B1
1DB9
02B4
02B5
02B2
()
1DA3
()
AB69
1DAD
()
02B7
107B0
107A9
107A8
10784
02B3
107AA
10796
107B4
1079B
()
10799
1079E
()
1079A
1079D
1079F
107A1
1079C
02E1
()
1DAA
()
1DA9
107A0
1DAB
()In Microsoft fonts, superscript was erroneously designed as a superscript .
AB5E
107A6
107A7
10785
1078C
1078D
10798
10793
10794
107B5
107B6
A71D
107B9
107B8()
107B7
A71EU+A71D and A71E were adopted as the Africanist equivalents of the IPA characters downstep and upstep. The correspondence of U+A71D to the IPA click letter is thus accidental. Coincidentally, U+A71E serves as the superscript variant of the extIPA percussive consonant ; the other percussive letters, and , do not have superscript support in Unicode.

The spacing diacritic for ejective consonants, U+2BC, works with superscript letters despite not being superscript itself: . If a distinction needs to be made, the combining apostrophe U+315 may be used: . The spacing diacritic should be used for a baseline letter with a superscript release, such as or , where the scope of the apostrophe includes the non-superscript letter, but the combining apostrophe U+315 might be used to indicate a weakly articulated ejective consonant like or , where the whole consonant is written as a superscript, or together with U+2BC when separate apostrophes have scope over the base and modifier letters, as in .

Spacing diacritics, as in , cannot be secondarily superscripted in plain text: . (In this instance, the old IPA letter for , , has a superscript variant in Unicode, U+1DB5 , but that is not generally the case.)

Among older letters, the most common letters with palatal hook are supported; they are displayed in the table above. IPA once had an idiosyncratic curl on some of the palatalized letters: these are the fricative letters . Their superscript forms have been accepted for version 18 of the Unicode Standard. Old-style click letters and the retired letters and have also been accepted for version 18 of the Unicode Standard. The Teuthonista letter (U+A727) is also an old graphic variant of . Its superscript is supported at (U+AB5C).

Among para-IPA letters, superscript variants of Sinological , of the Bantuist labio-dental plosives and , and of central semivowels , , and have been accepted for version 18 of the Unicode Standard.

Vowel letters

The Unicode characters for superscript (modifier) IPA vowel letters, plus a pair of extended letters found in English dictionaries, are as follows. Recently retired alternative letters such as are also supported; they are set off in parentheses and placed below the modern IPA letters. Asterisks mark superscript letters scheduled for release with Unicode 18 in September 2026.

FrontCentralBackCloseNear-closeClose-midMidOpen-midNear-openOpen
2071
02B8
1DA4
1DB6
1D5A
1D58
1DA6
()
1DA5
107B2
()
1DA7
()
()
1DB7
()
107A4
1D49
107A2
1078E
1DB1
10791
1D52
1D4A
1D4B
A7F9
1D9F
() and are mistakenly both defined as modifier in their Unicode properties, though the former is named as the modifier variant of .
1D4C
1078F
1DBA
1D53
10783
107A3
1D44
1D45
1D9B
1D43

The precomposed Unicode rhotic vowel letters are not directly supported. The rhotic diacritic U+02DE should be used instead: .

Among older letters, (U+1D1C), a graphic variant of , is supported at (U+1DB8). The briefly resurrected vowel letter (U+029A) is not supported as a superscript, only its reversed replacement is.

Among para-IPA letters, Sinological superscript and have been accepted for version 18 of the Unicode Standard.

Length marks

The two length marks are also supported:

LongHalf-long
10781
10782

These are used to add length to another superscript, such as or for long aspiration.

Wildcards

Superscript wildcards (full caps) are largely supported: e.g. (prenasalized consonant), (prestopped nasal), (fricative release), (sibilant release, added to Unicode in 2025), (epenthetic plosive), (tone-bearing syllable), (liquid or lateral release), (rhotic or resonant release), (off-glide/diphthong), (fleeting vowel). Superscript for fleeting/epenthetic click is not included in the Unicode Standard. Other basic Latin superscript wildcards for tone and weak indeterminate sounds, as described in , are mostly supported. (See table in the Latin section.)

Combining marks and subscripts

In addition to superscripts, a very few IPA letters beyond the basic Latin alphabet have combining forms or are supported as subscripts:

ɑæβçðəɣʃʍχʔʼ
Overscript
Subscript
Underscript

Composite characters

Primarily for compatibility with earlier character sets, Unicode contains a number of characters that compose super- and subscripts with other symbols. In most fonts these render much better than attempts to construct these symbols from the above characters or by using markup.

  • The Latin-1 Supplement block contains the precomposed fractions , , and . The copyright and registered trademark signs are also in this block; they are set as superscript in some fonts.
  • The General Punctuation block contains the permille sign and the per-ten-thousand sign , and Basic Latin has the percent sign .
  • The Number Forms block contains several precomposed fractions: .
  • The Letterlike Symbols block contains a few symbols composed of subscript and superscript characters: ℀ ℁ ℅ ℆ № ℠ ™ ⅍.
  • The Enclosed Alphanumeric Supplement block contains three superscript abbreviations : MC for marque de commerce (trademark), MD for marque déposée (registered trademark), both used in Canada; MR for marca registrada (registered trademark) in Spanish and Portuguese speaking countries.
  • The Miscellaneous Technical block has one additional subscript, a subscript 10 (), for the purpose of scientific notation.
  • The Unified Canadian Aboriginal Syllabics and its Extended blocks contain several letters composed with superscripted letters to indicate extended sound values: Main block , Extended block .

Notes

References

References

  1. "UCD: UnicodeData.txt". The Unicode Standard.
  2. (16 May 2007). "Unicode in XML and other Markup Languages". W3C.
  3. (December 27, 2021). "fraction {{!}} Dart Package".
  4. (March 30, 2021). "MathML {{!}} General layout elements {{!}} Fractions".
  5. (May 16, 2007). "Fraction Slash". W3C.
  6. (2025-01-27). "Approved Minutes of UTC Meeting 181". [[Unicode Consortium]].
  7. "UCD: Scripts.txt". The Unicode Standard.
  8. (October 5, 2020). "L2/20-268: Revised proposal to add ten characters for Middle English to the UCS".
  9. (2024-11-26). "Additional draft repertoire for provisionally assigned code points for Unicode". [[Unicode Consortium]].
  10. Kirk Miller & Michael Ashby, [https://www.unicode.org/L2/L2020/20253r-mod-ipa-b.pdf L2/20-253R] Unicode request for IPA modifier letters (b), non-pulmonic.
  11. Kirk Miller & Michael Ashby, [https://www.unicode.org/L2/L2020/20252r-mod-ipa-a.pdf L2/20-252R] Unicode request for IPA modifier-letters (a), pulmonic
  12. Kirk Miller. (January 30, 2024). "L2/24-081: Latin Phonetic The for Middle Tilde".
  13. Silva, Eduardo Marín. (March 1, 2017). "L2/17-066R: Proposal to encode the Marca Registrada sign".
Info: Wikipedia Source

This article was imported from Wikipedia and is available under the Creative Commons Attribution-ShareAlike 4.0 License. Content has been adapted to SurfDoc format. Original contributors can be found on the article history page.

Want to explore this topic further?

Ask Mako anything about Unicode subscripts and superscripts — get instant answers, deeper analysis, and related topics.

Research with Mako

Free with your Surf account

Content sourced from Wikipedia, available under CC BY-SA 4.0.

This content may have been generated or modified by AI. CloudSurf Software LLC is not responsible for the accuracy, completeness, or reliability of AI-generated content. Always verify important information from primary sources.

Report