記事一覧はこちら

unicodeのブロックはどのくらい余裕があるか

www.unicode.org/Public/UCD/latest/ucd/Blocks.txt より調査。 バージョンは# Blocks-7.0.0.txt Date: 2014-04-03, 23:23:00 GMT [RP, KW] ここ見れば全ての情報があるんだけど、16進数だらけでよくわからんから10進数化しただけ。

当然だけど、若い数字はほぼ埋まってる。CJK Unified Ideographsは領域使いすぎ。他の領域が100くらいなのに2万てwそりゃ6536文字は破綻ですわ。 ハングルも結構取ってるね。あれって組み合わせて使う文字だと思うけど、どこまで収録してんだろ?CJK Unified Ideographs Extension Bとか4万文字w 「サロゲートペアがあるからもう遠慮しなくていいよね。これ全部違う文字だから収録よろしく^^」的な サロゲートペア含めたunicodeの全領域は111万2,064文字。今のところ使ってる領域は25万6084文字分。ipv6並みに余裕じゃないけど、絵文字をバカスカ追加するくらいは出来そうだ。

      0~    127(    127)/0000000...000007F;Basic Latin
    128~    255(    127)/0000080...00000FF;Latin-1 Supplement
    256~    383(    127)/0000100...000017F;Latin Extended-A
    384~    591(    207)/0000180...000024F;Latin Extended-B
    592~    687(     95)/0000250...00002AF;IPA Extensions
    688~    767(     79)/00002B0...00002FF;Spacing Modifier Letters
    768~    879(    111)/0000300...000036F;Combining Diacritical Marks
    880~   1023(    143)/0000370...00003FF;Greek and Coptic
   1024~   1279(    255)/0000400...00004FF;Cyrillic
   1280~   1327(     47)/0000500...000052F;Cyrillic Supplement
   1328~   1423(     95)/0000530...000058F;Armenian
   1424~   1535(    111)/0000590...00005FF;Hebrew
   1536~   1791(    255)/0000600...00006FF;Arabic
   1792~   1871(     79)/0000700...000074F;Syriac
   1872~   1919(     47)/0000750...000077F;Arabic Supplement
   1920~   1983(     63)/0000780...00007BF;Thaana
   1984~   2047(     63)/00007C0...00007FF;NKo
   2048~   2111(     63)/0000800...000083F;Samaritan
   2112~   2143(     31)/0000840...000085F;Mandaic
   2208~   2303(     95)/00008A0...00008FF;Arabic Extended-A
   2304~   2431(    127)/0000900...000097F;Devanagari
   2432~   2559(    127)/0000980...00009FF;Bengali
   2560~   2687(    127)/0000A00...0000A7F;Gurmukhi
   2688~   2815(    127)/0000A80...0000AFF;Gujarati
   2816~   2943(    127)/0000B00...0000B7F;Oriya
   2944~   3071(    127)/0000B80...0000BFF;Tamil
   3072~   3199(    127)/0000C00...0000C7F;Telugu
   3200~   3327(    127)/0000C80...0000CFF;Kannada
   3328~   3455(    127)/0000D00...0000D7F;Malayalam
   3456~   3583(    127)/0000D80...0000DFF;Sinhala
   3584~   3711(    127)/0000E00...0000E7F;Thai
   3712~   3839(    127)/0000E80...0000EFF;Lao
   3840~   4095(    255)/0000F00...0000FFF;Tibetan
   4096~   4255(    159)/0001000...000109F;Myanmar
   4256~   4351(     95)/00010A0...00010FF;Georgian
   4352~   4607(    255)/0001100...00011FF;Hangul Jamo
   4608~   4991(    383)/0001200...000137F;Ethiopic
   4992~   5023(     31)/0001380...000139F;Ethiopic Supplement
   5024~   5119(     95)/00013A0...00013FF;Cherokee
   5120~   5759(    639)/0001400...000167F;Unified Canadian Aboriginal Syllabics
   5760~   5791(     31)/0001680...000169F;Ogham
   5792~   5887(     95)/00016A0...00016FF;Runic
   5888~   5919(     31)/0001700...000171F;Tagalog
   5920~   5951(     31)/0001720...000173F;Hanunoo
   5952~   5983(     31)/0001740...000175F;Buhid
   5984~   6015(     31)/0001760...000177F;Tagbanwa
   6016~   6143(    127)/0001780...00017FF;Khmer
   6144~   6319(    175)/0001800...00018AF;Mongolian
   6320~   6399(     79)/00018B0...00018FF;Unified Canadian Aboriginal Syllabics Extended
   6400~   6479(     79)/0001900...000194F;Limbu
   6480~   6527(     47)/0001950...000197F;Tai Le
   6528~   6623(     95)/0001980...00019DF;New Tai Lue
   6624~   6655(     31)/00019E0...00019FF;Khmer Symbols
   6656~   6687(     31)/0001A00...0001A1F;Buginese
   6688~   6831(    143)/0001A20...0001AAF;Tai Tham
   6832~   6911(     79)/0001AB0...0001AFF;Combining Diacritical Marks Extended
   6912~   7039(    127)/0001B00...0001B7F;Balinese
   7040~   7103(     63)/0001B80...0001BBF;Sundanese
   7104~   7167(     63)/0001BC0...0001BFF;Batak
   7168~   7247(     79)/0001C00...0001C4F;Lepcha
   7248~   7295(     47)/0001C50...0001C7F;Ol Chiki
   7360~   7375(     15)/0001CC0...0001CCF;Sundanese Supplement
   7376~   7423(     47)/0001CD0...0001CFF;Vedic Extensions
   7424~   7551(    127)/0001D00...0001D7F;Phonetic Extensions
   7552~   7615(     63)/0001D80...0001DBF;Phonetic Extensions Supplement
   7616~   7679(     63)/0001DC0...0001DFF;Combining Diacritical Marks Supplement
   7680~   7935(    255)/0001E00...0001EFF;Latin Extended Additional
   7936~   8191(    255)/0001F00...0001FFF;Greek Extended
   8192~   8303(    111)/0002000...000206F;General Punctuation
   8304~   8351(     47)/0002070...000209F;Superscripts and Subscripts
   8352~   8399(     47)/00020A0...00020CF;Currency Symbols
   8400~   8447(     47)/00020D0...00020FF;Combining Diacritical Marks for Symbols
   8448~   8527(     79)/0002100...000214F;Letterlike Symbols
   8528~   8591(     63)/0002150...000218F;Number Forms
   8592~   8703(    111)/0002190...00021FF;Arrows
   8704~   8959(    255)/0002200...00022FF;Mathematical Operators
   8960~   9215(    255)/0002300...00023FF;Miscellaneous Technical
   9216~   9279(     63)/0002400...000243F;Control Pictures
   9280~   9311(     31)/0002440...000245F;Optical Character Recognition
   9312~   9471(    159)/0002460...00024FF;Enclosed Alphanumerics
   9472~   9599(    127)/0002500...000257F;Box Drawing
   9600~   9631(     31)/0002580...000259F;Block Elements
   9632~   9727(     95)/00025A0...00025FF;Geometric Shapes
   9728~   9983(    255)/0002600...00026FF;Miscellaneous Symbols
   9984~  10175(    191)/0002700...00027BF;Dingbats
  10176~  10223(     47)/00027C0...00027EF;Miscellaneous Mathematical Symbols-A
  10224~  10239(     15)/00027F0...00027FF;Supplemental Arrows-A
  10240~  10495(    255)/0002800...00028FF;Braille Patterns
  10496~  10623(    127)/0002900...000297F;Supplemental Arrows-B
  10624~  10751(    127)/0002980...00029FF;Miscellaneous Mathematical Symbols-B
  10752~  11007(    255)/0002A00...0002AFF;Supplemental Mathematical Operators
  11008~  11263(    255)/0002B00...0002BFF;Miscellaneous Symbols and Arrows
  11264~  11359(     95)/0002C00...0002C5F;Glagolitic
  11360~  11391(     31)/0002C60...0002C7F;Latin Extended-C
  11392~  11519(    127)/0002C80...0002CFF;Coptic
  11520~  11567(     47)/0002D00...0002D2F;Georgian Supplement
  11568~  11647(     79)/0002D30...0002D7F;Tifinagh
  11648~  11743(     95)/0002D80...0002DDF;Ethiopic Extended
  11744~  11775(     31)/0002DE0...0002DFF;Cyrillic Extended-A
  11776~  11903(    127)/0002E00...0002E7F;Supplemental Punctuation
  11904~  12031(    127)/0002E80...0002EFF;CJK Radicals Supplement
  12032~  12255(    223)/0002F00...0002FDF;Kangxi Radicals
  12272~  12287(     15)/0002FF0...0002FFF;Ideographic Description Characters
  12288~  12351(     63)/0003000...000303F;CJK Symbols and Punctuation
  12352~  12447(     95)/0003040...000309F;Hiragana
  12448~  12543(     95)/00030A0...00030FF;Katakana
  12544~  12591(     47)/0003100...000312F;Bopomofo
  12592~  12687(     95)/0003130...000318F;Hangul Compatibility Jamo
  12688~  12703(     15)/0003190...000319F;Kanbun
  12704~  12735(     31)/00031A0...00031BF;Bopomofo Extended
  12736~  12783(     47)/00031C0...00031EF;CJK Strokes
  12784~  12799(     15)/00031F0...00031FF;Katakana Phonetic Extensions
  12800~  13055(    255)/0003200...00032FF;Enclosed CJK Letters and Months
  13056~  13311(    255)/0003300...00033FF;CJK Compatibility
  13312~  19903(   6591)/0003400...0004DBF;CJK Unified Ideographs Extension A
  19904~  19967(     63)/0004DC0...0004DFF;Yijing Hexagram Symbols
  19968~  40959(  20991)/0004E00...0009FFF;CJK Unified Ideographs
  40960~  42127(   1167)/000A000...000A48F;Yi Syllables
  42128~  42191(     63)/000A490...000A4CF;Yi Radicals
  42192~  42239(     47)/000A4D0...000A4FF;Lisu
  42240~  42559(    319)/000A500...000A63F;Vai
  42560~  42655(     95)/000A640...000A69F;Cyrillic Extended-B
  42656~  42751(     95)/000A6A0...000A6FF;Bamum
  42752~  42783(     31)/000A700...000A71F;Modifier Tone Letters
  42784~  43007(    223)/000A720...000A7FF;Latin Extended-D
  43008~  43055(     47)/000A800...000A82F;Syloti Nagri
  43056~  43071(     15)/000A830...000A83F;Common Indic Number Forms
  43072~  43135(     63)/000A840...000A87F;Phags-pa
  43136~  43231(     95)/000A880...000A8DF;Saurashtra
  43232~  43263(     31)/000A8E0...000A8FF;Devanagari Extended
  43264~  43311(     47)/000A900...000A92F;Kayah Li
  43312~  43359(     47)/000A930...000A95F;Rejang
  43360~  43391(     31)/000A960...000A97F;Hangul Jamo Extended-A
  43392~  43487(     95)/000A980...000A9DF;Javanese
  43488~  43519(     31)/000A9E0...000A9FF;Myanmar Extended-B
  43520~  43615(     95)/000AA00...000AA5F;Cham
  43616~  43647(     31)/000AA60...000AA7F;Myanmar Extended-A
  43648~  43743(     95)/000AA80...000AADF;Tai Viet
  43744~  43775(     31)/000AAE0...000AAFF;Meetei Mayek Extensions
  43776~  43823(     47)/000AB00...000AB2F;Ethiopic Extended-A
  43824~  43887(     63)/000AB30...000AB6F;Latin Extended-E
  43968~  44031(     63)/000ABC0...000ABFF;Meetei Mayek
  44032~  55215(  11183)/000AC00...000D7AF;Hangul Syllables
  55216~  55295(     79)/000D7B0...000D7FF;Hangul Jamo Extended-B
  55296~  56191(    895)/000D800...000DB7F;High Surrogates
  56192~  56319(    127)/000DB80...000DBFF;High Private Use Surrogates
  56320~  57343(   1023)/000DC00...000DFFF;Low Surrogates
  57344~  63743(   6399)/000E000...000F8FF;Private Use Area
  63744~  64255(    511)/000F900...000FAFF;CJK Compatibility Ideographs
  64256~  64335(     79)/000FB00...000FB4F;Alphabetic Presentation Forms
  64336~  65023(    687)/000FB50...000FDFF;Arabic Presentation Forms-A
  65024~  65039(     15)/000FE00...000FE0F;Variation Selectors
  65040~  65055(     15)/000FE10...000FE1F;Vertical Forms
  65056~  65071(     15)/000FE20...000FE2F;Combining Half Marks
  65072~  65103(     31)/000FE30...000FE4F;CJK Compatibility Forms
  65104~  65135(     31)/000FE50...000FE6F;Small Form Variants
  65136~  65279(    143)/000FE70...000FEFF;Arabic Presentation Forms-B
  65280~  65519(    239)/000FF00...000FFEF;Halfwidth and Fullwidth Forms
  65520~  65535(     15)/000FFF0...000FFFF;Specials
  65536~  65663(    127)/0010000...001007F;Linear B Syllabary
  65664~  65791(    127)/0010080...00100FF;Linear B Ideograms
  65792~  65855(     63)/0010100...001013F;Aegean Numbers
  65856~  65935(     79)/0010140...001018F;Ancient Greek Numbers
  65936~  65999(     63)/0010190...00101CF;Ancient Symbols
  66000~  66047(     47)/00101D0...00101FF;Phaistos Disc
  66176~  66207(     31)/0010280...001029F;Lycian
  66208~  66271(     63)/00102A0...00102DF;Carian
  66272~  66303(     31)/00102E0...00102FF;Coptic Epact Numbers
  66304~  66351(     47)/0010300...001032F;Old Italic
  66352~  66383(     31)/0010330...001034F;Gothic
  66384~  66431(     47)/0010350...001037F;Old Permic
  66432~  66463(     31)/0010380...001039F;Ugaritic
  66464~  66527(     63)/00103A0...00103DF;Old Persian
  66560~  66639(     79)/0010400...001044F;Deseret
  66640~  66687(     47)/0010450...001047F;Shavian
  66688~  66735(     47)/0010480...00104AF;Osmanya
  66816~  66863(     47)/0010500...001052F;Elbasan
  66864~  66927(     63)/0010530...001056F;Caucasian Albanian
  67072~  67455(    383)/0010600...001077F;Linear A
  67584~  67647(     63)/0010800...001083F;Cypriot Syllabary
  67648~  67679(     31)/0010840...001085F;Imperial Aramaic
  67680~  67711(     31)/0010860...001087F;Palmyrene
  67712~  67759(     47)/0010880...00108AF;Nabataean
  67840~  67871(     31)/0010900...001091F;Phoenician
  67872~  67903(     31)/0010920...001093F;Lydian
  67968~  67999(     31)/0010980...001099F;Meroitic Hieroglyphs
  68000~  68095(     95)/00109A0...00109FF;Meroitic Cursive
  68096~  68191(     95)/0010A00...0010A5F;Kharoshthi
  68192~  68223(     31)/0010A60...0010A7F;Old South Arabian
  68224~  68255(     31)/0010A80...0010A9F;Old North Arabian
  68288~  68351(     63)/0010AC0...0010AFF;Manichaean
  68352~  68415(     63)/0010B00...0010B3F;Avestan
  68416~  68447(     31)/0010B40...0010B5F;Inscriptional Parthian
  68448~  68479(     31)/0010B60...0010B7F;Inscriptional Pahlavi
  68480~  68527(     47)/0010B80...0010BAF;Psalter Pahlavi
  68608~  68687(     79)/0010C00...0010C4F;Old Turkic
  69216~  69247(     31)/0010E60...0010E7F;Rumi Numeral Symbols
  69632~  69759(    127)/0011000...001107F;Brahmi
  69760~  69839(     79)/0011080...00110CF;Kaithi
  69840~  69887(     47)/00110D0...00110FF;Sora Sompeng
  69888~  69967(     79)/0011100...001114F;Chakma
  69968~  70015(     47)/0011150...001117F;Mahajani
  70016~  70111(     95)/0011180...00111DF;Sharada
  70112~  70143(     31)/00111E0...00111FF;Sinhala Archaic Numbers
  70144~  70223(     79)/0011200...001124F;Khojki
  70320~  70399(     79)/00112B0...00112FF;Khudawadi
  70400~  70527(    127)/0011300...001137F;Grantha
  70784~  70879(     95)/0011480...00114DF;Tirhuta
  71040~  71167(    127)/0011580...00115FF;Siddham
  71168~  71263(     95)/0011600...001165F;Modi
  71296~  71375(     79)/0011680...00116CF;Takri
  71840~  71935(     95)/00118A0...00118FF;Warang Citi
  72384~  72447(     63)/0011AC0...0011AFF;Pau Cin Hau
  73728~  74751(   1023)/0012000...00123FF;Cuneiform
  74752~  74879(    127)/0012400...001247F;Cuneiform Numbers and Punctuation
  77824~  78895(   1071)/0013000...001342F;Egyptian Hieroglyphs
  92160~  92735(    575)/0016800...0016A3F;Bamum Supplement
  92736~  92783(     47)/0016A40...0016A6F;Mro
  92880~  92927(     47)/0016AD0...0016AFF;Bassa Vah
  92928~  93071(    143)/0016B00...0016B8F;Pahawh Hmong
  93952~  94111(    159)/0016F00...0016F9F;Miao
 110592~ 110847(    255)/001B000...001B0FF;Kana Supplement
 113664~ 113823(    159)/001BC00...001BC9F;Duployan
 113824~ 113839(     15)/001BCA0...001BCAF;Shorthand Format Controls
 118784~ 119039(    255)/001D000...001D0FF;Byzantine Musical Symbols
 119040~ 119295(    255)/001D100...001D1FF;Musical Symbols
 119296~ 119375(     79)/001D200...001D24F;Ancient Greek Musical Notation
 119552~ 119647(     95)/001D300...001D35F;Tai Xuan Jing Symbols
 119648~ 119679(     31)/001D360...001D37F;Counting Rod Numerals
 119808~ 120831(   1023)/001D400...001D7FF;Mathematical Alphanumeric Symbols
 124928~ 125151(    223)/001E800...001E8DF;Mende Kikakui
 126464~ 126719(    255)/001EE00...001EEFF;Arabic Mathematical Alphabetic Symbols
 126976~ 127023(     47)/001F000...001F02F;Mahjong Tiles
 127024~ 127135(    111)/001F030...001F09F;Domino Tiles
 127136~ 127231(     95)/001F0A0...001F0FF;Playing Cards
 127232~ 127487(    255)/001F100...001F1FF;Enclosed Alphanumeric Supplement
 127488~ 127743(    255)/001F200...001F2FF;Enclosed Ideographic Supplement
 127744~ 128511(    767)/001F300...001F5FF;Miscellaneous Symbols and Pictographs
 128512~ 128591(     79)/001F600...001F64F;Emoticons
 128592~ 128639(     47)/001F650...001F67F;Ornamental Dingbats
 128640~ 128767(    127)/001F680...001F6FF;Transport and Map Symbols
 128768~ 128895(    127)/001F700...001F77F;Alchemical Symbols
 128896~ 129023(    127)/001F780...001F7FF;Geometric Shapes Extended
 129024~ 129279(    255)/001F800...001F8FF;Supplemental Arrows-C
 131072~ 173791(  42719)/0020000...002A6DF;CJK Unified Ideographs Extension B
 173824~ 177983(   4159)/002A700...002B73F;CJK Unified Ideographs Extension C
 177984~ 178207(    223)/002B740...002B81F;CJK Unified Ideographs Extension D
 194560~ 195103(    543)/002F800...002FA1F;CJK Compatibility Ideographs Supplement
 917504~ 917631(    127)/00E0000...00E007F;Tags
 917760~ 917999(    239)/00E0100...00E01EF;Variation Selectors Supplement
 983040~1048575(  65535)/00F0000...00FFFFF;Supplementary Private Use Area-A
1048576~1114111(  65535)/0100000...010FFFF;Supplementary Private Use Area-B
total=256084

差分とか面白そうだからナマのテキストもペタリ

# Blocks-7.0.0.txt
# Date: 2014-04-03, 23:23:00 GMT [RP, KW]
#
# Unicode Character Database
# Copyright (c) 1991-2014 Unicode, Inc.
# For terms of use, see http://www.unicode.org/terms_of_use.html
# For documentation, see http://www.unicode.org/reports/tr44/
#
# Note:   The casing of block names is not normative.
#         For example, "Basic Latin" and "BASIC LATIN" are equivalent.
#
# Format:
# Start Code..End Code; Block Name

# ================================================

# Note:   When comparing block names, casing, whitespace, hyphens,
#         and underbars are ignored.
#         For example, "Latin Extended-A" and "latin extended a" are equivalent.
#         For more information on the comparison of property values, 
#            see UAX #44: http://www.unicode.org/reports/tr44/
#
#  All code points not explicitly listed for Block
#  have the value No_Block.

# Property: Block
#
# @missing: 0000..10FFFF; No_Block

0000..007F; Basic Latin
0080..00FF; Latin-1 Supplement
0100..017F; Latin Extended-A
0180..024F; Latin Extended-B
0250..02AF; IPA Extensions
02B0..02FF; Spacing Modifier Letters
0300..036F; Combining Diacritical Marks
0370..03FF; Greek and Coptic
0400..04FF; Cyrillic
0500..052F; Cyrillic Supplement
0530..058F; Armenian
0590..05FF; Hebrew
0600..06FF; Arabic
0700..074F; Syriac
0750..077F; Arabic Supplement
0780..07BF; Thaana
07C0..07FF; NKo
0800..083F; Samaritan
0840..085F; Mandaic
08A0..08FF; Arabic Extended-A
0900..097F; Devanagari
0980..09FF; Bengali
0A00..0A7F; Gurmukhi
0A80..0AFF; Gujarati
0B00..0B7F; Oriya
0B80..0BFF; Tamil
0C00..0C7F; Telugu
0C80..0CFF; Kannada
0D00..0D7F; Malayalam
0D80..0DFF; Sinhala
0E00..0E7F; Thai
0E80..0EFF; Lao
0F00..0FFF; Tibetan
1000..109F; Myanmar
10A0..10FF; Georgian
1100..11FF; Hangul Jamo
1200..137F; Ethiopic
1380..139F; Ethiopic Supplement
13A0..13FF; Cherokee
1400..167F; Unified Canadian Aboriginal Syllabics
1680..169F; Ogham
16A0..16FF; Runic
1700..171F; Tagalog
1720..173F; Hanunoo
1740..175F; Buhid
1760..177F; Tagbanwa
1780..17FF; Khmer
1800..18AF; Mongolian
18B0..18FF; Unified Canadian Aboriginal Syllabics Extended
1900..194F; Limbu
1950..197F; Tai Le
1980..19DF; New Tai Lue
19E0..19FF; Khmer Symbols
1A00..1A1F; Buginese
1A20..1AAF; Tai Tham
1AB0..1AFF; Combining Diacritical Marks Extended
1B00..1B7F; Balinese
1B80..1BBF; Sundanese
1BC0..1BFF; Batak
1C00..1C4F; Lepcha
1C50..1C7F; Ol Chiki
1CC0..1CCF; Sundanese Supplement
1CD0..1CFF; Vedic Extensions
1D00..1D7F; Phonetic Extensions
1D80..1DBF; Phonetic Extensions Supplement
1DC0..1DFF; Combining Diacritical Marks Supplement
1E00..1EFF; Latin Extended Additional
1F00..1FFF; Greek Extended
2000..206F; General Punctuation
2070..209F; Superscripts and Subscripts
20A0..20CF; Currency Symbols
20D0..20FF; Combining Diacritical Marks for Symbols
2100..214F; Letterlike Symbols
2150..218F; Number Forms
2190..21FF; Arrows
2200..22FF; Mathematical Operators
2300..23FF; Miscellaneous Technical
2400..243F; Control Pictures
2440..245F; Optical Character Recognition
2460..24FF; Enclosed Alphanumerics
2500..257F; Box Drawing
2580..259F; Block Elements
25A0..25FF; Geometric Shapes
2600..26FF; Miscellaneous Symbols
2700..27BF; Dingbats
27C0..27EF; Miscellaneous Mathematical Symbols-A
27F0..27FF; Supplemental Arrows-A
2800..28FF; Braille Patterns
2900..297F; Supplemental Arrows-B
2980..29FF; Miscellaneous Mathematical Symbols-B
2A00..2AFF; Supplemental Mathematical Operators
2B00..2BFF; Miscellaneous Symbols and Arrows
2C00..2C5F; Glagolitic
2C60..2C7F; Latin Extended-C
2C80..2CFF; Coptic
2D00..2D2F; Georgian Supplement
2D30..2D7F; Tifinagh
2D80..2DDF; Ethiopic Extended
2DE0..2DFF; Cyrillic Extended-A
2E00..2E7F; Supplemental Punctuation
2E80..2EFF; CJK Radicals Supplement
2F00..2FDF; Kangxi Radicals
2FF0..2FFF; Ideographic Description Characters
3000..303F; CJK Symbols and Punctuation
3040..309F; Hiragana
30A0..30FF; Katakana
3100..312F; Bopomofo
3130..318F; Hangul Compatibility Jamo
3190..319F; Kanbun
31A0..31BF; Bopomofo Extended
31C0..31EF; CJK Strokes
31F0..31FF; Katakana Phonetic Extensions
3200..32FF; Enclosed CJK Letters and Months
3300..33FF; CJK Compatibility
3400..4DBF; CJK Unified Ideographs Extension A
4DC0..4DFF; Yijing Hexagram Symbols
4E00..9FFF; CJK Unified Ideographs
A000..A48F; Yi Syllables
A490..A4CF; Yi Radicals
A4D0..A4FF; Lisu
A500..A63F; Vai
A640..A69F; Cyrillic Extended-B
A6A0..A6FF; Bamum
A700..A71F; Modifier Tone Letters
A720..A7FF; Latin Extended-D
A800..A82F; Syloti Nagri
A830..A83F; Common Indic Number Forms
A840..A87F; Phags-pa
A880..A8DF; Saurashtra
A8E0..A8FF; Devanagari Extended
A900..A92F; Kayah Li
A930..A95F; Rejang
A960..A97F; Hangul Jamo Extended-A
A980..A9DF; Javanese
A9E0..A9FF; Myanmar Extended-B
AA00..AA5F; Cham
AA60..AA7F; Myanmar Extended-A
AA80..AADF; Tai Viet
AAE0..AAFF; Meetei Mayek Extensions
AB00..AB2F; Ethiopic Extended-A
AB30..AB6F; Latin Extended-E
ABC0..ABFF; Meetei Mayek
AC00..D7AF; Hangul Syllables
D7B0..D7FF; Hangul Jamo Extended-B
D800..DB7F; High Surrogates
DB80..DBFF; High Private Use Surrogates
DC00..DFFF; Low Surrogates
E000..F8FF; Private Use Area
F900..FAFF; CJK Compatibility Ideographs
FB00..FB4F; Alphabetic Presentation Forms
FB50..FDFF; Arabic Presentation Forms-A
FE00..FE0F; Variation Selectors
FE10..FE1F; Vertical Forms
FE20..FE2F; Combining Half Marks
FE30..FE4F; CJK Compatibility Forms
FE50..FE6F; Small Form Variants
FE70..FEFF; Arabic Presentation Forms-B
FF00..FFEF; Halfwidth and Fullwidth Forms
FFF0..FFFF; Specials
10000..1007F; Linear B Syllabary
10080..100FF; Linear B Ideograms
10100..1013F; Aegean Numbers
10140..1018F; Ancient Greek Numbers
10190..101CF; Ancient Symbols
101D0..101FF; Phaistos Disc
10280..1029F; Lycian
102A0..102DF; Carian
102E0..102FF; Coptic Epact Numbers
10300..1032F; Old Italic
10330..1034F; Gothic
10350..1037F; Old Permic
10380..1039F; Ugaritic
103A0..103DF; Old Persian
10400..1044F; Deseret
10450..1047F; Shavian
10480..104AF; Osmanya
10500..1052F; Elbasan
10530..1056F; Caucasian Albanian
10600..1077F; Linear A
10800..1083F; Cypriot Syllabary
10840..1085F; Imperial Aramaic
10860..1087F; Palmyrene
10880..108AF; Nabataean
10900..1091F; Phoenician
10920..1093F; Lydian
10980..1099F; Meroitic Hieroglyphs
109A0..109FF; Meroitic Cursive
10A00..10A5F; Kharoshthi
10A60..10A7F; Old South Arabian
10A80..10A9F; Old North Arabian
10AC0..10AFF; Manichaean
10B00..10B3F; Avestan
10B40..10B5F; Inscriptional Parthian
10B60..10B7F; Inscriptional Pahlavi
10B80..10BAF; Psalter Pahlavi
10C00..10C4F; Old Turkic
10E60..10E7F; Rumi Numeral Symbols
11000..1107F; Brahmi
11080..110CF; Kaithi
110D0..110FF; Sora Sompeng
11100..1114F; Chakma
11150..1117F; Mahajani
11180..111DF; Sharada
111E0..111FF; Sinhala Archaic Numbers
11200..1124F; Khojki
112B0..112FF; Khudawadi
11300..1137F; Grantha
11480..114DF; Tirhuta
11580..115FF; Siddham
11600..1165F; Modi
11680..116CF; Takri
118A0..118FF; Warang Citi
11AC0..11AFF; Pau Cin Hau
12000..123FF; Cuneiform
12400..1247F; Cuneiform Numbers and Punctuation
13000..1342F; Egyptian Hieroglyphs
16800..16A3F; Bamum Supplement
16A40..16A6F; Mro
16AD0..16AFF; Bassa Vah
16B00..16B8F; Pahawh Hmong
16F00..16F9F; Miao
1B000..1B0FF; Kana Supplement
1BC00..1BC9F; Duployan
1BCA0..1BCAF; Shorthand Format Controls
1D000..1D0FF; Byzantine Musical Symbols
1D100..1D1FF; Musical Symbols
1D200..1D24F; Ancient Greek Musical Notation
1D300..1D35F; Tai Xuan Jing Symbols
1D360..1D37F; Counting Rod Numerals
1D400..1D7FF; Mathematical Alphanumeric Symbols
1E800..1E8DF; Mende Kikakui
1EE00..1EEFF; Arabic Mathematical Alphabetic Symbols
1F000..1F02F; Mahjong Tiles
1F030..1F09F; Domino Tiles
1F0A0..1F0FF; Playing Cards
1F100..1F1FF; Enclosed Alphanumeric Supplement
1F200..1F2FF; Enclosed Ideographic Supplement
1F300..1F5FF; Miscellaneous Symbols and Pictographs
1F600..1F64F; Emoticons
1F650..1F67F; Ornamental Dingbats
1F680..1F6FF; Transport and Map Symbols
1F700..1F77F; Alchemical Symbols
1F780..1F7FF; Geometric Shapes Extended
1F800..1F8FF; Supplemental Arrows-C
20000..2A6DF; CJK Unified Ideographs Extension B
2A700..2B73F; CJK Unified Ideographs Extension C
2B740..2B81F; CJK Unified Ideographs Extension D
2F800..2FA1F; CJK Compatibility Ideographs Supplement
E0000..E007F; Tags
E0100..E01EF; Variation Selectors Supplement
F0000..FFFFF; Supplementary Private Use Area-A
100000..10FFFF; Supplementary Private Use Area-B

# EOF

テキストの整形しただけのjsはこちら。次回も使えるといいな

ssr=function(v){return ("      "+v).slice(-7);};
sss=function(v){return ("000000"+v).slice(-7);};
total=0;
document.body.innerText.split("\n").forEach(function(v){
    var m=v.match(/^([0-9A-F]+)\.\.([0-9A-F]+); (.+)/)
    if(m==null){
        return;
    }
    var v={};
    v.hexFrom=m[1];
    v.hexTo=m[2];
    v.decFrom=parseInt(v.hexFrom,16);
    v.decTo=parseInt(v.hexTo,16);
    v.name=m[3];
    v.log=ssr(v.decFrom)+"~"+ssr(v.decTo)+"("+ssr(v.decTo-v.decFrom)+")"+"/"+sss(v.hexFrom)+"..."+sss(v.hexTo)+";"+v.name+"\n";
    total+=(v.decTo-v.decFrom);
    console.log(v.log);
});
console.log("total="+total);

よく考えたらインターネットアーカイブで見れた Internet Archive Wayback Machine うーん、軽く差分見てみたけどあんま面白い情報は無いな。変わってるね。ふーん って感じ