www.unicode.org/Public/UCD/latest/ucd/Blocks.txt より調査。 バージョンは# Blocks-7.0.0.txt Date: 2014-04-03, 23:23:00 GMT [RP, KW] ここ見れば全ての情報があるんだけど、16進数だらけでよくわからんから10進数化しただけ。
当然だけど、若い数字はほぼ埋まってる。CJK Unified Ideographsは領域使いすぎ。他の領域が100くらいなのに2万てwそりゃ6536文字は破綻ですわ。 ハングルも結構取ってるね。あれって組み合わせて使う文字だと思うけど、どこまで収録してんだろ?CJK Unified Ideographs Extension Bとか4万文字w 「サロゲートペアがあるからもう遠慮しなくていいよね。これ全部違う文字だから収録よろしく^^」的な サロゲートペア含めたunicodeの全領域は111万2,064文字。今のところ使ってる領域は25万6084文字分。ipv6並みに余裕じゃないけど、絵文字をバカスカ追加するくらいは出来そうだ。
0~ 127( 127)/0000000...000007F;Basic Latin 128~ 255( 127)/0000080...00000FF;Latin-1 Supplement 256~ 383( 127)/0000100...000017F;Latin Extended-A 384~ 591( 207)/0000180...000024F;Latin Extended-B 592~ 687( 95)/0000250...00002AF;IPA Extensions 688~ 767( 79)/00002B0...00002FF;Spacing Modifier Letters 768~ 879( 111)/0000300...000036F;Combining Diacritical Marks 880~ 1023( 143)/0000370...00003FF;Greek and Coptic 1024~ 1279( 255)/0000400...00004FF;Cyrillic 1280~ 1327( 47)/0000500...000052F;Cyrillic Supplement 1328~ 1423( 95)/0000530...000058F;Armenian 1424~ 1535( 111)/0000590...00005FF;Hebrew 1536~ 1791( 255)/0000600...00006FF;Arabic 1792~ 1871( 79)/0000700...000074F;Syriac 1872~ 1919( 47)/0000750...000077F;Arabic Supplement 1920~ 1983( 63)/0000780...00007BF;Thaana 1984~ 2047( 63)/00007C0...00007FF;NKo 2048~ 2111( 63)/0000800...000083F;Samaritan 2112~ 2143( 31)/0000840...000085F;Mandaic 2208~ 2303( 95)/00008A0...00008FF;Arabic Extended-A 2304~ 2431( 127)/0000900...000097F;Devanagari 2432~ 2559( 127)/0000980...00009FF;Bengali 2560~ 2687( 127)/0000A00...0000A7F;Gurmukhi 2688~ 2815( 127)/0000A80...0000AFF;Gujarati 2816~ 2943( 127)/0000B00...0000B7F;Oriya 2944~ 3071( 127)/0000B80...0000BFF;Tamil 3072~ 3199( 127)/0000C00...0000C7F;Telugu 3200~ 3327( 127)/0000C80...0000CFF;Kannada 3328~ 3455( 127)/0000D00...0000D7F;Malayalam 3456~ 3583( 127)/0000D80...0000DFF;Sinhala 3584~ 3711( 127)/0000E00...0000E7F;Thai 3712~ 3839( 127)/0000E80...0000EFF;Lao 3840~ 4095( 255)/0000F00...0000FFF;Tibetan 4096~ 4255( 159)/0001000...000109F;Myanmar 4256~ 4351( 95)/00010A0...00010FF;Georgian 4352~ 4607( 255)/0001100...00011FF;Hangul Jamo 4608~ 4991( 383)/0001200...000137F;Ethiopic 4992~ 5023( 31)/0001380...000139F;Ethiopic Supplement 5024~ 5119( 95)/00013A0...00013FF;Cherokee 5120~ 5759( 639)/0001400...000167F;Unified Canadian Aboriginal Syllabics 5760~ 5791( 31)/0001680...000169F;Ogham 5792~ 5887( 95)/00016A0...00016FF;Runic 5888~ 5919( 31)/0001700...000171F;Tagalog 5920~ 5951( 31)/0001720...000173F;Hanunoo 5952~ 5983( 31)/0001740...000175F;Buhid 5984~ 6015( 31)/0001760...000177F;Tagbanwa 6016~ 6143( 127)/0001780...00017FF;Khmer 6144~ 6319( 175)/0001800...00018AF;Mongolian 6320~ 6399( 79)/00018B0...00018FF;Unified Canadian Aboriginal Syllabics Extended 6400~ 6479( 79)/0001900...000194F;Limbu 6480~ 6527( 47)/0001950...000197F;Tai Le 6528~ 6623( 95)/0001980...00019DF;New Tai Lue 6624~ 6655( 31)/00019E0...00019FF;Khmer Symbols 6656~ 6687( 31)/0001A00...0001A1F;Buginese 6688~ 6831( 143)/0001A20...0001AAF;Tai Tham 6832~ 6911( 79)/0001AB0...0001AFF;Combining Diacritical Marks Extended 6912~ 7039( 127)/0001B00...0001B7F;Balinese 7040~ 7103( 63)/0001B80...0001BBF;Sundanese 7104~ 7167( 63)/0001BC0...0001BFF;Batak 7168~ 7247( 79)/0001C00...0001C4F;Lepcha 7248~ 7295( 47)/0001C50...0001C7F;Ol Chiki 7360~ 7375( 15)/0001CC0...0001CCF;Sundanese Supplement 7376~ 7423( 47)/0001CD0...0001CFF;Vedic Extensions 7424~ 7551( 127)/0001D00...0001D7F;Phonetic Extensions 7552~ 7615( 63)/0001D80...0001DBF;Phonetic Extensions Supplement 7616~ 7679( 63)/0001DC0...0001DFF;Combining Diacritical Marks Supplement 7680~ 7935( 255)/0001E00...0001EFF;Latin Extended Additional 7936~ 8191( 255)/0001F00...0001FFF;Greek Extended 8192~ 8303( 111)/0002000...000206F;General Punctuation 8304~ 8351( 47)/0002070...000209F;Superscripts and Subscripts 8352~ 8399( 47)/00020A0...00020CF;Currency Symbols 8400~ 8447( 47)/00020D0...00020FF;Combining Diacritical Marks for Symbols 8448~ 8527( 79)/0002100...000214F;Letterlike Symbols 8528~ 8591( 63)/0002150...000218F;Number Forms 8592~ 8703( 111)/0002190...00021FF;Arrows 8704~ 8959( 255)/0002200...00022FF;Mathematical Operators 8960~ 9215( 255)/0002300...00023FF;Miscellaneous Technical 9216~ 9279( 63)/0002400...000243F;Control Pictures 9280~ 9311( 31)/0002440...000245F;Optical Character Recognition 9312~ 9471( 159)/0002460...00024FF;Enclosed Alphanumerics 9472~ 9599( 127)/0002500...000257F;Box Drawing 9600~ 9631( 31)/0002580...000259F;Block Elements 9632~ 9727( 95)/00025A0...00025FF;Geometric Shapes 9728~ 9983( 255)/0002600...00026FF;Miscellaneous Symbols 9984~ 10175( 191)/0002700...00027BF;Dingbats 10176~ 10223( 47)/00027C0...00027EF;Miscellaneous Mathematical Symbols-A 10224~ 10239( 15)/00027F0...00027FF;Supplemental Arrows-A 10240~ 10495( 255)/0002800...00028FF;Braille Patterns 10496~ 10623( 127)/0002900...000297F;Supplemental Arrows-B 10624~ 10751( 127)/0002980...00029FF;Miscellaneous Mathematical Symbols-B 10752~ 11007( 255)/0002A00...0002AFF;Supplemental Mathematical Operators 11008~ 11263( 255)/0002B00...0002BFF;Miscellaneous Symbols and Arrows 11264~ 11359( 95)/0002C00...0002C5F;Glagolitic 11360~ 11391( 31)/0002C60...0002C7F;Latin Extended-C 11392~ 11519( 127)/0002C80...0002CFF;Coptic 11520~ 11567( 47)/0002D00...0002D2F;Georgian Supplement 11568~ 11647( 79)/0002D30...0002D7F;Tifinagh 11648~ 11743( 95)/0002D80...0002DDF;Ethiopic Extended 11744~ 11775( 31)/0002DE0...0002DFF;Cyrillic Extended-A 11776~ 11903( 127)/0002E00...0002E7F;Supplemental Punctuation 11904~ 12031( 127)/0002E80...0002EFF;CJK Radicals Supplement 12032~ 12255( 223)/0002F00...0002FDF;Kangxi Radicals 12272~ 12287( 15)/0002FF0...0002FFF;Ideographic Description Characters 12288~ 12351( 63)/0003000...000303F;CJK Symbols and Punctuation 12352~ 12447( 95)/0003040...000309F;Hiragana 12448~ 12543( 95)/00030A0...00030FF;Katakana 12544~ 12591( 47)/0003100...000312F;Bopomofo 12592~ 12687( 95)/0003130...000318F;Hangul Compatibility Jamo 12688~ 12703( 15)/0003190...000319F;Kanbun 12704~ 12735( 31)/00031A0...00031BF;Bopomofo Extended 12736~ 12783( 47)/00031C0...00031EF;CJK Strokes 12784~ 12799( 15)/00031F0...00031FF;Katakana Phonetic Extensions 12800~ 13055( 255)/0003200...00032FF;Enclosed CJK Letters and Months 13056~ 13311( 255)/0003300...00033FF;CJK Compatibility 13312~ 19903( 6591)/0003400...0004DBF;CJK Unified Ideographs Extension A 19904~ 19967( 63)/0004DC0...0004DFF;Yijing Hexagram Symbols 19968~ 40959( 20991)/0004E00...0009FFF;CJK Unified Ideographs 40960~ 42127( 1167)/000A000...000A48F;Yi Syllables 42128~ 42191( 63)/000A490...000A4CF;Yi Radicals 42192~ 42239( 47)/000A4D0...000A4FF;Lisu 42240~ 42559( 319)/000A500...000A63F;Vai 42560~ 42655( 95)/000A640...000A69F;Cyrillic Extended-B 42656~ 42751( 95)/000A6A0...000A6FF;Bamum 42752~ 42783( 31)/000A700...000A71F;Modifier Tone Letters 42784~ 43007( 223)/000A720...000A7FF;Latin Extended-D 43008~ 43055( 47)/000A800...000A82F;Syloti Nagri 43056~ 43071( 15)/000A830...000A83F;Common Indic Number Forms 43072~ 43135( 63)/000A840...000A87F;Phags-pa 43136~ 43231( 95)/000A880...000A8DF;Saurashtra 43232~ 43263( 31)/000A8E0...000A8FF;Devanagari Extended 43264~ 43311( 47)/000A900...000A92F;Kayah Li 43312~ 43359( 47)/000A930...000A95F;Rejang 43360~ 43391( 31)/000A960...000A97F;Hangul Jamo Extended-A 43392~ 43487( 95)/000A980...000A9DF;Javanese 43488~ 43519( 31)/000A9E0...000A9FF;Myanmar Extended-B 43520~ 43615( 95)/000AA00...000AA5F;Cham 43616~ 43647( 31)/000AA60...000AA7F;Myanmar Extended-A 43648~ 43743( 95)/000AA80...000AADF;Tai Viet 43744~ 43775( 31)/000AAE0...000AAFF;Meetei Mayek Extensions 43776~ 43823( 47)/000AB00...000AB2F;Ethiopic Extended-A 43824~ 43887( 63)/000AB30...000AB6F;Latin Extended-E 43968~ 44031( 63)/000ABC0...000ABFF;Meetei Mayek 44032~ 55215( 11183)/000AC00...000D7AF;Hangul Syllables 55216~ 55295( 79)/000D7B0...000D7FF;Hangul Jamo Extended-B 55296~ 56191( 895)/000D800...000DB7F;High Surrogates 56192~ 56319( 127)/000DB80...000DBFF;High Private Use Surrogates 56320~ 57343( 1023)/000DC00...000DFFF;Low Surrogates 57344~ 63743( 6399)/000E000...000F8FF;Private Use Area 63744~ 64255( 511)/000F900...000FAFF;CJK Compatibility Ideographs 64256~ 64335( 79)/000FB00...000FB4F;Alphabetic Presentation Forms 64336~ 65023( 687)/000FB50...000FDFF;Arabic Presentation Forms-A 65024~ 65039( 15)/000FE00...000FE0F;Variation Selectors 65040~ 65055( 15)/000FE10...000FE1F;Vertical Forms 65056~ 65071( 15)/000FE20...000FE2F;Combining Half Marks 65072~ 65103( 31)/000FE30...000FE4F;CJK Compatibility Forms 65104~ 65135( 31)/000FE50...000FE6F;Small Form Variants 65136~ 65279( 143)/000FE70...000FEFF;Arabic Presentation Forms-B 65280~ 65519( 239)/000FF00...000FFEF;Halfwidth and Fullwidth Forms 65520~ 65535( 15)/000FFF0...000FFFF;Specials 65536~ 65663( 127)/0010000...001007F;Linear B Syllabary 65664~ 65791( 127)/0010080...00100FF;Linear B Ideograms 65792~ 65855( 63)/0010100...001013F;Aegean Numbers 65856~ 65935( 79)/0010140...001018F;Ancient Greek Numbers 65936~ 65999( 63)/0010190...00101CF;Ancient Symbols 66000~ 66047( 47)/00101D0...00101FF;Phaistos Disc 66176~ 66207( 31)/0010280...001029F;Lycian 66208~ 66271( 63)/00102A0...00102DF;Carian 66272~ 66303( 31)/00102E0...00102FF;Coptic Epact Numbers 66304~ 66351( 47)/0010300...001032F;Old Italic 66352~ 66383( 31)/0010330...001034F;Gothic 66384~ 66431( 47)/0010350...001037F;Old Permic 66432~ 66463( 31)/0010380...001039F;Ugaritic 66464~ 66527( 63)/00103A0...00103DF;Old Persian 66560~ 66639( 79)/0010400...001044F;Deseret 66640~ 66687( 47)/0010450...001047F;Shavian 66688~ 66735( 47)/0010480...00104AF;Osmanya 66816~ 66863( 47)/0010500...001052F;Elbasan 66864~ 66927( 63)/0010530...001056F;Caucasian Albanian 67072~ 67455( 383)/0010600...001077F;Linear A 67584~ 67647( 63)/0010800...001083F;Cypriot Syllabary 67648~ 67679( 31)/0010840...001085F;Imperial Aramaic 67680~ 67711( 31)/0010860...001087F;Palmyrene 67712~ 67759( 47)/0010880...00108AF;Nabataean 67840~ 67871( 31)/0010900...001091F;Phoenician 67872~ 67903( 31)/0010920...001093F;Lydian 67968~ 67999( 31)/0010980...001099F;Meroitic Hieroglyphs 68000~ 68095( 95)/00109A0...00109FF;Meroitic Cursive 68096~ 68191( 95)/0010A00...0010A5F;Kharoshthi 68192~ 68223( 31)/0010A60...0010A7F;Old South Arabian 68224~ 68255( 31)/0010A80...0010A9F;Old North Arabian 68288~ 68351( 63)/0010AC0...0010AFF;Manichaean 68352~ 68415( 63)/0010B00...0010B3F;Avestan 68416~ 68447( 31)/0010B40...0010B5F;Inscriptional Parthian 68448~ 68479( 31)/0010B60...0010B7F;Inscriptional Pahlavi 68480~ 68527( 47)/0010B80...0010BAF;Psalter Pahlavi 68608~ 68687( 79)/0010C00...0010C4F;Old Turkic 69216~ 69247( 31)/0010E60...0010E7F;Rumi Numeral Symbols 69632~ 69759( 127)/0011000...001107F;Brahmi 69760~ 69839( 79)/0011080...00110CF;Kaithi 69840~ 69887( 47)/00110D0...00110FF;Sora Sompeng 69888~ 69967( 79)/0011100...001114F;Chakma 69968~ 70015( 47)/0011150...001117F;Mahajani 70016~ 70111( 95)/0011180...00111DF;Sharada 70112~ 70143( 31)/00111E0...00111FF;Sinhala Archaic Numbers 70144~ 70223( 79)/0011200...001124F;Khojki 70320~ 70399( 79)/00112B0...00112FF;Khudawadi 70400~ 70527( 127)/0011300...001137F;Grantha 70784~ 70879( 95)/0011480...00114DF;Tirhuta 71040~ 71167( 127)/0011580...00115FF;Siddham 71168~ 71263( 95)/0011600...001165F;Modi 71296~ 71375( 79)/0011680...00116CF;Takri 71840~ 71935( 95)/00118A0...00118FF;Warang Citi 72384~ 72447( 63)/0011AC0...0011AFF;Pau Cin Hau 73728~ 74751( 1023)/0012000...00123FF;Cuneiform 74752~ 74879( 127)/0012400...001247F;Cuneiform Numbers and Punctuation 77824~ 78895( 1071)/0013000...001342F;Egyptian Hieroglyphs 92160~ 92735( 575)/0016800...0016A3F;Bamum Supplement 92736~ 92783( 47)/0016A40...0016A6F;Mro 92880~ 92927( 47)/0016AD0...0016AFF;Bassa Vah 92928~ 93071( 143)/0016B00...0016B8F;Pahawh Hmong 93952~ 94111( 159)/0016F00...0016F9F;Miao 110592~ 110847( 255)/001B000...001B0FF;Kana Supplement 113664~ 113823( 159)/001BC00...001BC9F;Duployan 113824~ 113839( 15)/001BCA0...001BCAF;Shorthand Format Controls 118784~ 119039( 255)/001D000...001D0FF;Byzantine Musical Symbols 119040~ 119295( 255)/001D100...001D1FF;Musical Symbols 119296~ 119375( 79)/001D200...001D24F;Ancient Greek Musical Notation 119552~ 119647( 95)/001D300...001D35F;Tai Xuan Jing Symbols 119648~ 119679( 31)/001D360...001D37F;Counting Rod Numerals 119808~ 120831( 1023)/001D400...001D7FF;Mathematical Alphanumeric Symbols 124928~ 125151( 223)/001E800...001E8DF;Mende Kikakui 126464~ 126719( 255)/001EE00...001EEFF;Arabic Mathematical Alphabetic Symbols 126976~ 127023( 47)/001F000...001F02F;Mahjong Tiles 127024~ 127135( 111)/001F030...001F09F;Domino Tiles 127136~ 127231( 95)/001F0A0...001F0FF;Playing Cards 127232~ 127487( 255)/001F100...001F1FF;Enclosed Alphanumeric Supplement 127488~ 127743( 255)/001F200...001F2FF;Enclosed Ideographic Supplement 127744~ 128511( 767)/001F300...001F5FF;Miscellaneous Symbols and Pictographs 128512~ 128591( 79)/001F600...001F64F;Emoticons 128592~ 128639( 47)/001F650...001F67F;Ornamental Dingbats 128640~ 128767( 127)/001F680...001F6FF;Transport and Map Symbols 128768~ 128895( 127)/001F700...001F77F;Alchemical Symbols 128896~ 129023( 127)/001F780...001F7FF;Geometric Shapes Extended 129024~ 129279( 255)/001F800...001F8FF;Supplemental Arrows-C 131072~ 173791( 42719)/0020000...002A6DF;CJK Unified Ideographs Extension B 173824~ 177983( 4159)/002A700...002B73F;CJK Unified Ideographs Extension C 177984~ 178207( 223)/002B740...002B81F;CJK Unified Ideographs Extension D 194560~ 195103( 543)/002F800...002FA1F;CJK Compatibility Ideographs Supplement 917504~ 917631( 127)/00E0000...00E007F;Tags 917760~ 917999( 239)/00E0100...00E01EF;Variation Selectors Supplement 983040~1048575( 65535)/00F0000...00FFFFF;Supplementary Private Use Area-A 1048576~1114111( 65535)/0100000...010FFFF;Supplementary Private Use Area-B total=256084
差分とか面白そうだからナマのテキストもペタリ
# Blocks-7.0.0.txt # Date: 2014-04-03, 23:23:00 GMT [RP, KW] # # Unicode Character Database # Copyright (c) 1991-2014 Unicode, Inc. # For terms of use, see http://www.unicode.org/terms_of_use.html # For documentation, see http://www.unicode.org/reports/tr44/ # # Note: The casing of block names is not normative. # For example, "Basic Latin" and "BASIC LATIN" are equivalent. # # Format: # Start Code..End Code; Block Name # ================================================ # Note: When comparing block names, casing, whitespace, hyphens, # and underbars are ignored. # For example, "Latin Extended-A" and "latin extended a" are equivalent. # For more information on the comparison of property values, # see UAX #44: http://www.unicode.org/reports/tr44/ # # All code points not explicitly listed for Block # have the value No_Block. # Property: Block # # @missing: 0000..10FFFF; No_Block 0000..007F; Basic Latin 0080..00FF; Latin-1 Supplement 0100..017F; Latin Extended-A 0180..024F; Latin Extended-B 0250..02AF; IPA Extensions 02B0..02FF; Spacing Modifier Letters 0300..036F; Combining Diacritical Marks 0370..03FF; Greek and Coptic 0400..04FF; Cyrillic 0500..052F; Cyrillic Supplement 0530..058F; Armenian 0590..05FF; Hebrew 0600..06FF; Arabic 0700..074F; Syriac 0750..077F; Arabic Supplement 0780..07BF; Thaana 07C0..07FF; NKo 0800..083F; Samaritan 0840..085F; Mandaic 08A0..08FF; Arabic Extended-A 0900..097F; Devanagari 0980..09FF; Bengali 0A00..0A7F; Gurmukhi 0A80..0AFF; Gujarati 0B00..0B7F; Oriya 0B80..0BFF; Tamil 0C00..0C7F; Telugu 0C80..0CFF; Kannada 0D00..0D7F; Malayalam 0D80..0DFF; Sinhala 0E00..0E7F; Thai 0E80..0EFF; Lao 0F00..0FFF; Tibetan 1000..109F; Myanmar 10A0..10FF; Georgian 1100..11FF; Hangul Jamo 1200..137F; Ethiopic 1380..139F; Ethiopic Supplement 13A0..13FF; Cherokee 1400..167F; Unified Canadian Aboriginal Syllabics 1680..169F; Ogham 16A0..16FF; Runic 1700..171F; Tagalog 1720..173F; Hanunoo 1740..175F; Buhid 1760..177F; Tagbanwa 1780..17FF; Khmer 1800..18AF; Mongolian 18B0..18FF; Unified Canadian Aboriginal Syllabics Extended 1900..194F; Limbu 1950..197F; Tai Le 1980..19DF; New Tai Lue 19E0..19FF; Khmer Symbols 1A00..1A1F; Buginese 1A20..1AAF; Tai Tham 1AB0..1AFF; Combining Diacritical Marks Extended 1B00..1B7F; Balinese 1B80..1BBF; Sundanese 1BC0..1BFF; Batak 1C00..1C4F; Lepcha 1C50..1C7F; Ol Chiki 1CC0..1CCF; Sundanese Supplement 1CD0..1CFF; Vedic Extensions 1D00..1D7F; Phonetic Extensions 1D80..1DBF; Phonetic Extensions Supplement 1DC0..1DFF; Combining Diacritical Marks Supplement 1E00..1EFF; Latin Extended Additional 1F00..1FFF; Greek Extended 2000..206F; General Punctuation 2070..209F; Superscripts and Subscripts 20A0..20CF; Currency Symbols 20D0..20FF; Combining Diacritical Marks for Symbols 2100..214F; Letterlike Symbols 2150..218F; Number Forms 2190..21FF; Arrows 2200..22FF; Mathematical Operators 2300..23FF; Miscellaneous Technical 2400..243F; Control Pictures 2440..245F; Optical Character Recognition 2460..24FF; Enclosed Alphanumerics 2500..257F; Box Drawing 2580..259F; Block Elements 25A0..25FF; Geometric Shapes 2600..26FF; Miscellaneous Symbols 2700..27BF; Dingbats 27C0..27EF; Miscellaneous Mathematical Symbols-A 27F0..27FF; Supplemental Arrows-A 2800..28FF; Braille Patterns 2900..297F; Supplemental Arrows-B 2980..29FF; Miscellaneous Mathematical Symbols-B 2A00..2AFF; Supplemental Mathematical Operators 2B00..2BFF; Miscellaneous Symbols and Arrows 2C00..2C5F; Glagolitic 2C60..2C7F; Latin Extended-C 2C80..2CFF; Coptic 2D00..2D2F; Georgian Supplement 2D30..2D7F; Tifinagh 2D80..2DDF; Ethiopic Extended 2DE0..2DFF; Cyrillic Extended-A 2E00..2E7F; Supplemental Punctuation 2E80..2EFF; CJK Radicals Supplement 2F00..2FDF; Kangxi Radicals 2FF0..2FFF; Ideographic Description Characters 3000..303F; CJK Symbols and Punctuation 3040..309F; Hiragana 30A0..30FF; Katakana 3100..312F; Bopomofo 3130..318F; Hangul Compatibility Jamo 3190..319F; Kanbun 31A0..31BF; Bopomofo Extended 31C0..31EF; CJK Strokes 31F0..31FF; Katakana Phonetic Extensions 3200..32FF; Enclosed CJK Letters and Months 3300..33FF; CJK Compatibility 3400..4DBF; CJK Unified Ideographs Extension A 4DC0..4DFF; Yijing Hexagram Symbols 4E00..9FFF; CJK Unified Ideographs A000..A48F; Yi Syllables A490..A4CF; Yi Radicals A4D0..A4FF; Lisu A500..A63F; Vai A640..A69F; Cyrillic Extended-B A6A0..A6FF; Bamum A700..A71F; Modifier Tone Letters A720..A7FF; Latin Extended-D A800..A82F; Syloti Nagri A830..A83F; Common Indic Number Forms A840..A87F; Phags-pa A880..A8DF; Saurashtra A8E0..A8FF; Devanagari Extended A900..A92F; Kayah Li A930..A95F; Rejang A960..A97F; Hangul Jamo Extended-A A980..A9DF; Javanese A9E0..A9FF; Myanmar Extended-B AA00..AA5F; Cham AA60..AA7F; Myanmar Extended-A AA80..AADF; Tai Viet AAE0..AAFF; Meetei Mayek Extensions AB00..AB2F; Ethiopic Extended-A AB30..AB6F; Latin Extended-E ABC0..ABFF; Meetei Mayek AC00..D7AF; Hangul Syllables D7B0..D7FF; Hangul Jamo Extended-B D800..DB7F; High Surrogates DB80..DBFF; High Private Use Surrogates DC00..DFFF; Low Surrogates E000..F8FF; Private Use Area F900..FAFF; CJK Compatibility Ideographs FB00..FB4F; Alphabetic Presentation Forms FB50..FDFF; Arabic Presentation Forms-A FE00..FE0F; Variation Selectors FE10..FE1F; Vertical Forms FE20..FE2F; Combining Half Marks FE30..FE4F; CJK Compatibility Forms FE50..FE6F; Small Form Variants FE70..FEFF; Arabic Presentation Forms-B FF00..FFEF; Halfwidth and Fullwidth Forms FFF0..FFFF; Specials 10000..1007F; Linear B Syllabary 10080..100FF; Linear B Ideograms 10100..1013F; Aegean Numbers 10140..1018F; Ancient Greek Numbers 10190..101CF; Ancient Symbols 101D0..101FF; Phaistos Disc 10280..1029F; Lycian 102A0..102DF; Carian 102E0..102FF; Coptic Epact Numbers 10300..1032F; Old Italic 10330..1034F; Gothic 10350..1037F; Old Permic 10380..1039F; Ugaritic 103A0..103DF; Old Persian 10400..1044F; Deseret 10450..1047F; Shavian 10480..104AF; Osmanya 10500..1052F; Elbasan 10530..1056F; Caucasian Albanian 10600..1077F; Linear A 10800..1083F; Cypriot Syllabary 10840..1085F; Imperial Aramaic 10860..1087F; Palmyrene 10880..108AF; Nabataean 10900..1091F; Phoenician 10920..1093F; Lydian 10980..1099F; Meroitic Hieroglyphs 109A0..109FF; Meroitic Cursive 10A00..10A5F; Kharoshthi 10A60..10A7F; Old South Arabian 10A80..10A9F; Old North Arabian 10AC0..10AFF; Manichaean 10B00..10B3F; Avestan 10B40..10B5F; Inscriptional Parthian 10B60..10B7F; Inscriptional Pahlavi 10B80..10BAF; Psalter Pahlavi 10C00..10C4F; Old Turkic 10E60..10E7F; Rumi Numeral Symbols 11000..1107F; Brahmi 11080..110CF; Kaithi 110D0..110FF; Sora Sompeng 11100..1114F; Chakma 11150..1117F; Mahajani 11180..111DF; Sharada 111E0..111FF; Sinhala Archaic Numbers 11200..1124F; Khojki 112B0..112FF; Khudawadi 11300..1137F; Grantha 11480..114DF; Tirhuta 11580..115FF; Siddham 11600..1165F; Modi 11680..116CF; Takri 118A0..118FF; Warang Citi 11AC0..11AFF; Pau Cin Hau 12000..123FF; Cuneiform 12400..1247F; Cuneiform Numbers and Punctuation 13000..1342F; Egyptian Hieroglyphs 16800..16A3F; Bamum Supplement 16A40..16A6F; Mro 16AD0..16AFF; Bassa Vah 16B00..16B8F; Pahawh Hmong 16F00..16F9F; Miao 1B000..1B0FF; Kana Supplement 1BC00..1BC9F; Duployan 1BCA0..1BCAF; Shorthand Format Controls 1D000..1D0FF; Byzantine Musical Symbols 1D100..1D1FF; Musical Symbols 1D200..1D24F; Ancient Greek Musical Notation 1D300..1D35F; Tai Xuan Jing Symbols 1D360..1D37F; Counting Rod Numerals 1D400..1D7FF; Mathematical Alphanumeric Symbols 1E800..1E8DF; Mende Kikakui 1EE00..1EEFF; Arabic Mathematical Alphabetic Symbols 1F000..1F02F; Mahjong Tiles 1F030..1F09F; Domino Tiles 1F0A0..1F0FF; Playing Cards 1F100..1F1FF; Enclosed Alphanumeric Supplement 1F200..1F2FF; Enclosed Ideographic Supplement 1F300..1F5FF; Miscellaneous Symbols and Pictographs 1F600..1F64F; Emoticons 1F650..1F67F; Ornamental Dingbats 1F680..1F6FF; Transport and Map Symbols 1F700..1F77F; Alchemical Symbols 1F780..1F7FF; Geometric Shapes Extended 1F800..1F8FF; Supplemental Arrows-C 20000..2A6DF; CJK Unified Ideographs Extension B 2A700..2B73F; CJK Unified Ideographs Extension C 2B740..2B81F; CJK Unified Ideographs Extension D 2F800..2FA1F; CJK Compatibility Ideographs Supplement E0000..E007F; Tags E0100..E01EF; Variation Selectors Supplement F0000..FFFFF; Supplementary Private Use Area-A 100000..10FFFF; Supplementary Private Use Area-B # EOF
テキストの整形しただけのjsはこちら。次回も使えるといいな
ssr=function(v){return (" "+v).slice(-7);}; sss=function(v){return ("000000"+v).slice(-7);}; total=0; document.body.innerText.split("\n").forEach(function(v){ var m=v.match(/^([0-9A-F]+)\.\.([0-9A-F]+); (.+)/) if(m==null){ return; } var v={}; v.hexFrom=m[1]; v.hexTo=m[2]; v.decFrom=parseInt(v.hexFrom,16); v.decTo=parseInt(v.hexTo,16); v.name=m[3]; v.log=ssr(v.decFrom)+"~"+ssr(v.decTo)+"("+ssr(v.decTo-v.decFrom)+")"+"/"+sss(v.hexFrom)+"..."+sss(v.hexTo)+";"+v.name+"\n"; total+=(v.decTo-v.decFrom); console.log(v.log); }); console.log("total="+total);
よく考えたらインターネットアーカイブで見れた Internet Archive Wayback Machine うーん、軽く差分見てみたけどあんま面白い情報は無いな。変わってるね。ふーん って感じ