<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <META http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>UTF-8 Sampler</title> <META http-equiv="Content-Style-Type" content="text/css"> <META name="viewport" content="width=device-width, initial-scale=1.0"> <LINK REL="stylesheet" TYPE="text/css" HREF="/kermit.css"> <LINK REL="shortcut icon" href="/favicon.ico" > <LINK REL="icon" href="/favicon.ico" type="image/x-icon"> <LINK REL="icon" type="image/ico" href="/favicon.ico"> <style type="text/css"> blockquote { margin-left:8px; margin-right:8px; font-size:90% } body { font-size:15px; font-family:calibri, arial narrow, arial, sans-serif, times; color:black; background:white; margin:16px; } tt { font-size:94% } </style> </head> <body> <h1><tt>UTF-8 SAMPLER</tt></h1> <big><big> ¥ · £ · € · $ · ¢ · ₡ · ₢ · ₣ · ₤ · ₥ · ₦ · ₧ · ₨ · ₩ · ₪ · ₫ · ₭ · ₮ · ₯ · ₹</big></big> <p> <blockquote> Frank da Cruz<br> <a href="index.html">The Kermit Project</a><br> New York City<br> <a href="mailto:fdc@kermitproject.org">fdc@kermitproject.org</a> <p> <i>Last update:</i> Thu Sep 15 14:00:00 2016 </blockquote> <p> <hr> [ <a href="http://www.columbia.edu/~fdc/pace/">PEACE</a> ] [ <a href="#poetry">Poetry</a> ] [ <a href="#glass">I Can Eat Glass</a> ] [ <a href="#quickbrownfox">Pangrams</a> ] [ <a href="#html">HTML Features</a> ] [ <a href="#credits">Credits, Tools, Commentary</a> ] <p> <big><big>U</big>TF-8</big> is an ASCII-preserving encoding method for <a href="unicode.html">Unicode</a> (ISO 10646), the Universal Character Set (UCS). The UCS encodes most of the world's writing systems in a single character set, allowing you to mix languages and scripts within a document without needing any tricks for switching character sets. This web page is encoded directly in UTF-8. <p> As shown <a href="glass.html">HERE</a>, Columbia University's <a href="k95.html">Kermit 95</a> terminal emulation software can display UTF-8 plain text in Windows 95, 98, ME, NT, XP, Vista, or Windows 7/8/10 when using a monospace Unicode font like <a href="http://www.monotype.com">Andale Mono WT J</a> or <a href="http://www.evertype.com/emono/">Everson Mono Terminal</a>, or the lesser populated Courier New, Lucida Console, or Andale Mono. <a href="ckermit.html">C-Kermit</a> can handle it too, <a href="http://www.cl.cam.ac.uk/~mgk25/unicode.html">if you have a Unicode display</a>. As many languages as are representable in your font can be seen on the screen at the same time. <p> This, however, is a Web page, which started out as a kind of stress test for UTF-8 support in Web browsers, which was spotty when this page was first created in the 1990s but which has become standard in all modern browsers. The problem now is mainly the fonts and the browser's (or font's) support for the nonzero Unicode planes (as in, e.g., the <a href="#braille">Braille</a> and <a href="#gothic">Gothic</a> examples below). And to some extent the rendition of combining sequences, right-to-left rendition (<a href="#arabic">Arabic</a>, <a href="#hebrew">Hebrew</a>), and so on. <a href="http://www.alanwood.net/unicode/fonts.html">CLICK HERE</a> for a survey of Unicode fonts for Windows. <p> The subtitle above shows currency symbols of many lands. If they don't appear as blobs, we're off to a good start! (The one on the end is the <a href="http://en.wikipedia.org/wiki/Indian_rupee_sign">new Indian Rupee sign</a> which won't show up in fonts for a while.) <h3><a name="poetry">Poetry</a></h3> From the Anglo-Saxon <a href="http://www.ragweedforge.com/poems.html"><cite>Rune Poem</cite></a> (Rune version): <p><blockquote> ᚠᛇᚻ᛫ᛒᛦᚦ᛫ᚠᚱᚩᚠᚢᚱ᛫ᚠᛁᚱᚪ᛫ᚷᛖᚻᚹᛦᛚᚳᚢᛗ<br> ᛋᚳᛖᚪᛚ᛫ᚦᛖᚪᚻ᛫ᛗᚪᚾᚾᚪ᛫ᚷᛖᚻᚹᛦᛚᚳ᛫ᛗᛁᚳᛚᚢᚾ᛫ᚻᛦᛏ᛫ᛞᚫᛚᚪᚾ<br> ᚷᛁᚠ᛫ᚻᛖ᛫ᚹᛁᛚᛖ᛫ᚠᚩᚱ᛫ᛞᚱᛁᚻᛏᚾᛖ᛫ᛞᚩᛗᛖᛋ᛫ᚻᛚᛇᛏᚪᚾ᛬<br> </blockquote> <p> From Laȝamon's<i> <a href="http://mesl.itd.umich.edu/b/brut/">Brut</a></i> (<i>The Chronicles of England</i>, Middle English, West Midlands): <p> <blockquote> An preost wes on leoden, Laȝamon was ihoten<br> He wes Leovenaðes sone -- liðe him be Drihten.<br> He wonede at Ernleȝe at æðelen are chirechen,<br> Uppen Sevarne staþe, sel þar him þuhte,<br> Onfest Radestone, þer he bock radde. </blockquote> <p> (The third letter in the author's name is Yogh, missing from many fonts; <a href="st-erkenwald.html">CLICK HERE</a> for another Middle English sample with some explanation of letters and encoding). <p> From the <cite>Tagelied</cite> of <a href="http://gutenberg.spiegel.de/autoren/eschenba.htm"> <b>Wolfram von Eschenbach</b></a> (Middle High German): <p><blockquote> Sîne klâwen durh die wolken sint geslagen,<br> er stîget ûf mit grôzer kraft,<br> ich sih in grâwen tägelîch als er wil tagen,<br> den tac, der im geselleschaft<br> erwenden wil, dem werden man,<br> den ich mit sorgen în verliez.<br> ich bringe in hinnen, ob ich kan.<br> sîn vil manegiu tugent michz leisten hiez.<br> </blockquote><p> Some lines of <a href="http://users.hol.gr/~artemis/odysseas_elytis.htm"> <b>Odysseus Elytis</b></a> (Greek): <blockquote> <table cellspacing=0 cellpadding=0> <tr> <td valign="top" style="padding-right:16"> Monotonic: <p> Τη γλώσσα μου έδωσαν ελληνική<br> το σπίτι φτωχικό στις αμμουδιές του Ομήρου.<br> Μονάχη έγνοια η γλώσσα μου στις αμμουδιές του Ομήρου.<br> <p> από το Άξιον Εστί<br> του Οδυσσέα Ελύτη <td valign="top"> Polytonic: <p> Τὴ γλῶσσα μοῦ ἔδωσαν ἑλληνικὴ<br/> τὸ σπίτι φτωχικὸ στὶς ἀμμουδιὲς τοῦ Ὁμήρου.<br/> Μονάχη ἔγνοια ἡ γλῶσσα μου στὶς ἀμμουδιὲς τοῦ Ὁμήρου.<br/> <p> ἀπὸ τὸ Ἄξιον ἐστί<br/> τοῦ Ὀδυσσέα Ἐλύτη<br/> </table> </blockquote> <p> The first stanza of <a href="http://www.ocf.berkeley.edu/%7Eleong/Russkaya%20Literatura/Aleksandr%20Sergeevich%20Pushkin.htm"><b>Pushkin</b></a>'s <cite>Bronze Horseman</cite> (Russian):<br> <p><blockquote> На берегу пустынных волн<br> Стоял он, дум великих полн,<br> И вдаль глядел. Пред ним широко<br> Река неслася; бедный чёлн<br> По ней стремился одиноко.<br> По мшистым, топким берегам<br> Чернели избы здесь и там,<br> Приют убогого чухонца;<br> И лес, неведомый лучам<br> В тумане спрятанного солнца,<br> Кругом шумел.<br> </blockquote><p> <a href="http://www.compling.hu-berlin.de/~johannes/mxedruli/"><b>Šota Rustaveli</b></a>'s Veṗxis Ṭq̇aosani, ̣︡Th, <cite>The Knight in the Tiger's Skin</cite> (Georgian):<p> <blockquote> ვეპხის ტყაოსანი შოთა რუსთაველი <p> ღმერთსი შემვედრე, ნუთუ კვლა დამხსნას სოფლისა შრომასა, ცეცხლს, წყალსა და მიწასა, ჰაერთა თანა მრომასა; მომცნეს ფრთენი და აღვფრინდე, მივჰხვდე მას ჩემსა ნდომასა, დღისით და ღამით ვჰხედვიდე მზისა ელვათა კრთომაასა. </blockquote> <p> Tamil poetry of Subramaniya Bharathiyar: சுப்ரமணிய பாரதியார் (1882-1921): <p> <blockquote> யாமறிந்த மொழிகளிலே தமிழ்மொழி போல் இனிதாவது எங்கும் காணோம், <br> பாமரராய் விலங்குகளாய், உலகனைத்தும் இகழ்ச்சிசொலப் பான்மை கெட்டு, <br> நாமமது தமிழரெனக் கொண்டு இங்கு வாழ்ந்திடுதல் நன்றோ? சொல்லீர்!<br> தேமதுரத் தமிழோசை உலகமெலாம் பரவும்வகை செய்தல் வேண்டும். </blockquote> <p> Kannada poetry by Kuvempu — ಬಾ ಇಲ್ಲಿ ಸಂಭವಿಸು <p> <blockquote> ಬಾ ಇಲ್ಲಿ ಸಂಭವಿಸು ಇಂದೆನ್ನ ಹೃದಯದಲಿ <br> ನಿತ್ಯವೂ ಅವತರಿಪ ಸತ್ಯಾವತಾರ <p> ಮಣ್ಣಾಗಿ ಮರವಾಗಿ ಮಿಗವಾಗಿ ಕಗವಾಗೀ... <br> ಮಣ್ಣಾಗಿ ಮರವಾಗಿ ಮಿಗವಾಗಿ ಕಗವಾಗಿ <br> ಭವ ಭವದಿ ಭತಿಸಿಹೇ ಭವತಿ ದೂರ <br> ನಿತ್ಯವೂ ಅವತರಿಪ ಸತ್ಯಾವತಾರ || ಬಾ ಇಲ್ಲಿ || </blockquote> <h3><a name="glass">I Can Eat Glass</a></h3> And from the sublime to the ridiculous, here is a <a href="#notes">certain phrase¹</a> in an assortment of languages: <p> <ol> <li><b>Sanskrit</b>: काचं शक्नोम्यत्तुम् । नोपहिनस्ति माम् ॥ <li><b>Sanskrit</b> <i>(standard transcription):</i> kācaṃ śaknomyattum; nopahinasti mām. <li><b>Classical Greek</b>: ὕαλον ϕαγεῖν δύναμαι· τοῦτο οὔ με βλάπτει. <li><b>Greek</b> (monotonic): Μπορώ να φάω σπασμένα γυαλιά χωρίς να πάθω τίποτα. <li><b>Greek</b> (polytonic): Μπορῶ νὰ φάω σπασμένα γυαλιὰ χωρὶς νὰ πάθω τίποτα. <br><b>Etruscan</b>: (NEEDED) <li><b>Latin</b>: Vitrum edere possum; mihi non nocet. <li><b>Old French</b>: Je puis mangier del voirre. Ne me nuit. <li><b>French</b>: Je peux manger du verre, ça ne me fait pas <!--de--> mal. <li><b>Provençal / Occitan</b>: Pòdi manjar de veire, me nafrariá pas. <li><b>Québécois</b>: J'peux manger d'la vitre, ça m'fa pas mal. <li><b>Walloon</b>: Dji pou magnî do vêre, çoula m' freut nén må. <br><b>Champenois</b>: (NEEDED) <br><b>Lorrain</b>: (NEEDED) <li><b>Picard</b>: Ch'peux mingi du verre, cha m'foé mie n'ma. <br><b>Corsican/Corsu</b>: (NEEDED) <br><b>Jèrriais</b>: (NEEDED) <li><b>Kreyòl Ayisyen</b> (Haitï): Mwen kap manje vè, li pa blese'm. <li><b>Basque</b>: Kristala jan dezaket, ez dit minik ematen. <li><b>Catalan / Català</b>: Puc menjar vidre, que no em fa mal. <li><b>Spanish</b>: Puedo comer vidrio, no me hace daño. <li><b>Aragonés</b>: Puedo minchar beire, no me'n fa mal . <br><b>Aranés</b>: (NEEDED) <br><b>Mallorquín</b>: (NEEDED) <li><b>Galician</b>: Eu podo xantar cristais e non cortarme. <li><b>European Portuguese</b>: Posso comer vidro, não me faz mal. <li><b>Brazilian Portuguese</b> (<a href="#notes">8</a>): Posso comer vidro, não me machuca. <li><b>Caboverdiano/Kabuverdianu</b> (Cape Verde): M' podê cumê vidru, ca ta maguâ-m'. <li><b>Papiamentu</b>: Ami por kome glas anto e no ta hasimi daño. <li><b>Italian</b>: Posso mangiare il vetro e non mi fa male. <li><b>Milanese</b>: Sôn bôn de magnà el véder, el me fa minga mal. <li><b>Roman</b>: Me posso magna' er vetro, e nun me fa male. <li><b>Napoletano</b>: M' pozz magna' o'vetr, e nun m' fa mal. <li><b>Venetian</b>: Mi posso magnare el vetro, no'l me fa mae. <li><b>Zeneise</b> <i>(Genovese):</i> Pòsso mangiâ o veddro e o no me fà mâ. <li><b>Sicilian</b>: Puotsu mangiari u vitru, nun mi fa mali. <br><b>Campinadese</b> (Sardinia): (NEEDED) <br><b>Lugudorese</b> (Sardinia): (NEEDED) <li><b>Romansch (Grischun)</b>: Jau sai mangiar vaider, senza che quai fa donn a mai. <br><b>Romany / Tsigane</b>: (NEEDED) <li><b>Romanian</b>: Pot să mănânc sticlă și ea nu mă rănește. <li><b>Esperanto</b>: Mi povas manĝi vitron, ĝi ne damaĝas min. <br><b>Pictish</b>: (NEEDED) <br><b>Breton</b>: (NEEDED) <li><b>Cornish</b>: Mý a yl dybry gwéder hag éf ny wra ow ankenya. <li><b>Welsh</b>: Dw i'n gallu bwyta gwydr, 'dyw e ddim yn gwneud dolur i mi. <li><b>Manx Gaelic</b>: Foddym gee glonney agh cha jean eh gortaghey mee. <li><b>Old Irish</b> <i>(Ogham):</i> ᚛᚛ᚉᚑᚅᚔᚉᚉᚔᚋ ᚔᚈᚔ ᚍᚂᚐᚅᚑ ᚅᚔᚋᚌᚓᚅᚐ᚜ <li><b>Old Irish</b> <i>(Latin):</i> Con·iccim ithi nglano. Ním·géna. <li><b>Irish</b>: Is féidir liom gloinne a ithe. Ní dhéanann sí dochar ar bith dom. <li><b>Ulster Gaelic</b>: Ithim-sa gloine agus ní miste damh é. <li><b>Scottish Gaelic</b>: S urrainn dhomh gloinne ithe; cha ghoirtich i mi. <li><b>Anglo-Saxon</b> <i>(Runes):</i> ᛁᚳ᛫ᛗᚨᚷ᛫ᚷᛚᚨᛋ᛫ᛖᚩᛏᚪᚾ᛫ᚩᚾᛞ᛫ᚻᛁᛏ᛫ᚾᛖ᛫ᚻᛖᚪᚱᛗᛁᚪᚧ᛫ᛗᛖ᛬ <li><b>Anglo-Saxon</b> <i>(Latin):</i> Ic mæg glæs eotan ond hit ne hearmiað me. <li><b>Middle English</b>: Ich canne glas eten and hit hirtiþ me nouȝt. <li><b>English</b>: I can eat glass and it doesn't hurt me. <li><b>English</b> <i>(IPA):</i> [aɪ kæn iːt glɑːs ænd ɪt dɐz nɒt hɜːt miː] (Received Pronunciation) <li id="braille"><b>English</b> <i>(Braille):</i> ⠊⠀⠉⠁⠝⠀⠑⠁⠞⠀⠛⠇⠁⠎⠎⠀⠁⠝⠙⠀⠊⠞⠀⠙⠕⠑⠎⠝⠞⠀⠓⠥⠗⠞⠀⠍⠑ <li><b>Jamaican</b>: Mi kian niam glas han i neba hot mi. <li><b>Lalland Scots / Doric</b>: Ah can eat gless, it disnae hurt us. <br><b>Glaswegian</b>: (NEEDED) <li id="gothic"><b>Gothic</b> (<a href="#notes">4</a>):
Snippet is not live.
Travelled to 12 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt
No comments. add comment
Snippet ID: | #3000413 |
Snippet name: | Contents of kermitproject.org/utf8.html |
Eternal ID of this version: | #3000413/1 |
Text MD5: | c2571a8fa4aec8105e4922954fd68ff5 |
Author: | someone |
Category: | |
Type: | New Tinybrain snippet |
Gummipassword: | #3999999 |
Uploaded from IP: | 31.19.51.233 |
Public (visible to everyone): | Yes |
Archived (hidden from active list): | No |
Created/modified: | 2016-10-25 18:13:12 |
Source code size: | 14178 bytes / 326 lines |
Pitched / IR pitched: | No / No |
Views / Downloads: | 587 / 142 |
Referenced in: | [show references] |