<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <META http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>UTF-8 Sampler</title> <META http-equiv="Content-Style-Type" content="text/css"> <META name="viewport" content="width=device-width, initial-scale=1.0"> <LINK REL="stylesheet" TYPE="text/css" HREF="/kermit.css"> <LINK REL="shortcut icon" href="/favicon.ico" > <LINK REL="icon" href="/favicon.ico" type="image/x-icon"> <LINK REL="icon" type="image/ico" href="/favicon.ico"> <style type="text/css"> blockquote { margin-left:8px; margin-right:8px; font-size:90% } body { font-size:15px; font-family:calibri, arial narrow, arial, sans-serif, times; color:black; background:white; margin:16px; } tt { font-size:94% } </style> </head> <body> <h1><tt>UTF-8 SAMPLER</tt></h1> <big><big> Â¥ · £ · ⬠· $ · ¢ · ⡠· ⢠· ⣠· ⤠· ⥠· ⦠· ⧠· ⨠· ⩠· ⪠· ⫠· â · ⮠· ⯠· ₹</big></big> <p> <blockquote> Frank da Cruz<br> <a href="index.html">The Kermit Project</a><br> New York City<br> <a href="mailto:fdc@kermitproject.org">fdc@kermitproject.org</a> <p> <i>Last update:</i> Thu Sep 15 14:00:00 2016 </blockquote> <p> <hr> [ <a href="http://www.columbia.edu/~fdc/pace/">PEACE</a> ] [ <a href="#poetry">Poetry</a> ] [ <a href="#glass">I Can Eat Glass</a> ] [ <a href="#quickbrownfox">Pangrams</a> ] [ <a href="#html">HTML Features</a> ] [ <a href="#credits">Credits, Tools, Commentary</a> ] <p> <big><big>U</big>TF-8</big> is an ASCII-preserving encoding method for <a href="unicode.html">Unicode</a> (ISO 10646), the Universal Character Set (UCS). The UCS encodes most of the world's writing systems in a single character set, allowing you to mix languages and scripts within a document without needing any tricks for switching character sets. This web page is encoded directly in UTF-8. <p> As shown <a href="glass.html">HERE</a>, Columbia University's <a href="k95.html">Kermit 95</a> terminal emulation software can display UTF-8 plain text in Windows 95, 98, ME, NT, XP, Vista, or Windows 7/8/10 when using a monospace Unicode font like <a href="http://www.monotype.com">Andale Mono WT J</a> or <a href="http://www.evertype.com/emono/">Everson Mono Terminal</a>, or the lesser populated Courier New, Lucida Console, or Andale Mono. <a href="ckermit.html">C-Kermit</a> can handle it too, <a href="http://www.cl.cam.ac.uk/~mgk25/unicode.html">if you have a Unicode display</a>. As many languages as are representable in your font can be seen on the screen at the same time. <p> This, however, is a Web page, which started out as a kind of stress test for UTF-8 support in Web browsers, which was spotty when this page was first created in the 1990s but which has become standard in all modern browsers. The problem now is mainly the fonts and the browser's (or font's) support for the nonzero Unicode planes (as in, e.g., the <a href="#braille">Braille</a> and <a href="#gothic">Gothic</a> examples below). And to some extent the rendition of combining sequences, right-to-left rendition (<a href="#arabic">Arabic</a>, <a href="#hebrew">Hebrew</a>), and so on. <a href="http://www.alanwood.net/unicode/fonts.html">CLICK HERE</a> for a survey of Unicode fonts for Windows. <p> The subtitle above shows currency symbols of many lands. If they don't appear as blobs, we're off to a good start! (The one on the end is the <a href="http://en.wikipedia.org/wiki/Indian_rupee_sign">new Indian Rupee sign</a> which won't show up in fonts for a while.) <h3><a name="poetry">Poetry</a></h3> From the Anglo-Saxon <a href="http://www.ragweedforge.com/poems.html"><cite>Rune Poem</cite></a> (Rune version): <p><blockquote> á áá»á«áá¦á¦á«á á±á©á á¢á±á«á áá±áªá«á·áá»á¹á¦áá³á¢á<br> áá³ááªáá«á¦ááªá»á«ááªá¾á¾áªá«á·áá»á¹á¦áá³á«ááá³áá¢á¾á«á»á¦áá«áá«ááªá¾<br> á·áá á«á»áá«á¹áááá«á á©á±á«áá±áá»áá¾áá«áá©áááá«á»ááááªá¾á¬<br> </blockquote> <p> From LaÈamon's<i> <a href="http://mesl.itd.umich.edu/b/brut/">Brut</a></i> (<i>The Chronicles of England</i>, Middle English, West Midlands): <p> <blockquote> An preost wes on leoden, LaÈamon was ihoten<br> He wes Leovenaðes sone -- liðe him be Drihten.<br> He wonede at ErnleÈe at æðelen are chirechen,<br> Uppen Sevarne staþe, sel þar him þuhte,<br> Onfest Radestone, þer he bock radde. </blockquote> <p> (The third letter in the author's name is Yogh, missing from many fonts; <a href="st-erkenwald.html">CLICK HERE</a> for another Middle English sample with some explanation of letters and encoding). <p> From the <cite>Tagelied</cite> of <a href="http://gutenberg.spiegel.de/autoren/eschenba.htm"> <b>Wolfram von Eschenbach</b></a> (Middle High German): <p><blockquote> Sîne klâwen durh die wolken sint geslagen,<br> er stîget ûf mit grôzer kraft,<br> ich sih in grâwen tägelîch als er wil tagen,<br> den tac, der im geselleschaft<br> erwenden wil, dem werden man,<br> den ich mit sorgen în verliez.<br> ich bringe in hinnen, ob ich kan.<br> sîn vil manegiu tugent michz leisten hiez.<br> </blockquote><p> Some lines of <a href="http://users.hol.gr/~artemis/odysseas_elytis.htm"> <b>Odysseus Elytis</b></a> (Greek): <blockquote> <table cellspacing=0 cellpadding=0> <tr> <td valign="top" style="padding-right:16"> Monotonic: <p> Τη γλÏÏÏα Î¼Î¿Ï ÎδÏÏαν ελληνική<br> Ïο ÏÏίÏι ÏÏÏÏÎ¹ÎºÏ ÏÏÎ¹Ï Î±Î¼Î¼Î¿Ï Î´Î¹ÎÏ ÏÎ¿Ï ÎμήÏÎ¿Ï .<br> ÎονάÏη Îγνοια η γλÏÏÏα Î¼Î¿Ï ÏÏÎ¹Ï Î±Î¼Î¼Î¿Ï Î´Î¹ÎÏ ÏÎ¿Ï ÎμήÏÎ¿Ï .<br> <p> αÏÏ Ïο Îξιον ÎÏÏί<br> ÏÎ¿Ï ÎÎ´Ï ÏÏÎα ÎλÏÏη <td valign="top"> Polytonic: <p> Τὴ γλῶÏÏα μοῦ á¼Î´ÏÏαν á¼Î»Î»Î·Î½Î¹Îºá½´<br/> Ïὸ ÏÏίÏι ÏÏÏÏικὸ ÏÏá½¶Ï á¼Î¼Î¼Î¿Ï Î´Î¹á½²Ï Ïοῦ á½Î¼Î®ÏÎ¿Ï .<br/> ÎονάÏη á¼Î³Î½Î¿Î¹Î± ἡ γλῶÏÏα Î¼Î¿Ï ÏÏá½¶Ï á¼Î¼Î¼Î¿Ï Î´Î¹á½²Ï Ïοῦ á½Î¼Î®ÏÎ¿Ï .<br/> <p> á¼Ïὸ Ïὸ á¼Î¾Î¹Î¿Î½ á¼ÏÏί<br/> Ïοῦ á½Î´Ï ÏÏÎα á¼Î»ÏÏη<br/> </table> </blockquote> <p> The first stanza of <a href="http://www.ocf.berkeley.edu/%7Eleong/Russkaya%20Literatura/Aleksandr%20Sergeevich%20Pushkin.htm"><b>Pushkin</b></a>'s <cite>Bronze Horseman</cite> (Russian):<br> <p><blockquote> Ðа беÑÐµÐ³Ñ Ð¿ÑÑÑÑннÑÑ Ð²Ð¾Ð»Ð½<br> СÑоÑл он, дÑм Ð²ÐµÐ»Ð¸ÐºÐ¸Ñ Ð¿Ð¾Ð»Ð½,<br> Ð Ð²Ð´Ð°Ð»Ñ Ð³Ð»Ñдел. ÐÑед ним ÑиÑоко<br> Река неÑлаÑÑ; беднÑй ÑÑлн<br> Ðо ней ÑÑÑемилÑÑ Ð¾Ð´Ð¸Ð½Ð¾ÐºÐ¾.<br> Ðо мÑиÑÑÑм, Ñопким беÑегам<br> ЧеÑнели Ð¸Ð·Ð±Ñ Ð·Ð´ÐµÑÑ Ð¸ Ñам,<br> ÐÑиÑÑ Ñбогого ÑÑÑ Ð¾Ð½Ñа;<br> РлеÑ, неведомÑй лÑÑам<br> Ð ÑÑмане ÑпÑÑÑанного ÑолнÑа,<br> ÐÑÑгом ÑÑмел.<br> </blockquote><p> <a href="http://www.compling.hu-berlin.de/~johannes/mxedruli/"><b>Å ota Rustaveli</b></a>'s VepÌxis TÌ£qÌaosani, ̣︡Th, <cite>The Knight in the Tiger's Skin</cite> (Georgian):<p> <blockquote> áááá®áá¡ á¢á§ááá¡ááá á¨ááá á á£á¡áááááá <p> á¦ááá áá¡á á¨áááááá á, áá£áᣠáááá áááá®á¡ááá¡ á¡áá¤ááá¡á á¨á áááá¡á, áªááªá®áá¡, á¬á§ááá¡á áá ááá¬áá¡á, á°ááá áá áááá áá áááá¡á; ááááªááá¡ á¤á áááá áá áá¦áá¤á áááá, áááá°á®ááá ááá¡ á©ááá¡á áááááá¡á, áá¦áá¡áá áá á¦áááá áá°á®áááááá áááá¡á áááááá áá áááááá¡á. </blockquote> <p> Tamil poetry of Subramaniya Bharathiyar: à®à¯à®ªà¯à®°à®®à®£à®¿à®¯ பாரதியார௠(1882-1921): <p> <blockquote> யாமறிநà¯à®¤ à®®à¯à®´à®¿à®à®³à®¿à®²à¯ தமிழà¯à®®à¯à®´à®¿ பà¯à®²à¯ à®à®©à®¿à®¤à®¾à®µà®¤à¯ à®à®à¯à®à¯à®®à¯ à®à®¾à®£à¯à®®à¯, <br> பாமரராய௠விலà®à¯à®à¯à®à®³à®¾à®¯à¯, à®à®²à®à®©à¯à®¤à¯à®¤à¯à®®à¯ à®à®à®´à¯à®à¯à®à®¿à®à¯à®²à®ªà¯ பானà¯à®®à¯ à®à¯à®à¯à®à¯, <br> நாமமத௠தமிழரà¯à®©à®à¯ à®à¯à®£à¯à®à¯ à®à®à¯à®à¯ வாழà¯à®¨à¯à®¤à®¿à®à¯à®¤à®²à¯ நனà¯à®±à¯? à®à¯à®²à¯à®²à¯à®°à¯!<br> தà¯à®®à®¤à¯à®°à®¤à¯ தமிழà¯à®à¯ à®à®²à®à®®à¯à®²à®¾à®®à¯ பரவà¯à®®à¯à®µà®à¯ à®à¯à®¯à¯à®¤à®²à¯ வà¯à®£à¯à®à¯à®®à¯. </blockquote> <p> Kannada poetry by Kuvempu — ಬಾ à²à²²à³à²²à²¿ ಸà²à²à²µà²¿à²¸à³ <p> <blockquote> ಬಾ à²à²²à³à²²à²¿ ಸà²à²à²µà²¿à²¸à³ à²à²à²¦à³à²¨à³à²¨ ಹà³à²¦à²¯à²¦à²²à²¿ <br> ನಿತà³à²¯à²µà³ ಠವತರಿಪ ಸತà³à²¯à²¾à²µà²¤à²¾à²° <p> ಮಣà³à²£à²¾à²à²¿ ಮರವಾà²à²¿ ಮಿà²à²µà²¾à²à²¿ à²à²à²µà²¾à²à³... <br> ಮಣà³à²£à²¾à²à²¿ ಮರವಾà²à²¿ ಮಿà²à²µà²¾à²à²¿ à²à²à²µà²¾à²à²¿ <br> à²à²µ à²à²µà²¦à²¿ à²à²¤à²¿à²¸à²¿à²¹à³ à²à²µà²¤à²¿ ದà³à²° <br> ನಿತà³à²¯à²µà³ ಠವತರಿಪ ಸತà³à²¯à²¾à²µà²¤à²¾à²° || ಬಾ à²à²²à³à²²à²¿ || </blockquote> <h3><a name="glass">I Can Eat Glass</a></h3> And from the sublime to the ridiculous, here is a <a href="#notes">certain phrase¹</a> in an assortment of languages: <p> <ol> <li><b>Sanskrit</b>: à¤à¤¾à¤à¤ शà¤à¥à¤¨à¥à¤®à¥à¤¯à¤¤à¥à¤¤à¥à¤®à¥ । नà¥à¤ªà¤¹à¤¿à¤¨à¤¸à¥à¤¤à¤¿ मामॠ॥ <li><b>Sanskrit</b> <i>(standard transcription):</i> kÄcaá¹ Åaknomyattum; nopahinasti mÄm. <li><b>Classical Greek</b>: á½Î±Î»Î¿Î½ Ïαγεá¿Î½ δύναμαιΠÏοῦÏο οὠμε βλάÏÏει. <li><b>Greek</b> (monotonic): ÎÏοÏÏ Î½Î± ÏÎ¬Ï ÏÏαÏμÎνα Î³Ï Î±Î»Î¹Î¬ ÏÏÏÎ¯Ï Î½Î± ÏÎ¬Î¸Ï ÏίÏοÏα. <li><b>Greek</b> (polytonic): ÎÏοÏῶ νὰ ÏÎ¬Ï ÏÏαÏμÎνα Î³Ï Î±Î»Î¹á½° ÏÏÏá½¶Ï Î½á½° ÏÎ¬Î¸Ï ÏίÏοÏα. <br><b>Etruscan</b>: (NEEDED) <li><b>Latin</b>: Vitrum edere possum; mihi non nocet. <li><b>Old French</b>: Je puis mangier del voirre. Ne me nuit. <li><b>French</b>: Je peux manger du verre, ça ne me fait pas <!--de--> mal. <li><b>Provençal / Occitan</b>: Pòdi manjar de veire, me nafrariá pas. <li><b>Québécois</b>: J'peux manger d'la vitre, ça m'fa pas mal. <li><b>Walloon</b>: Dji pou magnî do vêre, çoula m' freut nén mÃ¥. <br><b>Champenois</b>: (NEEDED) <br><b>Lorrain</b>: (NEEDED) <li><b>Picard</b>: Ch'peux mingi du verre, cha m'foé mie n'ma. <br><b>Corsican/Corsu</b>: (NEEDED) <br><b>Jèrriais</b>: (NEEDED) <li><b>Kreyòl Ayisyen</b> (Haitï): Mwen kap manje vè, li pa blese'm. <li><b>Basque</b>: Kristala jan dezaket, ez dit minik ematen. <li><b>Catalan / Català </b>: Puc menjar vidre, que no em fa mal. <li><b>Spanish</b>: Puedo comer vidrio, no me hace daño. <li><b>Aragonés</b>: Puedo minchar beire, no me'n fa mal . <br><b>Aranés</b>: (NEEDED) <br><b>MallorquÃn</b>: (NEEDED) <li><b>Galician</b>: Eu podo xantar cristais e non cortarme. <li><b>European Portuguese</b>: Posso comer vidro, não me faz mal. <li><b>Brazilian Portuguese</b> (<a href="#notes">8</a>): Posso comer vidro, não me machuca. <li><b>Caboverdiano/Kabuverdianu</b> (Cape Verde): M' podê cumê vidru, ca ta maguâ-m'. <li><b>Papiamentu</b>: Ami por kome glas anto e no ta hasimi daño. <li><b>Italian</b>: Posso mangiare il vetro e non mi fa male. <li><b>Milanese</b>: Sôn bôn de magnà el véder, el me fa minga mal. <li><b>Roman</b>: Me posso magna' er vetro, e nun me fa male. <li><b>Napoletano</b>: M' pozz magna' o'vetr, e nun m' fa mal. <li><b>Venetian</b>: Mi posso magnare el vetro, no'l me fa mae. <li><b>Zeneise</b> <i>(Genovese):</i> Pòsso mangiâ o veddro e o no me fà mâ. <li><b>Sicilian</b>: Puotsu mangiari u vitru, nun mi fa mali. <br><b>Campinadese</b> (Sardinia): (NEEDED) <br><b>Lugudorese</b> (Sardinia): (NEEDED) <li><b>Romansch (Grischun)</b>: Jau sai mangiar vaider, senza che quai fa donn a mai. <br><b>Romany / Tsigane</b>: (NEEDED) <li><b>Romanian</b>: Pot sÄ mÄnânc sticlÄ Èi ea nu mÄ rÄneÈte. <li><b>Esperanto</b>: Mi povas manÄi vitron, Äi ne damaÄas min. <br><b>Pictish</b>: (NEEDED) <br><b>Breton</b>: (NEEDED) <li><b>Cornish</b>: Mý a yl dybry gwéder hag éf ny wra ow ankenya. <li><b>Welsh</b>: Dw i'n gallu bwyta gwydr, 'dyw e ddim yn gwneud dolur i mi. <li><b>Manx Gaelic</b>: Foddym gee glonney agh cha jean eh gortaghey mee. <li><b>Old Irish</b> <i>(Ogham):</i> ááááá áááááááááááááá ááá ááááá áá <li><b>Old Irish</b> <i>(Latin):</i> Con·iccim ithi nglano. NÃm·géna. <li><b>Irish</b>: Is féidir liom gloinne a ithe. Nà dhéanann sà dochar ar bith dom. <li><b>Ulster Gaelic</b>: Ithim-sa gloine agus nà miste damh é. <li><b>Scottish Gaelic</b>: S urrainn dhomh gloinne ithe; cha ghoirtich i mi. <li><b>Anglo-Saxon</b> <i>(Runes):</i> áá³á«áá¨á·á«á·áá¨áá«áá©ááªá¾á«á©á¾áá«á»ááá«á¾áá«á»ááªá±áááªá§á«ááᬠ<li><b>Anglo-Saxon</b> <i>(Latin):</i> Ic mæg glæs eotan ond hit ne hearmiað me. <li><b>Middle English</b>: Ich canne glas eten and hit hirtiþ me nouÈt. <li><b>English</b>: I can eat glass and it doesn't hurt me. <li><b>English</b> <i>(IPA):</i> [aɪ kæn iËt glÉËs ænd ɪt dÉz nÉt hÉËt miË] (Received Pronunciation) <li id="braille"><b>English</b> <i>(Braille):</i> â â â â â â â â â â â â â â â â â â â â â â â â â â â â â â â â ¥â â â â â <li><b>Jamaican</b>: Mi kian niam glas han i neba hot mi. <li><b>Lalland Scots / Doric</b>: Ah can eat gless, it disnae hurt us. <br><b>Glaswegian</b>: (NEEDED) <li id="gothic"><b>Gothic</b> (<a href="#notes">4</a>): ð¼ð°ð² ð²ð»ð´ð ð¹Ìðð°ð½, ð½ð¹ ð¼ð¹ð ð ð¿ ð½ð³ð°ð½ ð±ðð¹ð²ð²ð¹ð¸. <li><b>Old Norse</b> <i>(Runes):</i> áá´ á·áá ááá ᧠á·ááá± áá¾ á¦ááá á¨á§ á¡á á±á§á¨ áá¨á± <li><b>Old Norse</b> <i>(Latin):</i> Ek get etið gler án þess að verða sár. <li><b>Norsk / Norwegian (Nynorsk):</b> Eg kan eta glas utan Ã¥ skada meg. <li><b>Norsk / Norwegian (BokmÃ¥l):</b> Jeg kan spise glass uten Ã¥ skade meg. <li><b>Føroyskt / Faroese</b>: Eg kann eta glas, skaðaleysur. <!-- <br><b>Føroyskt / Faroese</b>: Eg kann eta glas, uttan á nakran hátt at meinslast av hesum. --> <li><b>Ãslenska / Icelandic</b>: Ãg get etið gler án þess að meiða mig. <li><b>Svenska / Swedish</b>: Jag kan äta glas utan att skada mig. <li><b>Dansk / Danish</b>: Jeg kan spise glas, det gør ikke ondt pÃ¥ mig. <li><b>Sønderjysk</b>: à ka æe glass uhen at det go mæ naue. <li><b>Frysk / Frisian</b>: Ik kin glês ite, it docht me net sear. <!-- <li><b>Nederlands / Dutch</b>: Ik kan glas eten, het doet mij geen pijn. --> <!-- <li><b>Nederlands / Dutch</b>: Ik kan glas eten zonder dat het mij schaadt. --> <!-- <li><tt>Dutch: Ik kan glas eten, maar dat doet mij geen kwaad.</tt> --> <li><b>Nederlands / Dutch</b>: Ik kan glas eten, het doet mij geen kwaad. <LI><B>Kirchröadsj/Bôchesserplat</B>: Iech ken glaas èèse, mer 't deet miech jing pieng.</LI> <li><b>Afrikaans</b>: Ek kan glas eet, maar dit doen my nie skade nie. <li><b>Lëtzebuergescht / Luxemburgish</b>: Ech kan Glas iessen, daat deet mir nët wei. <li><b>Deutsch / German</b>: Ich kann Glas essen, ohne mir zu schaden. <li><b>Ruhrdeutsch</b>: Ich kann Glas verkasematuckeln, ohne dattet mich wat jucken tut. <li><b>Langenfelder Platt</b>: Isch kann Jlaas kimmeln, uuhne datt mich datt weh dääd. <li><b>Lausitzer Mundart</b> ("Lusatian"): Ich koann Gloos assn und doas dudd merr ni wii. <li><b>Odenwälderisch</b>: Iech konn glaasch voschbachteln ohne dass es mir ebbs daun doun dud. <li><b>Sächsisch / Saxon</b>: 'sch kann Glos essn, ohne dass'sch mer wehtue. <li><b>Pfälzisch</b>: Isch konn Glass fresse ohne dasses mer ebbes ausmache dud. <li><b>Schwäbisch / Swabian</b>: I kÃ¥ Glas frässa, ond des macht mr nix! <li><b>Deutsch (Voralberg)</b>: I ka glas eassa, ohne dass mar weh tuat. <li><b>Bayrisch / Bavarian</b>: I koh Glos esa, und es duard ma ned wei. <li><b>Allemannisch</b>: I kaun Gloos essen, es tuat ma ned weh. <li><b>Schwyzerdütsch</b> (Zürich): Ich chan Glaas ässe, das schadt mir nöd. <li><b>Schwyzerdütsch</b> (Luzern): Ech cha Glâs ässe, das schadt mer ned. <br><b>Plautdietsch</b>: (NEEDED) <li><b>Hungarian</b>: Meg tudom enni az üveget, nem lesz tÅle bajom. <li><b>Suomi / Finnish</b>: Voin syödä lasia, se ei vahingoita minua. <li><b>Sami (Northern)</b>: Sáhtán borrat lása, dat ii leat bávÄÄas. <li><b>Erzian</b>: Ðон ÑÑÑан ÑÑликадо, Ð´Ñ Ð·ÑÑн ÑйÑÑÑÐ½Ð·Ñ Ð° Ñли. <li><b>Northern Karelian</b>: Mie voin syvvä lasie ta minla ei ole kipie. <li><b>Southern Karelian</b>: Minä voin syvvä st'oklua dai minule ei ole kibie. <br><b>Vepsian</b>: (NEEDED) <br><b>Votian</b>: (NEEDED) <br><b>Livonian</b>: (NEEDED) <li><b>Estonian</b>: Ma võin klaasi süüa, see ei tee mulle midagi. <li><b>Latvian</b>: Es varu Äst stiklu, tas man nekaitÄ. <li><b>Lithuanian</b>: AÅ¡ galiu valgyti stiklÄ ir jis manÄs nežeidžia <br><b>Old Prussian</b>: (NEEDED) <br><b>Sorbian</b> (Wendish): (NEEDED) <li><b>Czech</b>: Mohu jÃst sklo, neublÞà mi. <li><b>Slovak</b>: Môžem jesÅ¥ sklo. Nezranà ma. <li><b>Polska / Polish</b>: MogÄ jeÅÄ szkÅo i mi nie szkodzi. <li><b>Slovenian:</b> Lahko jem steklo, ne da bi mi Å¡kodovalo. <!-- <li><b>Croatian</b>: Ja mogu jesti staklo i ne boli me. Serbian translation is very poor. Infinitive used and sound as: "I can eating glass". <li><b>Serbian</b> <i>(Latin):</i> Mogu jesti staklo a da mi ne Å¡kodi. <li><b>Serbian</b> <i>(Cyrillic):</i> ÐÐ¾Ð³Ñ ÑеÑÑи ÑÑакло а да ми не Ñкоди. <li><b>Serbian</b> <i>(Latin):</i> Ja mogu da jedem staklo. <li><b>Serbian</b> <i>(Cyrillic)</i>: Ðа Ð¼Ð¾Ð³Ñ Ð´Ð° Ñедем ÑÑакло. <li><b>Macedonian:</b> Ðожам да Ñадам ÑÑакло, а не ме ÑÑеÑа. --> <li><b>Bosnian, Croatian, Montenegrin and Serbian</b> <i>(Latin)</i>: Ja mogu jesti staklo, i to mi ne Å¡teti. <li><b>Bosnian, Montenegrin and Serbian</b> <i>(Cyrillic)</i>: Ðа Ð¼Ð¾Ð³Ñ ÑеÑÑи ÑÑакло, и Ñо ми не ÑÑеÑи. <li><b>Macedonian:</b> Ðожам да Ñадам ÑÑакло, а не ме ÑÑеÑа. <li><b>Russian</b>: Я Ð¼Ð¾Ð³Ñ ÐµÑÑÑ ÑÑекло, оно мне не вÑедиÑ. <li><b>Belarusian</b> <i>(Cyrillic):</i> Я Ð¼Ð°Ð³Ñ ÐµÑÑÑ Ñкло, Ñно мне не ÑкодзÑÑÑ. <li><b>Belarusian</b> <i>(Lacinka):</i> Ja mahu jeÅci Å¡kÅo, jano mne ne Å¡kodziÄ. <!-- <li><b>Ukrainian</b>: Я Ð¼Ð¾Ð¶Ñ ÑÑÑи Ñкло, й воно Ð¼ÐµÐ½Ñ Ð½Ðµ поÑкодиÑÑ. --> <li><b>Ukrainian</b>: Я Ð¼Ð¾Ð¶Ñ ÑÑÑи Ñкло, Ñ Ð²Ð¾Ð½Ð¾ Ð¼ÐµÐ½Ñ Ð½Ðµ заÑкодиÑÑ. <!-- <li><b>Bulgarian</b>: Ðога да Ñм ÑÑÑкло и не ме боли. --> <li><b>Bulgarian</b>: Ðога да Ñм ÑÑÑкло, Ñо не ми вÑеди. <li><b>Georgian</b>: ááááá¡ áááá áá áá á áá¢áááá. <li><b>Armenian</b>: Ô¿ÖÕ¶Õ¡Õ´ Õ¡ÕºÕ¡Õ¯Õ« Õ¸ÖÕ¿Õ¥Õ¬ Ö Õ«Õ¶Õ®Õ« Õ¡Õ¶Õ°Õ¡Õ¶Õ£Õ«Õ½Õ¿ Õ¹Õ¨Õ¶Õ¥ÖÖ <li><b>Albanian</b>: Unë mund të ha qelq dhe nuk më gjen gjë. <li><b>Turkish</b>: Cam yiyebilirim, bana zararı dokunmaz. <li><b>Turkish</b> <i>(Ottoman):</i> جا٠ÙÙ٠بÙÙر٠بÚا ضرر٠طÙÙÙÙ٠ز <li><b>Bangla / Bengali</b>: à¦à¦®à¦¿ à¦à¦¾à¦à¦ à¦à§à¦¤à§ পারি, তাতৠà¦à¦®à¦¾à¦° à¦à§à¦¨à§ à¦à§à¦·à¦¤à¦¿ হৠনা। <li><b>Marathi</b>: मॠà¤à¤¾à¤ à¤à¤¾à¤ शà¤à¤¤à¥, मला तॠदà¥à¤à¤¤ नाहà¥. <!-- <li><b>Hindi</b>: मà¥à¤ à¤à¤¾à¤à¤ à¤à¤¾ सà¤à¤¤à¤¾ हà¥à¤, मà¥à¤à¥ à¤à¤¸ सॠà¤à¥à¤ पà¥à¤¡à¤¾ नहà¥à¤ हà¥à¤¤à¥. --> <li><b>Kannada</b>: ನನà²à³ ಹಾನಿ à²à²à²¦à³, ನಾನೠà²à²à²¨à³à²¨à³ ತಿನಬಹà³à²¦à³ <!-- (à²à²¨à³à²¨à²¡): à²à²²à³à²²à²¾à²¦à²°à³ à²à²°à³, à²à²à²¤à²¾à²¦à²°à³ à²à²°à³, à²à²à²¦à³à²à²¦à²¿à²à³ ನೠà²à²¨à³à²¨à²¡à²µà²¾à²à²¿à²°à³, à²à²¨à³à²¨à²¡à²µà³ ಸತà³à²¯.. à²à²¨à³à²¨à²¡à²µà³ ನಿತà³à²¯.. --> <li><b>Hindi</b>: मà¥à¤ à¤à¤¾à¤à¤ à¤à¤¾ सà¤à¤¤à¤¾ हà¥à¤ à¤à¤° मà¥à¤à¥ à¤à¤¸à¤¸à¥ à¤à¥à¤ à¤à¥à¤ नहà¥à¤ पहà¥à¤à¤à¤¤à¥. <li><b>Malayam</b>: à´à´¨à´¿à´àµà´àµ à´àµà´²à´¾à´¸àµ തിനàµà´¨à´¾à´. à´ à´¤àµà´¨àµà´¨àµ à´µàµà´¦à´¨à´¿à´ªàµà´ªà´¿à´àµà´à´¿à´²àµà´². <li><b>Tamil</b>: நான௠à®à®£à¯à®£à®¾à®à®¿ à®à®¾à®ªà¯à®ªà®¿à®à¯à®µà¯à®©à¯, ஠தனால௠à®à®©à®à¯à®à¯ à®à®°à¯ à®à¯à®à¯à®®à¯ வராதà¯. <li><b>Telugu</b>: à°¨à±à°¨à± à°à°¾à°à± తినà°à°²à°¨à± మరియౠఠలా à°à±à°¸à°¿à°¨à°¾ నాà°à± à°à°®à°¿ à°à°¬à±à°¬à°à°¦à°¿ à°²à±à°¦à± <li><b>Sinhalese</b>: මට à·à·à¶¯à·à¶»à· à¶à·à¶¸à¶§ à·à·à¶à·à¶ºà·. à¶à¶ºà·à¶±à· මට à¶à·à·à· à·à·à¶±à·à¶ºà¶à· à·à·à¶¯à· නà·à·à·. <li><b>Urdu</b><a href="#notes">(3)</a>: <span dir="RTL" lang=UR> Ù ÛÚº کاÙÚ Ú©Ú¾Ø§ سکتا ÛÙÚº اÙر Ù Ø¬Ú¾Û ØªÚ©ÙÛÙ ÙÛÛÚº ÛÙØªÛ Û</span> <li><b>Pashto</b><a href="#notes">(3)</a>: ز٠شÙØ´Ù Ø®ÙÚÙÛ Ø´Ù Ø Ùغ٠٠ا ÙÙ Ø®ÙÚÙÙ <li><b>Farsi / Persian</b><a href="#notes">(3)</a>: .Ù Ù Ù Û ØªÙاÙ٠بدÙÙ٠اØساس درد Ø´Ùش٠بخÙر٠<li id="arabic"><b>Arabic</b><a href="#notes">(3)</a>: <span dir="RTL" lang=AR>Ø£Ùا Ùادر عÙ٠أÙ٠اÙزجاج Ù Ùذا Ùا ÙؤÙÙ ÙÙ.</span> <br><B>Aramaic</B>: (NEEDED) <li><b>Maltese</b>: Nista' niekol il-ħġieġ u ma jagħmilli xejn. <li id="hebrew"><B>Hebrew</B><a href="#notes">(3)</a>: <SPAN dir=rtl lang=HE>×× × ×××× ××××× ×××××ת ××× ×× ×××ק ××.</SPAN> <li><B>Yiddish</B><a href="#notes">(3)</a>: <SPAN dir=rtl lang=JI>××× ×§×¢× ×¢×¡× ×××Ö¸× ××× ×¢×¡ ××× ××ר × ××©× ×°×².</SPAN> <br><b>Judeo-Arabic</b>: (NEEDED) <br><b>Ladino</b>: (NEEDED) <br><b>GÇʼÇz</b>: (NEEDED) <br><b>Amharic</b>: (NEEDED) <li><b>Twi</b>: Metumi awe tumpan, ÉnyÉ me hwee. <li><b>Hausa</b> (<i>Latin</i>): InaÌ iya taunar gilaÌshi kuma in gamaÌ laÌfiyaÌ. <li><b>Hausa</b> (<i>Ajami</i>) <a href="#notes">(2)</a>: <SPAN dir=rtl lang=HA> Ø¥ÙÙا Ø¥ÙÙ٠تÙÙÙÙر غÙÙÙاش٠ÙÙ٠٠إÙ٠غÙÙ Ùا ÙÙاÙÙÙÙا</SPAN> <li><b>Yoruba</b><a href="#notes">(4)</a>: Mo lè jeÌ© dÃgÃ, kò nà pa mà lára. <li><b>Lingala</b>: NakokiÌ koliÌya biteÌni bya milungi, ekosaÌla ngaÌiÌ mabeÌ tÉÌ. <!-- <li><b>Lingala</b>: Nakokà kolÃya biténi bya milungi, ekosála ngáà mabé tÉÌ. --> <li><b>(Ki)Swahili</b>: Naweza kula bilauri na sikunyui. <li><b>Malay</b>: Saya boleh makan kaca dan ia tidak mencederakan saya. <li><b>Tagalog</b>: Kaya kong kumain nang bubog at hindi ako masaktan. <li><b>Chamorro</b>: Siña yo' chumocho krestat, ti ha na'lalamen yo'. <li><b>Fijian</b>: Au rawa ni kana iloilo, ia au sega ni vakacacani kina. <li><b>Javanese</b>: Aku isa mangan beling tanpa lara. <li><b>Burmese</b> (Unicode 4.0): áá¹áá¹ááá¹âáá±á¬á¹âááá¹áá¹ááá¹âá áá¹ááá¹âá á¬á¸áá¯ááá¹âááá¹âá ááá¹áá±á¬áá¹âá· áááá¯ááá¹âáá¹áᯠááá¹áááá¬á (9) <li><b>Burmese</b> (Unicode 5.0): áá»á½ááºáá±á¬áº áá»á½ááºá áá¾ááºá á¬á¸ááá¯ááºáááºá áááºá¸áá¼á±á¬ááºá· ááááá¯ááºáá¾á¯ááá¾ááá«á (9) <li><B>Vietnamese (quá»c ngữ)</B>: Tôi có thá» Än thủy tinh mà không hại gì. <li><B>Vietnamese (nôm)</B> (<a href="#notes">4</a>): äº ð£ ä¸ å¹ æ°´ æ¶ ð¦¡ ç©º ð£ 害 å¦ <li><b>Khmer</b>: áááá»áá¢á¶á áá»ááááá áááá¶á ááááááá¶ááááá á¶á <li><b>Lao</b>: àºàºà»àºàºàº´àºà»àºà»àº§à»àºà»à»àºàºàºàºµà»àº¡àº±àºàºà»à»à»àºà»à»àº®àº±àºà»àº«à»àºàºà»àºà»àºàº±àº. <li><b>Thai</b>: à¸à¸±à¸à¸à¸´à¸à¸à¸£à¸°à¸à¸à¹à¸à¹ à¹à¸à¹à¸¡à¸±à¸à¹à¸¡à¹à¸à¸³à¹à¸«à¹à¸à¸±à¸à¹à¸à¹à¸ <li><b>Mongolian</b> <i>(Cyrillic):</i> Ðи Ñил идÑй Ñадна, надад Ñ Ð¾ÑÑой Ð±Ð¸Ñ <li><b>Mongolian</b> <i>(Classic)</i> (<a href="#notes">5</a>): á ªá ¢ á °á ¢á ¯á ¢ á ¢á ³á ¡á ¶á ¦ á ´á ¢á ³á á ¨á á á ¨á á ³á ¤á · á ¬á £á ¤á ·á á ³á á ¢ á ªá ¢á °á ¢ <br><b>Dzongkha</b>: (NEEDED) <li><b>Nepali</b>: म à¤à¤¾à¤à¤ à¤à¤¾à¤¨ सà¤à¥à¤à¥ र मलाठà¤à¥à¤¹à¤¿ नॠहà¥à¤¨à¥âनॠ। <li><b>Tibetan</b>: ཤེལà¼à½¦à¾à½¼à¼à½à¼à½à½¦à¼à½à¼à½à¼à½à½²à¼à½à¼à½¢à½ºà½à¼ <li><b>Chinese</b>: <span lang=zh>æè½åä¸ç»çèä¸ä¼¤èº«ä½ã</span> <li><b>Chinese</b> (Traditional): æè½åä¸ç»çèä¸å·èº«é«ã <li><b>Taiwanese</b><a href="#notes">(6)</a>: Góa Ä-tà ng chiaÌh po-lê, mÄ bÄ tioÌh-siong. <li><b>Japanese</b>: <span lang=ja>ç§ã¯ã¬ã©ã¹ãé£ã¹ããã¾ããããã¯ç§ãå·ã¤ãã¾ããã</span> <li><b>Korean</b>: <span lang=ko>ëë ì 리를 먹ì ì ìì´ì. ê·¸ëë ìíì§ ììì</span> <li><b>Bislama</b>: Mi save kakae glas, hemi no save katem mi.<br> <li><b>Hawaiian</b>: Hiki iaÊ»u ke Ê»ai i ke aniani; Ê»aÊ»ole nÅ lÄ au e Ê»eha.<br> <li><b>Marquesan</b>: E koÊ»ana e kai i te karahi, mea Ê»Ä, Ê»aÊ»e hauhau. <li><b>Inuktitut</b> (10): áááá ááááááᯠá±áá±á¦áááá áá <li><b>Chinook Jargon:</b> Naika mÉkmÉk kakshÉt labutay, pi weyk ukuk munk-sik nay. <li><b>Navajo</b>: Tsésǫʼ yishÄ ÌÄ go bÃÃnÃshghah dóó doo shiÅ neezgai da. <br><b>Cherokee</b> <i>(and Cree, Chickasaw, Cree, Micmac, Ojibwa, Lakota, Náhuatl, Quechua, Aymara, and other American languages):</i> (NEEDED) <br><b>Garifuna</b>: (NEEDED) <br><b>Gullah</b>: (NEEDED) <li><b>Lojban</b>: mi kakne le nu citka le blaci .iku'i le se go'i na xrani mi <li><b>Nórdicg</b>: Ljœr ye caudran créneþ ý jor cẃran. </ol> <p> <i>(Additions, corrections, completions,</i> <a href="mailto:kermit@kermitproject.org"><i>gratefuly accepted</i></a><i>.)</i> <p> For testing purposes, some of these are repeated in a <b>monospace font</b> . . . <p> <ol> <li><tt>Euro Symbol: â¬.</tt> <li><tt>Greek: ÎÏοÏÏ Î½Î± ÏÎ¬Ï ÏÏαÏμÎνα Î³Ï Î±Î»Î¹Î¬ ÏÏÏÎ¯Ï Î½Î± ÏÎ¬Î¸Ï ÏίÏοÏα.</tt> <li><tt>Ãslenska / Icelandic: Ãg get etið gler án þess að meiða mig.</tt> <li><tt>Polish: MogÄ jeÅÄ szkÅo, i mi nie szkodzi.</tt> <li><tt>Romanian: Pot sÄ mÄnânc sticlÄ Èi ea nu mÄ rÄneÈte.</tt> <li><tt>Ukrainian: Я Ð¼Ð¾Ð¶Ñ ÑÑÑи Ñкло, й воно Ð¼ÐµÐ½Ñ Ð½Ðµ поÑкодиÑÑ.</tt> <li><tt>Armenian: Ô¿ÖÕ¶Õ¡Õ´ Õ¡ÕºÕ¡Õ¯Õ« Õ¸ÖÕ¿Õ¥Õ¬ Ö Õ«Õ¶Õ®Õ« Õ¡Õ¶Õ°Õ¡Õ¶Õ£Õ«Õ½Õ¿ Õ¹Õ¨Õ¶Õ¥ÖÖ</tt> <li><tt>Georgian: ááááá¡ áááá áá áá á áá¢áááá.</tt> <li><tt>Hindi: मà¥à¤ à¤à¤¾à¤à¤ à¤à¤¾ सà¤à¤¤à¤¾ हà¥à¤, मà¥à¤à¥ à¤à¤¸ सॠà¤à¥à¤ पà¥à¤¡à¤¾ नहà¥à¤ हà¥à¤¤à¥.</tt> <li><tt>Hebrew<a href="#notes">(2)</a>: <SPAN dir=rtl lang=HE>×× × ×××× ××××× ×××××ת ××× ×× ×××ק ××.</SPAN></tt> <li><tt>Yiddish<a href="#notes">(2)</a>: <SPAN dir=rtl lang=JI>××× ×§×¢× ×¢×¡× ×××Ö¸× ××× ×¢×¡ ××× ××ר × ××©× ×°×².</SPAN></tt> <li><tt>Arabic<a href="#notes">(2)</a>: <span dir="RTL" lang=AR>Ø£Ùا Ùادر عÙ٠أÙ٠اÙزجاج Ù Ùذا Ùا ÙؤÙÙ ÙÙ.</span></tt> <li><tt>Japanese: <span lang=ja>ç§ã¯ã¬ã©ã¹ãé£ã¹ããã¾ããããã¯ç§ãå·ã¤ãã¾ããã</span></tt> <li><tt>Thai: à¸à¸±à¸à¸à¸´à¸à¸à¸£à¸°à¸à¸à¹à¸à¹ à¹à¸à¹à¸¡à¸±à¸à¹à¸¡à¹à¸à¸³à¹à¸«à¹à¸à¸±à¸à¹à¸à¹à¸</tt> </ol> <p> <b><a name="notes">Notes:</a></b> <p> <ol> <li>The "I can eat glass" phrase and initial translations (about 30 of them) were borrowed from Ethan Mollick's <a href="http://hcs.harvard.edu/~igp/glass.html">I Can Eat Glass</a> page (which disappeared on or about June 2004) and converted to UTF-8. Since Ethan's original page is gone, I should mention that his purpose was to offer travelers a phrase they could use in any country that would command a certain kind of respect, or at least get attention. See <a href="#credits">Credits</a> for the many additional contributions since then. When submitting new entries, the word "hurt" (if you have a choice) is used in the sense of "cause harm", "do damage", or "bother", rather than "inflict pain" or "make sad". In this vein Otto Stolz comments (as do others further down; personally I think it's better for the purpose of this page to have extra entries and/or to show a greater repertoire of characters than it is to enforce a strict interpretation of the word "hurt"!): <p> <blockquote> This is the meaning I have translated to the Swabian dialect. However, I just have noticed that most of the German variants translate the "inflict pain" meaning. The German example should read: <p> <blockquote> "Ich kann Glas essen ohne mir zu schaden." </blockquote> <p> rather than: <p> <blockquote> "Ich kann Glas essen, ohne mir weh zu tun." </blockquote> <p> (The comma fell victim to the 1996 orthographic reform, cf. <a href="http://www.ids-mannheim.de/reform/e3-1.html#P76"><tt>http://www.ids-mannheim.de/reform/e3-1.html#P76</tt></a>. <p> You may wish to contact the contributors of the following translations to correct them: <p> <ul> <li> Lëtzebuergescht / Luxemburgish: Ech kan Glas iessen, daat deet mir nët wei. <li> Lausitzer Mundart ("Lusatian"): Ich koann Gloos assn und doas dudd merr ni wii. <li> Sächsisch / Saxon: 'sch kann Glos essn, ohne dass'sch mer wehtue. <li> Bayrisch / Bavarian: I koh Glos esa, und es duard ma ned wei. <li> Allemannisch: I kaun Gloos essen, es tuat ma ned weh. <li> Schwyzerdütsch: Ich chan Glaas ässe, das tuet mir nöd weeh. </ul> <p> In contrast, I deem the following translations *alright*: <p> <ul> <li> Ruhrdeutsch: Ich kann Glas verkasematuckeln, ohne dattet mich wat jucken tut. <li> Pfälzisch: Isch konn Glass fresse ohne dasses mer ebbes ausmache dud. <li> Schwäbisch / Swabian: I kÃ¥ Glas frässa, ond des macht mr nix! </ul> <p> (However, you could remove the commas, on account of <a href="http://www.ids-mannheim.de/reform/e3-1.html#P76"><tt>http://www.ids-mannheim.de/reform/e3-1.html#P76</tt></a> and <a href="http://www.ids-mannheim.de/reform/e3-1.html#P72"><tt>http://www.ids-mannheim.de/reform/e3-1.html#P72</tt></a>, respectively.) <p> I guess, also these examples translate the <i>wrong</i> sense of "hurt", though I do not know these languages well enough to assert them definitely: <p> <ul> <li> Nederlands / Dutch: Ik kan glas eten; het doet mij geen pijn. <i>(This one has been changed)</i> <li> Kirchröadsj/Bôchesserplat: Iech ken glaas èèse, mer 't deet miech jing pieng. </ul> <p> In the Romanic languages, the variations on "fa male" (it) are probably wrong, whilst the variations on "hace daño" (es) and "damaÄas" (Esperanto) are probably correct; "nocet" (la) is definitely right. <p> The northern Germanic variants of "skada" are probably right, as are the Slavic variants of "Å¡kodi/Ñкоди" (se); however the Slavic variants of " boli" (hv) are probably wrong, as "bolena" means "pain/ache", IIRC. </blockquote> <p> That was from July 2004. In December 2007, Otto writes again: <p> <blockquote> <small> Hello Frank, in days of yore, I had written:<br> > "Ich kann Glas essen ohne mir zu schaden." <br> > (The comma fell victim to the 1996 orthographic reform, <p> cf. <a href="http://www.ids-mannheim.de/reform/e3-1.html#P76">http://www.ids-mannheim.de/reform/e3-1.html#P76</a>. <p> The latest revision (2006) of the official German orthography has revived the comma around infinitive clauses commencing with <i>ohne</i>, or 5 other conjunctions, or depending from a noun or from an announcing demonstrative (<a href="http://www.ids-mannheim.de/reform/regeln2006.pdf">http://www.ids-mannheim.de/reform/regeln2006.pdf</a>, §75). So, it's again: <i>Ich kann Glas essen, ohne mir zu schaden.</i> <p> Best wishes,<br> Otto Stolz </small> </blockquote> <p> <li>The numbering of the samples is arbitrary, done only to keep track of how many there are, and can change any time a new entry is added. The arrangement is also arbitrary but with some attempt to group related examples together. Note: All languages not listed are wanted, not just the ones that say (NEEDED). <p> <li><a name="note1">Correct right-to-left display of these languages depends on the capabilities of your browser.</a> The period should appear on the left. In the monospace Yiddish example, the Yiddish digraphs should occupy one character cell. <p> <li>Yoruba: The third word is Latin letter small 'j' followed by small 'e' with U+0329, Combining Vertical Line Below. This displays correctly only if your Unicode font includes the U+0329 glyph and your browser supports combining diacritical marks. The Lingala and Indic examples also include combining sequences. <p> <li>Includes Unicode 3.1 (or later) characters beyond Plane 0. <p> <li>The Classic Mongolian example should be vertical, top-to-bottom and left-to-right. But such display is almost impossible. Also no font yet exists which provides the proper ligatures and positional variants for the characters of this script, which works somewhat like Arabic. <p> <li>Taiwanese is also known as Holo or Hoklo, and is related to Southern Min dialects such as Amoy. Contributed by Henry H. Tan-Tenn, who comments, "The above is the romanized version, in a script current among Taiwanese Christians since the mid-19th century. It was invented by British missionaries and saw use in hundreds of published works, mostly of a religious nature. Most Taiwanese did not know Chinese characters then, or at least not well enough to read. More to the point, though, a written standard using Chinese characters has never developed, so a significant minority of words are represented with different candidate characters, depending on one's personal preference or etymological theory. In this sentence, for example, "-tà ng", "chiaÌh", "mÄ" and "bÄ" are problematic using Chinese characters. "Góa" (I/me) and "po-lê" (glass) are as written in other Sinitic languages (e.g. Mandarin, Hakka)." <p> <li>Wagner Amaral of Pinese & Amaral Associados notes that the Brazilian Portuguese sentence for "I can eat glass" should be identical to the Portuguese one, as the word "machuca" means "inflict pain", or rather "injuries". The words "faz mal" would more correctly translate as "cause harm". <p> <li>Burmese: In English the first person pronoun "I" stands for both genders, male and female. In Burmese (except in the central part of Burma) kyundaw (<font size="+1" face="Padauk">áá¹áá¹ááá¹âáá±á¬á¹â</font>) for male and kyanma (<font size="+1" face="Padauk">áá¹áá¹ááá¹âá</font>) for female. Using here a fully-compliant Unicode Burmese font -- sadly one and only one Padauk Graphite font exists -- rendering using graphite engine. <!--GONE <a href="http://h1.ripway.com/bamarsar/">CLICK HERE</a> to test Burmese characters. --> Unicode 4.0 or older standard did not have some medial and vowel character; the second example has them. <p> <li><i>From Louise Hope, 22 November 2010:</i> I decided to have a go at an Inuktitut rendering, mainly in hopes of shaming someone who actually knows the language into coming up with something better. Meanwhile, try this: <p> áááá ááááááᯠá±áá±á¦áááá áá <br> aliguq nirijaraangakku suranngittunnaqtunga <p> Loosely: I am able not to hurt myself whenever I eat glass. <p> aliguq >> glass (uninflected because it is the patient of a transitive verb in an ergative language)<br> nirijaraangakku >> "I eat him/her/it" in Frequentative mood (all one verb with inflectional ending, no affixes whatsoever)<br> suranngittunnaqtunga >> suraq (do permanent harm) + nngit (verb-negator) + tunnaq (ability) + tunga (intransitive ending, making the verb passive or reflexive) <p> See above about someone who knows the language, et cetera. <p> Script trivia: the syllable á± is a single unicode character representing the two elements á (syllable-final n) and á (syllable ngi). I think they just did it that way because it looks tidier than the expected áá. If your operating system didn't come with <a href="http://www.ffonts.net/Euphemia-UCAS.font">Euphemia</a> (all-purpose UCAS font), you can download <a href="http://www.allaboutshoes.ca/inuk/our-boots/piq_font.php">Pigiarniq</a>. It comes with a jolly little inuksuk á that the Unicode Consortium is trying to make into a squatter. <p> <!-- á¯áá¥á áªáááá¯á ááªááááá¾áªá <br> siqumiumanngikkuni naamangnanngijjuk. --> </ol> <h3><a name="quickbrownfox">The Quick Brown Fox... Pangrams</a></h3> The "I can eat glass" sentences do not necessarily show off the orthography of each language to best advantage. In many alphabetic written languages it is possible to include all (or most) letters (or "special" characters) in a single (often nonsense) <i>pangram</i>. These were traditionally used in typewriter instruction; now they are useful for stress-testing computer fonts and keyboard input methods. Here are a few examples (SEND MORE): <p> <ol> <li><b>English:</b> The quick brown fox jumps over the lazy dog. <li><b>Jamaican:</b> Chruu, a kwik di kwik brong fox a jomp huova di liezi daag de, yu no siit? <li><b>Irish:</b> "An á¸fuil do Äroà ag bualaḠó á¸aitÃos an Ä¡rá a á¹eall lena á¹Ã³g éada ó ṡlà do leasa ṫú?" "D'á¸uascail Ãosa Ãrá¹ac na hÃiÄ¡e Beannaiṫe pór Ãava agus Ãá¸aiá¹." <li><b>Dutch:</b> Pa's wijze lynx bezag vroom het fikse aquaduct. <li><b>German: </b> Falsches Ãben von Xylophonmusik quält jeden gröÃeren Zwerg. (1) <li><b>German: </b> <span lang=da>Im finÅ¿teren JagdÅ¿chloà am offenen FelsquellwaÅ¿Å¿er patzte der affig-flatterhafte kauzig-höfâliche Bäcker über Å¿einem verÅ¿ifften kniffligen C-Xylophon.</span> (2) <li><b>Norwegian:</b> BlÃ¥bærsyltetøy ("blueberry jam", includes every extra letter used in Norwegian). <li><b>Swedish:</b> Flygande bäckasiner söka strax hwila pÃ¥ mjuka tuvor. <li><b>Icelandic:</b> Sævör grét áðan þvà úlpan var ónýt. <li><b>Finnish:</b> (5) Törkylempijävongahdus (This is a perfect pangram, every letter appears only once. Translating it is an art on its own, but I'll say "rude lover's yelp". :-D) <li><b>Finnish:</b> (5) Albert osti fagotin ja töräytti puhkuvan melodian. (Albert bought a bassoon and hooted an impressive melody.) <li><b>Finnish:</b> (5) On sangen hauskaa, että polkupyörä on maanteiden jokapäiväinen ilmiö. (It's pleasantly amusing, that the bicycle is an everyday sight on the roads.) <li><b>Polish:</b> PchnÄ Ä w tÄ Åódź jeża lub osiem skrzyÅ fig. <li><b>Czech:</b> PÅÃliÅ¡ žluÅ¥ouÄký kůŠúpÄl Äábelské kódy. <li><b>Slovak:</b> Starý kôŠna hÅbe knÃh žuje tÃÅ¡ko povädnuté ruže, na stĺpe sa Äateľ uÄà kvákaÅ¥ novú ódu o živote. <li><b>Greek</b> (monotonic): ξεÏκεÏÎ¬Î¶Ï Ïην ÏÏ ÏοÏθÏÏα Î²Î´ÎµÎ»Ï Î³Î¼Î¯Î± <li><b>Greek</b> (polytonic): ξεÏκεÏÎ¬Î¶Ï Ïὴν ÏÏ ÏοÏθÏÏα Î²Î´ÎµÎ»Ï Î³Î¼Î¯Î± <li><b>Russian:</b> СÑеÑÑ Ð¶Ðµ еÑÑ ÑÑÐ¸Ñ Ð¼ÑÐ³ÐºÐ¸Ñ ÑÑанÑÑзÑÐºÐ¸Ñ Ð±Ñлок да вÑпей ÑаÑ. <li><b>Russian:</b> Ð ÑаÑÐ°Ñ Ñга жил-бÑл ÑиÑÑÑÑ? Ðа, но ÑалÑÑивÑй ÑкземплÑÑ! ÑÑ. <li><b>Bulgarian:</b> ÐÑлÑаÑа дÑÐ»Ñ Ð±ÐµÑе ÑаÑÑлива, Ñе пÑÑ ÑÑ, койÑо ÑÑÑна, замÑÑзна каÑо гÑон. <li><b>Sami (Northern):</b> Vuol Ruoŧa geÄggiid leat máÅga luosa ja Äuovžža. <li><b>Hungarian:</b> ÃrvÃztűrÅ tükörfúrógép. <li><b>Spanish:</b> El pingüino Wenceslao hizo kilómetros bajo exhaustiva lluvia y frÃo, añoraba a su querido cachorro. <li><b>Portuguese:</b> O próximo vôo à noite sobre o Atlântico, põe freqüentemente o único médico. (3) <li><b>French:</b> Les naïfs ægithales hâtifs pondant à Noël où il gèle sont sûrs d'être déçus en voyant leurs drôles d'Åufs abîmés. <li><b>Esperanto:</b> EÄ¥oÅanÄo ÄiuĵaÅde. <li><b>Hebrew:</b> <span dir="RTL" lang=HE>×× ×××£ ×¡×ª× ×ש×××¢ ××× ×ª× ×¦× ×§×¨×¤× ×¢×¥ ××× ×××.</span> <li><b>Japanese</b> (Hiragana):<blockquote> ããã¯ã«ã»ã¸ã©ãã¡ãã¬ãã<br> ãããããããã¤ããªãã<br> ããã®ãããã¾ãããµããã¦<br> ãããããã¿ãããã²ããã (4) </blockquote> </ol> <p id="oechtringen"> <a name="notes2"><b>Notes:</b></a> <p> <ol> <li>Other phrases commonly used in Germany include: "Ein wackerer Bayer vertilgt ja bequem zwo Pfund Kalbshaxe" and, more recently, "Franz jagt im komplett verwahrlosten Taxi quer durch Bayern", but both lack umlauts and esszet. Previously, going for the shortest sentence that has all the umlauts and special characters, I had "GrüÃe aus Bärenhöfe (und Ãechtringen)!" Acute accents are not used in native German words, so I was surprised to discover "Ãechtringen" in the Deutsche Bundespost Postleitzahlenbuch: <p> <blockquote> <a href="http://www.columbia.edu/~fdc/misc/oechtringen.jpg"><img src="oechtringen-sm.jpg" alt="Click for full-size image (2.8MB)"></a> </blockquote> <p> It's a small village in eastern Lower Saxony. The "oe" in this case turns out to be the Lower Saxon "lengthening e" (Dehnungs-e), which makes the previous vowel long (used in a number of Lower Saxon place names such as Soest and Itzehoe), not the "e" that indicates umlaut of the preceding vowel. Many thanks to the Ãechtringen-Namenschreibungsuntersuchungskomitee (Alex Bochannek, Manfred Erren, Asmus Freytag, Christoph Päper, plus Werner Lemberg who serves as Ãechtringen-Namenschreibungsuntersuchungskomiteerechtschreibungsprüfer) for their relentless pursuit of the facts in this case. Conclusion: the accent almost certainly does not belong on this (or any other native German) word, but neither can it be dismissed as dirt on the page. To add to the mystery, it has been reported that other copies of the same edition of the PLZB do not show the accent! UPDATE (March 2006): David Krings was intrigued enough by this report to contact the mayor of Ebstorf, of which Oechtringen is a borough, who responded: <p> <blockquote style="font-family:sans-serif;font-size:80%"> Sehr geehrter Mr. Krings,<br> wenn Oechtringen irgendwo mit einem Akzent auf dem O geschrieben wurde, dann kann das nur ein Fehldruck sein. Die offizielle Schreibweise lautet jedenfalls âOechtringenâ.<br> Mit freundlichen Grüssen<br> Der Samtgemeindebürgermeister<br> i.A. Lothar Jessel </blockquote> <p> <li>From Karl Pentzlin (Kochel am See, Bavaria, Germany): "This German phrase is suited for display by a Fraktur (broken letter) font. It contains: all common three-letter ligatures: ffi ffl fft and all two-letter ligatures required by the Duden for Fraktur typesetting: ch ck ff fi fl ft ll Å¿ch Å¿i Å¿Å¿ Å¿t tz (all in a manner such they are not part of a three-letter ligature), one example of f-l where German typesetting rules prohibit ligating (marked by a ZWNJ), and all German letters a...z, ä,ö,ü,Ã, Å¿ [long s] (all in a manner such that they are not part of a two-letter Fraktur ligature)." Otto Stolz notes that "'SchloÃ' is now spelled 'Schloss', in contrast to 'gröÃer' (example 4) which has kept its 'Ã'. Fraktur has been banned from general use, in 1942, and long-s (Å¿) has ceased to be used with Antiqua (Roman) even earlier (the latest Antiqua-Å¿ I have seen is from 1913, but then I am no expert, so there may well be a later instance." Later Otto confirms the latter theory, "Now I've run across a book âDeutsche Rechtschreibungâ (edited by Lutz Mackensen) from 1954 (my reprint is from 1956) that has kept the Antiqua-Å¿ in its dictionary part (but neither in the preface nor in the appendix)." <p> <li>Diaeresis is not used in Iberian Portuguese. <p> <li>From Yurio Miyazawa: "This poetry contains all the sounds in the Japanese language and used to be the first thing for children to learn in their Japanese class. The Hiragana version is particularly neat because it covers every character in the phonetic Hiragana character set." Yurio also sent the Kanji version: <p> <blockquote> è²ã¯åã¸ã© æ£ãã¬ãã<br> æãä¸èª°ã 常ãªãã<br> æçºã®å¥¥å±± ä»æ¥è¶ãã¦<br> æµ ã夢è¦ã é ã²ããã </blockquote> <li>Finnish pangrams from Mikko Ristilä. </ol> <p> <b>Accented Cyrillic:</b> <p> <i>(This section contributed by Vladimir Marinov.)</i> <p> In Bulgarian it is desirable, customary, or in some cases required to write accents over vowels. Unfortunately, no computer character sets contain the full repertoire of accented Cyrillic letters. With Unicode, however, it is possible to combine any Cyrillic letter with any combining accent. The appearance of the result depends on the font and the rendering engine. Here are two examples. <p> <ol> <li>Той Ð²Ð¸Ð´Ñ Ð±ÑлаÑа коÑÐ°Ì Ð¿Ð¾ главаÑа Ð¸Ì Ð¸ коÌÑа на ÑамоÑо иÌ, и ÑеÌÑе да Ð¸Ì ÑеÑеÌ: "ÐаÑаÌÑа Ð¿Ð¾Ì Ð¿Ð°ÌÑи Ð¾Ñ Ð¿Ð°ÌÑаÑа, не Ñа паÑиÌ!", но Ñи помиÌÑли: "Хей, помиÑÐ»Ð¸Ì Ñи! ÐÌ Ð¸Ì Ñека, Ð°Ì Ðµ ÑкоÑила в Ñази Ñека, коÑÑо ÑеÑе да ÑеÑеÌ, а не ÑеÌÑе." <p> <li>Ðо пÑÌÑÑ Ð¿ÑÑÑÌÐ²Ð°Ñ ÐºÑÌÑди и ÑгоÑлавÑÌни. </ol> <h3><a name="html">HTML Features</a></h3> Here is the Russian alphabet (uppercase only) coded in three different ways, which should look identical: <p> <ol> <li>ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐРСТУФХЦЧШЩЪЫЬÐЮЯ <i>(Literal UTF-8)</i> <li>АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ <i>(Decimal numeric character reference)</i> <li>АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ <i>(Hexadecimal numeric character reference)</i> </ol> <p> In another test, we use HTML language tags to distinguish Bulgarian, Russian, and <a href="http://www.tiro.com/transfer/Serbian_Rendering.pdf">Serbian</a>, which have different italic forms for lowercase б, г, д, п, and/or Ñ: <p> <blockquote> <table> <tr> <td><b>Bulgarian</b>: <td><span lang=BG>[ бгдпÑ</span> ] <td><span lang=BG>[ <i>бгдпÑ</i></span> ] <td><span lang=BG><i> Ðога да Ñм ÑÑÑкло и не ме боли.</i></span> <tr> <td><b>Russian</b>: <td><span lang=RU>[ бгдпÑ</span> ] <td><span lang=RU>[ <i>бгдпÑ</i></span> ] <td><span lang=RU><i>Я Ð¼Ð¾Ð³Ñ ÐµÑÑÑ ÑÑекло, ÑÑо мне не вÑедиÑ.</i></span> <tr> <td><b>Serbian</b>: <td><span lang=SR>[ бгдпÑ</span> ] <td><span lang=SR>[ <i>бгдпÑ</i></span> ] <td> <span lang=SR><i>ÐÐ¾Ð³Ñ ÑеÑÑи ÑÑакло а да ми не Ñкоди.</i></span> </table> </blockquote> <p> <!-- acknowledgments --> <h3><a name="credits">Credits, Tools, and Commentary</a></h3> <dl> <dt><b>Credits:</b></dt> <dd> The "I can eat glass" phrase and the initial collection of translations: <a href="http://hcs.harvard.edu/~igp/glass.html">Ethan Mollick</a>. Transcription / conversion to UTF-8: Frank da Cruz. <b>Albanian:</b> Sindi Keesan. <b>Afrikaans:</b> Johan Fourie, Kevin Poalses. <b>Anglo Saxon:</b> Frank da Cruz. <b>Arabic:</b> Najib Tounsi. <b>Armenian:</b> Vaçe Kundakçı. <b>Belarusian:</b> Alexey Chernyak, Patricia Clausnitzer. <b>Bengali:</b> Somnath Purkayastha, Deepayan Sarkar. <b>Bislama:</b> Dan McGarry. <b>Bosnian:</b> Dmitrij D. Czarkoff. <b>Braille:</b> Frank da Cruz. <b>Bulgarian:</b> Sindi Keesan, Guentcho Skordev, Vladimir Marinov. <b>Burmese:</b> "cetanapa", Sithu Thwin. <b>Cabo Verde Creole:</b> Cláudio Alexandre Duarte. <b>Catalán:</b> Jordi Bancells. <b>Chinese:</b> Jack Soo, Wong Pui Lam. <b>Chinook Jargon:</b> David Robertson. <b>Cornish:</b> Chris Stephens. <b>Croatian:</b> Dmitrij D. Czarkoff, Marjan BaÄe. <b>Czech:</b> Stanislav Pecha, Radovan GarabÃk. <b>Danish:</b> Morten Due Jorgensen. <b>Dutch:</b> Peter Gotink. Pim Blokland, Rob Daniel, Rob de Wit. <b>Erzian:</b> Jack Rueter. <b>Esperanto:</b> Franko Luin, Radovan GarabÃk. <b>Estonian:</b> Meelis Roos. <b>Faroese:</b> Jón Gaasedal. <b>Farsi/Persian:</b> Payam Elahi. <b>Fijian:</b> Paul Cannon. <b>Finnish:</b> Sampsa Toivanen, Mikko Ristilä. <b>French:</b> Luc Carissimo, Anne Colin du Terrail, Sean M. Burke, Theo Morelli. <b>Galician:</b> Laura Probaos. <b>Georgian:</b> Giorgi Lebanidze. <b>German:</b> Christoph Päper, Otto Stolz, Karl Pentzlin, David Krings, Frank da Cruz, Peter Keel (Seegras), Elias Glantschnig. <b>Gothic:</b> Aurélien Coudurier. <b>Greek:</b> Ariel Glenn, Constantine Stathopoulos, Siva Nataraja, Christos Georgiou. <b>Hebrew:</b> Jonathan Rosenne, Tal Barnea. <b>Hausa:</b> Malami Buba, Tom Gewecke. <b>Hawaiian:</b> na HauÊ»oli Motta, Anela de Rego, Kaliko Trapp. <b>Hindi:</b> Shirish Kalele, Nitin Dahra. <b>Hungarian:</b> András Rácz, Mark Holczhammer. <b>Icelandic:</b> Andrés Magnússon, Sveinn Baldursson. <b>International Phonetic Alphabet (IPA):</b> Siva Nataraja / Vincent Ramos. <b>Inuktitut</b>: Louise Hope. <b>Irish:</b> Michael Everson, Marion Gunn, James Kass, Curtis Clark. <b>Italian:</b> Thomas De Bellis. <b>Jamaican:</b> Stephen J. Cherin. <b>Japanese:</b> Makoto Takahashi, Yurio Miyazawa. <b>Kannada:</b> Sridhar R N, Alok G. Singh. <b>Karelian:</b> Aleksandr Semakov. <b>Khmer:</b> Tola Sann. <b>Kirchröadsj:</b> Roger Stoffers. <b>Kreyòl:</b> Sean M. Burke. <b>Korean:</b> Jungshik Shin. <b>Langenfelder Platt:</b> David Krings. <b>Lao:</b> Tola Sann. <b>Lëtzebuergescht:</b> Stefaan Eeckels. <b>Lingala:</b> <a href="http://home.sus.mcgill.ca/~moyogo">Denis Moyogo Jacquerye</a> (<a href="http://info-langues-congo.1sd.org/">Nkóta ya KÉÌngÉ mÃbalé </a>) (Nkóta ya KÉÌngÉ mÃbal). <b>Lithuanian:</b> Gediminas Grigas. <b>Lojban:</b> Edward Cherlin. <b>Lusatian:</b> Ronald Schaffhirt. <b>Macedonian:</b> Sindi Keesan. <b>Malay:</b> Zarina Mustapha. <b>Malayam:</b> Anil Matthews. <b>Maltese:</b> Kenneth Joseph Vella. <b>Manx:</b> Éanna Ó Brádaigh. <b>Marathi:</b> Shirish Kalele. <b>Marquesan:</b> Kaliko Trapp. <b>Middle English:</b> Frank da Cruz. <b>Milanese:</b> Marco Cimarosti. <b>Mongolian:</b> Tom Gewecke. <b>Montenegran:</b> Dmitrij D. Czarkoff. <b>Napoletano:</b> Diego Quintano. <b>Navajo:</b> Tom Gewecke. <a href="http://www.langmaker.com/db/mdl_nordicg.htm"><b>Nórdicg</b></a>: Yẃlyan Rott. <b>Nepali:</b> Ujjwol Lamichhane, Rabi Tripathi. <b>Norwegian:</b> Herman Ranes, HÃ¥vard KvÃ¥len. <b>Odenwälderisch:</b> Alexander Heß. <b>Old Irish:</b> Michael Everson. <b>Old Norse:</b> Andrés Magnússon. <b>Papiamentu:</b> Bianca and Denise Zanardi. <b>Pashto:</b> N.R. Liwal. <b>Pfälzisch:</b> Dr. Johannes Sander. <b>Picard:</b> Philippe Mennecier. <b>Polish:</b> Juliusz Chroboczek, PaweÅ Przeradowski, Wlodzislaw Kostecki. <b>Portuguese:</b> "Cláudio" Alexandre Duarte, Bianca and Denise Zanardi, Pedro Palhoto Matos, Wagner Amaral. <b>Québécois:</b> Laurent Detillieux. <b>Roman:</b> Pierpaolo Bernardi. <b>Romanian:</b> Juliusz Chroboczek, Ionel Mugurel. <b>Romansch:</b> Alexandre Suter. <b>Ruhrdeutsch:</b> "Timwi". <b>Russian:</b> Alexey Chernyak, Serge Nesterovitch. <b>Sami:</b> Anne Colin du Terrail, Luc Carissimo. <b>Sanskrit:</b> Siva Nataraja / Vincent Ramos. <b>Sächsisch:</b> André Müller. <b>Schwäbisch:</b> Otto Stolz. <b>Scots:</b> Jonathan Riddell. <b>Serbian:</b> Dmitrij D. Czarkoff, Sindi Keesan, Ranko Narancic, Boris Daljevic, Szilvia Csorba, O. Dag. <b>Sinhalese:</b> Abdul-Ahad (ASM). <b>Slovak:</b> G. Adam Stanislav, Radovan GarabÃk. <b>Slovenian:</b> Albert Kolar. <b>Spanish:</b> <a href="http://www.aleida.net">Aleida Morel</a>, Laura Probaos. <b>Swahili:</b> Ronald Schaffhirt. <b>Swedish:</b> Christian Rose, Bengt Larsson. <b>Taiwanese:</b> Henry H. Tan-Tenn. <b>Tagalog:</b> Jim Soliven. <b>Tamil:</b> Vasee Vaseeharan, Vetrivel P. <b>Telugu:</b> Arjuna Rao Chavala. <b>Tibetan:</b> D. Germano, Tom Gewecke. <b>Thai:</b> Alan Wood's wife. <b>Turkish:</b> Vaçe Kundakçı, Tom Gewecke, Merlign Olnon. <b>Ukrainian:</b> Michael Zajac, Oleg Podsadny. <b>Ulster Gaelic:</b> Ciarán à DuibhÃn. <b>Urdu:</b> Mustafa Ali. <a href="http://nomfoundation.org/"><b>Vietnamese</b></a>: Dixon Au, [James] Äá» Bá PhÆ°á»c <font face="PMingLiU">杜 伯 福</font>. <b>Walloon:</b> Pablo Saratxaga. <b>Welsh:</b> Geiriadur Prifysgol Cymru (Andrew). <b>Yiddish:</b> Mark David. <b>Zeneise:</b> Angelo Pavese. <p> <dt><b>Tools Used to Create This Web Page:</b></dt> <dd>The UTF8-aware <a href="k95.html">Kermit 95</a> terminal emulator on Windows, to a Unix host with the <a href="http://www.gnu.org/directory/emacs.html">EMACS</a> text editor. Kermit 95 displays UTF-8 and also allows keyboard entry of arbitrary Unicode BMP characters as 4 hex digits, as shown <a href="glass.html">HERE</a>. Hex codes for Unicode values can be found in <a href="http://www.unicode.org/unicode/uni2book/u2.html">The Unicode Standard</a> (recommended) and the <a href="http://www.unicode.org/charts/">online code charts</a>. When submissions arrive by email encoded in some other character set (Latin-1, Latin-2, KOI, various PC code pages, JEUC, etc), I use the TRANSLATE command of <a href="ckermit.html">C-Kermit</a> on the Unix host (<a href="safe.html">where I read my mail</a>) to convert the character set to UTF-8 (I could also use Kermit 95 for this; it has the same TRANSLATE command). That's it -- no "Web authoring" tools, no locales, no "smart" anything. It's just plain text, nothing more. By the way, there's nothing special about EMACS -- any text editor will do, providing it allows entry of arbitrary 8-bit bytes as text, including the 0x80-0x9F "C1" range. EMACS 21.1 actually supports UTF-8; earlier versions don't know about it and display the octal codes; either way is OK for this purpose. <p> <dt><b>Commentary:</b> <dd>Date: Wed, 27 Feb 2002 13:21:59 +0100<br> From: "Bruno DEDOMINICIS" <tt><b.dedominicis@cite-sciences.fr></tt><br> Subject: Je peux manger du verre, cela ne me fait pas mal. <p> I just found out your website and it makes me feel like proposing an interpretation of the choice of this peculiar phrase. <p> Glass is transparent and can hurt as everyone knows. The relation between people and civilisations is sometimes effusional and more often rude. The concept of breaking frontiers through globalization, in a way, is also an attempt to deny any difference. Isn't "transparency" the flag of modernity? Nothing should be hidden any more, authority is obsolete, and the new powers are supposed to reign through loving and smiling and no more through coercion... <p> Eating glass without pain sounds like a very nice metaphor of this attempt. That is, frontiers should become glass transparent first, and be denied by incorporating them. On the reverse, it shows that through globalization, frontiers undergo a process of displacement, that is, when they are not any more speakable, they become repressed from the speech and are therefore incorporated and might become painful symptoms, as for example what happens when one tries to eat glass. <p> The frontiers that used to separate bodies one from another tend to divide bodies from within and make them suffer.... The chosen phrase then appears as a denial of the symptom that might result from the destitution of traditional frontiers. <p> Best,<br> Bruno De Dominicis, Paris, France </dl> <p> <b>Other Unicode pages onsite:</b> <ul> <li><a href="postal.html">Frank's Compulsive Guide to Postal Addresses</a> (especially the <a href="postal.html#index">Index</a>) <li><a href="http://www.columbia.edu/~fdc/pace/">Peace in All Languages</a> <li><a href="sshclient-be.html">Kermit 95 клÑенÑа SSH</a> (Kermit 95 SSH Client documentation in Belarusian) <li><a href="st-erkenwald.html">Representing Middle English on the Web with UTF-8</a> <li><a href="biblio.html">The Kermit Bibliography</a> (in UTF-8) <li><a href="accents.html">Interchange of Non-English Computer Text</a> (UTF-8 math and box-drawing) <li><a href="utf8-t1.html">Unicode Table</a> (in UTF-8) </ul> <p> <b>Unicode samplers and resources offsite:</b> <ul> <li><a href="http://rishida.net/scripts/uniview/conversion">Unicode Code Converter</a> (converts among different Unicode encoding forms and notations). <li><a href="http://unicode.org/cldr/utility/confusables.jsp?a=paypal&n=on&x=on">Confusables</a> (every silver lining has a cloud). <li><a href="http://www.seigniorage.de/">Seigniorage</a> (Central Banks worldwide). <li>Michael Everson's <a href="http://www.evertype.com/scriptbib.html">Bibliography of Typography and Scripts</a> <li><a href="http://www.code2000.net/englishtestutf.htm">Does your browser support Unicode English?</a> (James Kass) <li><a href="http://crism.maden.org/dunno.html">I don't know, I only work here</a> <li><a href="http://www.trigeminal.com/samples/provincial.html">Anyone can be provincial!</a> <!-- defunct <li><a href="http://www.macchiato.com/unicode/Unicode_transcriptions.html">Transcriptions of "Unicode"</a> --> <li><a href="http://www.i18nguy.com/unicode-example.html">Example Unicode Usage for Business Applications</a> <li><a href="http://www.cl.cam.ac.uk/~mgk25/unicode.html#apps">UTF-8 and Unicode FAQ for Unix/Linux</a> </ul> <p> <b>Unicode fonts:</b> <ul> <li><a href="http://www.code2000.net/">Code 2000</a> (James Kass) <li><a href="http://www.alanwood.net/unicode/fonts.html">Unicode Fonts for Windows Computers</a> (Alan Wood) <li><a href="http://www.cl.cam.ac.uk/~mgk25/ucs-fonts.html">Unicode Fonts and Tools for X11</a> (Markus Kuhn) <li><a href="http://www.evertype.com/emono/">Everson Mono</a> (Michael Everson) <li><a href="http://www.monotype.com">Agfa Monotype</a> (now fonts.com) </ul> <p> [ <a href="k95.html">Kermit 95</a> ] [ <a href="glass.html">K95 Screen Shots</a> ] [ <a href="ckermit.html">C-Kermit</a> ] [ <a href="index.html">Kermit Home</a> ] [ <a href="http://www.unicode.org/help/display_problems.html">Display Problems?</a> ] [ <a href="http://www.unicode.org">The Unicode Consortium</a> ] <hr> <ADDRESS> UTF-8 Sampler / <a href="index.html">The Kermit Project</a> / <a href="http://www.columbia.edu">Columbia University</a> / <a href="mailto:kermit@kermitproject.org">kermit@kermitproject.org</a> </ADDRESS> </body> </html>
Snippet is not live.
Travelled to 12 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt
No comments. add comment
Snippet ID: | #3000418 |
Snippet name: | Contents of kermitproject.org/utf8.html |
Eternal ID of this version: | #3000418/1 |
Text MD5: | b208d08a9884912af97ca249c44cb697 |
Author: | someone |
Category: | |
Type: | New Tinybrain snippet |
Gummipassword: | #3999999 |
Uploaded from IP: | 31.19.51.233 |
Public (visible to everyone): | Yes |
Archived (hidden from active list): | No |
Created/modified: | 2016-10-25 18:20:13 |
Source code size: | 71543 bytes / 1346 lines |
Pitched / IR pitched: | No / No |
Views / Downloads: | 514 / 119 |
Referenced in: | [show references] |