Not logged in.  Login/Logout/Register | List snippets | | Create snippet | Upload image | Upload data

1346
LINES

< > BotCompany Repo | #3000417 // Contents of kermitproject.org/utf8.html

New Tinybrain snippet

1  
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
2  
<html>
3  
<head>
4  
<META http-equiv="Content-Type" content="text/html; charset=utf-8">
5  
<title>UTF-8 Sampler</title>
6  
7  
<META http-equiv="Content-Style-Type" content="text/css">
8  
<META name="viewport" content="width=device-width, initial-scale=1.0">
9  
<LINK REL="stylesheet" TYPE="text/css" HREF="/kermit.css">
10  
<LINK REL="shortcut icon" href="/favicon.ico" >
11  
<LINK REL="icon" href="/favicon.ico" type="image/x-icon">
12  
<LINK REL="icon" type="image/ico" href="/favicon.ico"> 
13  
<style type="text/css">
14  
  blockquote { margin-left:8px; margin-right:8px; font-size:90% }
15  
  body { font-size:15px;
16  
         font-family:calibri, arial narrow, arial, sans-serif, times;
17  
         color:black;
18  
         background:white;
19  
         margin:16px;    
20  
  }
21  
  tt { font-size:94% }
22  
</style>
23  
</head>
24  
25  
<body>
26  
27  
<h1><tt>UTF-8 SAMPLER</tt></h1>
28  
29  
<big><big>&nbsp;&nbsp;¥&nbsp;·&nbsp;£&nbsp;·&nbsp;€&nbsp;·&nbsp;$&nbsp;·&nbsp;¢&nbsp;·&nbsp;₡&nbsp;·&nbsp;₢&nbsp;·&nbsp;₣&nbsp;·&nbsp;₤&nbsp;·&nbsp;₥&nbsp;·&nbsp;₦&nbsp;·&nbsp;₧&nbsp;·&nbsp;₨&nbsp;·&nbsp;₩&nbsp;·&nbsp;₪&nbsp;·&nbsp;₫&nbsp;·&nbsp;₭&nbsp;·&nbsp;₮&nbsp;·&nbsp;₯&nbsp;·&nbsp;&#8377</big></big>
30  
31  
32  
33  
<p>
34  
<blockquote>
35  
Frank da Cruz<br>
36  
<a href="index.html">The Kermit Project</a><br>
37  
New York City<br>
38  
<a href="mailto:fdc@kermitproject.org">fdc@kermitproject.org</a>
39  
40  
<p>
41  
<i>Last update:</i>
42  
Thu Sep 15 14:00:00 2016
43  
</blockquote>
44  
<p>
45  
<hr>
46  
[&nbsp;<a href="http://www.columbia.edu/~fdc/pace/">PEACE</a>&nbsp;]
47  
[&nbsp;<a href="#poetry">Poetry</a>&nbsp;]
48  
[&nbsp;<a href="#glass">I Can Eat Glass</a>&nbsp;]
49  
[&nbsp;<a href="#quickbrownfox">Pangrams</a>&nbsp;]
50  
[&nbsp;<a href="#html">HTML Features</a>&nbsp;]
51  
[&nbsp;<a href="#credits">Credits, Tools, Commentary</a>&nbsp;]
52  
<p>
53  
54  
<big><big>U</big>TF-8</big> is an ASCII-preserving encoding method for
55  
<a href="unicode.html">Unicode</a> (ISO 10646), the Universal Character Set
56  
(UCS).  The UCS encodes most of the world's writing systems in a single
57  
character set, allowing you to mix languages and scripts within a document
58  
without needing any tricks for switching character sets.  This web page is
59  
encoded directly in UTF-8.
60  
61  
<p>
62  
63  
As shown <a href="glass.html">HERE</a>,
64  
Columbia University's <a href="k95.html">Kermit 95</a> terminal emulation
65  
software can display UTF-8 plain text in Windows 95, 98, ME, NT, XP, Vista,
66  
or Windows 7/8/10 when using a monospace Unicode font like <a
67  
href="http://www.monotype.com">Andale Mono WT J</a> or <a
68  
href="http://www.evertype.com/emono/">Everson Mono Terminal</a>, or the lesser
69  
populated Courier New, Lucida Console, or Andale Mono.  <a
70  
href="ckermit.html">C-Kermit</a> can handle it too,
71  
<a href="http://www.cl.cam.ac.uk/~mgk25/unicode.html">if you have a Unicode
72  
display</a>.  As many languages as are representable in your font can be seen
73  
on the screen at the same time.
74  
75  
<p>
76  
77  
This, however, is a Web page, which started out as a kind of stress test for
78  
UTF-8 support in Web browsers, which was spotty when this page was first
79  
created in the 1990s but which has become standard in all modern browsers.
80  
The problem now is mainly the fonts and the browser's (or font's) support
81  
for the nonzero Unicode planes (as in, e.g., the <a href="#braille">Braille</a>
82  
and <a href="#gothic">Gothic</a> examples
83  
below).  And to some extent the rendition of combining sequences,
84  
right-to-left rendition (<a href="#arabic">Arabic</a>,
85  
<a href="#hebrew">Hebrew</a>), and so
86  
on.  <a href="http://www.alanwood.net/unicode/fonts.html">CLICK HERE</a> for
87  
a survey of Unicode fonts for Windows.
88  
89  
<p>
90  
91  
The subtitle above shows currency symbols of many lands.  If they don't
92  
appear as blobs, we're off to a good start!  (The one on the end is the
93  
<a href="http://en.wikipedia.org/wiki/Indian_rupee_sign">new Indian Rupee
94  
sign</a> which won't show up in fonts for a while.)
95  
96  
<h3><a name="poetry">Poetry</a></h3>
97  
98  
From the Anglo-Saxon <a href="http://www.ragweedforge.com/poems.html"><cite>Rune Poem</cite></a> (Rune version):
99  
<p><blockquote>
100  
  ᚠᛇᚻ᛫ᛒᛦᚦ᛫ᚠᚱᚩᚠᚢᚱ᛫ᚠᛁᚱᚪ᛫ᚷᛖᚻᚹᛦᛚᚳᚢᛗ<br>
101  
  ᛋᚳᛖᚪᛚ᛫ᚦᛖᚪᚻ᛫ᛗᚪᚾᚾᚪ᛫ᚷᛖᚻᚹᛦᛚᚳ᛫ᛗᛁᚳᛚᚢᚾ᛫ᚻᛦᛏ᛫ᛞᚫᛚᚪᚾ<br>
102  
  ᚷᛁᚠ᛫ᚻᛖ᛫ᚹᛁᛚᛖ᛫ᚠᚩᚱ᛫ᛞᚱᛁᚻᛏᚾᛖ᛫ᛞᚩᛗᛖᛋ᛫ᚻᛚᛇᛏᚪᚾ᛬<br>
103  
</blockquote>
104  
<p>
105  
106  
From Laȝamon's<i> <a href="http://mesl.itd.umich.edu/b/brut/">Brut</a></i>
107  
(<i>The Chronicles of England</i>, Middle English, West Midlands):
108  
<p>
109  
<blockquote>
110  
An preost wes on leoden, Laȝamon was ihoten<br>
111  
He wes Leovenaðes sone -- liðe him be Drihten.<br>
112  
He wonede at Ernleȝe at æðelen are chirechen,<br>
113  
Uppen Sevarne staþe, sel þar him þuhte,<br>
114  
Onfest Radestone, þer he bock radde.
115  
</blockquote>
116  
<p>
117  
118  
(The third letter in the author's name is Yogh, missing from many fonts;
119  
<a href="st-erkenwald.html">CLICK HERE</a> for another Middle English sample
120  
with some explanation of letters and encoding).
121  
122  
<p>
123  
124  
From the <cite>Tagelied</cite> of 
125  
126  
<a href="http://gutenberg.spiegel.de/autoren/eschenba.htm">
127  
<b>Wolfram von Eschenbach</b></a> (Middle High German):
128  
<p><blockquote>
129  
Sîne klâwen durh die wolken sint geslagen,<br>
130  
er stîget ûf mit grôzer kraft,<br>
131  
ich sih in grâwen tägelîch als er wil tagen,<br>
132  
den tac, der im geselleschaft<br>
133  
erwenden wil, dem werden man,<br>
134  
den ich mit sorgen în verliez.<br>
135  
ich bringe in hinnen, ob ich kan.<br>
136  
sîn vil manegiu tugent michz leisten hiez.<br>
137  
</blockquote><p>
138  
139  
Some lines of 
140  
<a href="http://users.hol.gr/~artemis/odysseas_elytis.htm">
141  
<b>Odysseus Elytis</b></a> (Greek):
142  
143  
<blockquote>
144  
<table cellspacing=0 cellpadding=0>
145  
<tr>
146  
<td valign="top" style="padding-right:16">
147  
Monotonic:
148  
<p>
149  
Τη γλώσσα μου έδωσαν ελληνική<br>
150  
το σπίτι φτωχικό στις αμμουδιές του Ομήρου.<br>
151  
Μονάχη έγνοια η γλώσσα μου στις αμμουδιές του Ομήρου.<br>
152  
<p>
153  
από το Άξιον Εστί<br>
154  
του Οδυσσέα Ελύτη
155  
156  
<td valign="top">
157  
Polytonic:
158  
<p>
159  
Τὴ γλῶσσα μοῦ ἔδωσαν ἑλληνικὴ<br/>
160  
τὸ σπίτι φτωχικὸ στὶς ἀμμουδιὲς τοῦ Ὁμήρου.<br/>
161  
Μονάχη ἔγνοια ἡ γλῶσσα μου στὶς ἀμμουδιὲς τοῦ Ὁμήρου.<br/>
162  
<p>
163  
ἀπὸ τὸ Ἄξιον ἐστί<br/>
164  
τοῦ Ὀδυσσέα Ἐλύτη<br/>
165  
166  
167  
168  
169  
170  
171  
172  
</table>
173  
</blockquote>
174  
175  
<p>
176  
177  
The first stanza of 
178  
<a href="http://www.ocf.berkeley.edu/%7Eleong/Russkaya%20Literatura/Aleksandr%20Sergeevich%20Pushkin.htm"><b>Pushkin</b></a>'s <cite>Bronze Horseman</cite> (Russian):<br>
179  
<p><blockquote>
180  
На берегу пустынных волн<br>
181  
Стоял он, дум великих полн,<br>
182  
И вдаль глядел.  Пред ним широко<br>
183  
Река неслася; бедный чёлн<br>
184  
По ней стремился одиноко.<br>
185  
По мшистым, топким берегам<br>
186  
Чернели избы здесь и там,<br>
187  
Приют убогого чухонца;<br>
188  
И лес, неведомый лучам<br>
189  
В тумане спрятанного солнца,<br>
190  
Кругом шумел.<br>
191  
</blockquote><p>
192  
193  
<a href="http://www.compling.hu-berlin.de/~johannes/mxedruli/"><b>Šota Rustaveli</b></a>'s Veṗxis Ṭq̇aosani,
194  
̣︡Th, <cite>The Knight in the Tiger's Skin</cite> (Georgian):<p>
195  
<blockquote>
196  
ვეპხის ტყაოსანი
197  
შოთა რუსთაველი
198  
<p>
199  
ღმერთსი შემვედრე, ნუთუ კვლა დამხსნას სოფლისა შრომასა,
200  
ცეცხლს, წყალსა და მიწასა, ჰაერთა თანა მრომასა;
201  
მომცნეს ფრთენი და აღვფრინდე, მივჰხვდე მას ჩემსა ნდომასა,
202  
დღისით და ღამით ვჰხედვიდე მზისა ელვათა კრთომაასა.
203  
</blockquote>
204  
<p>
205  
206  
Tamil poetry of Subramaniya Bharathiyar:
207  
208  
சுப்ரமணிய பாரதியார் (1882-1921):
209  
210  
<p>
211  
<blockquote>
212  
213  
யாமறிந்த மொழிகளிலே தமிழ்மொழி போல் இனிதாவது எங்கும் காணோம், <br>
214  
பாமரராய் விலங்குகளாய், உலகனைத்தும் இகழ்ச்சிசொலப் பான்மை கெட்டு, <br>
215  
நாமமது தமிழரெனக் கொண்டு இங்கு வாழ்ந்திடுதல் நன்றோ? சொல்லீர்!<br>
216  
தேமதுரத் தமிழோசை உலகமெலாம் பரவும்வகை செய்தல் வேண்டும்.
217  
218  
</blockquote>
219  
<p>
220  
Kannada poetry by Kuvempu &mdash; ಬಾ ಇಲ್ಲಿ ಸಂಭವಿಸು
221  
222  
<p>
223  
<blockquote>
224  
225  
226  
ಬಾ ಇಲ್ಲಿ ಸಂಭವಿಸು ಇಂದೆನ್ನ ಹೃದಯದಲಿ
227  
<br>
228  
229  
ನಿತ್ಯವೂ ಅವತರಿಪ ಸತ್ಯಾವತಾರ
230  
231  
<p>
232  
233  
234  
235  
236  
ಮಣ್ಣಾಗಿ ಮರವಾಗಿ ಮಿಗವಾಗಿ ಕಗವಾಗೀ...
237  
238  
<br>
239  
240  
ಮಣ್ಣಾಗಿ ಮರವಾಗಿ ಮಿಗವಾಗಿ ಕಗವಾಗಿ
241  
242  
<br>
243  
244  
ಭವ ಭವದಿ ಭತಿಸಿಹೇ ಭವತಿ ದೂರ
245  
246  
<br>
247  
248  
ನಿತ್ಯವೂ ಅವತರಿಪ ಸತ್ಯಾವತಾರ || ಬಾ ಇಲ್ಲಿ ||
249  
250  
251  
</blockquote>
252  
253  
<h3><a name="glass">I Can Eat Glass</a></h3>
254  
255  
And from the sublime to the ridiculous, here is a
256  
<a href="#notes">certain phrase&sup1;</a> in an assortment of languages:
257  
258  
<p>
259  
<ol>
260  
<li><b>Sanskrit</b>: काचं शक्नोम्यत्तुम् । नोपहिनस्ति माम् ॥
261  
262  
<li><b>Sanskrit</b> <i>(standard transcription):</i> kācaṃ śaknomyattum; nopahinasti mām.
263  
<li><b>Classical Greek</b>: ὕαλον ϕαγεῖν δύναμαι· τοῦτο οὔ με βλάπτει.
264  
<li><b>Greek</b> (monotonic): Μπορώ να φάω σπασμένα γυαλιά χωρίς να πάθω τίποτα.
265  
<li><b>Greek</b> (polytonic): Μπορῶ νὰ φάω σπασμένα γυαλιὰ χωρὶς νὰ πάθω τίποτα.
266  
267  
<br><b>Etruscan</b>: (NEEDED)
268  
<li><b>Latin</b>:  Vitrum edere possum; mihi non nocet.
269  
<li><b>Old French</b>: Je puis mangier del voirre.  Ne me nuit.
270  
<li><b>French</b>: Je peux manger du verre, ça ne me fait pas <!--de--> mal.
271  
<li><b>Provençal / Occitan</b>: Pòdi manjar de veire, me nafrariá pas.
272  
<li><b>Québécois</b>: J'peux manger d'la vitre, ça m'fa pas mal.
273  
<li><b>Walloon</b>: Dji pou magnî do vêre, çoula m' freut nén må.
274  
<br><b>Champenois</b>: (NEEDED)
275  
<br><b>Lorrain</b>: (NEEDED)
276  
<li><b>Picard</b>: Ch'peux mingi du verre, cha m'foé mie n'ma.
277  
<br><b>Corsican/Corsu</b>: (NEEDED)
278  
<br><b>J&egrave;rriais</b>: (NEEDED)
279  
<li><b>Kreyòl Ayisyen</b> (Hait&iuml;):    Mwen kap manje vè, li pa blese'm.
280  
<li><b>Basque</b>: Kristala jan dezaket, ez dit minik ematen.
281  
<li><b>Catalan / Català</b>: Puc menjar vidre, que no em fa mal.
282  
<li><b>Spanish</b>: Puedo comer vidrio, no me hace daño.
283  
<li><b>Aragon&eacute;s</b>: Puedo minchar beire, no me'n fa mal .
284  
<br><b>Aran&eacute;s</b>: (NEEDED)
285  
<br><b>Mallorquín</b>: (NEEDED)
286  
<li><b>Galician</b>: Eu podo xantar cristais e non cortarme.
287  
<li><b>European Portuguese</b>: Posso comer vidro, não me faz mal.
288  
<li><b>Brazilian Portuguese</b> (<a href="#notes">8</a>):
289  
 Posso comer vidro, não me machuca.
290  
<li><b>Caboverdiano/Kabuverdianu</b> (Cape Verde): M' podê cumê vidru, ca ta maguâ-m'.
291  
<li><b>Papiamentu</b>: Ami por kome glas anto e no ta hasimi daño.
292  
<li><b>Italian</b>:  Posso mangiare il vetro e non mi fa male.
293  
<li><b>Milanese</b>: Sôn bôn de magnà el véder, el me fa minga mal.
294  
<li><b>Roman</b>: Me posso magna' er vetro, e nun me fa male.
295  
<li><b>Napoletano</b>: M' pozz magna' o'vetr, e nun m' fa mal.
296  
<li><b>Venetian</b>: Mi posso magnare el vetro, no'l me fa mae.
297  
<li><b>Zeneise</b> <i>(Genovese):</i> Pòsso mangiâ o veddro e o no me fà mâ.
298  
<li><b>Sicilian</b>: Puotsu mangiari u vitru, nun mi fa mali.
299  
<br><b>Campinadese</b> (Sardinia): (NEEDED)
300  
<br><b>Lugudorese</b> (Sardinia): (NEEDED)
301  
<li><b>Romansch (Grischun)</b>: Jau sai mangiar vaider, senza che quai fa donn a mai.
302  
<br><b>Romany / Tsigane</b>: (NEEDED)
303  
<li><b>Romanian</b>: Pot să mănânc sticlă și ea nu mă rănește.
304  
<li><b>Esperanto</b>: Mi povas manĝi vitron, ĝi ne damaĝas min.
305  
<br><b>Pictish</b>: (NEEDED)
306  
<br><b>Breton</b>: (NEEDED)
307  
<li><b>Cornish</b>: Mý a yl dybry gwéder hag éf ny wra ow ankenya.
308  
<li><b>Welsh</b>: Dw i'n gallu bwyta gwydr, 'dyw e ddim yn gwneud dolur i mi.
309  
<li><b>Manx Gaelic</b>: Foddym gee glonney agh cha jean eh gortaghey mee.
310  
<li><b>Old Irish</b> <i>(Ogham):</i> ᚛᚛ᚉᚑᚅᚔᚉᚉᚔᚋ ᚔᚈᚔ ᚍᚂᚐᚅᚑ ᚅᚔᚋᚌᚓᚅᚐ᚜
311  
<li><b>Old Irish</b> <i>(Latin):</i> Con·iccim ithi nglano. Ním·géna.
312  
313  
<li><b>Irish</b>: Is féidir liom gloinne a ithe. Ní dhéanann sí dochar ar bith dom.
314  
<li><b>Ulster Gaelic</b>: Ithim-sa gloine agus ní miste damh é.
315  
<li><b>Scottish Gaelic</b>: S urrainn dhomh gloinne ithe; cha ghoirtich i mi.
316  
<li><b>Anglo-Saxon</b> <i>(Runes):</i>
317  
ᛁᚳ᛫ᛗᚨᚷ᛫ᚷᛚᚨᛋ᛫ᛖᚩᛏᚪᚾ᛫ᚩᚾᛞ᛫ᚻᛁᛏ᛫ᚾᛖ᛫ᚻᛖᚪᚱᛗᛁᚪᚧ᛫ᛗᛖ᛬
318  
<li><b>Anglo-Saxon</b> <i>(Latin):</i> Ic mæg glæs eotan ond hit ne hearmiað me.
319  
<li><b>Middle English</b>: Ich canne glas eten and hit hirtiþ me nouȝt.
320  
<li><b>English</b>: I can eat glass and it doesn't hurt me.
321  
<li><b>English</b> <i>(IPA):</i> [aɪ kæn iːt glɑːs ænd ɪt dɐz nɒt hɜːt miː] (Received Pronunciation)
322  
<li id="braille"><b>English</b> <i>(Braille):</i> ⠊⠀⠉⠁⠝⠀⠑⠁⠞⠀⠛⠇⠁⠎⠎⠀⠁⠝⠙⠀⠊⠞⠀⠙⠕⠑⠎⠝⠞⠀⠓⠥⠗⠞⠀⠍⠑
323  
<li><b>Jamaican</b>: Mi kian niam glas han i neba hot mi.
324  
<li><b>Lalland Scots / Doric</b>: Ah can eat gless, it disnae hurt us.
325  
<br><b>Glaswegian</b>: (NEEDED)
326  
<li id="gothic"><b>Gothic</b> (<a href="#notes">4</a>):
327  
𐌼𐌰𐌲
328  
𐌲𐌻𐌴𐍃
329  
𐌹̈𐍄𐌰𐌽,
330  
𐌽𐌹
331  
𐌼𐌹𐍃
332  
𐍅𐌿
333  
𐌽𐌳𐌰𐌽
334  
𐌱𐍂𐌹𐌲𐌲𐌹𐌸.
335  
<li><b>Old Norse</b> <i>(Runes):</i>  ᛖᚴ ᚷᛖᛏ ᛖᛏᛁ
336  
ᚧ ᚷᛚᛖᚱ ᛘᚾ 
337  
ᚦᛖᛋᛋ ᚨᚧ ᚡᛖ
338  
ᚱᚧᚨ ᛋᚨᚱ
339  
340  
<li><b>Old Norse</b> <i>(Latin):</i>  Ek get etið gler án þess að verða sár.
341  
342  
<li><b>Norsk / Norwegian (Nynorsk):</b> Eg kan eta glas utan å skada meg.
343  
<li><b>Norsk / Norwegian (Bokmål):</b> Jeg kan spise glass uten å skade meg.
344  
<li><b>Føroyskt / Faroese</b>: Eg kann eta glas, skaðaleysur.
345  
<!-- <br><b>Føroyskt / Faroese</b>:  Eg kann eta glas, uttan á nakran hátt at meinslast av hesum. -->
346  
<li><b>Íslenska / Icelandic</b>: Ég  get etið gler án þess að meiða mig.
347  
<li><b>Svenska / Swedish</b>: Jag kan äta glas utan att skada mig.
348  
<li><b>Dansk / Danish</b>: Jeg kan spise glas, det gør ikke ondt på mig.
349  
<li><b>S&oslash;nderjysk</b>: Æ ka æe glass uhen at det go mæ naue.
350  
<li><b>Frysk / Frisian</b>: Ik kin glês ite, it docht me net sear.
351  
<!-- <li><b>Nederlands / Dutch</b>: Ik kan glas eten, het doet mij geen pijn. -->
352  
<!-- <li><b>Nederlands / Dutch</b>: Ik kan glas eten zonder dat het
353  
mij 
354  
schaadt. -->
355  
<!-- <li><tt>Dutch: Ik kan glas eten, maar dat doet mij geen kwaad.</tt> -->
356  
<li><b>Nederlands / Dutch</b>: Ik kan glas eten, het doet
357  
mij 
358  
geen kwaad.
359  
360  
361  
<LI><B>Kirchröadsj/Bôchesserplat</B>: Iech ken glaas èèse, mer 't deet miech
362  
jing pieng.</LI>
363  
364  
<li><b>Afrikaans</b>: Ek kan glas eet, maar dit doen my nie skade nie.
365  
<li><b>Lëtzebuergescht / Luxemburgish</b>: Ech kan Glas iessen, daat deet mir nët wei.
366  
<li><b>Deutsch / German</b>: Ich kann Glas essen, ohne mir zu schaden.
367  
<li><b>Ruhrdeutsch</b>: Ich kann Glas verkasematuckeln, ohne dattet mich wat jucken tut.
368  
<li><b>Langenfelder Platt</b>:
369  
Isch kann Jlaas kimmeln, uuhne datt mich datt weh dääd.
370  
<li><b>Lausitzer Mundart</b> ("Lusatian"): Ich koann Gloos assn und doas
371  
dudd merr ni wii.
372  
<li><b>Odenwälderisch</b>: Iech konn glaasch voschbachteln ohne dass es mir ebbs daun doun dud.
373  
<li><b>Sächsisch / Saxon</b>: 'sch kann Glos essn, ohne dass'sch mer wehtue.
374  
<li><b>Pfälzisch</b>: Isch konn Glass fresse ohne dasses mer ebbes ausmache dud.
375  
<li><b>Schwäbisch / Swabian</b>: I kå Glas frässa, ond des macht mr nix!
376  
<li><b>Deutsch (Voralberg)</b>: I ka glas eassa, ohne dass mar weh tuat.
377  
<li><b>Bayrisch / Bavarian</b>: I koh Glos esa, und es duard ma ned wei.
378  
<li><b>Allemannisch</b>: I kaun Gloos essen, es tuat ma ned weh.
379  
380  
<li><b>Schwyzerdütsch</b> (Zürich): Ich chan Glaas ässe, das schadt mir nöd. 
381  
<li><b>Schwyzerdütsch</b> (Luzern):  Ech cha Glâs ässe, das schadt mer ned. 
382  
383  
<br><b>Plautdietsch</b>: (NEEDED)
384  
<li><b>Hungarian</b>: Meg tudom enni az üveget, nem lesz tőle bajom.
385  
<li><b>Suomi / Finnish</b>: Voin syödä lasia, se ei vahingoita minua.
386  
<li><b>Sami (Northern)</b>: Sáhtán borrat lása, dat ii leat bávččas.
387  
<li><b>Erzian</b>: Мон ярсан 
388  
суликадо, ды 
389  
зыян 
390  
эйстэнзэ а 
391  
ули.
392  
<li><b>Northern Karelian</b>: Mie voin syvvä lasie ta minla ei ole kipie.
393  
<li><b>Southern Karelian</b>: Minä voin syvvä st'oklua dai minule ei ole kibie.
394  
<br><b>Vepsian</b>: (NEEDED)
395  
<br><b>Votian</b>: (NEEDED)
396  
<br><b>Livonian</b>: (NEEDED)
397  
<li><b>Estonian</b>: Ma võin klaasi süüa, see ei tee mulle midagi.
398  
<li><b>Latvian</b>: Es varu ēst stiklu, tas man nekaitē.
399  
<li><b>Lithuanian</b>: Aš galiu valgyti stiklą ir jis manęs nežeidžia
400  
<br><b>Old Prussian</b>: (NEEDED)
401  
<br><b>Sorbian</b> (Wendish): (NEEDED)
402  
<li><b>Czech</b>: Mohu jíst sklo, neublíží mi.
403  
<li><b>Slovak</b>: Môžem jesť sklo. Nezraní ma.
404  
<li><b>Polska / Polish</b>: Mogę jeść szkło i mi nie szkodzi.
405  
<li><b>Slovenian:</b> Lahko jem steklo, ne da bi mi Å¡kodovalo.
406  
407  
<!--
408  
<li><b>Croatian</b>: Ja mogu jesti staklo i ne boli me.
409  
Serbian translation is very poor. Infinitive used and sound as: "I can 
410  
eating glass".
411  
<li><b>Serbian</b> <i>(Latin):</i> Mogu jesti staklo a da mi ne Å¡kodi.
412  
<li><b>Serbian</b> <i>(Cyrillic):</i> Могу јести стакло 
413  
а
414  
да ми 
415  
не 
416  
шкоди.
417  
<li><b>Serbian</b> <i>(Latin):</i> Ja mogu da jedem staklo.
418  
<li><b>Serbian</b> <i>(Cyrillic)</i>: Ја могу да једем стакло.
419  
<li><b>Macedonian:</b> Можам да јадам стакло, а не ме штета.
420  
-->
421  
<li><b>Bosnian, Croatian, Montenegrin and Serbian</b> <i>(Latin)</i>: Ja mogu jesti staklo, i to mi ne Å¡teti.
422  
423  
<li><b>Bosnian, Montenegrin and Serbian</b> <i>(Cyrillic)</i>:  Ја могу јести стакло, и то ми не штети.
424  
425  
<li><b>Macedonian:</b> Можам да јадам стакло, а не ме штета.
426  
<li><b>Russian</b>: Я могу есть стекло, оно мне не вредит.
427  
<li><b>Belarusian</b> <i>(Cyrillic):</i> Я магу есці шкло, яно мне не шкодзіць.
428  
<li><b>Belarusian</b> <i>(Lacinka):</i> Ja mahu jeści škło, jano mne ne škodzić.
429  
<!--
430  
<li><b>Ukrainian</b>: Я можу їсти шкло, й воно мені не пошкодить.
431  
-->
432  
<li><b>Ukrainian</b>: Я можу їсти скло, і воно мені не зашкодить.
433  
434  
<!-- <li><b>Bulgarian</b>: Мога да ям стъкло и не ме боли. -->
435  
<li><b>Bulgarian</b>: Мога да ям стъкло, то не ми вреди.
436  
437  
<li><b>Georgian</b>: მინას ვჭამ და არა მტკივა.
438  
<li><b>Armenian</b>: Կրնամ ապակի ուտել և ինծի անհանգիստ չըներ։
439  
<li><b>Albanian</b>: Unë mund të ha qelq dhe nuk më gjen gjë.
440  
<li><b>Turkish</b>: Cam yiyebilirim, bana zararı dokunmaz.
441  
<li><b>Turkish</b> <i>(Ottoman):</i> جام  ييه بلورم  بڭا  ضررى  طوقونمز
442  
<li><b>Bangla / Bengali</b>:
443  
আমি কাঁচ খেতে পারি, তাতে আমার কোনো ক্ষতি হয় না। 
444  
<li><b>Marathi</b>: मी काच खाऊ शकतो, मला ते दुखत नाही.
445  
446  
<!--
447  
<li><b>Hindi</b>: मैं काँच खा सकता हूँ, मुझे उस से कोई पीडा नहीं होती.
448  
-->
449  
450  
<li><b>Kannada</b>:
451  
452  
453  
ನನಗೆ ಹಾನಿ ಆಗದೆ, ನಾನು ಗಜನ್ನು ತಿನಬಹುದು
454  
455  
456  
<!--
457  
458  
 (ಕನ್ನಡ): ಎಲ್ಲಾದರೂ ಇರು, ಎಂತಾದರು ಇರು, ಎಂದೆಂದಿಗೂ ನೀ ಕನ್ನಡವಾಗಿರು, ಕನ್ನಡವೇ ಸತ್ಯ.. ಕನ್ನಡವೇ ನಿತ್ಯ..
459  
460  
-->
461  
462  
<li><b>Hindi</b>: मैं काँच खा सकता हूँ और मुझे उससे कोई चोट नहीं पहुंचती.
463  
464  
465  
<li><b>Malayam</b>:
466  
467  
എനിക്ക് ഗ്ലാസ് തിന്നാം. അതെന്നെ വേദനിപ്പിക്കില്ല.
468  
469  
470  
471  
<li><b>Tamil</b>: நான் கண்ணாடி சாப்பிடுவேன், அதனால் எனக்கு ஒரு கேடும் வராது.
472  
473  
474  
<li><b>Telugu</b>: నేను గాజు తినగలను మరియు అలా చేసినా నాకు  ఏమి ఇబ్బంది  లేదు
475  
476  
477  
<li><b>Sinhalese</b>: මට වීදුරු කෑමට හැකියි. එයින් මට කිසි හානියක් සිදු නොවේ.
478  
479  
<li><b>Urdu</b><a href="#notes">(3)</a>: <span dir="RTL" lang=UR>
480  
 میں کانچ کھا سکتا ہوں اور مجھے تکلیف نہیں ہوتی ۔</span>
481  
<li><b>Pashto</b><a href="#notes">(3)</a>: زه شيشه خوړلې شم، هغه ما نه خوږوي
482  
<li><b>Farsi / Persian</b><a href="#notes">(3)</a>: .من می توانم بدونِ احساس درد شيشه بخورم
483  
<li id="arabic"><b>Arabic</b><a href="#notes">(3)</a>: <span dir="RTL" lang=AR>أنا قادر على أكل الزجاج و هذا لا يؤلمني.</span>
484  
485  
<br><B>Aramaic</B>: (NEEDED)
486  
<li><b>Maltese</b>: Nista' niekol il-&#295;&#289;ie&#289; u ma jag&#295;milli xejn.
487  
<li id="hebrew"><B>Hebrew</B><a href="#notes">(3)</a>: <SPAN dir=rtl lang=HE>אני יכול לאכול זכוכית וזה לא מזיק לי.</SPAN>
488  
<li><B>Yiddish</B><a href="#notes">(3)</a>: <SPAN dir=rtl lang=JI>איך קען עסן גלאָז און עס טוט מיר נישט װײ.</SPAN>
489  
<br><b>Judeo-Arabic</b>: (NEEDED)
490  
<br><b>Ladino</b>: (NEEDED)
491  
<br><b>Gǝʼǝz</b>: (NEEDED)
492  
<br><b>Amharic</b>: (NEEDED)
493  
<li><b>Twi</b>: Metumi awe tumpan, ɜnyɜ me hwee.
494  
<li><b>Hausa</b> (<i>Latin</i>): Inā iya taunar gilāshi kuma in gamā lāfiyā.
495  
<li><b>Hausa</b> (<i>Ajami</i>) <a href="#notes">(2)</a>: <SPAN dir=rtl lang=HA>
496  
إِنا إِىَ تَونَر غِلَاشِ كُمَ إِن غَمَا لَافِىَا</SPAN>
497  
<li><b>Yoruba</b><a href="#notes">(4)</a>: Mo lè je̩ dígí, kò ní pa mí lára.
498  
<li><b>Lingala</b>: Nakokí kolíya biténi bya milungi, ekosála ngáí mabé tɛ́.
499  
500  
<!--
501  
<li><b>Lingala</b>: Nakokí kolíya biténi bya milungi, ekosála ngáí mabé tɛ́.
502  
-->
503  
<li><b>(Ki)Swahili</b>: Naweza kula bilauri na sikunyui.
504  
505  
<li><b>Malay</b>: Saya boleh makan kaca dan ia tidak mencederakan saya.
506  
<li><b>Tagalog</b>: Kaya kong kumain nang bubog at hindi ako masaktan.
507  
<li><b>Chamorro</b>: Siña yo' chumocho krestat, ti ha na'lalamen yo'.
508  
<li><b>Fijian</b>: Au rawa ni kana iloilo, ia au sega ni vakacacani kina.
509  
<li><b>Javanese</b>: Aku isa mangan beling tanpa lara.
510  
<li><b>Burmese</b> (Unicode 4.0):
511  
က္ယ္ဝန္‌တော္‌၊က္ယ္ဝန္‌မ မ္ယက္‌စားနုိင္‌သည္‌။ ၎က္ရောင္‌့
512  
ထိခုိက္‌မ္ဟု မရ္ဟိပာ။
513  
(9)
514  
515  
<li><b>Burmese</b> (Unicode 5.0):
516  
ကျွန်တော် ကျွန်မ မှန်စားနိုင်တယ်။ ၎င်းကြောင့် ထိခိုက်မှုမရှိပါ။
517  
(9)
518  
519  
<li><B>Vietnamese (quốc  ngữ)</B>: Tôi có thể ăn thủy tinh mà không hại gì.
520  
<li><B>Vietnamese (nôm)</B> (<a href="#notes">4</a>): 些 𣎏 世 咹 水 晶 𦓡 空 𣎏 害 咦
521  
<li><b>Khmer</b>:
522  
ខ្ញុំអាចញុំកញ្ចក់បាន
523  
ដោយគ្មានបញ្ហារ
524  
525  
526  
<li><b>Lao</b>:
527  
ຂອ້ຍກິນແກ້ວໄດ້ໂດຍທີ່ມັນບໍ່ໄດ້ເຮັດໃຫ້ຂອ້ຍເຈັບ.
528  
529  
530  
531  
<li><b>Thai</b>: ฉันกินกระจกได้ แต่มันไม่ทำให้ฉันเจ็บ
532  
<li><b>Mongolian</b> <i>(Cyrillic):</i>   Би шил идэй чадна, надад хортой биш
533  
<li><b>Mongolian</b> <i>(Classic)</i> (<a href="#notes">5</a>):
534  
  ᠪᠢ  ᠰᠢᠯᠢ  ᠢᠳᠡᠶᠦ  ᠴᠢᠳᠠᠨᠠ ᠂ ᠨᠠᠳᠤᠷ  ᠬᠣᠤᠷᠠᠳᠠᠢ  ᠪᠢᠰᠢ
535  
<br><b>Dzongkha</b>: (NEEDED)
536  
<li><b>Nepali</b>: म काँच खान सक्छू र मलाई केहि नी हुन्‍न् ।
537  
538  
<li><b>Tibetan</b>: ཤེལ་སྒོ་ཟ་ནས་ང་ན་གི་མ་རེད།
539  
<li><b>Chinese</b>: <span lang=zh>我能吞下玻璃而不伤身体。</span>
540  
<li><b>Chinese</b> (Traditional):   我能吞下玻璃而不傷身體。
541  
542  
<li><b>Taiwanese</b><a href="#notes">(6)</a>: Góa ē-tàng chia̍h po-lê, mā bē tio̍h-siong.
543  
<li><b>Japanese</b>: <span lang=ja>私はガラスを食べられます。それは私を傷つけません。</span>
544  
<li><b>Korean</b>:  <span lang=ko>나는 유리를 먹을 수 있어요. 그래도 아프지 않아요</span>
545  
<li><b>Bislama</b>: Mi save kakae glas, hemi no save katem mi.<br>
546  
<li><b>Hawaiian</b>:  Hiki iaʻu ke ʻai i ke aniani; ʻaʻole nō lā au e ʻeha.<br>
547  
<li><b>Marquesan</b>: E koʻana e kai i te karahi, mea ʻā, ʻaʻe hauhau.
548  
<li><b>Inuktitut</b> (10): ᐊᓕᒍᖅ ᓂᕆᔭᕌᖓᒃᑯ ᓱᕋᙱᑦᑐᓐᓇᖅᑐᖓ
549  
 
550  
<li><b>Chinook Jargon:</b> Naika məkmək kakshət labutay, pi weyk ukuk munk-sik nay.
551  
<li><b>Navajo</b>: Tsésǫʼ yishą́ągo bííníshghah dóó doo shił neezgai da.
552  
<br><b>Cherokee</b> <i>(and Cree, Chickasaw, Cree, Micmac, Ojibwa, Lakota,
553  
N&aacute;huatl, Quechua, Aymara,
554  
and other American languages):</i> (NEEDED)
555  
<br><b>Garifuna</b>: (NEEDED)
556  
<br><b>Gullah</b>: (NEEDED)
557  
<li><b>Lojban</b>: mi kakne le nu citka le blaci .iku'i le se go'i na xrani mi
558  
<li><b>Nórdicg</b>: Lj&#339;r ye caudran créneþ ý jor c&#7811;ran.
559  
</ol>
560  
<p>
561  
562  
<i>(Additions, corrections, completions,</i>
563  
<a href="mailto:kermit@kermitproject.org"><i>gratefuly accepted</i></a><i>.)</i>
564  
565  
<p>
566  
For testing purposes, some of these are repeated in a <b>monospace font</b>&nbsp;.&nbsp;.&nbsp;.
567  
<p>
568  
<ol>
569  
<li><tt>Euro Symbol: €.</tt>
570  
<li><tt>Greek: Μπορώ να φάω σπασμένα γυαλιά χωρίς να πάθω τίποτα.</tt>
571  
572  
<li><tt>Íslenska / Icelandic: Ég  get etið gler án þess að meiða mig.</tt>
573  
574  
<li><tt>Polish: Mogę jeść szkło, i mi nie szkodzi.</tt>
575  
<li><tt>Romanian: Pot să mănânc sticlă și ea nu mă rănește.</tt>
576  
<li><tt>Ukrainian: Я можу їсти шкло, й воно мені не пошкодить.</tt>
577  
<li><tt>Armenian: Կրնամ ապակի ուտել և ինծի անհանգիստ չըներ։</tt>
578  
<li><tt>Georgian: მინას ვჭამ და არა მტკივა.</tt>
579  
<li><tt>Hindi: मैं काँच खा सकता हूँ, मुझे उस से कोई पीडा नहीं होती.</tt>
580  
<li><tt>Hebrew<a href="#notes">(2)</a>: <SPAN dir=rtl lang=HE>אני יכול לאכול זכוכית וזה לא מזיק לי.</SPAN></tt>
581  
<li><tt>Yiddish<a href="#notes">(2)</a>: <SPAN dir=rtl lang=JI>איך קען עסן גלאָז און עס טוט מיר נישט װײ.</SPAN></tt>
582  
<li><tt>Arabic<a href="#notes">(2)</a>: <span dir="RTL" lang=AR>أنا قادر على أكل الزجاج و هذا لا يؤلمني.</span></tt>
583  
<li><tt>Japanese: <span lang=ja>私はガラスを食べられます。それは私を傷つけません。</span></tt>
584  
<li><tt>Thai: ฉันกินกระจกได้ แต่มันไม่ทำให้ฉันเจ็บ</tt>
585  
</ol>
586  
<p>
587  
588  
<b><a name="notes">Notes:</a></b>
589  
590  
<p>
591  
<ol>
592  
593  
<li>The "I can eat glass" phrase and initial translations (about 30 of them)
594  
were borrowed from Ethan Mollick's <a
595  
href="http://hcs.harvard.edu/~igp/glass.html">I Can Eat Glass</a> page
596  
(which disappeared on or about June 2004) and converted to UTF-8.  Since
597  
Ethan's original page is gone, I should mention that his purpose was to offer
598  
travelers a phrase they could use in any country that would command a
599  
certain kind of respect, or at least get attention.  See <a
600  
href="#credits">Credits</a> for the many additional contributions since
601  
then.  When submitting new entries, the word "hurt" (if you have a choice)
602  
is used in the sense of "cause harm", "do damage", or "bother", rather than
603  
"inflict pain" or "make sad".  In this vein Otto Stolz comments (as do
604  
others further down; personally I think it's better for the purpose of this
605  
page to have extra entries and/or to show a greater repertoire of characters
606  
than it is to enforce a strict interpretation of the word "hurt"!):
607  
608  
<p>
609  
<blockquote>
610  
611  
This is the meaning I have translated to the Swabian dialect.
612  
613  
However, I just have noticed that most of the German variants
614  
translate the "inflict pain" meaning. The German example should 
615  
read:
616  
617  
<p>
618  
<blockquote>
619  
"Ich kann Glas essen ohne mir zu schaden."
620  
</blockquote>
621  
<p>
622  
623  
rather than:
624  
625  
<p>
626  
<blockquote>
627  
"Ich kann Glas essen, ohne mir weh zu tun."
628  
</blockquote>
629  
<p>
630  
631  
(The comma fell victim to the 1996 orthographic reform,
632  
cf. <a href="http://www.ids-mannheim.de/reform/e3-1.html#P76"><tt>http://www.ids-mannheim.de/reform/e3-1.html#P76</tt></a>.
633  
634  
<p>
635  
636  
You may wish to contact the contributors of the following translations
637  
to correct them:
638  
639  
<p>
640  
<ul>
641  
642  
<li> Lëtzebuergescht / Luxemburgish: Ech kan Glas iessen, daat deet mir nët wei.
643  
<li> Lausitzer Mundart ("Lusatian"): Ich koann Gloos assn und doas dudd merr ni wii.
644  
<li> Sächsisch / Saxon: 'sch kann Glos essn, ohne dass'sch mer wehtue.
645  
<li> Bayrisch / Bavarian: I koh Glos esa, und es duard ma ned wei.
646  
<li> Allemannisch: I kaun Gloos essen, es tuat ma ned weh.
647  
<li> Schwyzerdütsch: Ich chan Glaas ässe, das tuet mir nöd weeh.
648  
</ul>
649  
<p>
650  
651  
In contrast, I deem the following translations *alright*:
652  
653  
<p>
654  
<ul>
655  
656  
<li> Ruhrdeutsch: Ich kann Glas verkasematuckeln, ohne dattet mich wat jucken tut.
657  
<li> Pfälzisch: Isch konn Glass fresse ohne dasses mer ebbes ausmache dud.
658  
<li> Schwäbisch / Swabian: I kå Glas frässa, ond des macht mr nix!
659  
</ul>
660  
<p>
661  
662  
(However, you could remove the commas, on account of
663  
<a href="http://www.ids-mannheim.de/reform/e3-1.html#P76"><tt>http://www.ids-mannheim.de/reform/e3-1.html#P76</tt></a>
664  
and
665  
666  
<a href="http://www.ids-mannheim.de/reform/e3-1.html#P72"><tt>http://www.ids-mannheim.de/reform/e3-1.html#P72</tt></a>, respectively.)
667  
668  
<p>
669  
670  
I guess, also these examples translate the <i>wrong</i> sense of "hurt",
671  
though I do not know these languages well enough to assert them
672  
definitely:
673  
674  
<p>
675  
<ul>
676  
677  
<li> Nederlands / Dutch: Ik kan glas eten; het doet mij geen
678  
pijn. <i>(This one has been changed)</i>
679  
<li> Kirchröadsj/Bôchesserplat: Iech ken glaas èèse, mer 't deet miech jing pieng.
680  
681  
</ul>
682  
<p>
683  
684  
In the Romanic languages, the variations on "fa male" (it) are probably
685  
wrong, whilst the variations on "hace daño" (es) and "damaĝas" (Esperanto) are probably correct; "nocet" (la) is definitely right.
686  
687  
<p>
688  
689  
The northern Germanic variants of "skada" are probably right, as are
690  
the Slavic variants of "škodi/шкоди" (se); however the Slavic variants
691  
of " boli" (hv) are probably wrong, as "bolena" means "pain/ache", IIRC.
692  
693  
</blockquote>
694  
<p>
695  
That was from July 2004.  In December 2007, Otto writes again:
696  
697  
<p>
698  
<blockquote>
699  
<small>
700  
Hello Frank,
701  
702  
in days of yore, I had written:<br>
703  
&gt;     "Ich kann Glas essen ohne mir zu schaden." <br>
704  
&gt; (The comma fell victim to the 1996 orthographic reform,
705  
<p>
706  
cf. <a href="http://www.ids-mannheim.de/reform/e3-1.html#P76">http://www.ids-mannheim.de/reform/e3-1.html#P76</a>. 
707  
<p>
708  
709  
The latest revision (2006) of the official German orthography
710  
has revived the comma around infinitive clauses commencing with
711  
<i>ohne</i>, or 5 other conjunctions, or depending from a noun or
712  
from an announcing demonstrative
713  
(<a href="http://www.ids-mannheim.de/reform/regeln2006.pdf">http://www.ids-mannheim.de/reform/regeln2006.pdf</a>, &sect;75).
714  
So, it's again: <i>Ich kann Glas essen, ohne mir zu schaden.</i>
715  
<p>
716  
Best wishes,<br>
717  
&nbsp;&nbsp;&nbsp;&nbsp; Otto Stolz
718  
</small>
719  
</blockquote>
720  
<p>
721  
722  
<li>The numbering of the samples is arbitrary, done only to keep track of how
723  
many there are, and can change any time a new entry is added.  The
724  
arrangement is also arbitrary but with some attempt to group related
725  
examples together.  Note: All languages not listed are wanted, not just the
726  
ones that say (NEEDED).
727  
728  
<p>
729  
730  
<li><a name="note1">Correct right-to-left display of these languages
731  
depends on the capabilities of your browser.</a>  The period should
732  
appear on the left.  In the monospace Yiddish example, the Yiddish digraphs
733  
should occupy one character cell.
734  
735  
<p>
736  
737  
<li>Yoruba: The third word is Latin letter small 'j' followed by
738  
small 'e' with U+0329, Combining Vertical Line Below.  This displays
739  
correctly only if your Unicode font includes the U+0329 glyph and your
740  
browser supports combining diacritical marks.  The Lingala and Indic examples
741  
also include combining sequences.  
742  
743  
<p>
744  
745  
<li>Includes Unicode 3.1 (or later) characters beyond Plane 0.
746  
747  
<p>
748  
749  
<li>The Classic Mongolian example should be vertical, top-to-bottom and 
750  
left-to-right.  But such display is almost impossible.  Also no font yet
751  
exists which provides the proper ligatures and positional variants for the
752  
characters of this script, which works somewhat like Arabic.
753  
754  
<p>
755  
756  
<li>Taiwanese is also known as Holo or Hoklo, and is related to Southern
757  
Min dialects such as Amoy.
758  
Contributed by Henry H. Tan-Tenn, who comments, "The above is
759  
the romanized version, in a script current among Taiwanese Christians since
760  
the mid-19th century.  It was invented by British missionaries and saw use in
761  
hundreds of published works, mostly of a religious nature.  Most Taiwanese did
762  
not know Chinese characters then, or at least not well enough to read.  More
763  
to the point, though, a written standard using Chinese characters has never
764  
developed, so a significant minority of words are represented with different
765  
candidate characters, depending on one's personal preference or etymological
766  
theory.  In this sentence, for example, "-tàng", "chia̍h",
767  
"mā" and "bē" are problematic using Chinese characters.
768  
"Góa" (I/me) and "po-lê" (glass) are as written in other Sinitic
769  
languages (e.g. Mandarin, Hakka)."
770  
771  
<p>
772  
773  
<li>Wagner Amaral of Pinese &amp; Amaral Associados notes that
774  
the Brazilian Portuguese sentence for
775  
"I can eat glass" should be identical to the Portuguese one, as the word
776  
"machuca" means "inflict pain", or rather "injuries". The words "faz 
777  
mal" would more correctly translate as "cause harm".
778  
779  
<p>
780  
781  
<li>Burmese: In English the first person pronoun "I" stands for both
782  
genders, male and female.  In Burmese (except in the central part of Burma)
783  
kyundaw (<font
784  
 size="+1"
785  
 face="Padauk">က္ယ္ဝန္‌တော္‌</font>) for male and kyanma (<font 
786  
 size="+1" face="Padauk">က္ယ္ဝန္‌မ</font>) for female.
787  
Using here a fully-compliant Unicode Burmese font -- sadly one and only one
788  
Padauk Graphite font exists -- rendering using graphite engine.
789  
<!--GONE
790  
<a href="http://h1.ripway.com/bamarsar/">CLICK HERE</a> to test Burmese
791  
characters. 
792  
-->
793  
Unicode 4.0 or older standard did not have some medial and vowel character;
794  
the second example has them.
795  
796  
<p>
797  
798  
<li><i>From Louise Hope, 22 November 2010:</i>&nbsp;
799  
I decided to have a go at an Inuktitut rendering, mainly in hopes of shaming someone who actually knows the language into coming up with something better.
800  
Meanwhile, try this:
801  
<p>
802  
ᐊᓕᒍᖅ ᓂᕆᔭᕌᖓᒃᑯ ᓱᕋᙱᑦᑐᓐᓇᖅᑐᖓ
803  
<br>
804  
aliguq nirijaraangakku suranngittunnaqtunga
805  
<p>
806  
Loosely: I am able not to hurt myself whenever I eat glass.
807  
<p>
808  
aliguq &gt;&gt; glass (uninflected because it is the patient of a transitive verb in an ergative language)<br>
809  
nirijaraangakku &gt;&gt; "I eat him/her/it" in Frequentative mood (all one verb with inflectional ending, no affixes whatsoever)<br>
810  
suranngittunnaqtunga &gt;&gt; suraq (do permanent harm) + nngit (verb-negator) + tunnaq (ability) + tunga (intransitive ending, making the verb passive or reflexive)
811  
<p>
812  
See above about someone who knows the language, et cetera.
813  
<p>
814  
Script trivia: the syllable ᙱ is a single unicode character
815  
representing the two elements ᓐ (syllable-final n) and ᖏ
816  
(syllable ngi). I think they just did it that way because it looks tidier
817  
than the expected ᓐᖏ. If your operating system didn't come
818  
with <a href="http://www.ffonts.net/Euphemia-UCAS.font">Euphemia</a> (all-purpose UCAS font), you can download <a href="http://www.allaboutshoes.ca/inuk/our-boots/piq_font.php">Pigiarniq</a>. It comes with a jolly little inuksuk ᐀ that the Unicode Consortium is trying to make into a squatter.
819  
<p>
820  
821  
<!--
822  
ᓯᖁᒥᐅᒪᓐᖏᒃᑯᓂ ᓈᒪᖕᓇᓐᖏᔾᔪᒃ
823  
<br>
824  
siqumiumanngikkuni naamangnanngijjuk.
825  
-->
826  
827  
</ol>
828  
829  
<h3><a name="quickbrownfox">The Quick Brown Fox... Pangrams</a></h3>
830  
831  
The "I can eat glass" sentences do not necessarily show off the orthography of
832  
each language to best advantage.  In many alphabetic written languages it is
833  
possible to include all (or most) letters (or "special" characters) in
834  
a single (often nonsense) <i>pangram</i>.  These were traditionally used in
835  
typewriter instruction; now they are useful for stress-testing computer fonts
836  
and keyboard input methods.  Here are a few examples (SEND MORE):
837  
838  
<p>
839  
<ol>
840  
841  
<li><b>English:</b> The quick brown fox jumps over the lazy dog.
842  
<li><b>Jamaican:</b> Chruu, a kwik di kwik brong fox a jomp huova di liezi daag de, yu no siit?
843  
<li><b>Irish:</b> "An ḃfuil do ċroí ag bualaḋ ó ḟaitíos an ġrá a ṁeall lena ṗóg éada ó
844  
ṡlí do leasa ṫú?"
845  
"D'ḟuascail Íosa Úrṁac na hÓiġe Beannaiṫe pór Éava agus Áḋaiṁ."
846  
<li><b>Dutch:</b> Pa's wijze lynx bezag vroom het fikse aquaduct.
847  
<li><b>German: </b>  Falsches Üben von Xylophonmusik quält jeden
848  
größeren Zwerg.  (1)
849  
<li><b>German: </b> <span lang=da>Im finſteren Jagdſchloß am offenen Felsquellwaſſer patzte der affig-flatterhafte kauzig-höf‌liche Bäcker über ſeinem verſifften kniffligen C-Xylophon.</span> (2)
850  
<li><b>Norwegian:</b> Blåbærsyltetøy ("blueberry jam", includes every
851  
extra letter used in Norwegian).
852  
<li><b>Swedish:</b>  Flygande bäckasiner söka strax hwila på mjuka tuvor.
853  
<li><b>Icelandic:</b> Sævör grét áðan því úlpan var ónýt.
854  
<li><b>Finnish:</b> (5) Törkylempijävongahdus (This is a perfect pangram, every letter appears only once. Translating it is an art on its own, but I'll say "rude lover's yelp". :-D)
855  
<li><b>Finnish:</b> (5) Albert osti fagotin ja töräytti puhkuvan melodian. (Albert bought a bassoon and hooted an impressive melody.)
856  
<li><b>Finnish:</b> (5) On sangen hauskaa, että polkupyörä on maanteiden jokapäiväinen ilmiö. (It's pleasantly amusing, that the bicycle is an everyday sight on the roads.)
857  
<li><b>Polish:</b> Pchnąć w tę łódź jeża lub osiem skrzyń fig.
858  
<li><b>Czech:</b> Příliš
859  
žluťoučký kůň úpěl
860  
ďábelské kódy.
861  
<li><b>Slovak:</b> Starý kôň na hŕbe
862  
kníh žuje tíško povädnuté
863  
ruže, na stĺpe sa ďateľ
864  
učí kvákať novú ódu o
865  
živote.
866  
<li><b>Greek</b> (monotonic): ξεσκεπάζω την ψυχοφθόρα βδελυγμία
867  
868  
<li><b>Greek</b> (polytonic):
869  
ξεσκεπάζω τὴν ψυχοφθόρα βδελυγμία
870  
871  
872  
<li><b>Russian:</b>
873  
Съешь же ещё этих мягких французских булок да выпей чаю.
874  
875  
<li><b>Russian:</b>
876  
В чащах юга жил-был цитрус? Да, но фальшивый экземпляр! ёъ.
877  
878  
<li><b>Bulgarian:</b> Жълтата дюля беше щастлива, че пухът, който цъфна, замръзна като гьон.
879  
880  
<li><b>Sami (Northern):</b> Vuol Ruoŧa geđggiid leat máŋga luosa ja čuovžža.
881  
<li><b>Hungarian:</b> Árvíztűrő tükörfúrógép.
882  
<li><b>Spanish:</b> El pingüino Wenceslao hizo kilómetros bajo exhaustiva lluvia y frío, añoraba a su querido cachorro.
883  
<li><b>Portuguese:</b> O próximo vôo à noite sobre o Atlântico, põe freqüentemente o único médico. (3)
884  
<li><b>French:</b> Les naïfs ægithales hâtifs pondant à Noël où il gèle sont sûrs d'être 
885  
déçus en voyant leurs drôles d'œufs abîmés.
886  
887  
<li><b>Esperanto:</b> Eĥoŝanĝo
888  
ĉiuĵaŭde.
889  
890  
<li><b>Hebrew:</b> <span dir="RTL" lang=HE>זה כיף סתם לשמוע איך תנצח קרפד עץ טוב בגן.</span>
891  
892  
<li><b>Japanese</b> (Hiragana):<blockquote>
893  
いろはにほへど ちりぬるを<br>
894  
わがよたれぞ つねならむ<br>
895  
うゐのおくやま けふこえて<br>
896  
あさきゆめみじ ゑひもせず
897  
(4)
898  
</blockquote>
899  
900  
</ol>
901  
<p id="oechtringen">
902  
<a name="notes2"><b>Notes:</b></a>
903  
<p>
904  
<ol>
905  
906  
<li>Other phrases commonly used in Germany include: "Ein wackerer Bayer
907  
vertilgt ja bequem zwo Pfund Kalbshaxe" and, more recently, "Franz jagt im
908  
komplett verwahrlosten Taxi quer durch Bayern", but both lack umlauts and
909  
esszet.  Previously, going for the shortest sentence that has all the
910  
umlauts and special characters, I had
911  
"Grüße aus Bärenhöfe
912  
(und Óechtringen)!"
913  
Acute accents are not used in native German words, so I was surprised to
914  
discover "Óechtringen" in the Deutsche Bundespost 
915  
Postleitzahlenbuch:
916  
<p>
917  
<blockquote>
918  
<a href="http://www.columbia.edu/~fdc/misc/oechtringen.jpg"><img
919  
 src="oechtringen-sm.jpg" alt="Click for full-size image (2.8MB)"></a>
920  
</blockquote>
921  
<p>
922  
It's a small village in eastern Lower Saxony.
923  
The "oe" in this case
924  
turns out to be the Lower Saxon "lengthening e" (Dehnungs-e), which makes the
925  
previous vowel long (used in a number of Lower Saxon place names such as Soest
926  
and Itzehoe), not the "e" that indicates umlaut of the preceding vowel.
927  
Many thanks to the Óechtringen-Namenschreibungsuntersuchungskomitee
928  
(Alex Bochannek, Manfred Erren, Asmus Freytag, Christoph P&auml;per, plus
929  
Werner Lemberg who serves as
930  
Óechtringen-Namenschreibungsuntersuchungskomiteerechtschreibungsprüfer)
931  
932  
for their relentless pursuit of the facts in this case. Conclusion: the
933  
accent almost certainly does not belong on this (or any other native German)
934  
word, but neither can it be dismissed as dirt on the page.  To add to the
935  
mystery, it has been reported that other copies of the same edition of the
936  
PLZB do not show the accent!  UPDATE (March 2006): David Krings was
937  
intrigued enough by this report to contact the mayor of Ebstorf, of which
938  
Oechtringen is a borough, who responded:
939  
940  
<p>
941  
<blockquote style="font-family:sans-serif;font-size:80%">
942  
Sehr geehrter Mr. Krings,<br>
943  
wenn Oechtringen irgendwo mit einem Akzent auf dem O geschrieben wurde,
944  
dann kann das nur ein Fehldruck sein. Die offizielle Schreibweise lautet
945  
jedenfalls „Oechtringen“.<br>
946  
Mit freundlichen Grüssen<br>
947  
Der Samtgemeindebürgermeister<br>
948  
i.A. Lothar Jessel
949  
950  
</blockquote>
951  
952  
953  
<p>
954  
<li>From Karl Pentzlin (Kochel am See, Bavaria, Germany):
955  
"This German phrase is suited for display by a Fraktur (broken letter)
956  
font. It contains: all common three-letter ligatures: ffi ffl fft and all
957  
two-letter ligatures required by the Duden for Fraktur typesetting: ch ck ff
958  
fi fl ft ll Å¿ch Å¿i Å¿Å¿ Å¿t tz (all in a
959  
manner such they are not part of a three-letter ligature), one example of f-l
960  
where German typesetting rules prohibit ligating (marked by a ZWNJ), and all
961  
German letters a...z, ä,ö,ü,ß, ſ [long s]
962  
(all in a manner such that they are not part of a two-letter Fraktur
963  
ligature)."
964  
965  
Otto Stolz notes that "'Schloß' is now spelled 'Schloss', in
966  
contrast to 'größer' (example 4) which has kept its
967  
'ß'.  Fraktur has been banned from general use, in 1942, and long-s
968  
(Å¿) has ceased to be used with Antiqua (Roman) even earlier (the
969  
latest Antiqua-Å¿ I have seen is from 1913, but then
970  
I am no expert, so there may well be a later instance."  Later Otto confirms
971  
the latter theory, "Now I've run across a book “Deutsche
972  
Rechtschreibung” (edited by Lutz Mackensen) from 1954 (my reprint
973  
is from 1956) that has kept the Antiqua-Å¿ in its dictionary part (but
974  
neither in the preface nor in the appendix)."
975  
976  
<p>
977  
978  
<li>Diaeresis is not used in Iberian Portuguese.
979  
980  
<p>
981  
982  
<li>From Yurio Miyazawa: "This poetry contains all the sounds in the
983  
Japanese language and used to be the first thing for children to learn in
984  
their Japanese class. The Hiragana version is particularly neat because it
985  
covers every character in the phonetic Hiragana character set."  Yurio also
986  
sent the Kanji version:
987  
988  
<p>
989  
<blockquote>
990  
色は匂へど 散りぬるを<br>
991  
我が世誰ぞ 常ならむ<br>
992  
有為の奥山 今日越えて<br>
993  
浅き夢見じ 酔ひもせず
994  
</blockquote>
995  
996  
<li>Finnish pangrams from Mikko Ristilä.
997  
998  
</ol>
999  
<p>
1000  
<b>Accented Cyrillic:</b>
1001  
<p>
1002  
1003  
<i>(This section contributed by Vladimir Marinov.)</i>
1004  
1005  
<p>
1006  
1007  
In Bulgarian it is desirable, customary, or in some cases required to
1008  
write accents over vowels.  Unfortunately, no computer character sets
1009  
contain the full repertoire of accented Cyrillic letters.  With Unicode,
1010  
however, it is possible to combine any Cyrillic letter with any combining
1011  
accent.  The appearance of the result depends on the font and the rendering
1012  
engine.  Here are two examples.
1013  
1014  
<p>
1015  
<ol>
1016  
1017  
<li>Той видя бялата коса́ по главата и́ и ко́са на рамото и́, и ре́че да и́
1018  
рече́: "Пара́та по́ па́ри от па́рата, не ща пари́!", но си поми́сли: "Хей,
1019  
помисли́ си! А́ и́ река, а́ е скочила в тази река, която щеше да тече́,
1020  
а не те́че."
1021  
1022  
<p>
1023  
1024  
<li>По пъ́тя пъту́ват кю́рди и югославя́ни.  
1025  
1026  
</ol>
1027  
1028  
<h3><a name="html">HTML Features</a></h3>
1029  
1030  
Here is the Russian alphabet (uppercase only) coded in three
1031  
different ways, which should look identical:
1032  
1033  
<p>
1034  
<ol>
1035  
<li>АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ
1036  
&nbsp; <i>(Literal UTF-8)</i>
1037  
<li>&#1040;&#1041;&#1042;&#1043;&#1044;&#1045;&#1046;&#1047;&#1048;&#1049;&#1050;&#1051;&#1052;&#1053;&#1054;&#1055;&#1056;&#1057;&#1058;&#1059;&#1060;&#1061;&#1062;&#1063;&#1064;&#1065;&#1066;&#1067;&#1068;&#1069;&#1070;&#1071;
1038  
&nbsp; <i>(Decimal numeric character reference)</i>
1039  
<li>&#x0410;&#x0411;&#x0412;&#x0413;&#x0414;&#x0415;&#x0416;&#x0417;&#x0418;&#x0419;&#x041a;&#x041b;&#x041c;&#x041d;&#x041e;&#x041f;&#x0420;&#x0421;&#x0422;&#x0423;&#x0424;&#x0425;&#x0426;&#x0427;&#x0428;&#x0429;&#x042a;&#x042b;&#x042c;&#x042d;&#x042e;&#x042f;
1040  
&nbsp; <i>(Hexadecimal numeric character reference)</i>
1041  
</ol>
1042  
1043  
<p>
1044  
1045  
In another test, we use HTML language tags to distinguish Bulgarian, Russian,
1046  
and <a href="http://www.tiro.com/transfer/Serbian_Rendering.pdf">Serbian</a>, 
1047  
which have different italic forms for lowercase
1048  
б, г, д, п, and/or т:
1049  
<p>
1050  
<blockquote>
1051  
<table>
1052  
<tr>
1053  
<td><b>Bulgarian</b>: &nbsp;
1054  
<td><span lang=BG>[&nbsp;бгдпт</span>&nbsp;] &nbsp;
1055  
<td><span lang=BG>[&nbsp;<i>бгдпт</i></span>&nbsp;] &nbsp;
1056  
<td><span lang=BG><i> Мога да ям стъкло и не ме боли.</i></span>
1057  
<tr>
1058  
<td><b>Russian</b>:
1059  
<td><span lang=RU>[&nbsp;бгдпт</span>&nbsp;] &nbsp;
1060  
<td><span lang=RU>[&nbsp;<i>бгдпт</i></span>&nbsp;] &nbsp;
1061  
<td><span lang=RU><i>Я могу есть стекло, это мне не вредит.</i></span>
1062  
<tr>
1063  
<td><b>Serbian</b>:
1064  
<td><span lang=SR>[&nbsp;бгдпт</span>&nbsp;] &nbsp;
1065  
<td><span lang=SR>[&nbsp;<i>бгдпт</i></span>&nbsp;] &nbsp;
1066  
<td> <span lang=SR><i>Могу јести стакло 
1067  
а
1068  
да ми 
1069  
не 
1070  
шкоди.</i></span>
1071  
</table>
1072  
</blockquote>
1073  
<p>
1074  
1075  
<!-- acknowledgments -->
1076  
<h3><a name="credits">Credits, Tools, and Commentary</a></h3>
1077  
1078  
<dl>
1079  
<dt><b>Credits:</b></dt>
1080  
<dd>
1081  
The "I can eat glass" phrase and the initial collection of translations:
1082  
<a href="http://hcs.harvard.edu/~igp/glass.html">Ethan Mollick</a>.
1083  
Transcription / conversion to UTF-8: Frank da&nbsp;Cruz.
1084  
<b>Albanian:</b> Sindi Keesan.
1085  
<b>Afrikaans:</b> Johan Fourie, Kevin Poalses.
1086  
<b>Anglo Saxon:</b> Frank da&nbsp;Cruz.
1087  
<b>Arabic:</b> Najib Tounsi.
1088  
<b>Armenian:</b> Vaçe Kundakçı.
1089  
<b>Belarusian:</b> Alexey Chernyak, Patricia Clausnitzer.
1090  
<b>Bengali:</b> Somnath Purkayastha, Deepayan Sarkar.
1091  
<b>Bislama:</b> Dan McGarry.
1092  
<b>Bosnian:</b> Dmitrij D. Czarkoff.
1093  
<b>Braille:</b> Frank da&nbsp;Cruz.
1094  
<b>Bulgarian:</b> Sindi Keesan, Guentcho Skordev, Vladimir Marinov.
1095  
<b>Burmese:</b> "cetanapa", Sithu Thwin.
1096  
<b>Cabo Verde Creole:</b> Cláudio Alexandre Duarte.
1097  
<b>Catal&aacute;n:</b>  Jordi Bancells.
1098  
<b>Chinese:</b> Jack Soo, Wong Pui Lam.
1099  
<b>Chinook Jargon:</b> David Robertson.
1100  
<b>Cornish:</b> Chris Stephens.
1101  
<b>Croatian:</b> Dmitrij D. Czarkoff, Marjan Baće.
1102  
<b>Czech:</b> Stanislav Pecha, Radovan Garabík.
1103  
<b>Danish:</b> Morten Due Jorgensen.
1104  
<b>Dutch:</b> Peter Gotink. Pim Blokland, Rob Daniel, Rob de Wit.
1105  
<b>Erzian:</b> Jack Rueter.
1106  
<b>Esperanto:</b> Franko Luin, Radovan Garabík.
1107  
<b>Estonian:</b> Meelis Roos.
1108  
<b>Faroese:</b> J&oacute;n Gaasedal.
1109  
<b>Farsi/Persian:</b> Payam Elahi.
1110  
<b>Fijian:</b> Paul Cannon.
1111  
<b>Finnish:</b> Sampsa Toivanen, Mikko Ristilä.
1112  
<b>French:</b> Luc Carissimo, Anne Colin du&nbsp;Terrail, Sean M. Burke, Theo Morelli.
1113  
<b>Galician:</b> Laura Probaos.
1114  
<b>Georgian:</b> Giorgi Lebanidze.
1115  
<b>German:</b> Christoph Päper, Otto Stolz, Karl Pentzlin, David Krings,
1116  
Frank da&nbsp;Cruz, Peter Keel (Seegras), Elias Glantschnig.
1117  
<b>Gothic:</b> Aur&eacute;lien Coudurier.
1118  
<b>Greek:</b> Ariel Glenn, Constantine Stathopoulos, Siva Nataraja, Christos Georgiou.
1119  
<b>Hebrew:</b> Jonathan Rosenne, Tal Barnea.
1120  
<b>Hausa:</b> Malami Buba, Tom Gewecke.
1121  
<b>Hawaiian:</b> na Hauʻoli Motta, Anela de&nbsp;Rego, Kaliko Trapp.
1122  
<b>Hindi:</b> Shirish Kalele, Nitin Dahra.
1123  
<b>Hungarian:</b> András Rácz, Mark Holczhammer.
1124  
<b>Icelandic:</b> Andrés Magnússon, Sveinn Baldursson.
1125  
<b>International Phonetic Alphabet (IPA):</b> Siva Nataraja / Vincent Ramos.
1126  
<b>Inuktitut</b>: Louise Hope.
1127  
<b>Irish:</b> Michael Everson, Marion Gunn, James Kass, Curtis Clark.
1128  
<b>Italian:</b> Thomas De Bellis.
1129  
<b>Jamaican:</b> Stephen J. Cherin.
1130  
<b>Japanese:</b> Makoto Takahashi, Yurio Miyazawa.
1131  
<b>Kannada:</b> Sridhar&nbsp;R&nbsp;N, Alok G. Singh.
1132  
<b>Karelian:</b> Aleksandr Semakov.
1133  
<b>Khmer:</b> Tola Sann.
1134  
<b>Kirchröadsj:</b> Roger Stoffers.
1135  
<b>Kreyòl:</b> Sean M. Burke.
1136  
<b>Korean:</b> Jungshik Shin.
1137  
<b>Langenfelder Platt:</b> David Krings.
1138  
<b>Lao:</b> Tola Sann.
1139  
<b>Lëtzebuergescht:</b> Stefaan Eeckels.
1140  
<b>Lingala:</b> <a href="http://home.sus.mcgill.ca/~moyogo">Denis Moyogo Jacquerye</a>
1141  
(<a href="http://info-langues-congo.1sd.org/">Nkóta ya Kɔ́ngɔ míbalé </a>)
1142  
(Nkóta ya Kɔ́ngɔ míbal).
1143  
<b>Lithuanian:</b> Gediminas Grigas.
1144  
<b>Lojban:</b> Edward Cherlin.
1145  
<b>Lusatian:</b> Ronald Schaffhirt.
1146  
<b>Macedonian:</b> Sindi Keesan.
1147  
<b>Malay:</b> Zarina Mustapha.
1148  
<b>Malayam:</b> Anil Matthews.
1149  
<b>Maltese:</b> Kenneth Joseph Vella.
1150  
<b>Manx:</b> &Eacute;anna &Oacute; Br&aacute;daigh.
1151  
<b>Marathi:</b> Shirish Kalele.
1152  
<b>Marquesan:</b> Kaliko Trapp.
1153  
<b>Middle English:</b> Frank da&nbsp;Cruz.
1154  
<b>Milanese:</b> Marco Cimarosti.
1155  
<b>Mongolian:</b> Tom Gewecke.
1156  
<b>Montenegran:</b> Dmitrij D. Czarkoff.
1157  
<b>Napoletano:</b> Diego Quintano.
1158  
<b>Navajo:</b> Tom Gewecke.
1159  
<a href="http://www.langmaker.com/db/mdl_nordicg.htm"><b>Nórdicg</b></a>:
1160  
Y&#7811;lyan Rott.
1161  
<b>Nepali:</b> Ujjwol Lamichhane, Rabi Tripathi.
1162  
<b>Norwegian:</b> Herman Ranes, Håvard Kvålen.
1163  
<b>Odenwälderisch:</b> Alexander He&szlig;.
1164  
<b>Old Irish:</b> Michael Everson.
1165  
<b>Old Norse:</b> Andrés Magnússon.
1166  
<b>Papiamentu:</b> Bianca and Denise Zanardi.
1167  
<b>Pashto:</b> N.R. Liwal.
1168  
<b>Pfälzisch:</b> Dr. Johannes Sander.
1169  
<b>Picard:</b> Philippe Mennecier.
1170  
<b>Polish:</b> Juliusz Chroboczek, Paweł Przeradowski, Wlodzislaw Kostecki.
1171  
<b>Portuguese:</b> "Cláudio" Alexandre Duarte, Bianca and Denise
1172  
Zanardi, Pedro Palhoto Matos, Wagner Amaral.
1173  
<b>Québécois:</b> Laurent Detillieux.
1174  
<b>Roman:</b> Pierpaolo Bernardi.
1175  
<b>Romanian:</b> Juliusz Chroboczek, Ionel Mugurel.
1176  
<b>Romansch:</b> Alexandre Suter.
1177  
<b>Ruhrdeutsch:</b> "Timwi".
1178  
<b>Russian:</b> Alexey Chernyak, Serge Nesterovitch.
1179  
<b>Sami:</b> Anne Colin du&nbsp;Terrail, Luc Carissimo.
1180  
<b>Sanskrit:</b> Siva Nataraja / Vincent Ramos.
1181  
<b>Sächsisch:</b> André Müller.
1182  
<b>Schwäbisch:</b> Otto Stolz.
1183  
<b>Scots:</b> Jonathan Riddell.
1184  
<b>Serbian:</b> Dmitrij D. Czarkoff, Sindi Keesan, Ranko Narancic, Boris Daljevic, Szilvia Csorba,
1185  
O.&nbsp;Dag.
1186  
<b>Sinhalese:</b> Abdul-Ahad (ASM).
1187  
<b>Slovak:</b> G. Adam Stanislav, Radovan Garabík.
1188  
<b>Slovenian:</b> Albert Kolar.
1189  
<b>Spanish:</b> <a href="http://www.aleida.net">Aleida Morel</a>, Laura Probaos.
1190  
<b>Swahili:</b> Ronald Schaffhirt.
1191  
<b>Swedish:</b> Christian Rose, Bengt Larsson.
1192  
<b>Taiwanese:</b> Henry H. Tan-Tenn.
1193  
<b>Tagalog:</b> Jim Soliven.
1194  
<b>Tamil:</b> Vasee Vaseeharan, Vetrivel P.
1195  
<b>Telugu:</b> Arjuna Rao Chavala.
1196  
<b>Tibetan:</b> D. Germano, Tom Gewecke.
1197  
<b>Thai:</b> Alan Wood's wife.
1198  
<b>Turkish:</b> Vaçe Kundakçı, Tom Gewecke, Merlign Olnon.
1199  
<b>Ukrainian:</b> Michael Zajac, Oleg Podsadny.
1200  
<b>Ulster Gaelic:</b> Ciarán Ó Duibhín.
1201  
<b>Urdu:</b> Mustafa Ali.
1202  
<a href="http://nomfoundation.org/"><b>Vietnamese</b></a>: Dixon Au,
1203  
[James] Đỗ Bá Phước
1204  
<font face="PMingLiU">&#x675c; &#x4f2f; &#x798f;</font>.
1205  
<b>Walloon:</b> Pablo Saratxaga.
1206  
<b>Welsh:</b> Geiriadur Prifysgol Cymru (Andrew).
1207  
<b>Yiddish:</b> Mark David.
1208  
<b>Zeneise:</b> Angelo Pavese.
1209  
1210  
<p>
1211  
1212  
<dt><b>Tools Used to Create This Web Page:</b></dt>
1213  
1214  
<dd>The UTF8-aware <a href="k95.html">Kermit 95</a> terminal emulator on
1215  
Windows, to a Unix host with the <a
1216  
href="http://www.gnu.org/directory/emacs.html">EMACS</a> text editor.  Kermit
1217  
95 displays UTF-8 and also allows keyboard entry of arbitrary Unicode BMP
1218  
characters as 4 hex digits, as shown <a href="glass.html">HERE</a>.  Hex codes
1219  
for Unicode values can be found in <a
1220  
href="http://www.unicode.org/unicode/uni2book/u2.html">The Unicode
1221  
Standard</a> (recommended) and the <a
1222  
href="http://www.unicode.org/charts/">online code charts</a>.  When
1223  
submissions arrive by email encoded in some other character set (Latin-1,
1224  
Latin-2, KOI, various PC code pages, JEUC, etc), I use the TRANSLATE command
1225  
of <a href="ckermit.html">C-Kermit</a> on the Unix host (<a
1226  
href="safe.html">where I read my mail</a>) to convert the character set to
1227  
UTF-8 (I could also use Kermit 95 for this; it has the same TRANSLATE
1228  
command).  That's it -- no "Web authoring" tools, no locales, no "smart"
1229  
anything.  It's just plain text, nothing more.  By the way, there's nothing
1230  
special about EMACS -- any text editor will do, providing it allows entry of
1231  
arbitrary 8-bit bytes as text, including the 0x80-0x9F "C1" range.  EMACS 21.1
1232  
actually supports UTF-8; earlier versions don't know about it and display the
1233  
octal codes; either way is OK for this purpose.
1234  
1235  
<p>
1236  
1237  
<dt><b>Commentary:</b>
1238  
<dd>Date: Wed, 27 Feb 2002 13:21:59 +0100<br>
1239  
From: "Bruno DEDOMINICIS" <tt>&lt;b.dedominicis@cite-sciences.fr&gt;</tt><br>
1240  
Subject: Je peux manger du verre, cela ne me fait pas mal.
1241  
1242  
<p>
1243  
1244  
I just found out your website and it makes me feel like proposing an
1245  
interpretation of the choice of this peculiar phrase.
1246  
1247  
<p>
1248  
1249  
Glass is transparent and can hurt as everyone knows. The relation between
1250  
people and civilisations is sometimes effusional and more often rude. The
1251  
concept of breaking frontiers through globalization, in a way, is also an
1252  
attempt to deny any difference.  Isn't "transparency" the flag of modernity?
1253  
Nothing should be hidden any more, authority is obsolete, and the new powers
1254  
are supposed to reign through loving and smiling and no more through
1255  
coercion...
1256  
1257  
<p>
1258  
1259  
Eating glass without pain sounds like a very nice metaphor of this attempt.
1260  
That is, frontiers should become glass transparent first, and be denied by
1261  
incorporating them.  On the reverse, it shows that through globalization,
1262  
frontiers undergo a process of displacement, that is, when they are not any
1263  
more speakable, they become repressed from the speech and are therefore
1264  
incorporated and might become painful symptoms, as for example what happens
1265  
when one tries to eat glass.
1266  
1267  
<p>
1268  
1269  
The frontiers that used to separate bodies one from another tend to divide
1270  
bodies from within and make them suffer....  The chosen phrase then appears
1271  
as a denial of the symptom that might result from the destitution of
1272  
traditional frontiers.
1273  
1274  
<p>
1275  
Best,<br>
1276  
Bruno De Dominicis, Paris, France
1277  
</dl>
1278  
1279  
<p>
1280  
<b>Other Unicode pages onsite:</b>
1281  
<ul>
1282  
<li><a href="postal.html">Frank's Compulsive Guide to Postal Addresses</a>
1283  
(especially the <a href="postal.html#index">Index</a>)
1284  
<li><a href="http://www.columbia.edu/~fdc/pace/">Peace in All Languages</a>
1285  
<li><a href="sshclient-be.html">Kermit 95 кліента SSH</a>
1286  
(Kermit 95 SSH Client documentation in Belarusian)
1287  
<li><a href="st-erkenwald.html">Representing Middle English on the Web with UTF-8</a>
1288  
<li><a href="biblio.html">The Kermit Bibliography</a> (in UTF-8)
1289  
<li><a href="accents.html">Interchange of Non-English Computer Text</a>
1290  
(UTF-8 math and box-drawing)
1291  
<li><a href="utf8-t1.html">Unicode Table</a> (in UTF-8)
1292  
</ul>
1293  
<p>
1294  
<b>Unicode samplers and resources offsite:</b>
1295  
<ul>
1296  
<li><a href="http://rishida.net/scripts/uniview/conversion">Unicode Code
1297  
Converter</a> (converts among different Unicode
1298  
encoding forms and notations).
1299  
1300  
<li><a href="http://unicode.org/cldr/utility/confusables.jsp?a=paypal&n=on&x=on">Confusables</a> (every silver lining has a cloud).
1301  
<li><a href="http://www.seigniorage.de/">Seigniorage</a> (Central Banks worldwide).
1302  
<li>Michael Everson's
1303  
<a href="http://www.evertype.com/scriptbib.html">Bibliography of Typography
1304  
and Scripts</a>
1305  
<li><a href="http://www.code2000.net/englishtestutf.htm">Does your browser
1306  
support Unicode English?</a> (James Kass)
1307  
<li><a href="http://crism.maden.org/dunno.html">I don't know, I only work here</a>
1308  
<li><a href="http://www.trigeminal.com/samples/provincial.html">Anyone
1309  
can be provincial!</a>
1310  
<!-- defunct
1311  
<li><a href="http://www.macchiato.com/unicode/Unicode_transcriptions.html">Transcriptions of "Unicode"</a>
1312  
-->
1313  
<li><a href="http://www.i18nguy.com/unicode-example.html">Example
1314  
Unicode Usage for Business Applications</a>
1315  
<li><a href="http://www.cl.cam.ac.uk/~mgk25/unicode.html#apps">UTF-8 and
1316  
Unicode FAQ for Unix/Linux</a>
1317  
</ul>
1318  
<p>
1319  
<b>Unicode fonts:</b>
1320  
<ul>
1321  
<li><a href="http://www.code2000.net/">Code 2000</a> (James Kass)
1322  
1323  
<li><a href="http://www.alanwood.net/unicode/fonts.html">Unicode Fonts
1324  
for Windows Computers</a> (Alan Wood)
1325  
<li><a href="http://www.cl.cam.ac.uk/~mgk25/ucs-fonts.html">Unicode Fonts and
1326  
Tools for X11</a> (Markus Kuhn)
1327  
<li><a href="http://www.evertype.com/emono/">Everson Mono</a> (Michael
1328  
Everson)
1329  
<li><a href="http://www.monotype.com">Agfa Monotype</a> (now fonts.com)
1330  
</ul>
1331  
1332  
<p>
1333  
[ <a href="k95.html">Kermit 95</a> ]
1334  
[ <a href="glass.html">K95 Screen Shots</a> ]
1335  
[ <a href="ckermit.html">C-Kermit</a> ]
1336  
[ <a href="index.html">Kermit Home</a> ]
1337  
[ <a href="http://www.unicode.org/help/display_problems.html">Display Problems?</a> ]
1338  
[ <a href="http://www.unicode.org">The Unicode Consortium</a> ]
1339  
<hr>
1340  
<ADDRESS>
1341  
UTF-8 Sampler / <a href="index.html">The Kermit Project</a> /
1342  
<a href="http://www.columbia.edu">Columbia University</a> /
1343  
<a href="mailto:kermit@kermitproject.org">kermit@kermitproject.org</a>
1344  
</ADDRESS>
1345  
</body>
1346  
</html>

download  show line numbers   

Snippet is not live.

Travelled to 12 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt

No comments. add comment

Snippet ID: #3000417
Snippet name: Contents of kermitproject.org/utf8.html
Eternal ID of this version: #3000417/1
Text MD5: b208d08a9884912af97ca249c44cb697
Author: someone
Category:
Type: New Tinybrain snippet
Gummipassword: #3999999
Uploaded from IP: 31.19.51.233
Public (visible to everyone): Yes
Archived (hidden from active list): No
Created/modified: 2016-10-25 18:18:37
Source code size: 71543 bytes / 1346 lines
Pitched / IR pitched: No / No
Views / Downloads: 462 / 156
Referenced in: [show references]