1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> |
2 | <html> |
3 | <head> |
4 | <META http-equiv="Content-Type" content="text/html; charset=utf-8"> |
5 | <title>UTF-8 Sampler</title> |
6 | |
7 | <META http-equiv="Content-Style-Type" content="text/css"> |
8 | <META name="viewport" content="width=device-width, initial-scale=1.0"> |
9 | <LINK REL="stylesheet" TYPE="text/css" HREF="/kermit.css"> |
10 | <LINK REL="shortcut icon" href="/favicon.ico" > |
11 | <LINK REL="icon" href="/favicon.ico" type="image/x-icon"> |
12 | <LINK REL="icon" type="image/ico" href="/favicon.ico"> |
13 | <style type="text/css"> |
14 | blockquote { margin-left:8px; margin-right:8px; font-size:90% } |
15 | body { font-size:15px; |
16 | font-family:calibri, arial narrow, arial, sans-serif, times; |
17 | color:black; |
18 | background:white; |
19 | margin:16px; |
20 | } |
21 | tt { font-size:94% } |
22 | </style> |
23 | </head> |
24 | |
25 | <body> |
26 | |
27 | <h1><tt>UTF-8 SAMPLER</tt></h1> |
28 | |
29 | <big><big> Â¥ · £ · ⬠· $ · ¢ · ⡠· ⢠· ⣠· ⤠· ⥠· ⦠· ⧠· ⨠· ⩠· ⪠· ⫠· â · ⮠· ⯠· ₹</big></big> |
30 | |
31 | |
32 | |
33 | <p> |
34 | <blockquote> |
35 | Frank da Cruz<br> |
36 | <a href="index.html">The Kermit Project</a><br> |
37 | New York City<br> |
38 | <a href="mailto:fdc@kermitproject.org">fdc@kermitproject.org</a> |
39 | |
40 | <p> |
41 | <i>Last update:</i> |
42 | Thu Sep 15 14:00:00 2016 |
43 | </blockquote> |
44 | <p> |
45 | <hr> |
46 | [ <a href="http://www.columbia.edu/~fdc/pace/">PEACE</a> ] |
47 | [ <a href="#poetry">Poetry</a> ] |
48 | [ <a href="#glass">I Can Eat Glass</a> ] |
49 | [ <a href="#quickbrownfox">Pangrams</a> ] |
50 | [ <a href="#html">HTML Features</a> ] |
51 | [ <a href="#credits">Credits, Tools, Commentary</a> ] |
52 | <p> |
53 | |
54 | <big><big>U</big>TF-8</big> is an ASCII-preserving encoding method for |
55 | <a href="unicode.html">Unicode</a> (ISO 10646), the Universal Character Set |
56 | (UCS). The UCS encodes most of the world's writing systems in a single |
57 | character set, allowing you to mix languages and scripts within a document |
58 | without needing any tricks for switching character sets. This web page is |
59 | encoded directly in UTF-8. |
60 | |
61 | <p> |
62 | |
63 | As shown <a href="glass.html">HERE</a>, |
64 | Columbia University's <a href="k95.html">Kermit 95</a> terminal emulation |
65 | software can display UTF-8 plain text in Windows 95, 98, ME, NT, XP, Vista, |
66 | or Windows 7/8/10 when using a monospace Unicode font like <a |
67 | href="http://www.monotype.com">Andale Mono WT J</a> or <a |
68 | href="http://www.evertype.com/emono/">Everson Mono Terminal</a>, or the lesser |
69 | populated Courier New, Lucida Console, or Andale Mono. <a |
70 | href="ckermit.html">C-Kermit</a> can handle it too, |
71 | <a href="http://www.cl.cam.ac.uk/~mgk25/unicode.html">if you have a Unicode |
72 | display</a>. As many languages as are representable in your font can be seen |
73 | on the screen at the same time. |
74 | |
75 | <p> |
76 | |
77 | This, however, is a Web page, which started out as a kind of stress test for |
78 | UTF-8 support in Web browsers, which was spotty when this page was first |
79 | created in the 1990s but which has become standard in all modern browsers. |
80 | The problem now is mainly the fonts and the browser's (or font's) support |
81 | for the nonzero Unicode planes (as in, e.g., the <a href="#braille">Braille</a> |
82 | and <a href="#gothic">Gothic</a> examples |
83 | below). And to some extent the rendition of combining sequences, |
84 | right-to-left rendition (<a href="#arabic">Arabic</a>, |
85 | <a href="#hebrew">Hebrew</a>), and so |
86 | on. <a href="http://www.alanwood.net/unicode/fonts.html">CLICK HERE</a> for |
87 | a survey of Unicode fonts for Windows. |
88 | |
89 | <p> |
90 | |
91 | The subtitle above shows currency symbols of many lands. If they don't |
92 | appear as blobs, we're off to a good start! (The one on the end is the |
93 | <a href="http://en.wikipedia.org/wiki/Indian_rupee_sign">new Indian Rupee |
94 | sign</a> which won't show up in fonts for a while.) |
95 | |
96 | <h3><a name="poetry">Poetry</a></h3> |
97 | |
98 | From the Anglo-Saxon <a href="http://www.ragweedforge.com/poems.html"><cite>Rune Poem</cite></a> (Rune version): |
99 | <p><blockquote> |
100 | á áá»á«áá¦á¦á«á á±á©á á¢á±á«á áá±áªá«á·áá»á¹á¦áá³á¢á<br> |
101 | áá³ááªáá«á¦ááªá»á«ááªá¾á¾áªá«á·áá»á¹á¦áá³á«ááá³áá¢á¾á«á»á¦áá«áá«ááªá¾<br> |
102 | á·áá á«á»áá«á¹áááá«á á©á±á«áá±áá»áá¾áá«áá©áááá«á»ááááªá¾á¬<br> |
103 | </blockquote> |
104 | <p> |
105 | |
106 | From LaÈamon's<i> <a href="http://mesl.itd.umich.edu/b/brut/">Brut</a></i> |
107 | (<i>The Chronicles of England</i>, Middle English, West Midlands): |
108 | <p> |
109 | <blockquote> |
110 | An preost wes on leoden, LaÈamon was ihoten<br> |
111 | He wes Leovenaðes sone -- liðe him be Drihten.<br> |
112 | He wonede at ErnleÈe at æðelen are chirechen,<br> |
113 | Uppen Sevarne staþe, sel þar him þuhte,<br> |
114 | Onfest Radestone, þer he bock radde. |
115 | </blockquote> |
116 | <p> |
117 | |
118 | (The third letter in the author's name is Yogh, missing from many fonts; |
119 | <a href="st-erkenwald.html">CLICK HERE</a> for another Middle English sample |
120 | with some explanation of letters and encoding). |
121 | |
122 | <p> |
123 | |
124 | From the <cite>Tagelied</cite> of |
125 | |
126 | <a href="http://gutenberg.spiegel.de/autoren/eschenba.htm"> |
127 | <b>Wolfram von Eschenbach</b></a> (Middle High German): |
128 | <p><blockquote> |
129 | Sîne klâwen durh die wolken sint geslagen,<br> |
130 | er stîget ûf mit grôzer kraft,<br> |
131 | ich sih in grâwen tägelîch als er wil tagen,<br> |
132 | den tac, der im geselleschaft<br> |
133 | erwenden wil, dem werden man,<br> |
134 | den ich mit sorgen în verliez.<br> |
135 | ich bringe in hinnen, ob ich kan.<br> |
136 | sîn vil manegiu tugent michz leisten hiez.<br> |
137 | </blockquote><p> |
138 | |
139 | Some lines of |
140 | <a href="http://users.hol.gr/~artemis/odysseas_elytis.htm"> |
141 | <b>Odysseus Elytis</b></a> (Greek): |
142 | |
143 | <blockquote> |
144 | <table cellspacing=0 cellpadding=0> |
145 | <tr> |
146 | <td valign="top" style="padding-right:16"> |
147 | Monotonic: |
148 | <p> |
149 | Τη γλÏÏÏα Î¼Î¿Ï ÎδÏÏαν ελληνική<br> |
150 | Ïο ÏÏίÏι ÏÏÏÏÎ¹ÎºÏ ÏÏÎ¹Ï Î±Î¼Î¼Î¿Ï Î´Î¹ÎÏ ÏÎ¿Ï ÎμήÏÎ¿Ï .<br> |
151 | ÎονάÏη Îγνοια η γλÏÏÏα Î¼Î¿Ï ÏÏÎ¹Ï Î±Î¼Î¼Î¿Ï Î´Î¹ÎÏ ÏÎ¿Ï ÎμήÏÎ¿Ï .<br> |
152 | <p> |
153 | αÏÏ Ïο Îξιον ÎÏÏί<br> |
154 | ÏÎ¿Ï ÎÎ´Ï ÏÏÎα ÎλÏÏη |
155 | |
156 | <td valign="top"> |
157 | Polytonic: |
158 | <p> |
159 | Τὴ γλῶÏÏα μοῦ á¼Î´ÏÏαν á¼Î»Î»Î·Î½Î¹Îºá½´<br/> |
160 | Ïὸ ÏÏίÏι ÏÏÏÏικὸ ÏÏá½¶Ï á¼Î¼Î¼Î¿Ï Î´Î¹á½²Ï Ïοῦ á½Î¼Î®ÏÎ¿Ï .<br/> |
161 | ÎονάÏη á¼Î³Î½Î¿Î¹Î± ἡ γλῶÏÏα Î¼Î¿Ï ÏÏá½¶Ï á¼Î¼Î¼Î¿Ï Î´Î¹á½²Ï Ïοῦ á½Î¼Î®ÏÎ¿Ï .<br/> |
162 | <p> |
163 | á¼Ïὸ Ïὸ á¼Î¾Î¹Î¿Î½ á¼ÏÏί<br/> |
164 | Ïοῦ á½Î´Ï ÏÏÎα á¼Î»ÏÏη<br/> |
165 | |
166 | |
167 | |
168 | |
169 | |
170 | |
171 | |
172 | </table> |
173 | </blockquote> |
174 | |
175 | <p> |
176 | |
177 | The first stanza of |
178 | <a href="http://www.ocf.berkeley.edu/%7Eleong/Russkaya%20Literatura/Aleksandr%20Sergeevich%20Pushkin.htm"><b>Pushkin</b></a>'s <cite>Bronze Horseman</cite> (Russian):<br> |
179 | <p><blockquote> |
180 | Ðа беÑÐµÐ³Ñ Ð¿ÑÑÑÑннÑÑ Ð²Ð¾Ð»Ð½<br> |
181 | СÑоÑл он, дÑм Ð²ÐµÐ»Ð¸ÐºÐ¸Ñ Ð¿Ð¾Ð»Ð½,<br> |
182 | Ð Ð²Ð´Ð°Ð»Ñ Ð³Ð»Ñдел. ÐÑед ним ÑиÑоко<br> |
183 | Река неÑлаÑÑ; беднÑй ÑÑлн<br> |
184 | Ðо ней ÑÑÑемилÑÑ Ð¾Ð´Ð¸Ð½Ð¾ÐºÐ¾.<br> |
185 | Ðо мÑиÑÑÑм, Ñопким беÑегам<br> |
186 | ЧеÑнели Ð¸Ð·Ð±Ñ Ð·Ð´ÐµÑÑ Ð¸ Ñам,<br> |
187 | ÐÑиÑÑ Ñбогого ÑÑÑ Ð¾Ð½Ñа;<br> |
188 | РлеÑ, неведомÑй лÑÑам<br> |
189 | Ð ÑÑмане ÑпÑÑÑанного ÑолнÑа,<br> |
190 | ÐÑÑгом ÑÑмел.<br> |
191 | </blockquote><p> |
192 | |
193 | <a href="http://www.compling.hu-berlin.de/~johannes/mxedruli/"><b>Å ota Rustaveli</b></a>'s VepÌxis TÌ£qÌaosani, |
194 | ̣︡Th, <cite>The Knight in the Tiger's Skin</cite> (Georgian):<p> |
195 | <blockquote> |
196 | áááá®áá¡ á¢á§ááá¡ááá |
197 | á¨ááá á á£á¡áááááá |
198 | <p> |
199 | á¦ááá áá¡á á¨áááááá á, áá£áᣠáááá áááá®á¡ááá¡ á¡áá¤ááá¡á á¨á áááá¡á, |
200 | áªááªá®áá¡, á¬á§ááá¡á áá ááá¬áá¡á, á°ááá áá áááá áá áááá¡á; |
201 | ááááªááá¡ á¤á áááá áá áá¦áá¤á áááá, áááá°á®ááá ááá¡ á©ááá¡á áááááá¡á, |
202 | áá¦áá¡áá áá á¦áááá áá°á®áááááá áááá¡á áááááá áá áááááá¡á. |
203 | </blockquote> |
204 | <p> |
205 | |
206 | Tamil poetry of Subramaniya Bharathiyar: |
207 | |
208 | à®à¯à®ªà¯à®°à®®à®£à®¿à®¯ பாரதியார௠(1882-1921): |
209 | |
210 | <p> |
211 | <blockquote> |
212 | |
213 | யாமறிநà¯à®¤ à®®à¯à®´à®¿à®à®³à®¿à®²à¯ தமிழà¯à®®à¯à®´à®¿ பà¯à®²à¯ à®à®©à®¿à®¤à®¾à®µà®¤à¯ à®à®à¯à®à¯à®®à¯ à®à®¾à®£à¯à®®à¯, <br> |
214 | பாமரராய௠விலà®à¯à®à¯à®à®³à®¾à®¯à¯, à®à®²à®à®©à¯à®¤à¯à®¤à¯à®®à¯ à®à®à®´à¯à®à¯à®à®¿à®à¯à®²à®ªà¯ பானà¯à®®à¯ à®à¯à®à¯à®à¯, <br> |
215 | நாமமத௠தமிழரà¯à®©à®à¯ à®à¯à®£à¯à®à¯ à®à®à¯à®à¯ வாழà¯à®¨à¯à®¤à®¿à®à¯à®¤à®²à¯ நனà¯à®±à¯? à®à¯à®²à¯à®²à¯à®°à¯!<br> |
216 | தà¯à®®à®¤à¯à®°à®¤à¯ தமிழà¯à®à¯ à®à®²à®à®®à¯à®²à®¾à®®à¯ பரவà¯à®®à¯à®µà®à¯ à®à¯à®¯à¯à®¤à®²à¯ வà¯à®£à¯à®à¯à®®à¯. |
217 | |
218 | </blockquote> |
219 | <p> |
220 | Kannada poetry by Kuvempu — ಬಾ à²à²²à³à²²à²¿ ಸà²à²à²µà²¿à²¸à³ |
221 | |
222 | <p> |
223 | <blockquote> |
224 | |
225 | |
226 | ಬಾ à²à²²à³à²²à²¿ ಸà²à²à²µà²¿à²¸à³ à²à²à²¦à³à²¨à³à²¨ ಹà³à²¦à²¯à²¦à²²à²¿ |
227 | <br> |
228 | |
229 | ನಿತà³à²¯à²µà³ ಠವತರಿಪ ಸತà³à²¯à²¾à²µà²¤à²¾à²° |
230 | |
231 | <p> |
232 | |
233 | |
234 | |
235 | |
236 | ಮಣà³à²£à²¾à²à²¿ ಮರವಾà²à²¿ ಮಿà²à²µà²¾à²à²¿ à²à²à²µà²¾à²à³... |
237 | |
238 | <br> |
239 | |
240 | ಮಣà³à²£à²¾à²à²¿ ಮರವಾà²à²¿ ಮಿà²à²µà²¾à²à²¿ à²à²à²µà²¾à²à²¿ |
241 | |
242 | <br> |
243 | |
244 | à²à²µ à²à²µà²¦à²¿ à²à²¤à²¿à²¸à²¿à²¹à³ à²à²µà²¤à²¿ ದà³à²° |
245 | |
246 | <br> |
247 | |
248 | ನಿತà³à²¯à²µà³ ಠವತರಿಪ ಸತà³à²¯à²¾à²µà²¤à²¾à²° || ಬಾ à²à²²à³à²²à²¿ || |
249 | |
250 | |
251 | </blockquote> |
252 | |
253 | <h3><a name="glass">I Can Eat Glass</a></h3> |
254 | |
255 | And from the sublime to the ridiculous, here is a |
256 | <a href="#notes">certain phrase¹</a> in an assortment of languages: |
257 | |
258 | <p> |
259 | <ol> |
260 | <li><b>Sanskrit</b>: à¤à¤¾à¤à¤ शà¤à¥à¤¨à¥à¤®à¥à¤¯à¤¤à¥à¤¤à¥à¤®à¥ । नà¥à¤ªà¤¹à¤¿à¤¨à¤¸à¥à¤¤à¤¿ मामॠ॥ |
261 | |
262 | <li><b>Sanskrit</b> <i>(standard transcription):</i> kÄcaá¹ Åaknomyattum; nopahinasti mÄm. |
263 | <li><b>Classical Greek</b>: á½Î±Î»Î¿Î½ Ïαγεá¿Î½ δύναμαιΠÏοῦÏο οὠμε βλάÏÏει. |
264 | <li><b>Greek</b> (monotonic): ÎÏοÏÏ Î½Î± ÏÎ¬Ï ÏÏαÏμÎνα Î³Ï Î±Î»Î¹Î¬ ÏÏÏÎ¯Ï Î½Î± ÏÎ¬Î¸Ï ÏίÏοÏα. |
265 | <li><b>Greek</b> (polytonic): ÎÏοÏῶ νὰ ÏÎ¬Ï ÏÏαÏμÎνα Î³Ï Î±Î»Î¹á½° ÏÏÏá½¶Ï Î½á½° ÏÎ¬Î¸Ï ÏίÏοÏα. |
266 | |
267 | <br><b>Etruscan</b>: (NEEDED) |
268 | <li><b>Latin</b>: Vitrum edere possum; mihi non nocet. |
269 | <li><b>Old French</b>: Je puis mangier del voirre. Ne me nuit. |
270 | <li><b>French</b>: Je peux manger du verre, ça ne me fait pas <!--de--> mal. |
271 | <li><b>Provençal / Occitan</b>: Pòdi manjar de veire, me nafrariá pas. |
272 | <li><b>Québécois</b>: J'peux manger d'la vitre, ça m'fa pas mal. |
273 | <li><b>Walloon</b>: Dji pou magnî do vêre, çoula m' freut nén må. |
274 | <br><b>Champenois</b>: (NEEDED) |
275 | <br><b>Lorrain</b>: (NEEDED) |
276 | <li><b>Picard</b>: Ch'peux mingi du verre, cha m'foé mie n'ma. |
277 | <br><b>Corsican/Corsu</b>: (NEEDED) |
278 | <br><b>Jèrriais</b>: (NEEDED) |
279 | <li><b>Kreyòl Ayisyen</b> (Haitï): Mwen kap manje vè, li pa blese'm. |
280 | <li><b>Basque</b>: Kristala jan dezaket, ez dit minik ematen. |
281 | <li><b>Catalan / Català </b>: Puc menjar vidre, que no em fa mal. |
282 | <li><b>Spanish</b>: Puedo comer vidrio, no me hace daño. |
283 | <li><b>Aragonés</b>: Puedo minchar beire, no me'n fa mal . |
284 | <br><b>Aranés</b>: (NEEDED) |
285 | <br><b>MallorquÃn</b>: (NEEDED) |
286 | <li><b>Galician</b>: Eu podo xantar cristais e non cortarme. |
287 | <li><b>European Portuguese</b>: Posso comer vidro, não me faz mal. |
288 | <li><b>Brazilian Portuguese</b> (<a href="#notes">8</a>): |
289 | Posso comer vidro, não me machuca. |
290 | <li><b>Caboverdiano/Kabuverdianu</b> (Cape Verde): M' podê cumê vidru, ca ta maguâ-m'. |
291 | <li><b>Papiamentu</b>: Ami por kome glas anto e no ta hasimi daño. |
292 | <li><b>Italian</b>: Posso mangiare il vetro e non mi fa male. |
293 | <li><b>Milanese</b>: Sôn bôn de magnà el véder, el me fa minga mal. |
294 | <li><b>Roman</b>: Me posso magna' er vetro, e nun me fa male. |
295 | <li><b>Napoletano</b>: M' pozz magna' o'vetr, e nun m' fa mal. |
296 | <li><b>Venetian</b>: Mi posso magnare el vetro, no'l me fa mae. |
297 | <li><b>Zeneise</b> <i>(Genovese):</i> Pòsso mangiâ o veddro e o no me fà mâ. |
298 | <li><b>Sicilian</b>: Puotsu mangiari u vitru, nun mi fa mali. |
299 | <br><b>Campinadese</b> (Sardinia): (NEEDED) |
300 | <br><b>Lugudorese</b> (Sardinia): (NEEDED) |
301 | <li><b>Romansch (Grischun)</b>: Jau sai mangiar vaider, senza che quai fa donn a mai. |
302 | <br><b>Romany / Tsigane</b>: (NEEDED) |
303 | <li><b>Romanian</b>: Pot sÄ mÄnânc sticlÄ Èi ea nu mÄ rÄneÈte. |
304 | <li><b>Esperanto</b>: Mi povas manÄi vitron, Äi ne damaÄas min. |
305 | <br><b>Pictish</b>: (NEEDED) |
306 | <br><b>Breton</b>: (NEEDED) |
307 | <li><b>Cornish</b>: Mý a yl dybry gwéder hag éf ny wra ow ankenya. |
308 | <li><b>Welsh</b>: Dw i'n gallu bwyta gwydr, 'dyw e ddim yn gwneud dolur i mi. |
309 | <li><b>Manx Gaelic</b>: Foddym gee glonney agh cha jean eh gortaghey mee. |
310 | <li><b>Old Irish</b> <i>(Ogham):</i> ááááá áááááááááááááá ááá ááááá áá |
311 | <li><b>Old Irish</b> <i>(Latin):</i> Con·iccim ithi nglano. NÃm·géna. |
312 | |
313 | <li><b>Irish</b>: Is féidir liom gloinne a ithe. Nà dhéanann sà dochar ar bith dom. |
314 | <li><b>Ulster Gaelic</b>: Ithim-sa gloine agus nà miste damh é. |
315 | <li><b>Scottish Gaelic</b>: S urrainn dhomh gloinne ithe; cha ghoirtich i mi. |
316 | <li><b>Anglo-Saxon</b> <i>(Runes):</i> |
317 | áá³á«áá¨á·á«á·áá¨áá«áá©ááªá¾á«á©á¾áá«á»ááá«á¾áá«á»ááªá±áááªá§á«ááᬠ|
318 | <li><b>Anglo-Saxon</b> <i>(Latin):</i> Ic mæg glæs eotan ond hit ne hearmiað me. |
319 | <li><b>Middle English</b>: Ich canne glas eten and hit hirtiþ me nouÈt. |
320 | <li><b>English</b>: I can eat glass and it doesn't hurt me. |
321 | <li><b>English</b> <i>(IPA):</i> [aɪ kæn iËt glÉËs ænd ɪt dÉz nÉt hÉËt miË] (Received Pronunciation) |
322 | <li id="braille"><b>English</b> <i>(Braille):</i> â â â â â â â â â â â â â â â â â â â â â â â â â â â â â â â â ¥â â â â â |
323 | <li><b>Jamaican</b>: Mi kian niam glas han i neba hot mi. |
324 | <li><b>Lalland Scots / Doric</b>: Ah can eat gless, it disnae hurt us. |
325 | <br><b>Glaswegian</b>: (NEEDED) |
326 | <li id="gothic"><b>Gothic</b> (<a href="#notes">4</a>): |
327 | ð¼ð°ð² |
328 | ð²ð»ð´ð |
329 | ð¹Ìðð°ð½, |
330 | ð½ð¹ |
331 | ð¼ð¹ð |
332 | ð ð¿ |
333 | ð½ð³ð°ð½ |
334 | ð±ðð¹ð²ð²ð¹ð¸. |
335 | <li><b>Old Norse</b> <i>(Runes):</i> áá´ á·áá ááá |
336 | ᧠á·ááá± áá¾ |
337 | á¦ááá á¨á§ á¡á |
338 | á±á§á¨ áá¨á± |
339 | |
340 | <li><b>Old Norse</b> <i>(Latin):</i> Ek get etið gler án þess að verða sár. |
341 | |
342 | <li><b>Norsk / Norwegian (Nynorsk):</b> Eg kan eta glas utan å skada meg. |
343 | <li><b>Norsk / Norwegian (Bokmål):</b> Jeg kan spise glass uten å skade meg. |
344 | <li><b>Føroyskt / Faroese</b>: Eg kann eta glas, skaðaleysur. |
345 | <!-- <br><b>Føroyskt / Faroese</b>: Eg kann eta glas, uttan á nakran hátt at meinslast av hesum. --> |
346 | <li><b>Ãslenska / Icelandic</b>: Ãg get etið gler án þess að meiða mig. |
347 | <li><b>Svenska / Swedish</b>: Jag kan äta glas utan att skada mig. |
348 | <li><b>Dansk / Danish</b>: Jeg kan spise glas, det gør ikke ondt på mig. |
349 | <li><b>Sønderjysk</b>: à ka æe glass uhen at det go mæ naue. |
350 | <li><b>Frysk / Frisian</b>: Ik kin glês ite, it docht me net sear. |
351 | <!-- <li><b>Nederlands / Dutch</b>: Ik kan glas eten, het doet mij geen pijn. --> |
352 | <!-- <li><b>Nederlands / Dutch</b>: Ik kan glas eten zonder dat het |
353 | mij |
354 | schaadt. --> |
355 | <!-- <li><tt>Dutch: Ik kan glas eten, maar dat doet mij geen kwaad.</tt> --> |
356 | <li><b>Nederlands / Dutch</b>: Ik kan glas eten, het doet |
357 | mij |
358 | geen kwaad. |
359 | |
360 | |
361 | <LI><B>Kirchröadsj/Bôchesserplat</B>: Iech ken glaas èèse, mer 't deet miech |
362 | jing pieng.</LI> |
363 | |
364 | <li><b>Afrikaans</b>: Ek kan glas eet, maar dit doen my nie skade nie. |
365 | <li><b>Lëtzebuergescht / Luxemburgish</b>: Ech kan Glas iessen, daat deet mir nët wei. |
366 | <li><b>Deutsch / German</b>: Ich kann Glas essen, ohne mir zu schaden. |
367 | <li><b>Ruhrdeutsch</b>: Ich kann Glas verkasematuckeln, ohne dattet mich wat jucken tut. |
368 | <li><b>Langenfelder Platt</b>: |
369 | Isch kann Jlaas kimmeln, uuhne datt mich datt weh dääd. |
370 | <li><b>Lausitzer Mundart</b> ("Lusatian"): Ich koann Gloos assn und doas |
371 | dudd merr ni wii. |
372 | <li><b>Odenwälderisch</b>: Iech konn glaasch voschbachteln ohne dass es mir ebbs daun doun dud. |
373 | <li><b>Sächsisch / Saxon</b>: 'sch kann Glos essn, ohne dass'sch mer wehtue. |
374 | <li><b>Pfälzisch</b>: Isch konn Glass fresse ohne dasses mer ebbes ausmache dud. |
375 | <li><b>Schwäbisch / Swabian</b>: I kå Glas frässa, ond des macht mr nix! |
376 | <li><b>Deutsch (Voralberg)</b>: I ka glas eassa, ohne dass mar weh tuat. |
377 | <li><b>Bayrisch / Bavarian</b>: I koh Glos esa, und es duard ma ned wei. |
378 | <li><b>Allemannisch</b>: I kaun Gloos essen, es tuat ma ned weh. |
379 | |
380 | <li><b>Schwyzerdütsch</b> (Zürich): Ich chan Glaas ässe, das schadt mir nöd. |
381 | <li><b>Schwyzerdütsch</b> (Luzern): Ech cha Glâs ässe, das schadt mer ned. |
382 | |
383 | <br><b>Plautdietsch</b>: (NEEDED) |
384 | <li><b>Hungarian</b>: Meg tudom enni az üveget, nem lesz tÅle bajom. |
385 | <li><b>Suomi / Finnish</b>: Voin syödä lasia, se ei vahingoita minua. |
386 | <li><b>Sami (Northern)</b>: Sáhtán borrat lása, dat ii leat bávÄÄas. |
387 | <li><b>Erzian</b>: Ðон ÑÑÑан |
388 | ÑÑликадо, Ð´Ñ |
389 | зÑÑн |
390 | ÑйÑÑÑÐ½Ð·Ñ Ð° |
391 | Ñли. |
392 | <li><b>Northern Karelian</b>: Mie voin syvvä lasie ta minla ei ole kipie. |
393 | <li><b>Southern Karelian</b>: Minä voin syvvä st'oklua dai minule ei ole kibie. |
394 | <br><b>Vepsian</b>: (NEEDED) |
395 | <br><b>Votian</b>: (NEEDED) |
396 | <br><b>Livonian</b>: (NEEDED) |
397 | <li><b>Estonian</b>: Ma võin klaasi süüa, see ei tee mulle midagi. |
398 | <li><b>Latvian</b>: Es varu Äst stiklu, tas man nekaitÄ. |
399 | <li><b>Lithuanian</b>: AÅ¡ galiu valgyti stiklÄ ir jis manÄs nežeidžia |
400 | <br><b>Old Prussian</b>: (NEEDED) |
401 | <br><b>Sorbian</b> (Wendish): (NEEDED) |
402 | <li><b>Czech</b>: Mohu jÃst sklo, neublÞà mi. |
403 | <li><b>Slovak</b>: Môžem jesť sklo. Nezranà ma. |
404 | <li><b>Polska / Polish</b>: MogÄ jeÅÄ szkÅo i mi nie szkodzi. |
405 | <li><b>Slovenian:</b> Lahko jem steklo, ne da bi mi Å¡kodovalo. |
406 | |
407 | <!-- |
408 | <li><b>Croatian</b>: Ja mogu jesti staklo i ne boli me. |
409 | Serbian translation is very poor. Infinitive used and sound as: "I can |
410 | eating glass". |
411 | <li><b>Serbian</b> <i>(Latin):</i> Mogu jesti staklo a da mi ne Å¡kodi. |
412 | <li><b>Serbian</b> <i>(Cyrillic):</i> ÐÐ¾Ð³Ñ ÑеÑÑи ÑÑакло |
413 | а |
414 | да ми |
415 | не |
416 | Ñкоди. |
417 | <li><b>Serbian</b> <i>(Latin):</i> Ja mogu da jedem staklo. |
418 | <li><b>Serbian</b> <i>(Cyrillic)</i>: Ðа Ð¼Ð¾Ð³Ñ Ð´Ð° Ñедем ÑÑакло. |
419 | <li><b>Macedonian:</b> Ðожам да Ñадам ÑÑакло, а не ме ÑÑеÑа. |
420 | --> |
421 | <li><b>Bosnian, Croatian, Montenegrin and Serbian</b> <i>(Latin)</i>: Ja mogu jesti staklo, i to mi ne Å¡teti. |
422 | |
423 | <li><b>Bosnian, Montenegrin and Serbian</b> <i>(Cyrillic)</i>: Ðа Ð¼Ð¾Ð³Ñ ÑеÑÑи ÑÑакло, и Ñо ми не ÑÑеÑи. |
424 | |
425 | <li><b>Macedonian:</b> Ðожам да Ñадам ÑÑакло, а не ме ÑÑеÑа. |
426 | <li><b>Russian</b>: Я Ð¼Ð¾Ð³Ñ ÐµÑÑÑ ÑÑекло, оно мне не вÑедиÑ. |
427 | <li><b>Belarusian</b> <i>(Cyrillic):</i> Я Ð¼Ð°Ð³Ñ ÐµÑÑÑ Ñкло, Ñно мне не ÑкодзÑÑÑ. |
428 | <li><b>Belarusian</b> <i>(Lacinka):</i> Ja mahu jeÅci Å¡kÅo, jano mne ne Å¡kodziÄ. |
429 | <!-- |
430 | <li><b>Ukrainian</b>: Я Ð¼Ð¾Ð¶Ñ ÑÑÑи Ñкло, й воно Ð¼ÐµÐ½Ñ Ð½Ðµ поÑкодиÑÑ. |
431 | --> |
432 | <li><b>Ukrainian</b>: Я Ð¼Ð¾Ð¶Ñ ÑÑÑи Ñкло, Ñ Ð²Ð¾Ð½Ð¾ Ð¼ÐµÐ½Ñ Ð½Ðµ заÑкодиÑÑ. |
433 | |
434 | <!-- <li><b>Bulgarian</b>: Ðога да Ñм ÑÑÑкло и не ме боли. --> |
435 | <li><b>Bulgarian</b>: Ðога да Ñм ÑÑÑкло, Ñо не ми вÑеди. |
436 | |
437 | <li><b>Georgian</b>: ááááá¡ áááá áá áá á áá¢áááá. |
438 | <li><b>Armenian</b>: Ô¿ÖÕ¶Õ¡Õ´ Õ¡ÕºÕ¡Õ¯Õ« Õ¸ÖÕ¿Õ¥Õ¬ Ö Õ«Õ¶Õ®Õ« Õ¡Õ¶Õ°Õ¡Õ¶Õ£Õ«Õ½Õ¿ Õ¹Õ¨Õ¶Õ¥ÖÖ |
439 | <li><b>Albanian</b>: Unë mund të ha qelq dhe nuk më gjen gjë. |
440 | <li><b>Turkish</b>: Cam yiyebilirim, bana zararı dokunmaz. |
441 | <li><b>Turkish</b> <i>(Ottoman):</i> جا٠ÙÙ٠بÙÙر٠بÚا ضرر٠طÙÙÙÙ٠ز |
442 | <li><b>Bangla / Bengali</b>: |
443 | à¦à¦®à¦¿ à¦à¦¾à¦à¦ à¦à§à¦¤à§ পারি, তাতৠà¦à¦®à¦¾à¦° à¦à§à¦¨à§ à¦à§à¦·à¦¤à¦¿ হৠনা। |
444 | <li><b>Marathi</b>: मॠà¤à¤¾à¤ à¤à¤¾à¤ शà¤à¤¤à¥, मला तॠदà¥à¤à¤¤ नाहà¥. |
445 | |
446 | <!-- |
447 | <li><b>Hindi</b>: मà¥à¤ à¤à¤¾à¤à¤ à¤à¤¾ सà¤à¤¤à¤¾ हà¥à¤, मà¥à¤à¥ à¤à¤¸ सॠà¤à¥à¤ पà¥à¤¡à¤¾ नहà¥à¤ हà¥à¤¤à¥. |
448 | --> |
449 | |
450 | <li><b>Kannada</b>: |
451 | |
452 | |
453 | ನನà²à³ ಹಾನಿ à²à²à²¦à³, ನಾನೠà²à²à²¨à³à²¨à³ ತಿನಬಹà³à²¦à³ |
454 | |
455 | |
456 | <!-- |
457 | |
458 | (à²à²¨à³à²¨à²¡): à²à²²à³à²²à²¾à²¦à²°à³ à²à²°à³, à²à²à²¤à²¾à²¦à²°à³ à²à²°à³, à²à²à²¦à³à²à²¦à²¿à²à³ ನೠà²à²¨à³à²¨à²¡à²µà²¾à²à²¿à²°à³, à²à²¨à³à²¨à²¡à²µà³ ಸತà³à²¯.. à²à²¨à³à²¨à²¡à²µà³ ನಿತà³à²¯.. |
459 | |
460 | --> |
461 | |
462 | <li><b>Hindi</b>: मà¥à¤ à¤à¤¾à¤à¤ à¤à¤¾ सà¤à¤¤à¤¾ हà¥à¤ à¤à¤° मà¥à¤à¥ à¤à¤¸à¤¸à¥ à¤à¥à¤ à¤à¥à¤ नहà¥à¤ पहà¥à¤à¤à¤¤à¥. |
463 | |
464 | |
465 | <li><b>Malayam</b>: |
466 | |
467 | à´à´¨à´¿à´àµà´àµ à´àµà´²à´¾à´¸àµ തിനàµà´¨à´¾à´. à´ à´¤àµà´¨àµà´¨àµ à´µàµà´¦à´¨à´¿à´ªàµà´ªà´¿à´àµà´à´¿à´²àµà´². |
468 | |
469 | |
470 | |
471 | <li><b>Tamil</b>: நான௠à®à®£à¯à®£à®¾à®à®¿ à®à®¾à®ªà¯à®ªà®¿à®à¯à®µà¯à®©à¯, ஠தனால௠à®à®©à®à¯à®à¯ à®à®°à¯ à®à¯à®à¯à®®à¯ வராதà¯. |
472 | |
473 | |
474 | <li><b>Telugu</b>: à°¨à±à°¨à± à°à°¾à°à± తినà°à°²à°¨à± మరియౠఠలా à°à±à°¸à°¿à°¨à°¾ నాà°à± à°à°®à°¿ à°à°¬à±à°¬à°à°¦à°¿ à°²à±à°¦à± |
475 | |
476 | |
477 | <li><b>Sinhalese</b>: මට à·à·à¶¯à·à¶»à· à¶à·à¶¸à¶§ à·à·à¶à·à¶ºà·. à¶à¶ºà·à¶±à· මට à¶à·à·à· à·à·à¶±à·à¶ºà¶à· à·à·à¶¯à· නà·à·à·. |
478 | |
479 | <li><b>Urdu</b><a href="#notes">(3)</a>: <span dir="RTL" lang=UR> |
480 | Ù ÛÚº کاÙÚ Ú©Ú¾Ø§ سکتا ÛÙÚº اÙر Ù Ø¬Ú¾Û ØªÚ©ÙÛÙ ÙÛÛÚº ÛÙØªÛ Û</span> |
481 | <li><b>Pashto</b><a href="#notes">(3)</a>: ز٠شÙØ´Ù Ø®ÙÚÙÛ Ø´Ù Ø Ùغ٠٠ا ÙÙ Ø®ÙÚÙÙ |
482 | <li><b>Farsi / Persian</b><a href="#notes">(3)</a>: .Ù Ù Ù Û ØªÙاÙ٠بدÙÙ٠اØساس درد Ø´Ùش٠بخÙر٠|
483 | <li id="arabic"><b>Arabic</b><a href="#notes">(3)</a>: <span dir="RTL" lang=AR>Ø£Ùا Ùادر عÙ٠أÙ٠اÙزجاج Ù Ùذا Ùا ÙؤÙÙ ÙÙ.</span> |
484 | |
485 | <br><B>Aramaic</B>: (NEEDED) |
486 | <li><b>Maltese</b>: Nista' niekol il-ħġieġ u ma jagħmilli xejn. |
487 | <li id="hebrew"><B>Hebrew</B><a href="#notes">(3)</a>: <SPAN dir=rtl lang=HE>×× × ×××× ××××× ×××××ת ××× ×× ×××ק ××.</SPAN> |
488 | <li><B>Yiddish</B><a href="#notes">(3)</a>: <SPAN dir=rtl lang=JI>××× ×§×¢× ×¢×¡× ×××Ö¸× ××× ×¢×¡ ××× ××ר × ××©× ×°×².</SPAN> |
489 | <br><b>Judeo-Arabic</b>: (NEEDED) |
490 | <br><b>Ladino</b>: (NEEDED) |
491 | <br><b>GÇʼÇz</b>: (NEEDED) |
492 | <br><b>Amharic</b>: (NEEDED) |
493 | <li><b>Twi</b>: Metumi awe tumpan, ÉnyÉ me hwee. |
494 | <li><b>Hausa</b> (<i>Latin</i>): InaÌ iya taunar gilaÌshi kuma in gamaÌ laÌfiyaÌ. |
495 | <li><b>Hausa</b> (<i>Ajami</i>) <a href="#notes">(2)</a>: <SPAN dir=rtl lang=HA> |
496 | Ø¥ÙÙا Ø¥ÙÙ٠تÙÙÙÙر غÙÙÙاش٠ÙÙ٠٠إÙ٠غÙÙ Ùا ÙÙاÙÙÙÙا</SPAN> |
497 | <li><b>Yoruba</b><a href="#notes">(4)</a>: Mo lè jeÌ© dÃgÃ, kò nà pa mà lára. |
498 | <li><b>Lingala</b>: NakokiÌ koliÌya biteÌni bya milungi, ekosaÌla ngaÌiÌ mabeÌ tÉÌ. |
499 | |
500 | <!-- |
501 | <li><b>Lingala</b>: Nakokà kolÃya biténi bya milungi, ekosála ngáà mabé tÉÌ. |
502 | --> |
503 | <li><b>(Ki)Swahili</b>: Naweza kula bilauri na sikunyui. |
504 | |
505 | <li><b>Malay</b>: Saya boleh makan kaca dan ia tidak mencederakan saya. |
506 | <li><b>Tagalog</b>: Kaya kong kumain nang bubog at hindi ako masaktan. |
507 | <li><b>Chamorro</b>: Siña yo' chumocho krestat, ti ha na'lalamen yo'. |
508 | <li><b>Fijian</b>: Au rawa ni kana iloilo, ia au sega ni vakacacani kina. |
509 | <li><b>Javanese</b>: Aku isa mangan beling tanpa lara. |
510 | <li><b>Burmese</b> (Unicode 4.0): |
511 | áá¹áá¹ááá¹âáá±á¬á¹âááá¹áá¹ááá¹âá áá¹ááá¹âá á¬á¸áá¯ááá¹âááá¹âá ááá¹áá±á¬áá¹âá· |
512 | áááá¯ááá¹âáá¹áᯠááá¹áááá¬á |
513 | (9) |
514 | |
515 | <li><b>Burmese</b> (Unicode 5.0): |
516 | áá»á½ááºáá±á¬áº áá»á½ááºá áá¾ááºá á¬á¸ááá¯ááºáááºá áááºá¸áá¼á±á¬ááºá· ááááá¯ááºáá¾á¯ááá¾ááá«á |
517 | (9) |
518 | |
519 | <li><B>Vietnamese (quá»c ngữ)</B>: Tôi có thá» Än thủy tinh mà không hại gì. |
520 | <li><B>Vietnamese (nôm)</B> (<a href="#notes">4</a>): äº ð£ ä¸ å¹ æ°´ æ¶ ð¦¡ ç©º ð£ 害 å¦ |
521 | <li><b>Khmer</b>: |
522 | áááá»áá¢á¶á áá»ááááá áááá¶á |
523 | ááááááá¶ááááá á¶á |
524 | |
525 | |
526 | <li><b>Lao</b>: |
527 | àºàºà»àºàºàº´àºà»àºà»àº§à»àºà»à»àºàºàºàºµà»àº¡àº±àºàºà»à»à»àºà»à»àº®àº±àºà»àº«à»àºàºà»àºà»àºàº±àº. |
528 | |
529 | |
530 | |
531 | <li><b>Thai</b>: à¸à¸±à¸à¸à¸´à¸à¸à¸£à¸°à¸à¸à¹à¸à¹ à¹à¸à¹à¸¡à¸±à¸à¹à¸¡à¹à¸à¸³à¹à¸«à¹à¸à¸±à¸à¹à¸à¹à¸ |
532 | <li><b>Mongolian</b> <i>(Cyrillic):</i> Ðи Ñил идÑй Ñадна, надад Ñ Ð¾ÑÑой Ð±Ð¸Ñ |
533 | <li><b>Mongolian</b> <i>(Classic)</i> (<a href="#notes">5</a>): |
534 | á ªá ¢ á °á ¢á ¯á ¢ á ¢á ³á ¡á ¶á ¦ á ´á ¢á ³á á ¨á á á ¨á á ³á ¤á · á ¬á £á ¤á ·á á ³á á ¢ á ªá ¢á °á ¢ |
535 | <br><b>Dzongkha</b>: (NEEDED) |
536 | <li><b>Nepali</b>: म à¤à¤¾à¤à¤ à¤à¤¾à¤¨ सà¤à¥à¤à¥ र मलाठà¤à¥à¤¹à¤¿ नॠहà¥à¤¨à¥âनॠ। |
537 | |
538 | <li><b>Tibetan</b>: ཤེལà¼à½¦à¾à½¼à¼à½à¼à½à½¦à¼à½à¼à½à¼à½à½²à¼à½à¼à½¢à½ºà½à¼ |
539 | <li><b>Chinese</b>: <span lang=zh>æè½åä¸ç»çèä¸ä¼¤èº«ä½ã</span> |
540 | <li><b>Chinese</b> (Traditional): æè½åä¸ç»çèä¸å·èº«é«ã |
541 | |
542 | <li><b>Taiwanese</b><a href="#notes">(6)</a>: Góa Ä-tà ng chiaÌh po-lê, mÄ bÄ tioÌh-siong. |
543 | <li><b>Japanese</b>: <span lang=ja>ç§ã¯ã¬ã©ã¹ãé£ã¹ããã¾ããããã¯ç§ãå·ã¤ãã¾ããã</span> |
544 | <li><b>Korean</b>: <span lang=ko>ëë ì 리를 먹ì ì ìì´ì. ê·¸ëë ìíì§ ììì</span> |
545 | <li><b>Bislama</b>: Mi save kakae glas, hemi no save katem mi.<br> |
546 | <li><b>Hawaiian</b>: Hiki iaÊ»u ke Ê»ai i ke aniani; Ê»aÊ»ole nÅ lÄ au e Ê»eha.<br> |
547 | <li><b>Marquesan</b>: E koÊ»ana e kai i te karahi, mea Ê»Ä, Ê»aÊ»e hauhau. |
548 | <li><b>Inuktitut</b> (10): áááá ááááááᯠá±áá±á¦áááá áá |
549 | |
550 | <li><b>Chinook Jargon:</b> Naika mÉkmÉk kakshÉt labutay, pi weyk ukuk munk-sik nay. |
551 | <li><b>Navajo</b>: Tsésǫʼ yishÄ ÌÄ go bÃÃnÃshghah dóó doo shiÅ neezgai da. |
552 | <br><b>Cherokee</b> <i>(and Cree, Chickasaw, Cree, Micmac, Ojibwa, Lakota, |
553 | Náhuatl, Quechua, Aymara, |
554 | and other American languages):</i> (NEEDED) |
555 | <br><b>Garifuna</b>: (NEEDED) |
556 | <br><b>Gullah</b>: (NEEDED) |
557 | <li><b>Lojban</b>: mi kakne le nu citka le blaci .iku'i le se go'i na xrani mi |
558 | <li><b>Nórdicg</b>: Ljœr ye caudran créneþ ý jor cẃran. |
559 | </ol> |
560 | <p> |
561 | |
562 | <i>(Additions, corrections, completions,</i> |
563 | <a href="mailto:kermit@kermitproject.org"><i>gratefuly accepted</i></a><i>.)</i> |
564 | |
565 | <p> |
566 | For testing purposes, some of these are repeated in a <b>monospace font</b> . . . |
567 | <p> |
568 | <ol> |
569 | <li><tt>Euro Symbol: â¬.</tt> |
570 | <li><tt>Greek: ÎÏοÏÏ Î½Î± ÏÎ¬Ï ÏÏαÏμÎνα Î³Ï Î±Î»Î¹Î¬ ÏÏÏÎ¯Ï Î½Î± ÏÎ¬Î¸Ï ÏίÏοÏα.</tt> |
571 | |
572 | <li><tt>Ãslenska / Icelandic: Ãg get etið gler án þess að meiða mig.</tt> |
573 | |
574 | <li><tt>Polish: MogÄ jeÅÄ szkÅo, i mi nie szkodzi.</tt> |
575 | <li><tt>Romanian: Pot sÄ mÄnânc sticlÄ Èi ea nu mÄ rÄneÈte.</tt> |
576 | <li><tt>Ukrainian: Я Ð¼Ð¾Ð¶Ñ ÑÑÑи Ñкло, й воно Ð¼ÐµÐ½Ñ Ð½Ðµ поÑкодиÑÑ.</tt> |
577 | <li><tt>Armenian: Ô¿ÖÕ¶Õ¡Õ´ Õ¡ÕºÕ¡Õ¯Õ« Õ¸ÖÕ¿Õ¥Õ¬ Ö Õ«Õ¶Õ®Õ« Õ¡Õ¶Õ°Õ¡Õ¶Õ£Õ«Õ½Õ¿ Õ¹Õ¨Õ¶Õ¥ÖÖ</tt> |
578 | <li><tt>Georgian: ááááá¡ áááá áá áá á áá¢áááá.</tt> |
579 | <li><tt>Hindi: मà¥à¤ à¤à¤¾à¤à¤ à¤à¤¾ सà¤à¤¤à¤¾ हà¥à¤, मà¥à¤à¥ à¤à¤¸ सॠà¤à¥à¤ पà¥à¤¡à¤¾ नहà¥à¤ हà¥à¤¤à¥.</tt> |
580 | <li><tt>Hebrew<a href="#notes">(2)</a>: <SPAN dir=rtl lang=HE>×× × ×××× ××××× ×××××ת ××× ×× ×××ק ××.</SPAN></tt> |
581 | <li><tt>Yiddish<a href="#notes">(2)</a>: <SPAN dir=rtl lang=JI>××× ×§×¢× ×¢×¡× ×××Ö¸× ××× ×¢×¡ ××× ××ר × ××©× ×°×².</SPAN></tt> |
582 | <li><tt>Arabic<a href="#notes">(2)</a>: <span dir="RTL" lang=AR>Ø£Ùا Ùادر عÙ٠أÙ٠اÙزجاج Ù Ùذا Ùا ÙؤÙÙ ÙÙ.</span></tt> |
583 | <li><tt>Japanese: <span lang=ja>ç§ã¯ã¬ã©ã¹ãé£ã¹ããã¾ããããã¯ç§ãå·ã¤ãã¾ããã</span></tt> |
584 | <li><tt>Thai: à¸à¸±à¸à¸à¸´à¸à¸à¸£à¸°à¸à¸à¹à¸à¹ à¹à¸à¹à¸¡à¸±à¸à¹à¸¡à¹à¸à¸³à¹à¸«à¹à¸à¸±à¸à¹à¸à¹à¸</tt> |
585 | </ol> |
586 | <p> |
587 | |
588 | <b><a name="notes">Notes:</a></b> |
589 | |
590 | <p> |
591 | <ol> |
592 | |
593 | <li>The "I can eat glass" phrase and initial translations (about 30 of them) |
594 | were borrowed from Ethan Mollick's <a |
595 | href="http://hcs.harvard.edu/~igp/glass.html">I Can Eat Glass</a> page |
596 | (which disappeared on or about June 2004) and converted to UTF-8. Since |
597 | Ethan's original page is gone, I should mention that his purpose was to offer |
598 | travelers a phrase they could use in any country that would command a |
599 | certain kind of respect, or at least get attention. See <a |
600 | href="#credits">Credits</a> for the many additional contributions since |
601 | then. When submitting new entries, the word "hurt" (if you have a choice) |
602 | is used in the sense of "cause harm", "do damage", or "bother", rather than |
603 | "inflict pain" or "make sad". In this vein Otto Stolz comments (as do |
604 | others further down; personally I think it's better for the purpose of this |
605 | page to have extra entries and/or to show a greater repertoire of characters |
606 | than it is to enforce a strict interpretation of the word "hurt"!): |
607 | |
608 | <p> |
609 | <blockquote> |
610 | |
611 | This is the meaning I have translated to the Swabian dialect. |
612 | |
613 | However, I just have noticed that most of the German variants |
614 | translate the "inflict pain" meaning. The German example should |
615 | read: |
616 | |
617 | <p> |
618 | <blockquote> |
619 | "Ich kann Glas essen ohne mir zu schaden." |
620 | </blockquote> |
621 | <p> |
622 | |
623 | rather than: |
624 | |
625 | <p> |
626 | <blockquote> |
627 | "Ich kann Glas essen, ohne mir weh zu tun." |
628 | </blockquote> |
629 | <p> |
630 | |
631 | (The comma fell victim to the 1996 orthographic reform, |
632 | cf. <a href="http://www.ids-mannheim.de/reform/e3-1.html#P76"><tt>http://www.ids-mannheim.de/reform/e3-1.html#P76</tt></a>. |
633 | |
634 | <p> |
635 | |
636 | You may wish to contact the contributors of the following translations |
637 | to correct them: |
638 | |
639 | <p> |
640 | <ul> |
641 | |
642 | <li> Lëtzebuergescht / Luxemburgish: Ech kan Glas iessen, daat deet mir nët wei. |
643 | <li> Lausitzer Mundart ("Lusatian"): Ich koann Gloos assn und doas dudd merr ni wii. |
644 | <li> Sächsisch / Saxon: 'sch kann Glos essn, ohne dass'sch mer wehtue. |
645 | <li> Bayrisch / Bavarian: I koh Glos esa, und es duard ma ned wei. |
646 | <li> Allemannisch: I kaun Gloos essen, es tuat ma ned weh. |
647 | <li> Schwyzerdütsch: Ich chan Glaas ässe, das tuet mir nöd weeh. |
648 | </ul> |
649 | <p> |
650 | |
651 | In contrast, I deem the following translations *alright*: |
652 | |
653 | <p> |
654 | <ul> |
655 | |
656 | <li> Ruhrdeutsch: Ich kann Glas verkasematuckeln, ohne dattet mich wat jucken tut. |
657 | <li> Pfälzisch: Isch konn Glass fresse ohne dasses mer ebbes ausmache dud. |
658 | <li> Schwäbisch / Swabian: I kå Glas frässa, ond des macht mr nix! |
659 | </ul> |
660 | <p> |
661 | |
662 | (However, you could remove the commas, on account of |
663 | <a href="http://www.ids-mannheim.de/reform/e3-1.html#P76"><tt>http://www.ids-mannheim.de/reform/e3-1.html#P76</tt></a> |
664 | and |
665 | |
666 | <a href="http://www.ids-mannheim.de/reform/e3-1.html#P72"><tt>http://www.ids-mannheim.de/reform/e3-1.html#P72</tt></a>, respectively.) |
667 | |
668 | <p> |
669 | |
670 | I guess, also these examples translate the <i>wrong</i> sense of "hurt", |
671 | though I do not know these languages well enough to assert them |
672 | definitely: |
673 | |
674 | <p> |
675 | <ul> |
676 | |
677 | <li> Nederlands / Dutch: Ik kan glas eten; het doet mij geen |
678 | pijn. <i>(This one has been changed)</i> |
679 | <li> Kirchröadsj/Bôchesserplat: Iech ken glaas èèse, mer 't deet miech jing pieng. |
680 | |
681 | </ul> |
682 | <p> |
683 | |
684 | In the Romanic languages, the variations on "fa male" (it) are probably |
685 | wrong, whilst the variations on "hace daño" (es) and "damaÄas" (Esperanto) are probably correct; "nocet" (la) is definitely right. |
686 | |
687 | <p> |
688 | |
689 | The northern Germanic variants of "skada" are probably right, as are |
690 | the Slavic variants of "Å¡kodi/Ñкоди" (se); however the Slavic variants |
691 | of " boli" (hv) are probably wrong, as "bolena" means "pain/ache", IIRC. |
692 | |
693 | </blockquote> |
694 | <p> |
695 | That was from July 2004. In December 2007, Otto writes again: |
696 | |
697 | <p> |
698 | <blockquote> |
699 | <small> |
700 | Hello Frank, |
701 | |
702 | in days of yore, I had written:<br> |
703 | > "Ich kann Glas essen ohne mir zu schaden." <br> |
704 | > (The comma fell victim to the 1996 orthographic reform, |
705 | <p> |
706 | cf. <a href="http://www.ids-mannheim.de/reform/e3-1.html#P76">http://www.ids-mannheim.de/reform/e3-1.html#P76</a>. |
707 | <p> |
708 | |
709 | The latest revision (2006) of the official German orthography |
710 | has revived the comma around infinitive clauses commencing with |
711 | <i>ohne</i>, or 5 other conjunctions, or depending from a noun or |
712 | from an announcing demonstrative |
713 | (<a href="http://www.ids-mannheim.de/reform/regeln2006.pdf">http://www.ids-mannheim.de/reform/regeln2006.pdf</a>, §75). |
714 | So, it's again: <i>Ich kann Glas essen, ohne mir zu schaden.</i> |
715 | <p> |
716 | Best wishes,<br> |
717 | Otto Stolz |
718 | </small> |
719 | </blockquote> |
720 | <p> |
721 | |
722 | <li>The numbering of the samples is arbitrary, done only to keep track of how |
723 | many there are, and can change any time a new entry is added. The |
724 | arrangement is also arbitrary but with some attempt to group related |
725 | examples together. Note: All languages not listed are wanted, not just the |
726 | ones that say (NEEDED). |
727 | |
728 | <p> |
729 | |
730 | <li><a name="note1">Correct right-to-left display of these languages |
731 | depends on the capabilities of your browser.</a> The period should |
732 | appear on the left. In the monospace Yiddish example, the Yiddish digraphs |
733 | should occupy one character cell. |
734 | |
735 | <p> |
736 | |
737 | <li>Yoruba: The third word is Latin letter small 'j' followed by |
738 | small 'e' with U+0329, Combining Vertical Line Below. This displays |
739 | correctly only if your Unicode font includes the U+0329 glyph and your |
740 | browser supports combining diacritical marks. The Lingala and Indic examples |
741 | also include combining sequences. |
742 | |
743 | <p> |
744 | |
745 | <li>Includes Unicode 3.1 (or later) characters beyond Plane 0. |
746 | |
747 | <p> |
748 | |
749 | <li>The Classic Mongolian example should be vertical, top-to-bottom and |
750 | left-to-right. But such display is almost impossible. Also no font yet |
751 | exists which provides the proper ligatures and positional variants for the |
752 | characters of this script, which works somewhat like Arabic. |
753 | |
754 | <p> |
755 | |
756 | <li>Taiwanese is also known as Holo or Hoklo, and is related to Southern |
757 | Min dialects such as Amoy. |
758 | Contributed by Henry H. Tan-Tenn, who comments, "The above is |
759 | the romanized version, in a script current among Taiwanese Christians since |
760 | the mid-19th century. It was invented by British missionaries and saw use in |
761 | hundreds of published works, mostly of a religious nature. Most Taiwanese did |
762 | not know Chinese characters then, or at least not well enough to read. More |
763 | to the point, though, a written standard using Chinese characters has never |
764 | developed, so a significant minority of words are represented with different |
765 | candidate characters, depending on one's personal preference or etymological |
766 | theory. In this sentence, for example, "-tà ng", "chiaÌh", |
767 | "mÄ" and "bÄ" are problematic using Chinese characters. |
768 | "Góa" (I/me) and "po-lê" (glass) are as written in other Sinitic |
769 | languages (e.g. Mandarin, Hakka)." |
770 | |
771 | <p> |
772 | |
773 | <li>Wagner Amaral of Pinese & Amaral Associados notes that |
774 | the Brazilian Portuguese sentence for |
775 | "I can eat glass" should be identical to the Portuguese one, as the word |
776 | "machuca" means "inflict pain", or rather "injuries". The words "faz |
777 | mal" would more correctly translate as "cause harm". |
778 | |
779 | <p> |
780 | |
781 | <li>Burmese: In English the first person pronoun "I" stands for both |
782 | genders, male and female. In Burmese (except in the central part of Burma) |
783 | kyundaw (<font |
784 | size="+1" |
785 | face="Padauk">áá¹áá¹ááá¹âáá±á¬á¹â</font>) for male and kyanma (<font |
786 | size="+1" face="Padauk">áá¹áá¹ááá¹âá</font>) for female. |
787 | Using here a fully-compliant Unicode Burmese font -- sadly one and only one |
788 | Padauk Graphite font exists -- rendering using graphite engine. |
789 | <!--GONE |
790 | <a href="http://h1.ripway.com/bamarsar/">CLICK HERE</a> to test Burmese |
791 | characters. |
792 | --> |
793 | Unicode 4.0 or older standard did not have some medial and vowel character; |
794 | the second example has them. |
795 | |
796 | <p> |
797 | |
798 | <li><i>From Louise Hope, 22 November 2010:</i> |
799 | I decided to have a go at an Inuktitut rendering, mainly in hopes of shaming someone who actually knows the language into coming up with something better. |
800 | Meanwhile, try this: |
801 | <p> |
802 | áááá ááááááᯠá±áá±á¦áááá áá |
803 | <br> |
804 | aliguq nirijaraangakku suranngittunnaqtunga |
805 | <p> |
806 | Loosely: I am able not to hurt myself whenever I eat glass. |
807 | <p> |
808 | aliguq >> glass (uninflected because it is the patient of a transitive verb in an ergative language)<br> |
809 | nirijaraangakku >> "I eat him/her/it" in Frequentative mood (all one verb with inflectional ending, no affixes whatsoever)<br> |
810 | suranngittunnaqtunga >> suraq (do permanent harm) + nngit (verb-negator) + tunnaq (ability) + tunga (intransitive ending, making the verb passive or reflexive) |
811 | <p> |
812 | See above about someone who knows the language, et cetera. |
813 | <p> |
814 | Script trivia: the syllable á± is a single unicode character |
815 | representing the two elements á (syllable-final n) and á |
816 | (syllable ngi). I think they just did it that way because it looks tidier |
817 | than the expected áá. If your operating system didn't come |
818 | with <a href="http://www.ffonts.net/Euphemia-UCAS.font">Euphemia</a> (all-purpose UCAS font), you can download <a href="http://www.allaboutshoes.ca/inuk/our-boots/piq_font.php">Pigiarniq</a>. It comes with a jolly little inuksuk á that the Unicode Consortium is trying to make into a squatter. |
819 | <p> |
820 | |
821 | <!-- |
822 | á¯áá¥á áªáááá¯á ááªááááá¾áªá |
823 | <br> |
824 | siqumiumanngikkuni naamangnanngijjuk. |
825 | --> |
826 | |
827 | </ol> |
828 | |
829 | <h3><a name="quickbrownfox">The Quick Brown Fox... Pangrams</a></h3> |
830 | |
831 | The "I can eat glass" sentences do not necessarily show off the orthography of |
832 | each language to best advantage. In many alphabetic written languages it is |
833 | possible to include all (or most) letters (or "special" characters) in |
834 | a single (often nonsense) <i>pangram</i>. These were traditionally used in |
835 | typewriter instruction; now they are useful for stress-testing computer fonts |
836 | and keyboard input methods. Here are a few examples (SEND MORE): |
837 | |
838 | <p> |
839 | <ol> |
840 | |
841 | <li><b>English:</b> The quick brown fox jumps over the lazy dog. |
842 | <li><b>Jamaican:</b> Chruu, a kwik di kwik brong fox a jomp huova di liezi daag de, yu no siit? |
843 | <li><b>Irish:</b> "An á¸fuil do Äroà ag bualaḠó á¸aitÃos an Ä¡rá a á¹eall lena á¹Ã³g éada ó |
844 | ṡlà do leasa ṫú?" |
845 | "D'á¸uascail Ãosa Ãrá¹ac na hÃiÄ¡e Beannaiṫe pór Ãava agus Ãá¸aiá¹." |
846 | <li><b>Dutch:</b> Pa's wijze lynx bezag vroom het fikse aquaduct. |
847 | <li><b>German: </b> Falsches Ãben von Xylophonmusik quält jeden |
848 | gröÃeren Zwerg. (1) |
849 | <li><b>German: </b> <span lang=da>Im finÅ¿teren JagdÅ¿chloà am offenen FelsquellwaÅ¿Å¿er patzte der affig-flatterhafte kauzig-höfâliche Bäcker über Å¿einem verÅ¿ifften kniffligen C-Xylophon.</span> (2) |
850 | <li><b>Norwegian:</b> Blåbærsyltetøy ("blueberry jam", includes every |
851 | extra letter used in Norwegian). |
852 | <li><b>Swedish:</b> Flygande bäckasiner söka strax hwila på mjuka tuvor. |
853 | <li><b>Icelandic:</b> Sævör grét áðan þvà úlpan var ónýt. |
854 | <li><b>Finnish:</b> (5) Törkylempijävongahdus (This is a perfect pangram, every letter appears only once. Translating it is an art on its own, but I'll say "rude lover's yelp". :-D) |
855 | <li><b>Finnish:</b> (5) Albert osti fagotin ja töräytti puhkuvan melodian. (Albert bought a bassoon and hooted an impressive melody.) |
856 | <li><b>Finnish:</b> (5) On sangen hauskaa, että polkupyörä on maanteiden jokapäiväinen ilmiö. (It's pleasantly amusing, that the bicycle is an everyday sight on the roads.) |
857 | <li><b>Polish:</b> PchnÄ Ä w tÄ Åódź jeża lub osiem skrzyÅ fig. |
858 | <li><b>Czech:</b> PÅÃliÅ¡ |
859 | žluÅ¥ouÄký kůŠúpÄl |
860 | Äábelské kódy. |
861 | <li><b>Slovak:</b> Starý kôŠna hÅbe |
862 | knÃh žuje tÃÅ¡ko povädnuté |
863 | ruže, na stĺpe sa Äateľ |
864 | uÄà kvákaÅ¥ novú ódu o |
865 | živote. |
866 | <li><b>Greek</b> (monotonic): ξεÏκεÏÎ¬Î¶Ï Ïην ÏÏ ÏοÏθÏÏα Î²Î´ÎµÎ»Ï Î³Î¼Î¯Î± |
867 | |
868 | <li><b>Greek</b> (polytonic): |
869 | ξεÏκεÏÎ¬Î¶Ï Ïὴν ÏÏ ÏοÏθÏÏα Î²Î´ÎµÎ»Ï Î³Î¼Î¯Î± |
870 | |
871 | |
872 | <li><b>Russian:</b> |
873 | СÑеÑÑ Ð¶Ðµ еÑÑ ÑÑÐ¸Ñ Ð¼ÑÐ³ÐºÐ¸Ñ ÑÑанÑÑзÑÐºÐ¸Ñ Ð±Ñлок да вÑпей ÑаÑ. |
874 | |
875 | <li><b>Russian:</b> |
876 | Ð ÑаÑÐ°Ñ Ñга жил-бÑл ÑиÑÑÑÑ? Ðа, но ÑалÑÑивÑй ÑкземплÑÑ! ÑÑ. |
877 | |
878 | <li><b>Bulgarian:</b> ÐÑлÑаÑа дÑÐ»Ñ Ð±ÐµÑе ÑаÑÑлива, Ñе пÑÑ ÑÑ, койÑо ÑÑÑна, замÑÑзна каÑо гÑон. |
879 | |
880 | <li><b>Sami (Northern):</b> Vuol Ruoŧa geÄggiid leat máÅga luosa ja Äuovžža. |
881 | <li><b>Hungarian:</b> ÃrvÃztűrÅ tükörfúrógép. |
882 | <li><b>Spanish:</b> El pingüino Wenceslao hizo kilómetros bajo exhaustiva lluvia y frÃo, añoraba a su querido cachorro. |
883 | <li><b>Portuguese:</b> O próximo vôo à noite sobre o Atlântico, põe freqüentemente o único médico. (3) |
884 | <li><b>French:</b> Les naïfs ægithales hâtifs pondant à Noël où il gèle sont sûrs d'être |
885 | déçus en voyant leurs drôles d'Åufs abîmés. |
886 | |
887 | <li><b>Esperanto:</b> EÄ¥oÅanÄo |
888 | ÄiuĵaÅde. |
889 | |
890 | <li><b>Hebrew:</b> <span dir="RTL" lang=HE>×× ×××£ ×¡×ª× ×ש×××¢ ××× ×ª× ×¦× ×§×¨×¤× ×¢×¥ ××× ×××.</span> |
891 | |
892 | <li><b>Japanese</b> (Hiragana):<blockquote> |
893 | ããã¯ã«ã»ã¸ã©ãã¡ãã¬ãã<br> |
894 | ãããããããã¤ããªãã<br> |
895 | ããã®ãããã¾ãããµããã¦<br> |
896 | ãããããã¿ãããã²ããã |
897 | (4) |
898 | </blockquote> |
899 | |
900 | </ol> |
901 | <p id="oechtringen"> |
902 | <a name="notes2"><b>Notes:</b></a> |
903 | <p> |
904 | <ol> |
905 | |
906 | <li>Other phrases commonly used in Germany include: "Ein wackerer Bayer |
907 | vertilgt ja bequem zwo Pfund Kalbshaxe" and, more recently, "Franz jagt im |
908 | komplett verwahrlosten Taxi quer durch Bayern", but both lack umlauts and |
909 | esszet. Previously, going for the shortest sentence that has all the |
910 | umlauts and special characters, I had |
911 | "GrüÃe aus Bärenhöfe |
912 | (und Ãechtringen)!" |
913 | Acute accents are not used in native German words, so I was surprised to |
914 | discover "Ãechtringen" in the Deutsche Bundespost |
915 | Postleitzahlenbuch: |
916 | <p> |
917 | <blockquote> |
918 | <a href="http://www.columbia.edu/~fdc/misc/oechtringen.jpg"><img |
919 | src="oechtringen-sm.jpg" alt="Click for full-size image (2.8MB)"></a> |
920 | </blockquote> |
921 | <p> |
922 | It's a small village in eastern Lower Saxony. |
923 | The "oe" in this case |
924 | turns out to be the Lower Saxon "lengthening e" (Dehnungs-e), which makes the |
925 | previous vowel long (used in a number of Lower Saxon place names such as Soest |
926 | and Itzehoe), not the "e" that indicates umlaut of the preceding vowel. |
927 | Many thanks to the Ãechtringen-Namenschreibungsuntersuchungskomitee |
928 | (Alex Bochannek, Manfred Erren, Asmus Freytag, Christoph Päper, plus |
929 | Werner Lemberg who serves as |
930 | Ãechtringen-Namenschreibungsuntersuchungskomiteerechtschreibungsprüfer) |
931 | |
932 | for their relentless pursuit of the facts in this case. Conclusion: the |
933 | accent almost certainly does not belong on this (or any other native German) |
934 | word, but neither can it be dismissed as dirt on the page. To add to the |
935 | mystery, it has been reported that other copies of the same edition of the |
936 | PLZB do not show the accent! UPDATE (March 2006): David Krings was |
937 | intrigued enough by this report to contact the mayor of Ebstorf, of which |
938 | Oechtringen is a borough, who responded: |
939 | |
940 | <p> |
941 | <blockquote style="font-family:sans-serif;font-size:80%"> |
942 | Sehr geehrter Mr. Krings,<br> |
943 | wenn Oechtringen irgendwo mit einem Akzent auf dem O geschrieben wurde, |
944 | dann kann das nur ein Fehldruck sein. Die offizielle Schreibweise lautet |
945 | jedenfalls âOechtringenâ.<br> |
946 | Mit freundlichen Grüssen<br> |
947 | Der Samtgemeindebürgermeister<br> |
948 | i.A. Lothar Jessel |
949 | |
950 | </blockquote> |
951 | |
952 | |
953 | <p> |
954 | <li>From Karl Pentzlin (Kochel am See, Bavaria, Germany): |
955 | "This German phrase is suited for display by a Fraktur (broken letter) |
956 | font. It contains: all common three-letter ligatures: ffi ffl fft and all |
957 | two-letter ligatures required by the Duden for Fraktur typesetting: ch ck ff |
958 | fi fl ft ll Å¿ch Å¿i Å¿Å¿ Å¿t tz (all in a |
959 | manner such they are not part of a three-letter ligature), one example of f-l |
960 | where German typesetting rules prohibit ligating (marked by a ZWNJ), and all |
961 | German letters a...z, ä,ö,ü,Ã, Å¿ [long s] |
962 | (all in a manner such that they are not part of a two-letter Fraktur |
963 | ligature)." |
964 | |
965 | Otto Stolz notes that "'SchloÃ' is now spelled 'Schloss', in |
966 | contrast to 'gröÃer' (example 4) which has kept its |
967 | 'Ã'. Fraktur has been banned from general use, in 1942, and long-s |
968 | (Å¿) has ceased to be used with Antiqua (Roman) even earlier (the |
969 | latest Antiqua-Å¿ I have seen is from 1913, but then |
970 | I am no expert, so there may well be a later instance." Later Otto confirms |
971 | the latter theory, "Now I've run across a book âDeutsche |
972 | Rechtschreibungâ (edited by Lutz Mackensen) from 1954 (my reprint |
973 | is from 1956) that has kept the Antiqua-Å¿ in its dictionary part (but |
974 | neither in the preface nor in the appendix)." |
975 | |
976 | <p> |
977 | |
978 | <li>Diaeresis is not used in Iberian Portuguese. |
979 | |
980 | <p> |
981 | |
982 | <li>From Yurio Miyazawa: "This poetry contains all the sounds in the |
983 | Japanese language and used to be the first thing for children to learn in |
984 | their Japanese class. The Hiragana version is particularly neat because it |
985 | covers every character in the phonetic Hiragana character set." Yurio also |
986 | sent the Kanji version: |
987 | |
988 | <p> |
989 | <blockquote> |
990 | è²ã¯åã¸ã© æ£ãã¬ãã<br> |
991 | æãä¸èª°ã 常ãªãã<br> |
992 | æçºã®å¥¥å±± ä»æ¥è¶ãã¦<br> |
993 | æµ ã夢è¦ã é ã²ããã |
994 | </blockquote> |
995 | |
996 | <li>Finnish pangrams from Mikko Ristilä. |
997 | |
998 | </ol> |
999 | <p> |
1000 | <b>Accented Cyrillic:</b> |
1001 | <p> |
1002 | |
1003 | <i>(This section contributed by Vladimir Marinov.)</i> |
1004 | |
1005 | <p> |
1006 | |
1007 | In Bulgarian it is desirable, customary, or in some cases required to |
1008 | write accents over vowels. Unfortunately, no computer character sets |
1009 | contain the full repertoire of accented Cyrillic letters. With Unicode, |
1010 | however, it is possible to combine any Cyrillic letter with any combining |
1011 | accent. The appearance of the result depends on the font and the rendering |
1012 | engine. Here are two examples. |
1013 | |
1014 | <p> |
1015 | <ol> |
1016 | |
1017 | <li>Той Ð²Ð¸Ð´Ñ Ð±ÑлаÑа коÑÐ°Ì Ð¿Ð¾ главаÑа Ð¸Ì Ð¸ коÌÑа на ÑамоÑо иÌ, и ÑеÌÑе да Ð¸Ì |
1018 | ÑеÑеÌ: "ÐаÑаÌÑа Ð¿Ð¾Ì Ð¿Ð°ÌÑи Ð¾Ñ Ð¿Ð°ÌÑаÑа, не Ñа паÑиÌ!", но Ñи помиÌÑли: "Хей, |
1019 | помиÑÐ»Ð¸Ì Ñи! ÐÌ Ð¸Ì Ñека, Ð°Ì Ðµ ÑкоÑила в Ñази Ñека, коÑÑо ÑеÑе да ÑеÑеÌ, |
1020 | а не ÑеÌÑе." |
1021 | |
1022 | <p> |
1023 | |
1024 | <li>Ðо пÑÌÑÑ Ð¿ÑÑÑÌÐ²Ð°Ñ ÐºÑÌÑди и ÑгоÑлавÑÌни. |
1025 | |
1026 | </ol> |
1027 | |
1028 | <h3><a name="html">HTML Features</a></h3> |
1029 | |
1030 | Here is the Russian alphabet (uppercase only) coded in three |
1031 | different ways, which should look identical: |
1032 | |
1033 | <p> |
1034 | <ol> |
1035 | <li>ÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐÐРСТУФХЦЧШЩЪЫЬÐЮЯ |
1036 | <i>(Literal UTF-8)</i> |
1037 | <li>АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ |
1038 | <i>(Decimal numeric character reference)</i> |
1039 | <li>АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ |
1040 | <i>(Hexadecimal numeric character reference)</i> |
1041 | </ol> |
1042 | |
1043 | <p> |
1044 | |
1045 | In another test, we use HTML language tags to distinguish Bulgarian, Russian, |
1046 | and <a href="http://www.tiro.com/transfer/Serbian_Rendering.pdf">Serbian</a>, |
1047 | which have different italic forms for lowercase |
1048 | б, г, д, п, and/or Ñ: |
1049 | <p> |
1050 | <blockquote> |
1051 | <table> |
1052 | <tr> |
1053 | <td><b>Bulgarian</b>: |
1054 | <td><span lang=BG>[ бгдпÑ</span> ] |
1055 | <td><span lang=BG>[ <i>бгдпÑ</i></span> ] |
1056 | <td><span lang=BG><i> Ðога да Ñм ÑÑÑкло и не ме боли.</i></span> |
1057 | <tr> |
1058 | <td><b>Russian</b>: |
1059 | <td><span lang=RU>[ бгдпÑ</span> ] |
1060 | <td><span lang=RU>[ <i>бгдпÑ</i></span> ] |
1061 | <td><span lang=RU><i>Я Ð¼Ð¾Ð³Ñ ÐµÑÑÑ ÑÑекло, ÑÑо мне не вÑедиÑ.</i></span> |
1062 | <tr> |
1063 | <td><b>Serbian</b>: |
1064 | <td><span lang=SR>[ бгдпÑ</span> ] |
1065 | <td><span lang=SR>[ <i>бгдпÑ</i></span> ] |
1066 | <td> <span lang=SR><i>ÐÐ¾Ð³Ñ ÑеÑÑи ÑÑакло |
1067 | а |
1068 | да ми |
1069 | не |
1070 | Ñкоди.</i></span> |
1071 | </table> |
1072 | </blockquote> |
1073 | <p> |
1074 | |
1075 | <!-- acknowledgments --> |
1076 | <h3><a name="credits">Credits, Tools, and Commentary</a></h3> |
1077 | |
1078 | <dl> |
1079 | <dt><b>Credits:</b></dt> |
1080 | <dd> |
1081 | The "I can eat glass" phrase and the initial collection of translations: |
1082 | <a href="http://hcs.harvard.edu/~igp/glass.html">Ethan Mollick</a>. |
1083 | Transcription / conversion to UTF-8: Frank da Cruz. |
1084 | <b>Albanian:</b> Sindi Keesan. |
1085 | <b>Afrikaans:</b> Johan Fourie, Kevin Poalses. |
1086 | <b>Anglo Saxon:</b> Frank da Cruz. |
1087 | <b>Arabic:</b> Najib Tounsi. |
1088 | <b>Armenian:</b> Vaçe Kundakçı. |
1089 | <b>Belarusian:</b> Alexey Chernyak, Patricia Clausnitzer. |
1090 | <b>Bengali:</b> Somnath Purkayastha, Deepayan Sarkar. |
1091 | <b>Bislama:</b> Dan McGarry. |
1092 | <b>Bosnian:</b> Dmitrij D. Czarkoff. |
1093 | <b>Braille:</b> Frank da Cruz. |
1094 | <b>Bulgarian:</b> Sindi Keesan, Guentcho Skordev, Vladimir Marinov. |
1095 | <b>Burmese:</b> "cetanapa", Sithu Thwin. |
1096 | <b>Cabo Verde Creole:</b> Cláudio Alexandre Duarte. |
1097 | <b>Catalán:</b> Jordi Bancells. |
1098 | <b>Chinese:</b> Jack Soo, Wong Pui Lam. |
1099 | <b>Chinook Jargon:</b> David Robertson. |
1100 | <b>Cornish:</b> Chris Stephens. |
1101 | <b>Croatian:</b> Dmitrij D. Czarkoff, Marjan BaÄe. |
1102 | <b>Czech:</b> Stanislav Pecha, Radovan GarabÃk. |
1103 | <b>Danish:</b> Morten Due Jorgensen. |
1104 | <b>Dutch:</b> Peter Gotink. Pim Blokland, Rob Daniel, Rob de Wit. |
1105 | <b>Erzian:</b> Jack Rueter. |
1106 | <b>Esperanto:</b> Franko Luin, Radovan GarabÃk. |
1107 | <b>Estonian:</b> Meelis Roos. |
1108 | <b>Faroese:</b> Jón Gaasedal. |
1109 | <b>Farsi/Persian:</b> Payam Elahi. |
1110 | <b>Fijian:</b> Paul Cannon. |
1111 | <b>Finnish:</b> Sampsa Toivanen, Mikko Ristilä. |
1112 | <b>French:</b> Luc Carissimo, Anne Colin du Terrail, Sean M. Burke, Theo Morelli. |
1113 | <b>Galician:</b> Laura Probaos. |
1114 | <b>Georgian:</b> Giorgi Lebanidze. |
1115 | <b>German:</b> Christoph Päper, Otto Stolz, Karl Pentzlin, David Krings, |
1116 | Frank da Cruz, Peter Keel (Seegras), Elias Glantschnig. |
1117 | <b>Gothic:</b> Aurélien Coudurier. |
1118 | <b>Greek:</b> Ariel Glenn, Constantine Stathopoulos, Siva Nataraja, Christos Georgiou. |
1119 | <b>Hebrew:</b> Jonathan Rosenne, Tal Barnea. |
1120 | <b>Hausa:</b> Malami Buba, Tom Gewecke. |
1121 | <b>Hawaiian:</b> na Hauʻoli Motta, Anela de Rego, Kaliko Trapp. |
1122 | <b>Hindi:</b> Shirish Kalele, Nitin Dahra. |
1123 | <b>Hungarian:</b> András Rácz, Mark Holczhammer. |
1124 | <b>Icelandic:</b> Andrés Magnússon, Sveinn Baldursson. |
1125 | <b>International Phonetic Alphabet (IPA):</b> Siva Nataraja / Vincent Ramos. |
1126 | <b>Inuktitut</b>: Louise Hope. |
1127 | <b>Irish:</b> Michael Everson, Marion Gunn, James Kass, Curtis Clark. |
1128 | <b>Italian:</b> Thomas De Bellis. |
1129 | <b>Jamaican:</b> Stephen J. Cherin. |
1130 | <b>Japanese:</b> Makoto Takahashi, Yurio Miyazawa. |
1131 | <b>Kannada:</b> Sridhar R N, Alok G. Singh. |
1132 | <b>Karelian:</b> Aleksandr Semakov. |
1133 | <b>Khmer:</b> Tola Sann. |
1134 | <b>Kirchröadsj:</b> Roger Stoffers. |
1135 | <b>Kreyòl:</b> Sean M. Burke. |
1136 | <b>Korean:</b> Jungshik Shin. |
1137 | <b>Langenfelder Platt:</b> David Krings. |
1138 | <b>Lao:</b> Tola Sann. |
1139 | <b>Lëtzebuergescht:</b> Stefaan Eeckels. |
1140 | <b>Lingala:</b> <a href="http://home.sus.mcgill.ca/~moyogo">Denis Moyogo Jacquerye</a> |
1141 | (<a href="http://info-langues-congo.1sd.org/">Nkóta ya KÉÌngÉ mÃbalé </a>) |
1142 | (Nkóta ya KÉÌngÉ mÃbal). |
1143 | <b>Lithuanian:</b> Gediminas Grigas. |
1144 | <b>Lojban:</b> Edward Cherlin. |
1145 | <b>Lusatian:</b> Ronald Schaffhirt. |
1146 | <b>Macedonian:</b> Sindi Keesan. |
1147 | <b>Malay:</b> Zarina Mustapha. |
1148 | <b>Malayam:</b> Anil Matthews. |
1149 | <b>Maltese:</b> Kenneth Joseph Vella. |
1150 | <b>Manx:</b> Éanna Ó Brádaigh. |
1151 | <b>Marathi:</b> Shirish Kalele. |
1152 | <b>Marquesan:</b> Kaliko Trapp. |
1153 | <b>Middle English:</b> Frank da Cruz. |
1154 | <b>Milanese:</b> Marco Cimarosti. |
1155 | <b>Mongolian:</b> Tom Gewecke. |
1156 | <b>Montenegran:</b> Dmitrij D. Czarkoff. |
1157 | <b>Napoletano:</b> Diego Quintano. |
1158 | <b>Navajo:</b> Tom Gewecke. |
1159 | <a href="http://www.langmaker.com/db/mdl_nordicg.htm"><b>Nórdicg</b></a>: |
1160 | Yẃlyan Rott. |
1161 | <b>Nepali:</b> Ujjwol Lamichhane, Rabi Tripathi. |
1162 | <b>Norwegian:</b> Herman Ranes, Håvard Kvålen. |
1163 | <b>Odenwälderisch:</b> Alexander Heß. |
1164 | <b>Old Irish:</b> Michael Everson. |
1165 | <b>Old Norse:</b> Andrés Magnússon. |
1166 | <b>Papiamentu:</b> Bianca and Denise Zanardi. |
1167 | <b>Pashto:</b> N.R. Liwal. |
1168 | <b>Pfälzisch:</b> Dr. Johannes Sander. |
1169 | <b>Picard:</b> Philippe Mennecier. |
1170 | <b>Polish:</b> Juliusz Chroboczek, PaweÅ Przeradowski, Wlodzislaw Kostecki. |
1171 | <b>Portuguese:</b> "Cláudio" Alexandre Duarte, Bianca and Denise |
1172 | Zanardi, Pedro Palhoto Matos, Wagner Amaral. |
1173 | <b>Québécois:</b> Laurent Detillieux. |
1174 | <b>Roman:</b> Pierpaolo Bernardi. |
1175 | <b>Romanian:</b> Juliusz Chroboczek, Ionel Mugurel. |
1176 | <b>Romansch:</b> Alexandre Suter. |
1177 | <b>Ruhrdeutsch:</b> "Timwi". |
1178 | <b>Russian:</b> Alexey Chernyak, Serge Nesterovitch. |
1179 | <b>Sami:</b> Anne Colin du Terrail, Luc Carissimo. |
1180 | <b>Sanskrit:</b> Siva Nataraja / Vincent Ramos. |
1181 | <b>Sächsisch:</b> André Müller. |
1182 | <b>Schwäbisch:</b> Otto Stolz. |
1183 | <b>Scots:</b> Jonathan Riddell. |
1184 | <b>Serbian:</b> Dmitrij D. Czarkoff, Sindi Keesan, Ranko Narancic, Boris Daljevic, Szilvia Csorba, |
1185 | O. Dag. |
1186 | <b>Sinhalese:</b> Abdul-Ahad (ASM). |
1187 | <b>Slovak:</b> G. Adam Stanislav, Radovan GarabÃk. |
1188 | <b>Slovenian:</b> Albert Kolar. |
1189 | <b>Spanish:</b> <a href="http://www.aleida.net">Aleida Morel</a>, Laura Probaos. |
1190 | <b>Swahili:</b> Ronald Schaffhirt. |
1191 | <b>Swedish:</b> Christian Rose, Bengt Larsson. |
1192 | <b>Taiwanese:</b> Henry H. Tan-Tenn. |
1193 | <b>Tagalog:</b> Jim Soliven. |
1194 | <b>Tamil:</b> Vasee Vaseeharan, Vetrivel P. |
1195 | <b>Telugu:</b> Arjuna Rao Chavala. |
1196 | <b>Tibetan:</b> D. Germano, Tom Gewecke. |
1197 | <b>Thai:</b> Alan Wood's wife. |
1198 | <b>Turkish:</b> Vaçe Kundakçı, Tom Gewecke, Merlign Olnon. |
1199 | <b>Ukrainian:</b> Michael Zajac, Oleg Podsadny. |
1200 | <b>Ulster Gaelic:</b> Ciarán à DuibhÃn. |
1201 | <b>Urdu:</b> Mustafa Ali. |
1202 | <a href="http://nomfoundation.org/"><b>Vietnamese</b></a>: Dixon Au, |
1203 | [James] Äá» Bá PhÆ°á»c |
1204 | <font face="PMingLiU">杜 伯 福</font>. |
1205 | <b>Walloon:</b> Pablo Saratxaga. |
1206 | <b>Welsh:</b> Geiriadur Prifysgol Cymru (Andrew). |
1207 | <b>Yiddish:</b> Mark David. |
1208 | <b>Zeneise:</b> Angelo Pavese. |
1209 | |
1210 | <p> |
1211 | |
1212 | <dt><b>Tools Used to Create This Web Page:</b></dt> |
1213 | |
1214 | <dd>The UTF8-aware <a href="k95.html">Kermit 95</a> terminal emulator on |
1215 | Windows, to a Unix host with the <a |
1216 | href="http://www.gnu.org/directory/emacs.html">EMACS</a> text editor. Kermit |
1217 | 95 displays UTF-8 and also allows keyboard entry of arbitrary Unicode BMP |
1218 | characters as 4 hex digits, as shown <a href="glass.html">HERE</a>. Hex codes |
1219 | for Unicode values can be found in <a |
1220 | href="http://www.unicode.org/unicode/uni2book/u2.html">The Unicode |
1221 | Standard</a> (recommended) and the <a |
1222 | href="http://www.unicode.org/charts/">online code charts</a>. When |
1223 | submissions arrive by email encoded in some other character set (Latin-1, |
1224 | Latin-2, KOI, various PC code pages, JEUC, etc), I use the TRANSLATE command |
1225 | of <a href="ckermit.html">C-Kermit</a> on the Unix host (<a |
1226 | href="safe.html">where I read my mail</a>) to convert the character set to |
1227 | UTF-8 (I could also use Kermit 95 for this; it has the same TRANSLATE |
1228 | command). That's it -- no "Web authoring" tools, no locales, no "smart" |
1229 | anything. It's just plain text, nothing more. By the way, there's nothing |
1230 | special about EMACS -- any text editor will do, providing it allows entry of |
1231 | arbitrary 8-bit bytes as text, including the 0x80-0x9F "C1" range. EMACS 21.1 |
1232 | actually supports UTF-8; earlier versions don't know about it and display the |
1233 | octal codes; either way is OK for this purpose. |
1234 | |
1235 | <p> |
1236 | |
1237 | <dt><b>Commentary:</b> |
1238 | <dd>Date: Wed, 27 Feb 2002 13:21:59 +0100<br> |
1239 | From: "Bruno DEDOMINICIS" <tt><b.dedominicis@cite-sciences.fr></tt><br> |
1240 | Subject: Je peux manger du verre, cela ne me fait pas mal. |
1241 | |
1242 | <p> |
1243 | |
1244 | I just found out your website and it makes me feel like proposing an |
1245 | interpretation of the choice of this peculiar phrase. |
1246 | |
1247 | <p> |
1248 | |
1249 | Glass is transparent and can hurt as everyone knows. The relation between |
1250 | people and civilisations is sometimes effusional and more often rude. The |
1251 | concept of breaking frontiers through globalization, in a way, is also an |
1252 | attempt to deny any difference. Isn't "transparency" the flag of modernity? |
1253 | Nothing should be hidden any more, authority is obsolete, and the new powers |
1254 | are supposed to reign through loving and smiling and no more through |
1255 | coercion... |
1256 | |
1257 | <p> |
1258 | |
1259 | Eating glass without pain sounds like a very nice metaphor of this attempt. |
1260 | That is, frontiers should become glass transparent first, and be denied by |
1261 | incorporating them. On the reverse, it shows that through globalization, |
1262 | frontiers undergo a process of displacement, that is, when they are not any |
1263 | more speakable, they become repressed from the speech and are therefore |
1264 | incorporated and might become painful symptoms, as for example what happens |
1265 | when one tries to eat glass. |
1266 | |
1267 | <p> |
1268 | |
1269 | The frontiers that used to separate bodies one from another tend to divide |
1270 | bodies from within and make them suffer.... The chosen phrase then appears |
1271 | as a denial of the symptom that might result from the destitution of |
1272 | traditional frontiers. |
1273 | |
1274 | <p> |
1275 | Best,<br> |
1276 | Bruno De Dominicis, Paris, France |
1277 | </dl> |
1278 | |
1279 | <p> |
1280 | <b>Other Unicode pages onsite:</b> |
1281 | <ul> |
1282 | <li><a href="postal.html">Frank's Compulsive Guide to Postal Addresses</a> |
1283 | (especially the <a href="postal.html#index">Index</a>) |
1284 | <li><a href="http://www.columbia.edu/~fdc/pace/">Peace in All Languages</a> |
1285 | <li><a href="sshclient-be.html">Kermit 95 клÑенÑа SSH</a> |
1286 | (Kermit 95 SSH Client documentation in Belarusian) |
1287 | <li><a href="st-erkenwald.html">Representing Middle English on the Web with UTF-8</a> |
1288 | <li><a href="biblio.html">The Kermit Bibliography</a> (in UTF-8) |
1289 | <li><a href="accents.html">Interchange of Non-English Computer Text</a> |
1290 | (UTF-8 math and box-drawing) |
1291 | <li><a href="utf8-t1.html">Unicode Table</a> (in UTF-8) |
1292 | </ul> |
1293 | <p> |
1294 | <b>Unicode samplers and resources offsite:</b> |
1295 | <ul> |
1296 | <li><a href="http://rishida.net/scripts/uniview/conversion">Unicode Code |
1297 | Converter</a> (converts among different Unicode |
1298 | encoding forms and notations). |
1299 | |
1300 | <li><a href="http://unicode.org/cldr/utility/confusables.jsp?a=paypal&n=on&x=on">Confusables</a> (every silver lining has a cloud). |
1301 | <li><a href="http://www.seigniorage.de/">Seigniorage</a> (Central Banks worldwide). |
1302 | <li>Michael Everson's |
1303 | <a href="http://www.evertype.com/scriptbib.html">Bibliography of Typography |
1304 | and Scripts</a> |
1305 | <li><a href="http://www.code2000.net/englishtestutf.htm">Does your browser |
1306 | support Unicode English?</a> (James Kass) |
1307 | <li><a href="http://crism.maden.org/dunno.html">I don't know, I only work here</a> |
1308 | <li><a href="http://www.trigeminal.com/samples/provincial.html">Anyone |
1309 | can be provincial!</a> |
1310 | <!-- defunct |
1311 | <li><a href="http://www.macchiato.com/unicode/Unicode_transcriptions.html">Transcriptions of "Unicode"</a> |
1312 | --> |
1313 | <li><a href="http://www.i18nguy.com/unicode-example.html">Example |
1314 | Unicode Usage for Business Applications</a> |
1315 | <li><a href="http://www.cl.cam.ac.uk/~mgk25/unicode.html#apps">UTF-8 and |
1316 | Unicode FAQ for Unix/Linux</a> |
1317 | </ul> |
1318 | <p> |
1319 | <b>Unicode fonts:</b> |
1320 | <ul> |
1321 | <li><a href="http://www.code2000.net/">Code 2000</a> (James Kass) |
1322 | |
1323 | <li><a href="http://www.alanwood.net/unicode/fonts.html">Unicode Fonts |
1324 | for Windows Computers</a> (Alan Wood) |
1325 | <li><a href="http://www.cl.cam.ac.uk/~mgk25/ucs-fonts.html">Unicode Fonts and |
1326 | Tools for X11</a> (Markus Kuhn) |
1327 | <li><a href="http://www.evertype.com/emono/">Everson Mono</a> (Michael |
1328 | Everson) |
1329 | <li><a href="http://www.monotype.com">Agfa Monotype</a> (now fonts.com) |
1330 | </ul> |
1331 | |
1332 | <p> |
1333 | [ <a href="k95.html">Kermit 95</a> ] |
1334 | [ <a href="glass.html">K95 Screen Shots</a> ] |
1335 | [ <a href="ckermit.html">C-Kermit</a> ] |
1336 | [ <a href="index.html">Kermit Home</a> ] |
1337 | [ <a href="http://www.unicode.org/help/display_problems.html">Display Problems?</a> ] |
1338 | [ <a href="http://www.unicode.org">The Unicode Consortium</a> ] |
1339 | <hr> |
1340 | <ADDRESS> |
1341 | UTF-8 Sampler / <a href="index.html">The Kermit Project</a> / |
1342 | <a href="http://www.columbia.edu">Columbia University</a> / |
1343 | <a href="mailto:kermit@kermitproject.org">kermit@kermitproject.org</a> |
1344 | </ADDRESS> |
1345 | </body> |
1346 | </html> |
Snippet is not live.
Travelled to 12 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt
No comments. add comment
Snippet ID: | #3000417 |
Snippet name: | Contents of kermitproject.org/utf8.html |
Eternal ID of this version: | #3000417/1 |
Text MD5: | b208d08a9884912af97ca249c44cb697 |
Author: | someone |
Category: | |
Type: | New Tinybrain snippet |
Gummipassword: | #3999999 |
Uploaded from IP: | 31.19.51.233 |
Public (visible to everyone): | Yes |
Archived (hidden from active list): | No |
Created/modified: | 2016-10-25 18:18:37 |
Source code size: | 71543 bytes / 1346 lines |
Pitched / IR pitched: | No / No |
Views / Downloads: | 462 / 156 |
Referenced in: | [show references] |