1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> |
2 | <html> |
3 | <head> |
4 | <META http-equiv="Content-Type" content="text/html; charset=utf-8"> |
5 | <title>UTF-8 Sampler</title> |
6 | |
7 | <META http-equiv="Content-Style-Type" content="text/css"> |
8 | <META name="viewport" content="width=device-width, initial-scale=1.0"> |
9 | <LINK REL="stylesheet" TYPE="text/css" HREF="/kermit.css"> |
10 | <LINK REL="shortcut icon" href="/favicon.ico" > |
11 | <LINK REL="icon" href="/favicon.ico" type="image/x-icon"> |
12 | <LINK REL="icon" type="image/ico" href="/favicon.ico"> |
13 | <style type="text/css"> |
14 | blockquote { margin-left:8px; margin-right:8px; font-size:90% } |
15 | body { font-size:15px; |
16 | font-family:calibri, arial narrow, arial, sans-serif, times; |
17 | color:black; |
18 | background:white; |
19 | margin:16px; |
20 | } |
21 | tt { font-size:94% } |
22 | </style> |
23 | </head> |
24 | |
25 | <body> |
26 | |
27 | <h1><tt>UTF-8 SAMPLER</tt></h1> |
28 | |
29 | <big><big> ¥ · £ · € · $ · ¢ · ₡ · ₢ · ₣ · ₤ · ₥ · ₦ · ₧ · ₨ · ₩ · ₪ · ₫ · ₭ · ₮ · ₯ · ₹</big></big> |
30 | |
31 | |
32 | |
33 | <p> |
34 | <blockquote> |
35 | Frank da Cruz<br> |
36 | <a href="index.html">The Kermit Project</a><br> |
37 | New York City<br> |
38 | <a href="mailto:fdc@kermitproject.org">fdc@kermitproject.org</a> |
39 | |
40 | <p> |
41 | <i>Last update:</i> |
42 | Thu Sep 15 14:00:00 2016 |
43 | </blockquote> |
44 | <p> |
45 | <hr> |
46 | [ <a href="http://www.columbia.edu/~fdc/pace/">PEACE</a> ] |
47 | [ <a href="#poetry">Poetry</a> ] |
48 | [ <a href="#glass">I Can Eat Glass</a> ] |
49 | [ <a href="#quickbrownfox">Pangrams</a> ] |
50 | [ <a href="#html">HTML Features</a> ] |
51 | [ <a href="#credits">Credits, Tools, Commentary</a> ] |
52 | <p> |
53 | |
54 | <big><big>U</big>TF-8</big> is an ASCII-preserving encoding method for |
55 | <a href="unicode.html">Unicode</a> (ISO 10646), the Universal Character Set |
56 | (UCS). The UCS encodes most of the world's writing systems in a single |
57 | character set, allowing you to mix languages and scripts within a document |
58 | without needing any tricks for switching character sets. This web page is |
59 | encoded directly in UTF-8. |
60 | |
61 | <p> |
62 | |
63 | As shown <a href="glass.html">HERE</a>, |
64 | Columbia University's <a href="k95.html">Kermit 95</a> terminal emulation |
65 | software can display UTF-8 plain text in Windows 95, 98, ME, NT, XP, Vista, |
66 | or Windows 7/8/10 when using a monospace Unicode font like <a |
67 | href="http://www.monotype.com">Andale Mono WT J</a> or <a |
68 | href="http://www.evertype.com/emono/">Everson Mono Terminal</a>, or the lesser |
69 | populated Courier New, Lucida Console, or Andale Mono. <a |
70 | href="ckermit.html">C-Kermit</a> can handle it too, |
71 | <a href="http://www.cl.cam.ac.uk/~mgk25/unicode.html">if you have a Unicode |
72 | display</a>. As many languages as are representable in your font can be seen |
73 | on the screen at the same time. |
74 | |
75 | <p> |
76 | |
77 | This, however, is a Web page, which started out as a kind of stress test for |
78 | UTF-8 support in Web browsers, which was spotty when this page was first |
79 | created in the 1990s but which has become standard in all modern browsers. |
80 | The problem now is mainly the fonts and the browser's (or font's) support |
81 | for the nonzero Unicode planes (as in, e.g., the <a href="#braille">Braille</a> |
82 | and <a href="#gothic">Gothic</a> examples |
83 | below). And to some extent the rendition of combining sequences, |
84 | right-to-left rendition (<a href="#arabic">Arabic</a>, |
85 | <a href="#hebrew">Hebrew</a>), and so |
86 | on. <a href="http://www.alanwood.net/unicode/fonts.html">CLICK HERE</a> for |
87 | a survey of Unicode fonts for Windows. |
88 | |
89 | <p> |
90 | |
91 | The subtitle above shows currency symbols of many lands. If they don't |
92 | appear as blobs, we're off to a good start! (The one on the end is the |
93 | <a href="http://en.wikipedia.org/wiki/Indian_rupee_sign">new Indian Rupee |
94 | sign</a> which won't show up in fonts for a while.) |
95 | |
96 | <h3><a name="poetry">Poetry</a></h3> |
97 | |
98 | From the Anglo-Saxon <a href="http://www.ragweedforge.com/poems.html"><cite>Rune Poem</cite></a> (Rune version): |
99 | <p><blockquote> |
100 | ᚠᛇᚻ᛫ᛒᛦᚦ᛫ᚠᚱᚩᚠᚢᚱ᛫ᚠᛁᚱᚪ᛫ᚷᛖᚻᚹᛦᛚᚳᚢᛗ<br> |
101 | ᛋᚳᛖᚪᛚ᛫ᚦᛖᚪᚻ᛫ᛗᚪᚾᚾᚪ᛫ᚷᛖᚻᚹᛦᛚᚳ᛫ᛗᛁᚳᛚᚢᚾ᛫ᚻᛦᛏ᛫ᛞᚫᛚᚪᚾ<br> |
102 | ᚷᛁᚠ᛫ᚻᛖ᛫ᚹᛁᛚᛖ᛫ᚠᚩᚱ᛫ᛞᚱᛁᚻᛏᚾᛖ᛫ᛞᚩᛗᛖᛋ᛫ᚻᛚᛇᛏᚪᚾ᛬<br> |
103 | </blockquote> |
104 | <p> |
105 | |
106 | From Laȝamon's<i> <a href="http://mesl.itd.umich.edu/b/brut/">Brut</a></i> |
107 | (<i>The Chronicles of England</i>, Middle English, West Midlands): |
108 | <p> |
109 | <blockquote> |
110 | An preost wes on leoden, Laȝamon was ihoten<br> |
111 | He wes Leovenaðes sone -- liðe him be Drihten.<br> |
112 | He wonede at Ernleȝe at æðelen are chirechen,<br> |
113 | Uppen Sevarne staþe, sel þar him þuhte,<br> |
114 | Onfest Radestone, þer he bock radde. |
115 | </blockquote> |
116 | <p> |
117 | |
118 | (The third letter in the author's name is Yogh, missing from many fonts; |
119 | <a href="st-erkenwald.html">CLICK HERE</a> for another Middle English sample |
120 | with some explanation of letters and encoding). |
121 | |
122 | <p> |
123 | |
124 | From the <cite>Tagelied</cite> of |
125 | |
126 | <a href="http://gutenberg.spiegel.de/autoren/eschenba.htm"> |
127 | <b>Wolfram von Eschenbach</b></a> (Middle High German): |
128 | <p><blockquote> |
129 | Sîne klâwen durh die wolken sint geslagen,<br> |
130 | er stîget ûf mit grôzer kraft,<br> |
131 | ich sih in grâwen tägelîch als er wil tagen,<br> |
132 | den tac, der im geselleschaft<br> |
133 | erwenden wil, dem werden man,<br> |
134 | den ich mit sorgen în verliez.<br> |
135 | ich bringe in hinnen, ob ich kan.<br> |
136 | sîn vil manegiu tugent michz leisten hiez.<br> |
137 | </blockquote><p> |
138 | |
139 | Some lines of |
140 | <a href="http://users.hol.gr/~artemis/odysseas_elytis.htm"> |
141 | <b>Odysseus Elytis</b></a> (Greek): |
142 | |
143 | <blockquote> |
144 | <table cellspacing=0 cellpadding=0> |
145 | <tr> |
146 | <td valign="top" style="padding-right:16"> |
147 | Monotonic: |
148 | <p> |
149 | Τη γλώσσα μου έδωσαν ελληνική<br> |
150 | το σπίτι φτωχικό στις αμμουδιές του Ομήρου.<br> |
151 | Μονάχη έγνοια η γλώσσα μου στις αμμουδιές του Ομήρου.<br> |
152 | <p> |
153 | από το Άξιον Εστί<br> |
154 | του Οδυσσέα Ελύτη |
155 | |
156 | <td valign="top"> |
157 | Polytonic: |
158 | <p> |
159 | Τὴ γλῶσσα μοῦ ἔδωσαν ἑλληνικὴ<br/> |
160 | τὸ σπίτι φτωχικὸ στὶς ἀμμουδιὲς τοῦ Ὁμήρου.<br/> |
161 | Μονάχη ἔγνοια ἡ γλῶσσα μου στὶς ἀμμουδιὲς τοῦ Ὁμήρου.<br/> |
162 | <p> |
163 | ἀπὸ τὸ Ἄξιον ἐστί<br/> |
164 | τοῦ Ὀδυσσέα Ἐλύτη<br/> |
165 | |
166 | |
167 | |
168 | |
169 | |
170 | |
171 | |
172 | </table> |
173 | </blockquote> |
174 | |
175 | <p> |
176 | |
177 | The first stanza of |
178 | <a href="http://www.ocf.berkeley.edu/%7Eleong/Russkaya%20Literatura/Aleksandr%20Sergeevich%20Pushkin.htm"><b>Pushkin</b></a>'s <cite>Bronze Horseman</cite> (Russian):<br> |
179 | <p><blockquote> |
180 | На берегу пустынных волн<br> |
181 | Стоял он, дум великих полн,<br> |
182 | И вдаль глядел. Пред ним широко<br> |
183 | Река неслася; бедный чёлн<br> |
184 | По ней стремился одиноко.<br> |
185 | По мшистым, топким берегам<br> |
186 | Чернели избы здесь и там,<br> |
187 | Приют убогого чухонца;<br> |
188 | И лес, неведомый лучам<br> |
189 | В тумане спрятанного солнца,<br> |
190 | Кругом шумел.<br> |
191 | </blockquote><p> |
192 | |
193 | <a href="http://www.compling.hu-berlin.de/~johannes/mxedruli/"><b>Šota Rustaveli</b></a>'s Veṗxis Ṭq̇aosani, |
194 | ̣︡Th, <cite>The Knight in the Tiger's Skin</cite> (Georgian):<p> |
195 | <blockquote> |
196 | ვეპხის ტყაოსანი |
197 | შოთა რუსთაველი |
198 | <p> |
199 | ღმერთსი შემვედრე, ნუთუ კვლა დამხსნას სოფლისა შრომასა, |
200 | ცეცხლს, წყალსა და მიწასა, ჰაერთა თანა მრომასა; |
201 | მომცნეს ფრთენი და აღვფრინდე, მივჰხვდე მას ჩემსა ნდომასა, |
202 | დღისით და ღამით ვჰხედვიდე მზისა ელვათა კრთომაასა. |
203 | </blockquote> |
204 | <p> |
205 | |
206 | Tamil poetry of Subramaniya Bharathiyar: |
207 | |
208 | சுப்ரமணிய பாரதியார் (1882-1921): |
209 | |
210 | <p> |
211 | <blockquote> |
212 | |
213 | யாமறிந்த மொழிகளிலே தமிழ்மொழி போல் இனிதாவது எங்கும் காணோம், <br> |
214 | பாமரராய் விலங்குகளாய், உலகனைத்தும் இகழ்ச்சிசொலப் பான்மை கெட்டு, <br> |
215 | நாமமது தமிழரெனக் கொண்டு இங்கு வாழ்ந்திடுதல் நன்றோ? சொல்லீர்!<br> |
216 | தேமதுரத் தமிழோசை உலகமெலாம் பரவும்வகை செய்தல் வேண்டும். |
217 | |
218 | </blockquote> |
219 | <p> |
220 | Kannada poetry by Kuvempu — ಬಾ ಇಲ್ಲಿ ಸಂಭವಿಸು |
221 | |
222 | <p> |
223 | <blockquote> |
224 | |
225 | |
226 | ಬಾ ಇಲ್ಲಿ ಸಂಭವಿಸು ಇಂದೆನ್ನ ಹೃದಯದಲಿ |
227 | <br> |
228 | |
229 | ನಿತ್ಯವೂ ಅವತರಿಪ ಸತ್ಯಾವತಾರ |
230 | |
231 | <p> |
232 | |
233 | |
234 | |
235 | |
236 | ಮಣ್ಣಾಗಿ ಮರವಾಗಿ ಮಿಗವಾಗಿ ಕಗವಾಗೀ... |
237 | |
238 | <br> |
239 | |
240 | ಮಣ್ಣಾಗಿ ಮರವಾಗಿ ಮಿಗವಾಗಿ ಕಗವಾಗಿ |
241 | |
242 | <br> |
243 | |
244 | ಭವ ಭವದಿ ಭತಿಸಿಹೇ ಭವತಿ ದೂರ |
245 | |
246 | <br> |
247 | |
248 | ನಿತ್ಯವೂ ಅವತರಿಪ ಸತ್ಯಾವತಾರ || ಬಾ ಇಲ್ಲಿ || |
249 | |
250 | |
251 | </blockquote> |
252 | |
253 | <h3><a name="glass">I Can Eat Glass</a></h3> |
254 | |
255 | And from the sublime to the ridiculous, here is a |
256 | <a href="#notes">certain phrase¹</a> in an assortment of languages: |
257 | |
258 | <p> |
259 | <ol> |
260 | <li><b>Sanskrit</b>: काचं शक्नोम्यत्तुम् । नोपहिनस्ति माम् ॥ |
261 | |
262 | <li><b>Sanskrit</b> <i>(standard transcription):</i> kācaṃ śaknomyattum; nopahinasti mām. |
263 | <li><b>Classical Greek</b>: ὕαλον ϕαγεῖν δύναμαι· τοῦτο οὔ με βλάπτει. |
264 | <li><b>Greek</b> (monotonic): Μπορώ να φάω σπασμένα γυαλιά χωρίς να πάθω τίποτα. |
265 | <li><b>Greek</b> (polytonic): Μπορῶ νὰ φάω σπασμένα γυαλιὰ χωρὶς νὰ πάθω τίποτα. |
266 | |
267 | <br><b>Etruscan</b>: (NEEDED) |
268 | <li><b>Latin</b>: Vitrum edere possum; mihi non nocet. |
269 | <li><b>Old French</b>: Je puis mangier del voirre. Ne me nuit. |
270 | <li><b>French</b>: Je peux manger du verre, ça ne me fait pas <!--de--> mal. |
271 | <li><b>Provençal / Occitan</b>: Pòdi manjar de veire, me nafrariá pas. |
272 | <li><b>Québécois</b>: J'peux manger d'la vitre, ça m'fa pas mal. |
273 | <li><b>Walloon</b>: Dji pou magnî do vêre, çoula m' freut nén må. |
274 | <br><b>Champenois</b>: (NEEDED) |
275 | <br><b>Lorrain</b>: (NEEDED) |
276 | <li><b>Picard</b>: Ch'peux mingi du verre, cha m'foé mie n'ma. |
277 | <br><b>Corsican/Corsu</b>: (NEEDED) |
278 | <br><b>Jèrriais</b>: (NEEDED) |
279 | <li><b>Kreyòl Ayisyen</b> (Haitï): Mwen kap manje vè, li pa blese'm. |
280 | <li><b>Basque</b>: Kristala jan dezaket, ez dit minik ematen. |
281 | <li><b>Catalan / Català</b>: Puc menjar vidre, que no em fa mal. |
282 | <li><b>Spanish</b>: Puedo comer vidrio, no me hace daño. |
283 | <li><b>Aragonés</b>: Puedo minchar beire, no me'n fa mal . |
284 | <br><b>Aranés</b>: (NEEDED) |
285 | <br><b>Mallorquín</b>: (NEEDED) |
286 | <li><b>Galician</b>: Eu podo xantar cristais e non cortarme. |
287 | <li><b>European Portuguese</b>: Posso comer vidro, não me faz mal. |
288 | <li><b>Brazilian Portuguese</b> (<a href="#notes">8</a>): |
289 | Posso comer vidro, não me machuca. |
290 | <li><b>Caboverdiano/Kabuverdianu</b> (Cape Verde): M' podê cumê vidru, ca ta maguâ-m'. |
291 | <li><b>Papiamentu</b>: Ami por kome glas anto e no ta hasimi daño. |
292 | <li><b>Italian</b>: Posso mangiare il vetro e non mi fa male. |
293 | <li><b>Milanese</b>: Sôn bôn de magnà el véder, el me fa minga mal. |
294 | <li><b>Roman</b>: Me posso magna' er vetro, e nun me fa male. |
295 | <li><b>Napoletano</b>: M' pozz magna' o'vetr, e nun m' fa mal. |
296 | <li><b>Venetian</b>: Mi posso magnare el vetro, no'l me fa mae. |
297 | <li><b>Zeneise</b> <i>(Genovese):</i> Pòsso mangiâ o veddro e o no me fà mâ. |
298 | <li><b>Sicilian</b>: Puotsu mangiari u vitru, nun mi fa mali. |
299 | <br><b>Campinadese</b> (Sardinia): (NEEDED) |
300 | <br><b>Lugudorese</b> (Sardinia): (NEEDED) |
301 | <li><b>Romansch (Grischun)</b>: Jau sai mangiar vaider, senza che quai fa donn a mai. |
302 | <br><b>Romany / Tsigane</b>: (NEEDED) |
303 | <li><b>Romanian</b>: Pot să mănânc sticlă și ea nu mă rănește. |
304 | <li><b>Esperanto</b>: Mi povas manĝi vitron, ĝi ne damaĝas min. |
305 | <br><b>Pictish</b>: (NEEDED) |
306 | <br><b>Breton</b>: (NEEDED) |
307 | <li><b>Cornish</b>: Mý a yl dybry gwéder hag éf ny wra ow ankenya. |
308 | <li><b>Welsh</b>: Dw i'n gallu bwyta gwydr, 'dyw e ddim yn gwneud dolur i mi. |
309 | <li><b>Manx Gaelic</b>: Foddym gee glonney agh cha jean eh gortaghey mee. |
310 | <li><b>Old Irish</b> <i>(Ogham):</i> ᚛᚛ᚉᚑᚅᚔᚉᚉᚔᚋ ᚔᚈᚔ ᚍᚂᚐᚅᚑ ᚅᚔᚋᚌᚓᚅᚐ᚜ |
311 | <li><b>Old Irish</b> <i>(Latin):</i> Con·iccim ithi nglano. Ním·géna. |
312 | |
313 | <li><b>Irish</b>: Is féidir liom gloinne a ithe. Ní dhéanann sí dochar ar bith dom. |
314 | <li><b>Ulster Gaelic</b>: Ithim-sa gloine agus ní miste damh é. |
315 | <li><b>Scottish Gaelic</b>: S urrainn dhomh gloinne ithe; cha ghoirtich i mi. |
316 | <li><b>Anglo-Saxon</b> <i>(Runes):</i> |
317 | ᛁᚳ᛫ᛗᚨᚷ᛫ᚷᛚᚨᛋ᛫ᛖᚩᛏᚪᚾ᛫ᚩᚾᛞ᛫ᚻᛁᛏ᛫ᚾᛖ᛫ᚻᛖᚪᚱᛗᛁᚪᚧ᛫ᛗᛖ᛬ |
318 | <li><b>Anglo-Saxon</b> <i>(Latin):</i> Ic mæg glæs eotan ond hit ne hearmiað me. |
319 | <li><b>Middle English</b>: Ich canne glas eten and hit hirtiþ me nouȝt. |
320 | <li><b>English</b>: I can eat glass and it doesn't hurt me. |
321 | <li><b>English</b> <i>(IPA):</i> [aɪ kæn iːt glɑːs ænd ɪt dɐz nɒt hɜːt miː] (Received Pronunciation) |
322 | <li id="braille"><b>English</b> <i>(Braille):</i> ⠊⠀⠉⠁⠝⠀⠑⠁⠞⠀⠛⠇⠁⠎⠎⠀⠁⠝⠙⠀⠊⠞⠀⠙⠕⠑⠎⠝⠞⠀⠓⠥⠗⠞⠀⠍⠑ |
323 | <li><b>Jamaican</b>: Mi kian niam glas han i neba hot mi. |
324 | <li><b>Lalland Scots / Doric</b>: Ah can eat gless, it disnae hurt us. |
325 | <br><b>Glaswegian</b>: (NEEDED) |
326 | <li id="gothic"><b>Gothic</b> (<a href="#notes">4</a>): |
Snippet is not live.
Travelled to 12 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt
No comments. add comment
Snippet ID: | #3000413 |
Snippet name: | Contents of kermitproject.org/utf8.html |
Eternal ID of this version: | #3000413/1 |
Text MD5: | c2571a8fa4aec8105e4922954fd68ff5 |
Author: | someone |
Category: | |
Type: | New Tinybrain snippet |
Gummipassword: | #3999999 |
Uploaded from IP: | 31.19.51.233 |
Public (visible to everyone): | Yes |
Archived (hidden from active list): | No |
Created/modified: | 2016-10-25 18:13:12 |
Source code size: | 14178 bytes / 326 lines |
Pitched / IR pitched: | No / No |
Views / Downloads: | 589 / 143 |
Referenced in: | [show references] |