Not logged in.  Login/Logout/Register | List snippets | | Create snippet | Upload image | Upload data

326
LINES

< > BotCompany Repo | #3000413 // Contents of kermitproject.org/utf8.html

New Tinybrain snippet

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<META http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>UTF-8 Sampler</title>

<META http-equiv="Content-Style-Type" content="text/css">
<META name="viewport" content="width=device-width, initial-scale=1.0">
<LINK REL="stylesheet" TYPE="text/css" HREF="/kermit.css">
<LINK REL="shortcut icon" href="/favicon.ico" >
<LINK REL="icon" href="/favicon.ico" type="image/x-icon">
<LINK REL="icon" type="image/ico" href="/favicon.ico"> 
<style type="text/css">
  blockquote { margin-left:8px; margin-right:8px; font-size:90% }
  body { font-size:15px;
         font-family:calibri, arial narrow, arial, sans-serif, times;
         color:black;
         background:white;
         margin:16px;    
  }
  tt { font-size:94% }
</style>
</head>

<body>

<h1><tt>UTF-8 SAMPLER</tt></h1>

<big><big>&nbsp;&nbsp;¥&nbsp;·&nbsp;£&nbsp;·&nbsp;€&nbsp;·&nbsp;$&nbsp;·&nbsp;¢&nbsp;·&nbsp;₡&nbsp;·&nbsp;₢&nbsp;·&nbsp;₣&nbsp;·&nbsp;₤&nbsp;·&nbsp;₥&nbsp;·&nbsp;₦&nbsp;·&nbsp;₧&nbsp;·&nbsp;₨&nbsp;·&nbsp;₩&nbsp;·&nbsp;₪&nbsp;·&nbsp;₫&nbsp;·&nbsp;₭&nbsp;·&nbsp;₮&nbsp;·&nbsp;₯&nbsp;·&nbsp;&#8377</big></big>



<p>
<blockquote>
Frank da Cruz<br>
<a href="index.html">The Kermit Project</a><br>
New York City<br>
<a href="mailto:fdc@kermitproject.org">fdc@kermitproject.org</a>

<p>
<i>Last update:</i>
Thu Sep 15 14:00:00 2016
</blockquote>
<p>
<hr>
[&nbsp;<a href="http://www.columbia.edu/~fdc/pace/">PEACE</a>&nbsp;]
[&nbsp;<a href="#poetry">Poetry</a>&nbsp;]
[&nbsp;<a href="#glass">I Can Eat Glass</a>&nbsp;]
[&nbsp;<a href="#quickbrownfox">Pangrams</a>&nbsp;]
[&nbsp;<a href="#html">HTML Features</a>&nbsp;]
[&nbsp;<a href="#credits">Credits, Tools, Commentary</a>&nbsp;]
<p>

<big><big>U</big>TF-8</big> is an ASCII-preserving encoding method for
<a href="unicode.html">Unicode</a> (ISO 10646), the Universal Character Set
(UCS).  The UCS encodes most of the world's writing systems in a single
character set, allowing you to mix languages and scripts within a document
without needing any tricks for switching character sets.  This web page is
encoded directly in UTF-8.

<p>

As shown <a href="glass.html">HERE</a>,
Columbia University's <a href="k95.html">Kermit 95</a> terminal emulation
software can display UTF-8 plain text in Windows 95, 98, ME, NT, XP, Vista,
or Windows 7/8/10 when using a monospace Unicode font like <a
href="http://www.monotype.com">Andale Mono WT J</a> or <a
href="http://www.evertype.com/emono/">Everson Mono Terminal</a>, or the lesser
populated Courier New, Lucida Console, or Andale Mono.  <a
href="ckermit.html">C-Kermit</a> can handle it too,
<a href="http://www.cl.cam.ac.uk/~mgk25/unicode.html">if you have a Unicode
display</a>.  As many languages as are representable in your font can be seen
on the screen at the same time.

<p>

This, however, is a Web page, which started out as a kind of stress test for
UTF-8 support in Web browsers, which was spotty when this page was first
created in the 1990s but which has become standard in all modern browsers.
The problem now is mainly the fonts and the browser's (or font's) support
for the nonzero Unicode planes (as in, e.g., the <a href="#braille">Braille</a>
and <a href="#gothic">Gothic</a> examples
below).  And to some extent the rendition of combining sequences,
right-to-left rendition (<a href="#arabic">Arabic</a>,
<a href="#hebrew">Hebrew</a>), and so
on.  <a href="http://www.alanwood.net/unicode/fonts.html">CLICK HERE</a> for
a survey of Unicode fonts for Windows.

<p>

The subtitle above shows currency symbols of many lands.  If they don't
appear as blobs, we're off to a good start!  (The one on the end is the
<a href="http://en.wikipedia.org/wiki/Indian_rupee_sign">new Indian Rupee
sign</a> which won't show up in fonts for a while.)

<h3><a name="poetry">Poetry</a></h3>

From the Anglo-Saxon <a href="http://www.ragweedforge.com/poems.html"><cite>Rune Poem</cite></a> (Rune version):
<p><blockquote>
  ᚠᛇᚻ᛫ᛒᛦᚦ᛫ᚠᚱᚩᚠᚢᚱ᛫ᚠᛁᚱᚪ᛫ᚷᛖᚻᚹᛦᛚᚳᚢᛗ<br>
  ᛋᚳᛖᚪᛚ᛫ᚦᛖᚪᚻ᛫ᛗᚪᚾᚾᚪ᛫ᚷᛖᚻᚹᛦᛚᚳ᛫ᛗᛁᚳᛚᚢᚾ᛫ᚻᛦᛏ᛫ᛞᚫᛚᚪᚾ<br>
  ᚷᛁᚠ᛫ᚻᛖ᛫ᚹᛁᛚᛖ᛫ᚠᚩᚱ᛫ᛞᚱᛁᚻᛏᚾᛖ᛫ᛞᚩᛗᛖᛋ᛫ᚻᛚᛇᛏᚪᚾ᛬<br>
</blockquote>
<p>

From Laȝamon's<i> <a href="http://mesl.itd.umich.edu/b/brut/">Brut</a></i>
(<i>The Chronicles of England</i>, Middle English, West Midlands):
<p>
<blockquote>
An preost wes on leoden, Laȝamon was ihoten<br>
He wes Leovenaðes sone -- liðe him be Drihten.<br>
He wonede at Ernleȝe at æðelen are chirechen,<br>
Uppen Sevarne staþe, sel þar him þuhte,<br>
Onfest Radestone, þer he bock radde.
</blockquote>
<p>

(The third letter in the author's name is Yogh, missing from many fonts;
<a href="st-erkenwald.html">CLICK HERE</a> for another Middle English sample
with some explanation of letters and encoding).

<p>

From the <cite>Tagelied</cite> of 

<a href="http://gutenberg.spiegel.de/autoren/eschenba.htm">
<b>Wolfram von Eschenbach</b></a> (Middle High German):
<p><blockquote>
Sîne klâwen durh die wolken sint geslagen,<br>
er stîget ûf mit grôzer kraft,<br>
ich sih in grâwen tägelîch als er wil tagen,<br>
den tac, der im geselleschaft<br>
erwenden wil, dem werden man,<br>
den ich mit sorgen în verliez.<br>
ich bringe in hinnen, ob ich kan.<br>
sîn vil manegiu tugent michz leisten hiez.<br>
</blockquote><p>

Some lines of 
<a href="http://users.hol.gr/~artemis/odysseas_elytis.htm">
<b>Odysseus Elytis</b></a> (Greek):

<blockquote>
<table cellspacing=0 cellpadding=0>
<tr>
<td valign="top" style="padding-right:16">
Monotonic:
<p>
Τη γλώσσα μου έδωσαν ελληνική<br>
το σπίτι φτωχικό στις αμμουδιές του Ομήρου.<br>
Μονάχη έγνοια η γλώσσα μου στις αμμουδιές του Ομήρου.<br>
<p>
από το Άξιον Εστί<br>
του Οδυσσέα Ελύτη

<td valign="top">
Polytonic:
<p>
Τὴ γλῶσσα μοῦ ἔδωσαν ἑλληνικὴ<br/>
τὸ σπίτι φτωχικὸ στὶς ἀμμουδιὲς τοῦ Ὁμήρου.<br/>
Μονάχη ἔγνοια ἡ γλῶσσα μου στὶς ἀμμουδιὲς τοῦ Ὁμήρου.<br/>
<p>
ἀπὸ τὸ Ἄξιον ἐστί<br/>
τοῦ Ὀδυσσέα Ἐλύτη<br/>







</table>
</blockquote>

<p>

The first stanza of 
<a href="http://www.ocf.berkeley.edu/%7Eleong/Russkaya%20Literatura/Aleksandr%20Sergeevich%20Pushkin.htm"><b>Pushkin</b></a>'s <cite>Bronze Horseman</cite> (Russian):<br>
<p><blockquote>
На берегу пустынных волн<br>
Стоял он, дум великих полн,<br>
И вдаль глядел.  Пред ним широко<br>
Река неслася; бедный чёлн<br>
По ней стремился одиноко.<br>
По мшистым, топким берегам<br>
Чернели избы здесь и там,<br>
Приют убогого чухонца;<br>
И лес, неведомый лучам<br>
В тумане спрятанного солнца,<br>
Кругом шумел.<br>
</blockquote><p>

<a href="http://www.compling.hu-berlin.de/~johannes/mxedruli/"><b>Šota Rustaveli</b></a>'s Veṗxis Ṭq̇aosani,
̣︡Th, <cite>The Knight in the Tiger's Skin</cite> (Georgian):<p>
<blockquote>
ვეპხის ტყაოსანი
შოთა რუსთაველი
<p>
ღმერთსი შემვედრე, ნუთუ კვლა დამხსნას სოფლისა შრომასა,
ცეცხლს, წყალსა და მიწასა, ჰაერთა თანა მრომასა;
მომცნეს ფრთენი და აღვფრინდე, მივჰხვდე მას ჩემსა ნდომასა,
დღისით და ღამით ვჰხედვიდე მზისა ელვათა კრთომაასა.
</blockquote>
<p>

Tamil poetry of Subramaniya Bharathiyar:

சுப்ரமணிய பாரதியார் (1882-1921):

<p>
<blockquote>

யாமறிந்த மொழிகளிலே தமிழ்மொழி போல் இனிதாவது எங்கும் காணோம், <br>
பாமரராய் விலங்குகளாய், உலகனைத்தும் இகழ்ச்சிசொலப் பான்மை கெட்டு, <br>
நாமமது தமிழரெனக் கொண்டு இங்கு வாழ்ந்திடுதல் நன்றோ? சொல்லீர்!<br>
தேமதுரத் தமிழோசை உலகமெலாம் பரவும்வகை செய்தல் வேண்டும்.

</blockquote>
<p>
Kannada poetry by Kuvempu &mdash; ಬಾ ಇಲ್ಲಿ ಸಂಭವಿಸು

<p>
<blockquote>


ಬಾ ಇಲ್ಲಿ ಸಂಭವಿಸು ಇಂದೆನ್ನ ಹೃದಯದಲಿ
<br>

ನಿತ್ಯವೂ ಅವತರಿಪ ಸತ್ಯಾವತಾರ

<p>




ಮಣ್ಣಾಗಿ ಮರವಾಗಿ ಮಿಗವಾಗಿ ಕಗವಾಗೀ...

<br>

ಮಣ್ಣಾಗಿ ಮರವಾಗಿ ಮಿಗವಾಗಿ ಕಗವಾಗಿ

<br>

ಭವ ಭವದಿ ಭತಿಸಿಹೇ ಭವತಿ ದೂರ

<br>

ನಿತ್ಯವೂ ಅವತರಿಪ ಸತ್ಯಾವತಾರ || ಬಾ ಇಲ್ಲಿ ||


</blockquote>

<h3><a name="glass">I Can Eat Glass</a></h3>

And from the sublime to the ridiculous, here is a
<a href="#notes">certain phrase&sup1;</a> in an assortment of languages:

<p>
<ol>
<li><b>Sanskrit</b>: काचं शक्नोम्यत्तुम् । नोपहिनस्ति माम् ॥

<li><b>Sanskrit</b> <i>(standard transcription):</i> kācaṃ śaknomyattum; nopahinasti mām.
<li><b>Classical Greek</b>: ὕαλον ϕαγεῖν δύναμαι· τοῦτο οὔ με βλάπτει.
<li><b>Greek</b> (monotonic): Μπορώ να φάω σπασμένα γυαλιά χωρίς να πάθω τίποτα.
<li><b>Greek</b> (polytonic): Μπορῶ νὰ φάω σπασμένα γυαλιὰ χωρὶς νὰ πάθω τίποτα.

<br><b>Etruscan</b>: (NEEDED)
<li><b>Latin</b>:  Vitrum edere possum; mihi non nocet.
<li><b>Old French</b>: Je puis mangier del voirre.  Ne me nuit.
<li><b>French</b>: Je peux manger du verre, ça ne me fait pas <!--de--> mal.
<li><b>Provençal / Occitan</b>: Pòdi manjar de veire, me nafrariá pas.
<li><b>Québécois</b>: J'peux manger d'la vitre, ça m'fa pas mal.
<li><b>Walloon</b>: Dji pou magnî do vêre, çoula m' freut nén må.
<br><b>Champenois</b>: (NEEDED)
<br><b>Lorrain</b>: (NEEDED)
<li><b>Picard</b>: Ch'peux mingi du verre, cha m'foé mie n'ma.
<br><b>Corsican/Corsu</b>: (NEEDED)
<br><b>J&egrave;rriais</b>: (NEEDED)
<li><b>Kreyòl Ayisyen</b> (Hait&iuml;):    Mwen kap manje vè, li pa blese'm.
<li><b>Basque</b>: Kristala jan dezaket, ez dit minik ematen.
<li><b>Catalan / Català</b>: Puc menjar vidre, que no em fa mal.
<li><b>Spanish</b>: Puedo comer vidrio, no me hace daño.
<li><b>Aragon&eacute;s</b>: Puedo minchar beire, no me'n fa mal .
<br><b>Aran&eacute;s</b>: (NEEDED)
<br><b>Mallorquín</b>: (NEEDED)
<li><b>Galician</b>: Eu podo xantar cristais e non cortarme.
<li><b>European Portuguese</b>: Posso comer vidro, não me faz mal.
<li><b>Brazilian Portuguese</b> (<a href="#notes">8</a>):
 Posso comer vidro, não me machuca.
<li><b>Caboverdiano/Kabuverdianu</b> (Cape Verde): M' podê cumê vidru, ca ta maguâ-m'.
<li><b>Papiamentu</b>: Ami por kome glas anto e no ta hasimi daño.
<li><b>Italian</b>:  Posso mangiare il vetro e non mi fa male.
<li><b>Milanese</b>: Sôn bôn de magnà el véder, el me fa minga mal.
<li><b>Roman</b>: Me posso magna' er vetro, e nun me fa male.
<li><b>Napoletano</b>: M' pozz magna' o'vetr, e nun m' fa mal.
<li><b>Venetian</b>: Mi posso magnare el vetro, no'l me fa mae.
<li><b>Zeneise</b> <i>(Genovese):</i> Pòsso mangiâ o veddro e o no me fà mâ.
<li><b>Sicilian</b>: Puotsu mangiari u vitru, nun mi fa mali.
<br><b>Campinadese</b> (Sardinia): (NEEDED)
<br><b>Lugudorese</b> (Sardinia): (NEEDED)
<li><b>Romansch (Grischun)</b>: Jau sai mangiar vaider, senza che quai fa donn a mai.
<br><b>Romany / Tsigane</b>: (NEEDED)
<li><b>Romanian</b>: Pot să mănânc sticlă și ea nu mă rănește.
<li><b>Esperanto</b>: Mi povas manĝi vitron, ĝi ne damaĝas min.
<br><b>Pictish</b>: (NEEDED)
<br><b>Breton</b>: (NEEDED)
<li><b>Cornish</b>: Mý a yl dybry gwéder hag éf ny wra ow ankenya.
<li><b>Welsh</b>: Dw i'n gallu bwyta gwydr, 'dyw e ddim yn gwneud dolur i mi.
<li><b>Manx Gaelic</b>: Foddym gee glonney agh cha jean eh gortaghey mee.
<li><b>Old Irish</b> <i>(Ogham):</i> ᚛᚛ᚉᚑᚅᚔᚉᚉᚔᚋ ᚔᚈᚔ ᚍᚂᚐᚅᚑ ᚅᚔᚋᚌᚓᚅᚐ᚜
<li><b>Old Irish</b> <i>(Latin):</i> Con·iccim ithi nglano. Ním·géna.

<li><b>Irish</b>: Is féidir liom gloinne a ithe. Ní dhéanann sí dochar ar bith dom.
<li><b>Ulster Gaelic</b>: Ithim-sa gloine agus ní miste damh é.
<li><b>Scottish Gaelic</b>: S urrainn dhomh gloinne ithe; cha ghoirtich i mi.
<li><b>Anglo-Saxon</b> <i>(Runes):</i>
ᛁᚳ᛫ᛗᚨᚷ᛫ᚷᛚᚨᛋ᛫ᛖᚩᛏᚪᚾ᛫ᚩᚾᛞ᛫ᚻᛁᛏ᛫ᚾᛖ᛫ᚻᛖᚪᚱᛗᛁᚪᚧ᛫ᛗᛖ᛬
<li><b>Anglo-Saxon</b> <i>(Latin):</i> Ic mæg glæs eotan ond hit ne hearmiað me.
<li><b>Middle English</b>: Ich canne glas eten and hit hirtiþ me nouȝt.
<li><b>English</b>: I can eat glass and it doesn't hurt me.
<li><b>English</b> <i>(IPA):</i> [aɪ kæn iːt glɑːs ænd ɪt dɐz nɒt hɜːt miː] (Received Pronunciation)
<li id="braille"><b>English</b> <i>(Braille):</i> ⠊⠀⠉⠁⠝⠀⠑⠁⠞⠀⠛⠇⠁⠎⠎⠀⠁⠝⠙⠀⠊⠞⠀⠙⠕⠑⠎⠝⠞⠀⠓⠥⠗⠞⠀⠍⠑
<li><b>Jamaican</b>: Mi kian niam glas han i neba hot mi.
<li><b>Lalland Scots / Doric</b>: Ah can eat gless, it disnae hurt us.
<br><b>Glaswegian</b>: (NEEDED)
<li id="gothic"><b>Gothic</b> (<a href="#notes">4</a>):

download  show line numbers   

Snippet is not live.

Travelled to 12 computer(s): aoiabmzegqzx, bhatertpkbcr, cbybwowwnfue, gwrvuhgaqvyk, ishqpsrjomds, lpdgvwnxivlt, mqqgnosmbjvj, pyentgdyhuwx, pzhvpgtvlbxg, tslmcundralx, tvejysmllsmz, vouqrxazstgt

No comments. add comment

Snippet ID: #3000413
Snippet name: Contents of kermitproject.org/utf8.html
Eternal ID of this version: #3000413/1
Text MD5: c2571a8fa4aec8105e4922954fd68ff5
Author: someone
Category:
Type: New Tinybrain snippet
Gummipassword: #3999999
Uploaded from IP: 31.19.51.233
Public (visible to everyone): Yes
Archived (hidden from active list): No
Created/modified: 2016-10-25 18:13:12
Source code size: 14178 bytes / 326 lines
Pitched / IR pitched: No / No
Views / Downloads: 587 / 142
Referenced in: [show references]