Arnt Gulbrandsen
About meAbout this blog

GROẞ in use

The unicode committee added ẞ about five years ago, until then that ligature was only available in lower case. (The Germans used to write Straße and STRASSE, the Swiss and Austrians wrote Strasse.)

That was good timing.

In the ten years from when I moved to Germany until the unicode committee added ẞ I only saw it once, carved on a wooden clock in in the Straußwirtschaft at a winery near Kaiserstuhl. (The food and wine were both good.) In the five years since 2010 I've seen it five times. Yesterday I saw it twice, once on a poster that encouraged me to TANZEN UND GENIEẞEN and once in the credits for a film.

OK, the film is from 2008, but still, good timing. The committee caught the start of wider use well.

Here's an example (the middle headline):

Looks quite naturally German, I think.

Update: And in the two weeks since I wrote that I've seen two more examples: Another poster and a glass of mustard.


Deleting the wrong file

Many options are open to me. ⓐ Recovering the file from tarsnap, or ⓑ from the NAS, or maybe ⓒ I have a copy elsewhere. But if the file isn't available anywhere, what then?

I can ⓓ be polite and express my anger and sorrow in flawless prose.

I can ⓔ express myself freely, but discreetly. ██████ and █████████ the █████ ███████!

Finally, if nothing else helps, my last recourse is to ⓕ leverage the power of unicode: ℄♓♜⌘⌼⑆↺☂☊⚠雷𝀲☠⏣☡☢☣☧⍾♏♣⚑⚒⏁⚡⬌⭓𝀴🁠😒!



Some fonts have already been updated to include ẞ, including the ones I generally use (on ubuntu 10.04). Lovely.


Detecting character encodings

Archiveopteryx often needs to massage incoming mail to make it syntactically valid. 99% may be valid, but 1% is still a lot. One of the chores is to guess how a message is encoded — unicode, ISO-8859-x or what? For that Archiveopteryx uses a novel and good algorithm. (more…)



I'm sitting in my office, pondering whether my location is best described as an office, an office, an office or perhaps (overwhelmingly correct) an office. I think I like room. Simple words are so… unruffling.



Here's what I did to get a sensibly large character repertoire for my keyboard, using ubuntu 9.10.

First, read any of the fine explanations of the compose key and configure some suitable key.

Next, explain to ubuntu that you do wish to XIM:

im-switch -s default-xim

By this time, X applications will read your ~/.XCompose file when they start, so set it up. There's a large repertoire on github:

cd ~/src git clone git://

Here's my ~/.XCompose (more…)


Various neat signs and glyphs

Another browser test (I'm looking at you, android). These signs are mostly ones I've used in 2008-9 (with some added for symmetry etc.), and I do not think they are too odd to be worth rendering.


Feature not supported (U+2610)
Feature supported! (U+2611)
Feature supported! (U+2612)
Check mark (U+2713)
Heavy check mark (U+2714)
Multiplication x (U+2715)
Heavy multiplication x (U+2716)
Ballot x (U+2717)
Heavy ballot x (U+2718)
Telephone (U+260f)
Telephone (U+260e)
Peace (U+262e)

Arrows and that kind of thing: (more…)


All the world's alphabets

Below are almost all the letters in Unicode. Let me see what my browser can render.

Arabic: ء آ أ ؤ إ ئ ا ب ة ت ث ج ح خ د ذ ر ز س ش ص ض ط ظ ع غ ف ق ك ل م ن ه و ى ي ٮ ٯ ٱ ٲ ٳ ٴ ٵ ٶ ٷ ٸ ٹ ٺ ٻ ټ ٽ پ ٿ ڀ ځ ڂ ڃ ڄ څ چ ڇ ڈ ډ ڊ ڋ ڌ ڍ ڎ ڏ ڐ ڑ ڒ ړ ڔ ڕ ږ ڗ ژ ڙ ښ ڛ ڜ ڝ ڞ ڟ ڠ ڡ ڢ ڣ ڤ ڥ ڦ ڧ ڨ ک ڪ ګ ڬ ڭ ڮ گ ڰ ڱ ڲ ڳ ڴ ڵ ڶ ڷ ڸ ڹ ں ڻ ڼ ڽ ھ ڿ ۀ ہ ۂ ۃ ۄ ۅ ۆ ۇ ۈ ۉ ۊ ۋ ی ۍ ێ ۏ ې ۑ ے ۓ ە ۮ ۯ ۺ ۻ ۼ ۿ ݐ ݑ ݒ ݓ ݔ ݕ ݖ ݗ ݘ ݙ ݚ ݛ ݜ ݝ ݞ ݟ ݠ ݡ ݢ ݣ ݤ ݥ ݦ ݧ ݨ ݩ ݪ ݫ ݬ ݭ ﭿ ﯿ ﺿ (more…)