George V. Reilly

fsymbols for Unicode weirdness

My display name on Twitter currently looks like @ɢᴇᴏʀɢᴇᴠʀᴇɪʟʟʏ@ᴛᴇᴄʜ.ʟɢʙᴛ, an attempt to route around Twitter's apparent censorship of Mastodon in­for­ma­tion.

I used the FSymbols Generators to produce several variants.

@𝕘𝕖𝕠𝕣𝕘𝕖𝕧𝕣𝕖𝕚𝕝𝕝𝕪@𝕥𝕖𝕔𝕙.𝕝𝕘𝕓𝕥
ʇqƃʅ.ɥɔǝʇ@ʎʅʅᴉǝɹʌǝƃɹoǝƃ@
@𝗀𝖾𝗈𝗋𝗀𝖾𝗏𝗋𝖾𝗂𝗅𝗅𝗒@𝗍𝖾𝖼𝗁.𝗅𝗀𝖻𝗍
@𝘨𝘦𝘰𝘳𝘨𝘦𝘷𝘳𝘦𝘪𝘭𝘭𝘺@𝘵𝘦𝘤𝘩.𝘭𝘨𝘣𝘵
@𝑔𝑒𝑜𝑟𝑔𝑒𝑣𝑟𝑒𝑖𝑙𝑙𝑦@𝑡𝑒𝑐ℎ.𝑙𝑔𝑏𝑡
@𝙜𝙚𝙤𝙧𝙜𝙚𝙫𝙧𝙚𝙞𝙡𝙡𝙮@𝙩𝙚𝙘𝙝.𝙡𝙜𝙗𝙩
@𝚐𝚎𝚘𝚛𝚐𝚎𝚟𝚛𝚎𝚒𝚕𝚕𝚢@𝚝𝚎𝚌𝚑.𝚕𝚐𝚋𝚝
@𝔤𝔢𝔬𝔯𝔤𝔢𝔳𝔯𝔢𝔦𝔩𝔩𝔶@𝔱𝔢𝔠𝔥.𝔩𝔤𝔟𝔱

Many of these variants come from Unicode Block "Math­e­mat­i­cal Al­phanu­mer­ic Symbols".

There are a lot more things you can do with Unicode than just upside-down text.

Path Traversal Attacks

I was surprised to read this evening that the Apache Web Server just fixed an actively exploited path traversal flaw.

🚨 Apache has disclosed an *actively exploited* Path traversal flaw in the #open­source "httpd" server. Over 112,000 exposed Apache servers run version 2.4.49, and should be upgraded now!
New fix checks for encoded path traversal characters e.g. /../.%2E/https://t.co/1tLNc3LAul pic.twitter.com/mDHLEU3k9N
— Ax Sharma (@Ax_Sharma) October 5, 2021

Apparently, it was introduced over a year ago.

I'm gobsmacked that Apache didn't have a robust suite of tests for this.

Directory Traversal attacks have been a problem for web servers since the beginning. OWASP, PortSwig­ger, and Spanning all have ex­pla­na­tions that you can read. The essence is that you make continue.

Homograph Attacks

During an internal training exercise today, as a sort of one-man Chaos Monkey, I de­lib­er­ate­ly broke a test system by changing a config setting to read:

itemfinder.url = http://test-іtemfinder.example.com/

The correct value should have been:

itemfinder.url = http://test-itemfinder.example.com/

What's that, you say? There's no difference, you say?

There is a difference, but it's subtle. The first i in the URL is 'CYRILLIC SMALL LETTER BYELORUSS­IAN-UKRAINIAN I' (U+0456), not 'LATIN SMALL LETTER I' (U+0069). Depending upon the font, the two is may be visually in­dis­tin­guish­able, very similar looking, or the Cyrillic i may not render.

This is an example of an In­ter­na­tion­al Domain Name Homograph Attack. There are Greek letters and Cyrillic letters that look continue.

Unicode Upside-Down Mapping, Part 2

Yesterday I showed File­For­mat's ɹǝʇɹǝʌuoↃ uʍo◖-ǝpısd∩ ǝpoɔıu∩. Although the lowercase letters generally looked good, several of the uppercase letters and numerals were un­sat­is­fac­to­ry. Looking through the Unicode Table site, I came across the Fraser Lisu alphabet, which is un­for­tu­nate­ly not well supported in most fonts. The following renders in Hack and Source Code Pro in MacVim, but not in the Source Code Pro webfont from Google Fonts:

B: ꓭ u+A4ED  Lisu Letter Gha
D: ꓷ u+A4F7  Lisu Letter Oe
J: ꓩ u+A4E9  Lisu Letter Fa
K: ꓘ u+A4D8  Lisu Letter Kha
L: ꓶ u+A4F6  Lisu Letter Uh
R: ꓤ u+A4E4  Lisu Letter Za
T: ꓕ u+A4D5  Lisu 
continue.

Unicode Upside-Down Mapping

Unicode is so versatile that you can (more or less) invert the Latin alphabet:

ɐqɔpǝɟƃɥıɾʞʃɯuodbɹsʇnʌʍxʎz ∀𐐒Ↄ◖ƎℲ⅁HIſ⋊⅂WᴎOԀΌᴚS⊥∩ᴧMX⅄Z 012Ɛᔭ59Ɫ86
abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ 0123456789
68Ɫ95ᔭƐ210 Z⅄XMᴧ∩⊥SᴚΌԀOᴎW⅂⋊ſIH⅁ℲƎ◖Ↄ𐐒∀ zʎxʍʌnʇsɹbdouɯʃʞɾıɥƃɟǝpɔqɐ

Obtained via the ɹǝʇɹǝʌuoↃ uʍo◖-ǝpısd∩ ǝpoɔıu∩. More at Unicode Upside-Down Mapping.

Update: more tomorrow.

URLs from Unicode Strings

[Pre­vi­ous­ly published at the now defunct MetaBrite Dev Blog.]

Some time ago, we made an ill-considered decision to use recipe names for image URLs, which simplified image management with our then-rudi­men­ta­ry tools. For example, the recipe named "Twisted Pasta With Browned Butter, Sage, and Walnuts" becomes a URL ending in "Twist­ed%20­Pas­ta%20With­%20Browned%20But­ter%2C%20Sage%2C%20and%20Wal­nuts.jpg".

Life becomes more in­ter­est­ing when you escape the confines of 7-bit ASCII and use Unicode. How should u"Sautéed crème fraîche Provençale" be handled? The only reasonable thing to do is to first convert the Unicode string to UTF-8 and then hex-encode those octets: "Saut%C3%A9ed%20cr%C3%A8me%20fra%C3%AEche%20Proven%C3%A7ale".

That seems reasonable, but it was giving us in­con­sis­tent results when the images were uploaded to an S3 bucket. When continue.