learning and documenting UTF-8 like a bozo

starting off strongly with da goat, da wikipedia page

it says i can be stupid when needed

It was designed for backward compatibility with ASCII

which i like to see, i'm stupid after all

so, first i need to detect unicode points. if <128 then ascii, otherwise unicode

right? right.

reading, reading... ok i'm bored

let's write some stupid ass C program that readlines and displays bytes. yeah, you right, i'm rewriting hexdump. will take 5 minutes, stop complaining and shut the fuck up please

thank u ;3 brb

wrote da lol.c

well why readline when i can use argc/argv? lol

let's do some tests now and analyse like a fucking genius (i ain't that)

i'll need to write french characters mostly so i'll only try these out, if easy enough i'll probably hardcode it

me good engineer

$ gcc -o lol lol.c
$ ./lol e
101/0x65
$ ./lol é
195/0xc3 169/0xa9
$ ./lol è
195/0xc3 168/0xa8
$ ./lol ê
195/0xc3 170/0xaa
$ ./lol ë
195/0xc3 171/0xab
$ ./lol à
195/0xc3 160/0xa0
$ ./lol â
195/0xc3 162/0xa2
$ ./lol ù
195/0xc3 185/0xb9
$ ./lol ô
195/0xc3 180/0xb4

seems pretty clear now than 0xc3 is the cool letter page uh?

let's duck this shit

https://www.utf8-chartable.de/

yeah they be latin small and big letter shit

i don't give a fuck about big letters, but i'm down for the small

obvious solution in my eyes is to have a separate tileset for unicode codepoints and use them in my rendering (since i use bitmap fonts, they be cooler for rendering tricks)

so yeah uh... that's the end of the unicode tale i guess. i didn't learn much kek

i'm out