aboutsummaryrefslogtreecommitdiff
path: root/README.md
blob: 70b1d892cd90363f7e3ae7ffc23306f8f5436490 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# learning and documenting UTF-8 like a bozo

starting off strongly with da goat, [da wikipedia page](https://en.wikipedia.org/wiki/UTF-8)

it says i can be stupid when needed
> It was designed for backward compatibility with ASCII

which i like to see, i'm stupid after all

so, first i need to detect unicode points. if <128 then ascii, otherwise unicode

right? right.

reading, reading... ok i'm bored

let's write some stupid ass C program that readlines and displays bytes.
yeah, you right, i'm rewriting hexdump.
will take 5 minutes, stop complaining and shut the fuck up please

thank u ;3 brb

## wrote da lol.c

well why readline when i can use argc/argv? lol

let's do some tests now and analyse like a fucking genius (i ain't that)

i'll need to write french characters mostly so i'll only try these out, if easy enough i'll probably hardcode it

me good engineer

```sh
$ gcc -o lol lol.c
$ ./lol e
101/0x65
$ ./lol é
195/0xc3 169/0xa9
$ ./lol è
195/0xc3 168/0xa8
$ ./lol ê
195/0xc3 170/0xaa
$ ./lol ë
195/0xc3 171/0xab
$ ./lol à
195/0xc3 160/0xa0
$ ./lol â
195/0xc3 162/0xa2
$ ./lol ù
195/0xc3 185/0xb9
$ ./lol ô
195/0xc3 180/0xb4
```

seems pretty clear now than 0xc3 is the cool letter page uh?

let's duck this shit

> https://www.utf8-chartable.de/

yeah they be latin small and big letter shit

i don't give a fuck about big letters, but i'm down for the small

obvious solution in my eyes is to have a separate tileset for unicode codepoints and use them in my rendering (since i use bitmap fonts, they be cooler for rendering tricks)

so yeah uh... that's the end of the unicode tale i guess. i didn't learn much kek

[i'm out](out.gif)