annotate src/libvterm/t/03encoding_utf8.test @ 18478:94223687df0e

Added tag v8.1.2233 for changeset e93cab5d0f0f27fad7882f1f412927df055b090d
author Bram Moolenaar <Bram@vim.org>
date Tue, 29 Oct 2019 04:30:05 +0100
parents b8299e742f41
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
11621
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
1 INIT
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
2 WANTENCODING
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
3
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
4 !Low
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
5 ENCIN "123"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
6 encout 0x31,0x32,0x33
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
7
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
8 # We want to prove the UTF-8 parser correctly handles all the sequences.
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
9 # Easy way to do this is to check it does low/high boundary cases, as that
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
10 # leaves only two for each sequence length
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
11 #
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
12 # These ranges are therefore:
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
13 #
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
14 # Two bytes:
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
15 # U+0080 = 000 10000000 => 00010 000000
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
16 # => 11000010 10000000 = C2 80
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
17 # U+07FF = 111 11111111 => 11111 111111
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
18 # => 11011111 10111111 = DF BF
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
19 #
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
20 # Three bytes:
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
21 # U+0800 = 00001000 00000000 => 0000 100000 000000
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
22 # => 11100000 10100000 10000000 = E0 A0 80
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
23 # U+FFFD = 11111111 11111101 => 1111 111111 111101
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
24 # => 11101111 10111111 10111101 = EF BF BD
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
25 # (We avoid U+FFFE and U+FFFF as they're invalid codepoints)
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
26 #
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
27 # Four bytes:
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
28 # U+10000 = 00001 00000000 00000000 => 000 010000 000000 000000
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
29 # => 11110000 10010000 10000000 10000000 = F0 90 80 80
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
30 # U+1FFFFF = 11111 11111111 11111111 => 111 111111 111111 111111
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
31 # => 11110111 10111111 10111111 10111111 = F7 BF BF BF
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
32
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
33 !2 byte
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
34 ENCIN "\xC2\x80\xDF\xBF"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
35 encout 0x0080, 0x07FF
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
36
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
37 !3 byte
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
38 ENCIN "\xE0\xA0\x80\xEF\xBF\xBD"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
39 encout 0x0800,0xFFFD
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
40
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
41 !4 byte
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
42 ENCIN "\xF0\x90\x80\x80\xF7\xBF\xBF\xBF"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
43 encout 0x10000,0x1fffff
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
44
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
45 # Next up, we check some invalid sequences
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
46 # + Early termination (back to low bytes too soon)
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
47 # + Early restart (another sequence introduction before the previous one was finished)
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
48
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
49 !Early termination
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
50 ENCIN "\xC2!"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
51 encout 0xfffd,0x21
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
52
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
53 ENCIN "\xE0!\xE0\xA0!"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
54 encout 0xfffd,0x21,0xfffd,0x21
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
55
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
56 ENCIN "\xF0!\xF0\x90!\xF0\x90\x80!"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
57 encout 0xfffd,0x21,0xfffd,0x21,0xfffd,0x21
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
58
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
59 !Early restart
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
60 ENCIN "\xC2\xC2\x90"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
61 encout 0xfffd,0x0090
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
62
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
63 ENCIN "\xE0\xC2\x90\xE0\xA0\xC2\x90"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
64 encout 0xfffd,0x0090,0xfffd,0x0090
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
65
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
66 ENCIN "\xF0\xC2\x90\xF0\x90\xC2\x90\xF0\x90\x80\xC2\x90"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
67 encout 0xfffd,0x0090,0xfffd,0x0090,0xfffd,0x0090
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
68
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
69 # Test the overlong sequences by giving an overlong encoding of U+0000 and
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
70 # an encoding of the highest codepoint still too short
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
71 #
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
72 # Two bytes:
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
73 # U+0000 = C0 80
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
74 # U+007F = 000 01111111 => 00001 111111 =>
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
75 # => 11000001 10111111 => C1 BF
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
76 #
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
77 # Three bytes:
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
78 # U+0000 = E0 80 80
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
79 # U+07FF = 00000111 11111111 => 0000 011111 111111
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
80 # => 11100000 10011111 10111111 = E0 9F BF
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
81 #
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
82 # Four bytes:
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
83 # U+0000 = F0 80 80 80
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
84 # U+FFFF = 11111111 11111111 => 000 001111 111111 111111
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
85 # => 11110000 10001111 10111111 10111111 = F0 8F BF BF
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
86
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
87 !Overlong
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
88 ENCIN "\xC0\x80\xC1\xBF"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
89 encout 0xfffd,0xfffd
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
90
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
91 ENCIN "\xE0\x80\x80\xE0\x9F\xBF"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
92 encout 0xfffd,0xfffd
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
93
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
94 ENCIN "\xF0\x80\x80\x80\xF0\x8F\xBF\xBF"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
95 encout 0xfffd,0xfffd
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
96
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
97 # UTF-16 surrogates U+D800 and U+DFFF
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
98 !UTF-16 Surrogates
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
99 ENCIN "\xED\xA0\x80\xED\xBF\xBF"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
100 encout 0xfffd,0xfffd
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
101
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
102 !Split write
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
103 ENCIN "\xC2"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
104 ENCIN "\xA0"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
105 encout 0x000A0
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
106
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
107 ENCIN "\xE0"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
108 ENCIN "\xA0\x80"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
109 encout 0x00800
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
110 ENCIN "\xE0\xA0"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
111 ENCIN "\x80"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
112 encout 0x00800
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
113
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
114 ENCIN "\xF0"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
115 ENCIN "\x90\x80\x80"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
116 encout 0x10000
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
117 ENCIN "\xF0\x90"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
118 ENCIN "\x80\x80"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
119 encout 0x10000
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
120 ENCIN "\xF0\x90\x80"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
121 ENCIN "\x80"
b8299e742f41 patch 8.0.0693: no terminal emulator support
Christian Brabandt <cb@256bit.org>
parents:
diff changeset
122 encout 0x10000