Mercurial > vim
annotate runtime/doc/mbyte.txt @ 8314:4e057409f1d7 v7.4.1449
commit https://github.com/vim/vim/commit/8cc6977a9655603bfc4aab64edddafef147da65e
Author: Bram Moolenaar <Bram@vim.org>
Date: Sun Feb 28 16:42:03 2016 +0100
patch 7.4.1449
Problem: Build fails with job feature but without channel feature.
Solution: Add #ifdef.
author | Christian Brabandt <cb@256bit.org> |
---|---|
date | Sun, 28 Feb 2016 16:45:04 +0100 |
parents | 359743c1f59a |
children | 9f7bcc2c3b97 |
rev | line source |
---|---|
5294 | 1 *mbyte.txt* For Vim version 7.4. Last change: 2013 May 18 |
7 | 2 |
3 | |
4 VIM REFERENCE MANUAL by Bram Moolenaar et al. | |
5 | |
6 | |
7 Multi-byte support *multibyte* *multi-byte* | |
8 *Chinese* *Japanese* *Korean* | |
9 This is about editing text in languages which have many characters that can | |
10 not be represented using one byte (one octet). Examples are Chinese, Japanese | |
11 and Korean. Unicode is also covered here. | |
12 | |
13 For an introduction to the most common features, see |usr_45.txt| in the user | |
14 manual. | |
15 For changing the language of messages and menus see |mlang.txt|. | |
16 | |
2570
71b56b4e7785
Make the references to features in the help more consistent. (Sylvain Hitier)
Bram Moolenaar <bram@vim.org>
parents:
2561
diff
changeset
|
17 {not available when compiled without the |+multi_byte| feature} |
7 | 18 |
19 | |
20 1. Getting started |mbyte-first| | |
21 2. Locale |mbyte-locale| | |
22 3. Encoding |mbyte-encoding| | |
23 4. Using a terminal |mbyte-terminal| | |
24 5. Fonts on X11 |mbyte-fonts-X11| | |
25 6. Fonts on MS-Windows |mbyte-fonts-MSwin| | |
26 7. Input on X11 |mbyte-XIM| | |
27 8. Input on MS-Windows |mbyte-IME| | |
28 9. Input with a keymap |mbyte-keymap| | |
29 10. Using UTF-8 |mbyte-utf8| | |
30 11. Overview of options |mbyte-options| | |
31 | |
32 NOTE: This file contains UTF-8 characters. These may show up as strange | |
33 characters or boxes when using another encoding. | |
34 | |
35 ============================================================================== | |
36 1. Getting started *mbyte-first* | |
37 | |
38 This is a summary of the multibyte features in Vim. If you are lucky it works | |
39 as described and you can start using Vim without much trouble. If something | |
40 doesn't work you will have to read the rest. Don't be surprised if it takes | |
41 quite a bit of work and experimenting to make Vim use all the multi-byte | |
42 features. Unfortunately, every system has its own way to deal with multibyte | |
43 languages and it is quite complicated. | |
44 | |
45 | |
46 COMPILING | |
47 | |
48 If you already have a compiled Vim program, check if the |+multi_byte| feature | |
49 is included. The |:version| command can be used for this. | |
50 | |
4502
605c9ce57ec3
Updated runtime files, language files and translations.
Bram Moolenaar <bram@vim.org>
parents:
4186
diff
changeset
|
51 If +multi_byte is not included, you should compile Vim with "normal", "big" or |
605c9ce57ec3
Updated runtime files, language files and translations.
Bram Moolenaar <bram@vim.org>
parents:
4186
diff
changeset
|
52 "huge" features. You can further tune what features are included. See the |
605c9ce57ec3
Updated runtime files, language files and translations.
Bram Moolenaar <bram@vim.org>
parents:
4186
diff
changeset
|
53 INSTALL files in the source directory. |
7 | 54 |
55 | |
56 LOCALE | |
57 | |
58 First of all, you must make sure your current locale is set correctly. If | |
59 your system has been installed to use the language, it probably works right | |
60 away. If not, you can often make it work by setting the $LANG environment | |
61 variable in your shell: > | |
62 | |
63 setenv LANG ja_JP.EUC | |
64 | |
65 Unfortunately, the name of the locale depends on your system. Japanese might | |
66 also be called "ja_JP.EUCjp" or just "ja". To see what is currently used: > | |
67 | |
68 :language | |
69 | |
70 To change the locale inside Vim use: > | |
71 | |
72 :language ja_JP.EUC | |
73 | |
74 Vim will give an error message if this doesn't work. This is a good way to | |
75 experiment and find the locale name you want to use. But it's always better | |
76 to set the locale in the shell, so that it is used right from the start. | |
77 | |
78 See |mbyte-locale| for details. | |
79 | |
80 | |
81 ENCODING | |
82 | |
83 If your locale works properly, Vim will try to set the 'encoding' option | |
84 accordingly. If this doesn't work you can overrule its value: > | |
85 | |
86 :set encoding=utf-8 | |
87 | |
88 See |encoding-values| for a list of acceptable values. | |
89 | |
90 The result is that all the text that is used inside Vim will be in this | |
91 encoding. Not only the text in the buffers, but also in registers, variables, | |
92 etc. This also means that changing the value of 'encoding' makes the existing | |
93 text invalid! The text doesn't change, but it will be displayed wrong. | |
94 | |
95 You can edit files in another encoding than what 'encoding' is set to. Vim | |
96 will convert the file when you read it and convert it back when you write it. | |
97 See 'fileencoding', 'fileencodings' and |++enc|. | |
98 | |
99 | |
100 DISPLAY AND FONTS | |
101 | |
102 If you are working in a terminal (emulator) you must make sure it accepts the | |
103 same encoding as which Vim is working with. If this is not the case, you can | |
104 use the 'termencoding' option to make Vim convert text automatically. | |
105 | |
106 For the GUI you must select fonts that work with the current 'encoding'. This | |
107 is the difficult part. It depends on the system you are using, the locale and | |
108 a few other things. See the chapters on fonts: |mbyte-fonts-X11| for | |
109 X-Windows and |mbyte-fonts-MSwin| for MS-Windows. | |
110 | |
111 For GTK+ 2, you can skip most of this section. The option 'guifontset' does | |
112 no longer exist. You only need to set 'guifont' and everything should "just | |
113 work". If your system comes with Xft2 and fontconfig and the current font | |
114 does not contain a certain glyph, a different font will be used automatically | |
115 if available. The 'guifontwide' option is still supported but usually you do | |
116 not need to set it. It is only necessary if the automatic font selection does | |
117 not suit your needs. | |
118 | |
119 For X11 you can set the 'guifontset' option to a list of fonts that together | |
120 cover the characters that are used. Example for Korean: > | |
121 | |
122 :set guifontset=k12,r12 | |
123 | |
124 Alternatively, you can set 'guifont' and 'guifontwide'. 'guifont' is used for | |
125 the single-width characters, 'guifontwide' for the double-width characters. | |
126 Thus the 'guifontwide' font must be exactly twice as wide as 'guifont'. | |
127 Example for UTF-8: > | |
128 | |
129 :set guifont=-misc-fixed-medium-r-normal-*-18-120-100-100-c-90-iso10646-1 | |
130 :set guifontwide=-misc-fixed-medium-r-normal-*-18-120-100-100-c-180-iso10646-1 | |
131 | |
132 You can also set 'guifont' alone, Vim will try to find a matching | |
133 'guifontwide' for you. | |
134 | |
135 | |
136 INPUT | |
137 | |
138 There are several ways to enter multi-byte characters: | |
139 - For X11 XIM can be used. See |XIM|. | |
140 - For MS-Windows IME can be used. See |IME|. | |
141 - For all systems keymaps can be used. See |mbyte-keymap|. | |
142 | |
143 The options 'iminsert', 'imsearch' and 'imcmdline' can be used to chose | |
9 | 144 the different input methods or disable them temporarily. |
7 | 145 |
146 ============================================================================== | |
147 2. Locale *mbyte-locale* | |
148 | |
149 The easiest setup is when your whole system uses the locale you want to work | |
150 in. But it's also possible to set the locale for one shell you are working | |
151 in, or just use a certain locale inside Vim. | |
152 | |
153 | |
154 WHAT IS A LOCALE? *locale* | |
155 | |
156 There are many of languages in the world. And there are different cultures | |
157 and environments at least as much as the number of languages. A linguistic | |
158 environment corresponding to an area is called "locale". This includes | |
159 information about the used language, the charset, collating order for sorting, | |
160 date format, currency format and so on. For Vim only the language and charset | |
161 really matter. | |
162 | |
163 You can only use a locale if your system has support for it. Some systems | |
164 have only a few locales, especially in the USA. The language which you want | |
165 to use may not be on your system. In that case you might be able to install | |
166 it as an extra package. Check your system documentation for how to do that. | |
167 | |
168 The location in which the locales are installed varies from system to system. | |
169 For example, "/usr/share/locale" or "/usr/lib/locale". See your system's | |
170 setlocale() man page. | |
171 | |
172 Looking in these directories will show you the exact name of each locale. | |
173 Mostly upper/lowercase matters, thus "ja_JP.EUC" and "ja_jp.euc" are | |
174 different. Some systems have a locale.alias file, which allows translation | |
175 from a short name like "nl" to the full name "nl_NL.ISO_8859-1". | |
176 | |
177 Note that X-windows has its own locale stuff. And unfortunately uses locale | |
178 names different from what is used elsewhere. This is confusing! For Vim it | |
179 matters what the setlocale() function uses, which is generally NOT the | |
180 X-windows stuff. You might have to do some experiments to find out what | |
181 really works. | |
182 | |
183 *locale-name* | |
184 The (simplified) format of |locale| name is: | |
185 | |
186 language | |
187 or language_territory | |
188 or language_territory.codeset | |
189 | |
190 Territory means the country (or part of it), codeset means the |charset|. For | |
191 example, the locale name "ja_JP.eucJP" means: | |
192 ja the language is Japanese | |
193 JP the country is Japan | |
194 eucJP the codeset is EUC-JP | |
195 But it also could be "ja", "ja_JP.EUC", "ja_JP.ujis", etc. And unfortunately, | |
196 the locale name for a specific language, territory and codeset is not unified | |
197 and depends on your system. | |
198 | |
199 Examples of locale name: | |
200 charset language locale name ~ | |
201 GB2312 Chinese (simplified) zh_CN.EUC, zh_CN.GB2312 | |
202 Big5 Chinese (traditional) zh_TW.BIG5, zh_TW.Big5 | |
203 CNS-11643 Chinese (traditional) zh_TW | |
204 EUC-JP Japanese ja, ja_JP.EUC, ja_JP.ujis, ja_JP.eucJP | |
205 Shift_JIS Japanese ja_JP.SJIS, ja_JP.Shift_JIS | |
206 EUC-KR Korean ko, ko_KR.EUC | |
207 | |
208 | |
209 USING A LOCALE | |
210 | |
211 To start using a locale for the whole system, see the documentation of your | |
212 system. Mostly you need to set it in a configuration file in "/etc". | |
213 | |
214 To use a locale in a shell, set the $LANG environment value. When you want to | |
215 use Korean and the |locale| name is "ko", do this: | |
216 | |
217 sh: export LANG=ko | |
218 csh: setenv LANG ko | |
219 | |
220 You can put this in your ~/.profile or ~/.cshrc file to always use it. | |
221 | |
222 To use a locale in Vim only, use the |:language| command: > | |
223 | |
224 :language ko | |
225 | |
226 Put this in your ~/.vimrc file to use it always. | |
227 | |
228 Or specify $LANG when starting Vim: | |
229 | |
230 sh: LANG=ko vim {vim-arguments} | |
231 csh: env LANG=ko vim {vim-arguments} | |
232 | |
233 You could make a small shell script for this. | |
234 | |
235 ============================================================================== | |
236 3. Encoding *mbyte-encoding* | |
237 | |
1621 | 238 Vim uses the 'encoding' option to specify how characters are identified and |
7 | 239 encoded when they are used inside Vim. This applies to all the places where |
240 text is used, including buffers (files loaded into memory), registers and | |
241 variables. | |
242 | |
243 *charset* *codeset* | |
244 Charset is another name for encoding. There are subtle differences, but these | |
245 don't matter when using Vim. "codeset" is another similar name. | |
246 | |
247 Each character is encoded as one or more bytes. When all characters are | |
248 encoded with one byte, we call this a single-byte encoding. The most often | |
249 used one is called "latin1". This limits the number of characters to 256. | |
250 Some of these are control characters, thus even fewer can be used for text. | |
251 | |
252 When some characters use two or more bytes, we call this a multi-byte | |
253 encoding. This allows using much more than 256 characters, which is required | |
254 for most East Asian languages. | |
255 | |
256 Most multi-byte encodings use one byte for the first 127 characters. These | |
257 are equal to ASCII, which makes it easy to exchange plain-ASCII text, no | |
258 matter what language is used. Thus you might see the right text even when the | |
259 encoding was set wrong. | |
260 | |
261 *encoding-names* | |
262 Vim can use many different character encodings. There are three major groups: | |
263 | |
264 1 8bit Single-byte encodings, 256 different characters. Mostly used | |
265 in USA and Europe. Example: ISO-8859-1 (Latin1). All | |
266 characters occupy one screen cell only. | |
267 | |
268 2 2byte Double-byte encodings, over 10000 different characters. | |
269 Mostly used in Asian countries. Example: euc-kr (Korean) | |
270 The number of screen cells is equal to the number of bytes | |
271 (except for euc-jp when the first byte is 0x8e). | |
272 | |
273 u Unicode Universal encoding, can replace all others. ISO 10646. | |
274 Millions of different characters. Example: UTF-8. The | |
275 relation between bytes and screen cells is complex. | |
276 | |
277 Other encodings cannot be used by Vim internally. But files in other | |
278 encodings can be edited by using conversion, see 'fileencoding'. | |
279 Note that all encodings must use ASCII for the characters up to 128 (except | |
280 when compiled for EBCDIC). | |
281 | |
282 Supported 'encoding' values are: *encoding-values* | |
2698
b6471224d2af
Updated runtime files and translations.
Bram Moolenaar <bram@vim.org>
parents:
2577
diff
changeset
|
283 1 latin1 8-bit characters (ISO 8859-1, also used for cp1252) |
7 | 284 1 iso-8859-n ISO_8859 variant (n = 2 to 15) |
285 1 koi8-r Russian | |
286 1 koi8-u Ukrainian | |
287 1 macroman MacRoman (Macintosh encoding) | |
288 1 8bit-{name} any 8-bit encoding (Vim specific name) | |
407 | 289 1 cp437 similar to iso-8859-1 |
290 1 cp737 similar to iso-8859-7 | |
291 1 cp775 Baltic | |
292 1 cp850 similar to iso-8859-4 | |
293 1 cp852 similar to iso-8859-1 | |
294 1 cp855 similar to iso-8859-2 | |
295 1 cp857 similar to iso-8859-5 | |
296 1 cp860 similar to iso-8859-9 | |
297 1 cp861 similar to iso-8859-1 | |
298 1 cp862 similar to iso-8859-1 | |
299 1 cp863 similar to iso-8859-8 | |
300 1 cp865 similar to iso-8859-1 | |
301 1 cp866 similar to iso-8859-5 | |
302 1 cp869 similar to iso-8859-7 | |
303 1 cp874 Thai | |
304 1 cp1250 Czech, Polish, etc. | |
305 1 cp1251 Cyrillic | |
306 1 cp1253 Greek | |
307 1 cp1254 Turkish | |
308 1 cp1255 Hebrew | |
309 1 cp1256 Arabic | |
310 1 cp1257 Baltic | |
311 1 cp1258 Vietnamese | |
7 | 312 1 cp{number} MS-Windows: any installed single-byte codepage |
313 2 cp932 Japanese (Windows only) | |
314 2 euc-jp Japanese (Unix only) | |
315 2 sjis Japanese (Unix only) | |
316 2 cp949 Korean (Unix and Windows) | |
317 2 euc-kr Korean (Unix only) | |
318 2 cp936 simplified Chinese (Windows only) | |
319 2 euc-cn simplified Chinese (Unix only) | |
320 2 cp950 traditional Chinese (on Unix alias for big5) | |
321 2 big5 traditional Chinese (on Windows alias for cp950) | |
322 2 euc-tw traditional Chinese (Unix only) | |
323 2 2byte-{name} Unix: any double-byte encoding (Vim specific name) | |
324 2 cp{number} MS-Windows: any installed double-byte codepage | |
325 u utf-8 32 bit UTF-8 encoded Unicode (ISO/IEC 10646-1) | |
326 u ucs-2 16 bit UCS-2 encoded Unicode (ISO/IEC 10646-1) | |
327 u ucs-2le like ucs-2, little endian | |
328 u utf-16 ucs-2 extended with double-words for more characters | |
329 u utf-16le like utf-16, little endian | |
330 u ucs-4 32 bit UCS-4 encoded Unicode (ISO/IEC 10646-1) | |
331 u ucs-4le like ucs-4, little endian | |
332 | |
333 The {name} can be any encoding name that your system supports. It is passed | |
334 to iconv() to convert between the encoding of the file and the current locale. | |
335 For MS-Windows "cp{number}" means using codepage {number}. | |
336 Examples: > | |
337 :set encoding=8bit-cp1252 | |
338 :set encoding=2byte-cp932 | |
2698
b6471224d2af
Updated runtime files and translations.
Bram Moolenaar <bram@vim.org>
parents:
2577
diff
changeset
|
339 |
b6471224d2af
Updated runtime files and translations.
Bram Moolenaar <bram@vim.org>
parents:
2577
diff
changeset
|
340 The MS-Windows codepage 1252 is very similar to latin1. For practical reasons |
b6471224d2af
Updated runtime files and translations.
Bram Moolenaar <bram@vim.org>
parents:
2577
diff
changeset
|
341 the same encoding is used and it's called latin1. 'isprint' can be used to |
b6471224d2af
Updated runtime files and translations.
Bram Moolenaar <bram@vim.org>
parents:
2577
diff
changeset
|
342 display the characters 0x80 - 0xA0 or not. |
b6471224d2af
Updated runtime files and translations.
Bram Moolenaar <bram@vim.org>
parents:
2577
diff
changeset
|
343 |
7 | 344 Several aliases can be used, they are translated to one of the names above. |
345 An incomplete list: | |
346 | |
347 1 ansi same as latin1 (obsolete, for backward compatibility) | |
348 2 japan Japanese: on Unix "euc-jp", on MS-Windows cp932 | |
349 2 korea Korean: on Unix "euc-kr", on MS-Windows cp949 | |
350 2 prc simplified Chinese: on Unix "euc-cn", on MS-Windows cp936 | |
351 2 chinese same as "prc" | |
352 2 taiwan traditional Chinese: on Unix "euc-tw", on MS-Windows cp950 | |
353 u utf8 same as utf-8 | |
354 u unicode same as ucs-2 | |
355 u ucs2be same as ucs-2 (big endian) | |
356 u ucs-2be same as ucs-2 (big endian) | |
357 u ucs-4be same as ucs-4 (big endian) | |
1621 | 358 u utf-32 same as ucs-4 |
359 u utf-32le same as ucs-4le | |
39 | 360 default stands for the default value of 'encoding', depends on the |
856 | 361 environment |
7 | 362 |
363 For the UCS codes the byte order matters. This is tricky, use UTF-8 whenever | |
364 you can. The default is to use big-endian (most significant byte comes | |
365 first): | |
366 name bytes char ~ | |
367 ucs-2 11 22 1122 | |
368 ucs-2le 22 11 1122 | |
369 ucs-4 11 22 33 44 11223344 | |
370 ucs-4le 44 33 22 11 11223344 | |
371 | |
372 On MS-Windows systems you often want to use "ucs-2le", because it uses little | |
373 endian UCS-2. | |
374 | |
375 There are a few encodings which are similar, but not exactly the same. Vim | |
376 treats them as if they were different encodings, so that conversion will be | |
377 done when needed. You might want to use the similar name to avoid conversion | |
378 or when conversion is not possible: | |
379 | |
380 cp932, shift-jis, sjis | |
381 cp936, euc-cn | |
382 | |
383 *encoding-table* | |
384 Normally 'encoding' is equal to your current locale and 'termencoding' is | |
385 empty. This means that your keyboard and display work with characters encoded | |
386 in your current locale, and Vim uses the same characters internally. | |
387 | |
388 You can make Vim use characters in a different encoding by setting the | |
389 'encoding' option to a different value. Since the keyboard and display still | |
390 use the current locale, conversion needs to be done. The 'termencoding' then | |
391 takes over the value of the current locale, so Vim converts between 'encoding' | |
392 and 'termencoding'. Example: > | |
393 :let &termencoding = &encoding | |
394 :set encoding=utf-8 | |
395 | |
396 However, not all combinations of values are possible. The table below tells | |
397 you how each of the nine combinations works. This is further restricted by | |
398 not all conversions being possible, iconv() being present, etc. Since this | |
399 depends on the system used, no detailed list can be given. | |
400 | |
401 ('tenc' is the short name for 'termencoding' and 'enc' short for 'encoding') | |
402 | |
403 'tenc' 'enc' remark ~ | |
404 | |
405 8bit 8bit Works. When 'termencoding' is different from | |
406 'encoding' typing and displaying may be wrong for some | |
407 characters, Vim does NOT perform conversion (set | |
408 'encoding' to "utf-8" to get this). | |
409 8bit 2byte MS-Windows: works for all codepages installed on your | |
410 system; you can only type 8bit characters; | |
411 Other systems: does NOT work. | |
1121 | 412 8bit Unicode Works, but only 8bit characters can be typed directly |
413 (others through digraphs, keymaps, etc.); in a | |
7 | 414 terminal you can only see 8bit characters; the GUI can |
415 show all characters that the 'guifont' supports. | |
416 | |
417 2byte 8bit Works, but typing non-ASCII characters might | |
418 be a problem. | |
419 2byte 2byte MS-Windows: works for all codepages installed on your | |
420 system; typing characters might be a problem when | |
421 locale is different from 'encoding'. | |
422 Other systems: Only works when 'termencoding' is equal | |
423 to 'encoding', you might as well leave it empty. | |
424 2byte Unicode works, Vim will translate typed characters. | |
425 | |
426 Unicode 8bit works (unusual) | |
427 Unicode 2byte does NOT work | |
428 Unicode Unicode works very well (leaving 'termencoding' empty works | |
429 the same way, because all Unicode is handled | |
430 internally as UTF-8) | |
431 | |
432 CONVERSION *charset-conversion* | |
433 | |
434 Vim will automatically convert from one to another encoding in several places: | |
435 - When reading a file and 'fileencoding' is different from 'encoding' | |
436 - When writing a file and 'fileencoding' is different from 'encoding' | |
437 - When displaying characters and 'termencoding' is different from 'encoding' | |
438 - When reading input and 'termencoding' is different from 'encoding' | |
439 - When displaying messages and the encoding used for LC_MESSAGES differs from | |
440 'encoding' (requires a gettext version that supports this). | |
441 - When reading a Vim script where |:scriptencoding| is different from | |
442 'encoding'. | |
443 - When reading or writing a |viminfo| file. | |
444 Most of these require the |+iconv| feature. Conversion for reading and | |
445 writing files may also be specified with the 'charconvert' option. | |
446 | |
447 Useful utilities for converting the charset: | |
448 All: iconv | |
449 GNU iconv can convert most encodings. Unicode is used as the | |
450 intermediate encoding, which allows conversion from and to all other | |
451 encodings. See http://www.gnu.org/directory/libiconv.html. | |
452 | |
453 Japanese: nkf | |
454 Nkf is "Network Kanji code conversion Filter". One of the most unique | |
455 facility of nkf is the guess of the input Kanji code. So, you don't | |
456 need to know what the inputting file's |charset| is. When convert to | |
457 EUC-JP from ISO-2022-JP or Shift_JIS, simply do the following command | |
458 in Vim: | |
459 :%!nkf -e | |
460 Nkf can be found at: | |
461 http://www.sfc.wide.ad.jp/~max/FreeBSD/ports/distfiles/nkf-1.62.tar.gz | |
462 | |
463 Chinese: hc | |
464 Hc is "Hanzi Converter". Hc convert a GB file to a Big5 file, or Big5 | |
465 file to GB file. Hc can be found at: | |
466 ftp://ftp.cuhk.hk/pub/chinese/ifcss/software/unix/convert/hc-30.tar.gz | |
467 | |
468 Korean: hmconv | |
236 | 469 Hmconv is Korean code conversion utility especially for E-mail. It can |
7 | 470 convert between EUC-KR and ISO-2022-KR. Hmconv can be found at: |
471 ftp://ftp.kaist.ac.kr/pub/hangul/code/hmconv/ | |
472 | |
473 Multilingual: lv | |
474 Lv is a Powerful Multilingual File Viewer. And it can be worked as | |
475 |charset| converter. Supported |charset|: ISO-2022-CN, ISO-2022-JP, | |
476 ISO-2022-KR, EUC-CN, EUC-JP, EUC-KR, EUC-TW, UTF-7, UTF-8, ISO-8859 | |
236 | 477 series, Shift_JIS, Big5 and HZ. Lv can be found at: |
3682 | 478 http://www.ff.iij4u.or.jp/~nrt/lv/index.html |
7 | 479 |
480 | |
481 *mbyte-conversion* | |
482 When reading and writing files in an encoding different from 'encoding', | |
483 conversion needs to be done. These conversions are supported: | |
484 - All conversions between Latin-1 (ISO-8859-1), UTF-8, UCS-2 and UCS-4 are | |
485 handled internally. | |
486 - For MS-Windows, when 'encoding' is a Unicode encoding, conversion from and | |
487 to any codepage should work. | |
488 - Conversion specified with 'charconvert' | |
489 - Conversion with the iconv library, if it is available. | |
490 Old versions of GNU iconv() may cause the conversion to fail (they | |
491 request a very large buffer, more than Vim is willing to provide). | |
492 Try getting another iconv() implementation. | |
493 | |
557 | 494 *iconv-dynamic* |
495 On MS-Windows Vim can be compiled with the |+iconv/dyn| feature. This means | |
496 Vim will search for the "iconv.dll" and "libiconv.dll" libraries. When | |
497 neither of them can be found Vim will still work but some conversions won't be | |
498 possible. | |
499 | |
7 | 500 ============================================================================== |
501 4. Using a terminal *mbyte-terminal* | |
502 | |
503 The GUI fully supports multi-byte characters. It is also possible in a | |
504 terminal, if the terminal supports the same encoding that Vim uses. Thus this | |
505 is less flexible. | |
506 | |
507 For example, you can run Vim in a xterm with added multi-byte support and/or | |
508 |XIM|. Examples are kterm (Kanji term) and hanterm (for Korean), Eterm | |
509 (Enlightened terminal) and rxvt. | |
510 | |
511 If your terminal does not support the right encoding, you can set the | |
512 'termencoding' option. Vim will then convert the typed characters from | |
513 'termencoding' to 'encoding'. And displayed text will be converted from | |
514 'encoding' to 'termencoding'. If the encoding supported by the terminal | |
515 doesn't include all the characters that Vim uses, this leads to lost | |
516 characters. This may mess up the display. If you use a terminal that | |
517 supports Unicode, such as the xterm mentioned below, it should work just fine, | |
518 since nearly every character set can be converted to Unicode without loss of | |
519 information. | |
520 | |
521 | |
522 UTF-8 IN XFREE86 XTERM *UTF8-xterm* | |
523 | |
524 This is a short explanation of how to use UTF-8 character encoding in the | |
525 xterm that comes with XFree86 by Thomas Dickey (text by Markus Kuhn). | |
526 | |
527 Get the latest xterm version which has now UTF-8 support: | |
528 | |
529 http://invisible-island.net/xterm/xterm.html | |
530 | |
531 Compile it with "./configure --enable-wide-chars ; make" | |
532 | |
533 Also get the ISO 10646-1 version of various fonts, which is available on | |
534 | |
535 http://www.cl.cam.ac.uk/~mgk25/download/ucs-fonts.tar.gz | |
536 | |
537 and install the font as described in the README file. | |
538 | |
539 Now start xterm with > | |
540 | |
541 xterm -u8 -fn -misc-fixed-medium-r-semicondensed--13-120-75-75-c-60-iso10646-1 | |
542 or, for bigger character: > | |
543 xterm -u8 -fn -misc-fixed-medium-r-normal--15-140-75-75-c-90-iso10646-1 | |
544 | |
236 | 545 and you will have a working UTF-8 terminal emulator. Try both > |
7 | 546 |
547 cat utf-8-demo.txt | |
548 vim utf-8-demo.txt | |
549 | |
550 with the demo text that comes with ucs-fonts.tar.gz in order to see | |
551 whether there are any problems with UTF-8 in your xterm. | |
552 | |
553 For Vim you may need to set 'encoding' to "utf-8". | |
554 | |
555 ============================================================================== | |
556 5. Fonts on X11 *mbyte-fonts-X11* | |
557 | |
558 Unfortunately, using fonts in X11 is complicated. The name of a single-byte | |
559 font is a long string. For multi-byte fonts we need several of these... | |
560 | |
561 Note: Most of this is no longer relevant for GTK+ 2. Selecting a font via | |
2033
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
562 its XLFD is not supported; see 'guifont' for an example of how to |
7 | 563 set the font. Do yourself a favor and ignore the |XLFD| and |xfontset| |
564 sections below. | |
565 | |
566 First of all, Vim only accepts fixed-width fonts for displaying text. You | |
567 cannot use proportionally spaced fonts. This excludes many of the available | |
568 (and nicer looking) fonts. However, for menus and tooltips any font can be | |
569 used. | |
570 | |
571 Note that Display and Input are independent. It is possible to see your | |
572 language even though you have no input method for it. | |
573 | |
574 You should get a default font for menus and tooltips that works, but it might | |
575 be ugly. Read the following to find out how to select a better font. | |
576 | |
577 | |
578 X LOGICAL FONT DESCRIPTION (XLFD) | |
579 *XLFD* | |
580 XLFD is the X font name and contains the information about the font size, | |
581 charset, etc. The name is in this format: | |
582 | |
583 FOUNDRY-FAMILY-WEIGHT-SLANT-WIDTH-STYLE-PIXEL-POINT-X-Y-SPACE-AVE-CR-CE | |
584 | |
585 Each field means: | |
586 | |
587 - FOUNDRY: FOUNDRY field. The company that created the font. | |
588 - FAMILY: FAMILY_NAME field. Basic font family name. (helvetica, gothic, | |
589 times, etc) | |
590 - WEIGHT: WEIGHT_NAME field. How thick the letters are. (light, medium, | |
591 bold, etc) | |
592 - SLANT: SLANT field. | |
593 r: Roman (no slant) | |
594 i: Italic | |
595 o: Oblique | |
596 ri: Reverse Italic | |
597 ro: Reverse Oblique | |
598 ot: Other | |
599 number: Scaled font | |
600 - WIDTH: SETWIDTH_NAME field. Width of characters. (normal, condensed, | |
601 narrow, double wide) | |
602 - STYLE: ADD_STYLE_NAME field. Extra info to describe font. (Serif, Sans | |
603 Serif, Informal, Decorated, etc) | |
604 - PIXEL: PIXEL_SIZE field. Height, in pixels, of characters. | |
605 - POINT: POINT_SIZE field. Ten times height of characters in points. | |
606 - X: RESOLUTION_X field. X resolution (dots per inch). | |
607 - Y: RESOLUTION_Y field. Y resolution (dots per inch). | |
608 - SPACE: SPACING field. | |
609 p: Proportional | |
610 m: Monospaced | |
611 c: CharCell | |
612 - AVE: AVERAGE_WIDTH field. Ten times average width in pixels. | |
613 - CR: CHARSET_REGISTRY field. The name of the charset group. | |
614 - CE: CHARSET_ENCODING field. The rest of the charset name. For some | |
615 charsets, such as JIS X 0208, if this field is 0, code points has | |
616 the same value as GL, and GR if 1. | |
617 | |
3682 | 618 For example, in case of a 16 dots font corresponding to JIS X 0208, it is |
7 | 619 written like: |
620 -misc-fixed-medium-r-normal--16-110-100-100-c-160-jisx0208.1990-0 | |
621 | |
622 | |
623 X FONTSET | |
624 *fontset* *xfontset* | |
625 A single-byte charset is typically associated with one font. For multi-byte | |
626 charsets a combination of fonts is often used. This means that one group of | |
627 characters are used from one font and another group from another font (which | |
628 might be double wide). This collection of fonts is called a fontset. | |
629 | |
630 Which fonts are required in a fontset depends on the current locale. X | |
631 windows maintains a table of which groups of characters are required for a | |
632 locale. You have to specify all the fonts that a locale requires in the | |
633 'guifontset' option. | |
634 | |
635 NOTE: The fontset always uses the current locale, even though 'encoding' may | |
636 be set to use a different charset. In that situation you might want to use | |
637 'guifont' and 'guifontwide' instead of 'guifontset'. | |
638 | |
639 Example: | |
640 |charset| language "groups of characters" ~ | |
641 GB2312 Chinese (simplified) ISO-8859-1 and GB 2312 | |
642 Big5 Chinese (traditional) ISO-8859-1 and Big5 | |
643 CNS-11643 Chinese (traditional) ISO-8859-1, CNS 11643-1 and CNS 11643-2 | |
644 EUC-JP Japanese JIS X 0201 and JIS X 0208 | |
645 EUC-KR Korean ISO-8859-1 and KS C 5601 (KS X 1001) | |
646 | |
647 You can search for fonts using the xlsfonts command. For example, when you're | |
648 searching for a font for KS C 5601: > | |
649 xlsfonts | grep ksc5601 | |
650 | |
651 This is complicated and confusing. You might want to consult the X-Windows | |
652 documentation if there is something you don't understand. | |
653 | |
654 *base_font_name_list* | |
655 When you have found the names of the fonts you want to use, you need to set | |
656 the 'guifontset' option. You specify the list by concatenating the font names | |
657 and putting a comma in between them. | |
658 | |
659 For example, when you use the ja_JP.eucJP locale, this requires JIS X 0201 | |
660 and JIS X 0208. You could supply a list of fonts that explicitly specifies | |
661 the charsets, like: > | |
662 | |
663 :set guifontset=-misc-fixed-medium-r-normal--14-130-75-75-c-140-jisx0208.1983-0, | |
664 \-misc-fixed-medium-r-normal--14-130-75-75-c-70-jisx0201.1976-0 | |
665 | |
666 Alternatively, you can supply a base font name list that omits the charset | |
667 name, letting X-Windows select font characters required for the locale. For | |
668 example: > | |
669 | |
670 :set guifontset=-misc-fixed-medium-r-normal--14-130-75-75-c-140, | |
671 \-misc-fixed-medium-r-normal--14-130-75-75-c-70 | |
672 | |
673 Alternatively, you can supply a single base font name that allows X-Windows to | |
674 select from all available fonts. For example: > | |
675 | |
676 :set guifontset=-misc-fixed-medium-r-normal--14-* | |
677 | |
678 Alternatively, you can specify alias names. See the fonts.alias file in the | |
679 fonts directory (e.g., /usr/X11R6/lib/X11/fonts/). For example: > | |
680 | |
681 :set guifontset=k14,r14 | |
682 < | |
683 *E253* | |
684 Note that in East Asian fonts, the standard character cell is square. When | |
685 mixing a Latin font and an East Asian font, the East Asian font width should | |
686 be twice the Latin font width. | |
687 | |
688 If 'guifontset' is not empty, the "font" argument of the |:highlight| command | |
689 is also interpreted as a fontset. For example, you should use for | |
690 highlighting: > | |
691 :hi Comment font=english_font,your_font | |
692 If you use a wrong "font" argument you will get an error message. | |
693 Also make sure that you set 'guifontset' before setting fonts for highlight | |
694 groups. | |
695 | |
696 | |
697 USING RESOURCE FILES | |
698 | |
699 Instead of specifying 'guifontset', you can set X11 resources and Vim will | |
700 pick them up. This is only for people who know how X resource files work. | |
701 | |
702 For Motif and Athena insert these three lines in your $HOME/.Xdefaults file: | |
703 | |
704 Vim.font: |base_font_name_list| | |
705 Vim*fontSet: |base_font_name_list| | |
706 Vim*fontList: your_language_font | |
707 | |
708 Note: Vim.font is for text area. | |
709 Vim*fontSet is for menu. | |
710 Vim*fontList is for menu (for Motif GUI) | |
711 | |
712 For example, when you are using Japanese and a 14 dots font, > | |
713 | |
714 Vim.font: -misc-fixed-medium-r-normal--14-* | |
715 Vim*fontSet: -misc-fixed-medium-r-normal--14-* | |
716 Vim*fontList: -misc-fixed-medium-r-normal--14-* | |
717 < | |
718 or: > | |
719 | |
720 Vim*font: k14,r14 | |
721 Vim*fontSet: k14,r14 | |
722 Vim*fontList: k14,r14 | |
723 < | |
724 To have them take effect immediately you will have to do > | |
725 | |
726 xrdb -merge ~/.Xdefaults | |
727 | |
728 Otherwise you will have to stop and restart the X server before the changes | |
729 take effect. | |
730 | |
731 | |
732 The GTK+ version of GUI Vim does not use .Xdefaults, use ~/.gtkrc instead. | |
733 The default mostly works OK. But for the menus you might have to change | |
734 it. Example: > | |
735 | |
736 style "default" | |
737 { | |
738 fontset="-*-*-medium-r-normal--14-*-*-*-c-*-*-*" | |
739 } | |
740 widget_class "*" style "default" | |
741 | |
742 ============================================================================== | |
743 6. Fonts on MS-Windows *mbyte-fonts-MSwin* | |
744 | |
745 The simplest is to use the font dialog to select fonts and try them out. You | |
746 can find this at the "Edit/Select Font..." menu. Once you find a font name | |
747 that works well you can use this command to see its name: > | |
748 | |
749 :set guifont | |
750 | |
751 Then add a command to your |gvimrc| file to set 'guifont': > | |
752 | |
753 :set guifont=courier_new:h12 | |
754 | |
755 ============================================================================== | |
756 7. Input on X11 *mbyte-XIM* | |
757 | |
758 X INPUT METHOD (XIM) BACKGROUND *XIM* *xim* *x-input-method* | |
759 | |
2207
b17bbfa96fa0
Add the settabvar() and gettabvar() functions.
Bram Moolenaar <bram@vim.org>
parents:
2154
diff
changeset
|
760 XIM is an international input module for X. There are two kinds of structures, |
7 | 761 Xlib unit type and |IM-server| (Input-Method server) type. |IM-server| type |
762 is suitable for complex input, such as CJK. | |
763 | |
764 - IM-server | |
765 *IM-server* | |
766 In |IM-server| type input structures, the input event is handled by either | |
767 of the two ways: FrontEnd system and BackEnd system. In the FrontEnd | |
768 system, input events are snatched by the |IM-server| first, then |IM-server| | |
769 give the application the result of input. On the other hand, the BackEnd | |
770 system works reverse order. MS Windows adopt BackEnd system. In X, most of | |
771 |IM-server|s adopt FrontEnd system. The demerit of BackEnd system is the | |
772 large overhead in communication, but it provides safe synchronization with | |
773 no restrictions on applications. | |
774 | |
775 For example, there are xwnmo and kinput2 Japanese |IM-server|, both are | |
776 FrontEnd system. Xwnmo is distributed with Wnn (see below), kinput2 can be | |
777 found at: ftp://ftp.sra.co.jp/pub/x11/kinput2/ | |
778 | |
779 For Chinese, there's a great XIM server named "xcin", you can input both | |
780 Traditional and Simplified Chinese characters. And it can accept other | |
781 locale if you make a correct input table. Xcin can be found at: | |
2236
dc2e5ec0500d
Added the undofile() function. Updated runtime files.
Bram Moolenaar <bram@vim.org>
parents:
2207
diff
changeset
|
782 http://cle.linux.org.tw/xcin/ |
15 | 783 Others are scim: http://scim.freedesktop.org/ and fcitx: |
856 | 784 http://www.fcitx.org/ |
7 | 785 |
786 - Conversion Server | |
787 *conversion-server* | |
788 Some system needs additional server: conversion server. Most of Japanese | |
789 |IM-server|s need it, Kana-Kanji conversion server. For Chinese inputting, | |
790 it depends on the method of inputting, in some methods, PinYin or ZhuYin to | |
791 HanZi conversion server is needed. For Korean inputting, if you want to | |
792 input Hanja, Hangul-Hanja conversion server is needed. | |
793 | |
794 For example, the Japanese inputting process is divided into 2 steps. First | |
795 we pre-input Hira-gana, second Kana-Kanji conversion. There are so many | |
796 Kanji characters (6349 Kanji characters are defined in JIS X 0208) and the | |
797 number of Hira-gana characters are 76. So, first, we pre-input text as | |
798 pronounced in Hira-gana, second, we convert Hira-gana to Kanji or Kata-Kana, | |
799 if needed. There are some Kana-Kanji conversion server: jserver | |
3153 | 800 (distributed with Wnn, see below) and canna. Canna can be found at: |
801 http://canna.sourceforge.jp/ | |
7 | 802 |
803 There is a good input system: Wnn4.2. Wnn 4.2 contains, | |
804 xwnmo (|IM-server|) | |
805 jserver (Japanese Kana-Kanji conversion server) | |
806 cserver (Chinese PinYin or ZhuYin to simplified HanZi conversion server) | |
807 tserver (Chinese PinYin or ZhuYin to traditional HanZi conversion server) | |
808 kserver (Hangul-Hanja conversion server) | |
809 Wnn 4.2 for several systems can be found at various places on the internet. | |
810 Use the RPM or port for your system. | |
811 | |
812 | |
813 - Input Style | |
814 *xim-input-style* | |
815 When inputting CJK, there are four areas: | |
816 1. The area to display of the input while it is being composed | |
817 2. The area to display the currently active input mode. | |
818 3. The area to display the next candidate for the selection. | |
819 4. The area to display other tools. | |
820 | |
821 The third area is needed when converting. For example, in Japanese | |
822 inputting, multiple Kanji characters could have the same pronunciation, so | |
823 a sequence of Hira-gana characters could map to a distinct sequence of Kanji | |
824 characters. | |
825 | |
826 The first and second areas are defined in international input of X with the | |
827 names of "Preedit Area", "Status Area" respectively. The third and fourth | |
828 areas are not defined and are left to be managed by the |IM-server|. In the | |
829 international input, four input styles have been defined using combinations | |
830 of Preedit Area and Status Area: |OnTheSpot|, |OffTheSpot|, |OverTheSpot| | |
831 and |Root|. | |
832 | |
2207
b17bbfa96fa0
Add the settabvar() and gettabvar() functions.
Bram Moolenaar <bram@vim.org>
parents:
2154
diff
changeset
|
833 Currently, GUI Vim supports three styles, |OverTheSpot|, |OffTheSpot| and |
7 | 834 |Root|. |
835 | |
836 *. on-the-spot *OnTheSpot* | |
837 Preedit Area and Status Area are performed by the client application in | |
838 the area of application. The client application is directed by the | |
839 |IM-server| to display all pre-edit data at the location of text | |
236 | 840 insertion. The client registers callbacks invoked by the input method |
7 | 841 during pre-editing. |
842 *. over-the-spot *OverTheSpot* | |
843 Status Area is created in a fixed position within the area of application, | |
844 in case of Vim, the position is the additional status line. Preedit Area | |
845 is made at present input position of application. The input method | |
846 displays pre-edit data in a window which it brings up directly over the | |
847 text insertion position. | |
848 *. off-the-spot *OffTheSpot* | |
849 Preedit Area and Status Area are performed in the area of application, in | |
850 case of Vim, the area is additional status line. The client application | |
851 provides display windows for the pre-edit data to the input method which | |
852 displays into them directly. | |
853 *. root-window *Root* | |
854 Preedit Area and Status Area are outside of the application. The input | |
855 method displays all pre-edit data in a separate area of the screen in a | |
856 window specific to the input method. | |
857 | |
858 | |
859 USING XIM *multibyte-input* *E284* *E286* *E287* *E288* | |
3410
94601b379f38
Updated runtime files. Add Dutch translations.
Bram Moolenaar <bram@vim.org>
parents:
3153
diff
changeset
|
860 *E285* *E289* |
7 | 861 |
862 Note that Display and Input are independent. It is possible to see your | |
863 language even though you have no input method for it. But when your Display | |
864 method doesn't match your Input method, the text will be displayed wrong. | |
865 | |
866 Note: You can not use IM unless you specify 'guifontset'. | |
867 Therefore, Latin users, you have to also use 'guifontset' | |
868 if you use IM. | |
869 | |
870 To input your language you should run the |IM-server| which supports your | |
871 language and |conversion-server| if needed. | |
872 | |
873 The next 3 lines should be put in your ~/.Xdefaults file. They are common for | |
874 all X applications which uses |XIM|. If you already use |XIM|, you can skip | |
875 this. > | |
876 | |
877 *international: True | |
878 *.inputMethod: your_input_server_name | |
879 *.preeditType: your_input_style | |
880 < | |
881 input_server_name is your |IM-server| name (check your |IM-server| | |
882 manual). | |
883 your_input_style is one of |OverTheSpot|, |OffTheSpot|, |Root|. See | |
884 also |xim-input-style|. | |
885 | |
886 *international may not necessary if you use X11R6. | |
887 *.inputMethod and *.preeditType are optional if you use X11R6. | |
888 | |
889 For example, when you are using kinput2 as |IM-server|, > | |
890 | |
891 *international: True | |
892 *.inputMethod: kinput2 | |
893 *.preeditType: OverTheSpot | |
894 < | |
895 When using |OverTheSpot|, GUI Vim always connects to the IM Server even in | |
896 Normal mode, so you can input your language with commands like "f" and "r". | |
897 But when using one of the other two methods, GUI Vim connects to the IM Server | |
898 only if it is not in Normal mode. | |
899 | |
900 If your IM Server does not support |OverTheSpot|, and if you want to use your | |
901 language with some Normal mode command like "f" or "r", then you should use a | |
902 localized xterm or an xterm which supports |XIM| | |
903 | |
904 If needed, you can set the XMODIFIERS environment variable: | |
905 | |
906 sh: export XMODIFIERS="@im=input_server_name" | |
907 csh: setenv XMODIFIERS "@im=input_server_name" | |
908 | |
909 For example, when you are using kinput2 as |IM-server| and sh, > | |
910 | |
911 export XMODIFIERS="@im=kinput2" | |
912 < | |
913 | |
914 FULLY CONTROLLED XIM | |
915 | |
916 You can fully control XIM, like with IME of MS-Windows (see |multibyte-ime|). | |
917 This is currently only available for the GTK GUI. | |
918 | |
919 Before using fully controlled XIM, one setting is required. Set the | |
920 'imactivatekey' option to the key that is used for the activation of the input | |
921 method. For example, when you are using kinput2 + canna as IM Server, the | |
922 activation key is probably Shift+Space: > | |
923 | |
924 :set imactivatekey=S-space | |
925 | |
926 See 'imactivatekey' for the format. | |
927 | |
928 ============================================================================== | |
929 8. Input on MS-Windows *mbyte-IME* | |
930 | |
931 (Windows IME support) *multibyte-ime* *IME* | |
932 | |
933 {only works Windows GUI and compiled with the |+multi_byte_ime| feature} | |
934 | |
2415 | 935 To input multibyte characters on Windows, you can use an Input Method Editor |
7 | 936 (IME). In process of your editing text, you must switch status (on/off) of |
937 IME many many many times. Because IME with status on is hooking all of your | |
938 key inputs, you cannot input 'j', 'k', or almost all of keys to Vim directly. | |
939 | |
940 This |+multi_byte_ime| feature help this. It reduce times of switch status of | |
941 IME manually. In normal mode, there are almost no need working IME, even | |
942 editing multibyte text. So exiting insert mode with ESC, Vim memorize last | |
943 status of IME and force turn off IME. When re-enter insert mode, Vim revert | |
944 IME status to that memorized automatically. | |
945 | |
946 This works on not only insert-normal mode, but also search-command input and | |
947 replace mode. | |
948 The options 'iminsert', 'imsearch' and 'imcmdline' can be used to chose | |
9 | 949 the different input methods or disable them temporarily. |
7 | 950 |
951 WHAT IS IME | |
952 IME is a part of East asian version Windows. That helps you to input | |
953 multibyte character. English and other language version Windows does not | |
2355
84c7eeeb09e2
Fix typos in documentation. (Dominique Pelle)
Bram Moolenaar <bram@vim.org>
parents:
2345
diff
changeset
|
954 have any IME. (Also there is no need usually.) But there is one that |
7 | 955 called Microsoft Global IME. Global IME is a part of Internet Explorer |
956 4.0 or above. You can get more information about Global IME, at below | |
957 URL. | |
958 | |
959 WHAT IS GLOBAL IME *global-ime* | |
960 Global IME makes capability to input Chinese, Japanese, and Korean text | |
961 into Vim buffer on any language version of Windows 98, Windows 95, and | |
962 Windows NT 4.0. | |
963 On Windows 2000 and XP it should work as well (without downloading). On | |
964 Windows 2000 Professional, Global IME is built in, and the Input Locales | |
965 can be added through Control Panel/Regional Options/Input Locales. | |
966 Please see below URL for detail of Global IME. You can also find various | |
967 language version of Global IME at same place. | |
968 | |
969 - Global IME detailed information. | |
2236
dc2e5ec0500d
Added the undofile() function. Updated runtime files.
Bram Moolenaar <bram@vim.org>
parents:
2207
diff
changeset
|
970 http://search.microsoft.com/results.aspx?q=global+ime |
7 | 971 |
972 - Active Input Method Manager (Global IME) | |
2236
dc2e5ec0500d
Added the undofile() function. Updated runtime files.
Bram Moolenaar <bram@vim.org>
parents:
2207
diff
changeset
|
973 http://msdn.microsoft.com/en-us/library/aa741221(v=VS.85).aspx |
7 | 974 |
1621 | 975 Support for Global IME is an experimental feature. |
7 | 976 |
977 NOTE: For IME to work you must make sure the input locales of your language | |
978 are added to your system. The exact location of this depends on the version | |
1621 | 979 of Windows you use. For example, on my Windows 2000 box: |
7 | 980 1. Control Panel |
981 2. Regional Options | |
982 3. Input Locales Tab | |
983 4. Add Installed input locales -> Chinese(PRC) | |
984 The default is still English (United Stated) | |
985 | |
986 | |
987 Cursor color when IME or XIM is on *CursorIM* | |
988 There is a little cute feature for IME. Cursor can indicate status of IME | |
989 by changing its color. Usually status of IME was indicated by little icon | |
990 at a corner of desktop (or taskbar). It is not easy to verify status of | |
991 IME. But this feature help this. | |
992 This works in the same way when using XIM. | |
993 | |
994 You can select cursor color when status is on by using highlight group | |
819 | 995 CursorIM. For example, add these lines to your |gvimrc|: > |
7 | 996 |
997 if has('multi_byte_ime') | |
998 highlight Cursor guifg=NONE guibg=Green | |
999 highlight CursorIM guifg=NONE guibg=Purple | |
1000 endif | |
1001 < | |
1002 Cursor color with off IME is green. And purple cursor indicates that | |
1003 status is on. | |
1004 | |
1005 ============================================================================== | |
1006 9. Input with a keymap *mbyte-keymap* | |
1007 | |
1008 When the keyboard doesn't produce the characters you want to enter in your | |
1009 text, you can use the 'keymap' option. This will translate one or more | |
1010 (English) characters to another (non-English) character. This only happens | |
1011 when typing text, not when typing Vim commands. This avoids having to switch | |
1012 between two keyboard settings. | |
1013 | |
1014 The value of the 'keymap' option specifies a keymap file to use. The name of | |
1015 this file is one of these two: | |
1016 | |
1017 keymap/{keymap}_{encoding}.vim | |
1018 keymap/{keymap}.vim | |
1019 | |
1020 Here {keymap} is the value of the 'keymap' option and {encoding} of the | |
1021 'encoding' option. The file name with the {encoding} included is tried first. | |
1022 | |
1023 'runtimepath' is used to find these files. To see an overview of all | |
1024 available keymap files, use this: > | |
1025 :echo globpath(&rtp, "keymap/*.vim") | |
1026 | |
1027 In Insert and Command-line mode you can use CTRL-^ to toggle between using the | |
1028 keyboard map or not. |i_CTRL-^| |c_CTRL-^| | |
1029 This flag is remembered for Insert mode with the 'iminsert' option. When | |
1030 leaving and entering Insert mode the previous value is used. The same value | |
1031 is also used for commands that take a single character argument, like |f| and | |
1032 |r|. | |
1033 For Command-line mode the flag is NOT remembered. You are expected to type an | |
1034 Ex command first, which is ASCII. | |
1035 For typing search patterns the 'imsearch' option is used. It can be set to | |
1036 use the same value as for 'iminsert'. | |
2033
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1037 *lCursor* |
7 | 1038 It is possible to give the GUI cursor another color when the language mappings |
1039 are being used. This is disabled by default, to avoid that the cursor becomes | |
1040 invisible when you use a non-standard background color. Here is an example to | |
1041 use a brightly colored cursor: > | |
1042 :highlight Cursor guifg=NONE guibg=Green | |
1043 :highlight lCursor guifg=NONE guibg=Cyan | |
1044 < | |
839 | 1045 *keymap-file-format* *:loadk* *:loadkeymap* *E105* *E791* |
7 | 1046 The keymap file looks something like this: > |
1047 | |
1048 " Maintainer: name <email@address> | |
1049 " Last Changed: 2001 Jan 1 | |
1050 | |
1051 let b:keymap_name = "short" | |
1052 | |
1053 loadkeymap | |
1054 a A | |
1055 b B comment | |
1056 | |
1057 The lines starting with a " are comments and will be ignored. Blank lines are | |
1058 also ignored. The lines with the mappings may have a comment after the useful | |
1059 text. | |
1060 | |
1061 The "b:keymap_name" can be set to a short name, which will be shown in the | |
1062 status line. The idea is that this takes less room than the value of | |
1063 'keymap', which might be long to distinguish between different languages, | |
1064 keyboards and encodings. | |
1065 | |
1066 The actual mappings are in the lines below "loadkeymap". In the example "a" | |
1067 is mapped to "A" and "b" to "B". Thus the first item is mapped to the second | |
1068 item. This is done for each line, until the end of the file. | |
1069 These items are exactly the same as what can be used in a |:lnoremap| command, | |
4186 | 1070 using "<buffer>" to make the mappings local to the buffer. |
7 | 1071 You can check the result with this command: > |
1072 :lmap | |
1073 The two items must be separated by white space. You cannot include white | |
1074 space inside an item, use the special names "<Tab>" and "<Space>" instead. | |
1075 The length of the two items together must not exceed 200 bytes. | |
1076 | |
1077 It's possible to have more than one character in the first column. This works | |
1078 like a dead key. Example: > | |
1079 'a á | |
1080 Since Vim doesn't know if the next character after a quote is really an "a", | |
1081 it will wait for the next character. To be able to insert a single quote, | |
1082 also add this line: > | |
1083 '' ' | |
1084 Since the mapping is defined with |:lnoremap| the resulting quote will not be | |
1085 used for the start of another character. | |
818 | 1086 The "accents" keymap uses this. *keymap-accents* |
7 | 1087 |
3893 | 1088 The first column can also be in |<>| form: |
1089 <C-c> Ctrl-C | |
1090 <A-c> Alt-c | |
1091 <A-C> Alt-C | |
1092 Note that the Alt mappings may not work, depending on your keyboard and | |
1093 terminal. | |
1094 | |
7 | 1095 Although it's possible to have more than one character in the second column, |
1096 this is unusual. But you can use various ways to specify the character: > | |
1097 A a literal character | |
1098 A <char-97> decimal value | |
1099 A <char-0x61> hexadecimal value | |
1100 A <char-0141> octal value | |
1101 x <Space> special key name | |
1102 | |
1103 The characters are assumed to be encoded for the current value of 'encoding'. | |
1104 It's possible to use ":scriptencoding" when all characters are given | |
1105 literally. That doesn't work when using the <char-> construct, because the | |
1106 conversion is done on the keymap file, not on the resulting character. | |
1107 | |
1108 The lines after "loadkeymap" are interpreted with 'cpoptions' set to "C". | |
1109 This means that continuation lines are not used and a backslash has a special | |
1110 meaning in the mappings. Examples: > | |
1111 | |
1112 " a comment line | |
1113 \" x maps " to x | |
1114 \\ y maps \ to y | |
1115 | |
1116 If you write a keymap file that will be useful for others, consider submitting | |
1117 it to the Vim maintainer for inclusion in the distribution: | |
1118 <maintainer@vim.org> | |
1119 | |
1120 | |
1121 HEBREW KEYMAP *keymap-hebrew* | |
1122 | |
1123 This file explains what characters are available in UTF-8 and CP1255 encodings, | |
1124 and what the keymaps are to get those characters: | |
1125 | |
1126 glyph encoding keymap ~ | |
1127 Char utf-8 cp1255 hebrew hebrewp name ~ | |
1128 א 0x5d0 0xe0 t a 'alef | |
1129 ב 0x5d1 0xe1 c b bet | |
1130 ג 0x5d2 0xe2 d g gimel | |
1131 ד 0x5d3 0xe3 s d dalet | |
1132 ה 0x5d4 0xe4 v h he | |
1133 ו 0x5d5 0xe5 u v vav | |
1134 ז 0x5d6 0xe6 z z zayin | |
1135 ח 0x5d7 0xe7 j j het | |
1136 ט 0x5d8 0xe8 y T tet | |
1137 י 0x5d9 0xe9 h y yod | |
1138 ך 0x5da 0xea l K kaf sofit | |
1139 כ 0x5db 0xeb f k kaf | |
1140 ל 0x5dc 0xec k l lamed | |
1141 ם 0x5dd 0xed o M mem sofit | |
1142 מ 0x5de 0xee n m mem | |
1143 ן 0x5df 0xef i N nun sofit | |
1144 נ 0x5e0 0xf0 b n nun | |
1145 ס 0x5e1 0xf1 x s samech | |
1146 ע 0x5e2 0xf2 g u `ayin | |
1147 ף 0x5e3 0xf3 ; P pe sofit | |
1148 פ 0x5e4 0xf4 p p pe | |
1149 ץ 0x5e5 0xf5 . X tsadi sofit | |
1150 צ 0x5e6 0xf6 m x tsadi | |
1151 ק 0x5e7 0xf7 e q qof | |
1152 ר 0x5e8 0xf8 r r resh | |
1153 ש 0x5e9 0xf9 a w shin | |
1154 ת 0x5ea 0xfa , t tav | |
1155 | |
1156 Vowel marks and special punctuation: | |
1157 הְ 0x5b0 0xc0 A: A: sheva | |
1158 הֱ 0x5b1 0xc1 HE HE hataf segol | |
1159 הֲ 0x5b2 0xc2 HA HA hataf patah | |
1160 הֳ 0x5b3 0xc3 HO HO hataf qamats | |
1161 הִ 0x5b4 0xc4 I I hiriq | |
1162 הֵ 0x5b5 0xc5 AY AY tsere | |
1163 הֶ 0x5b6 0xc6 E E segol | |
1164 הַ 0x5b7 0xc7 AA AA patah | |
1165 הָ 0x5b8 0xc8 AO AO qamats | |
1166 הֹ 0x5b9 0xc9 O O holam | |
1167 הֻ 0x5bb 0xcb U U qubuts | |
1168 כּ 0x5bc 0xcc D D dagesh | |
1169 הֽ 0x5bd 0xcd ]T ]T meteg | |
1170 ה־ 0x5be 0xce ]Q ]Q maqaf | |
1171 בֿ 0x5bf 0xcf ]R ]R rafe | |
1172 ב׀ 0x5c0 0xd0 ]p ]p paseq | |
1173 שׁ 0x5c1 0xd1 SR SR shin-dot | |
1174 שׂ 0x5c2 0xd2 SL SL sin-dot | |
1175 ׃ 0x5c3 0xd3 ]P ]P sof-pasuq | |
1176 װ 0x5f0 0xd4 VV VV double-vav | |
1177 ױ 0x5f1 0xd5 VY VY vav-yod | |
1178 ײ 0x5f2 0xd6 YY YY yod-yod | |
1179 | |
1180 The following are only available in utf-8 | |
1181 | |
1182 Cantillation marks: | |
1183 glyph | |
1184 Char utf-8 hebrew name | |
1185 ב֑ 0x591 C: etnahta | |
1186 ב֒ 0x592 Cs segol | |
1187 ב֓ 0x593 CS shalshelet | |
1188 ב֔ 0x594 Cz zaqef qatan | |
1189 ב֕ 0x595 CZ zaqef gadol | |
1190 ב֖ 0x596 Ct tipeha | |
1191 ב֗ 0x597 Cr revia | |
1192 ב֘ 0x598 Cq zarqa | |
1193 ב֙ 0x599 Cp pashta | |
1194 ב֚ 0x59a C! yetiv | |
1195 ב֛ 0x59b Cv tevir | |
1196 ב֜ 0x59c Cg geresh | |
1197 ב֝ 0x59d C* geresh qadim | |
1198 ב֞ 0x59e CG gershayim | |
1199 ב֟ 0x59f CP qarnei-parah | |
1200 ב֪ 0x5aa Cy yerach-ben-yomo | |
1201 ב֫ 0x5ab Co ole | |
1202 ב֬ 0x5ac Ci iluy | |
1203 ב֭ 0x5ad Cd dehi | |
1204 ב֮ 0x5ae Cn zinor | |
1205 ב֯ 0x5af CC masora circle | |
1206 | |
1207 Combining forms: | |
1208 ﬠ 0xfb20 X` Alternative `ayin | |
1209 ﬡ 0xfb21 X' Alternative 'alef | |
1210 ﬢ 0xfb22 X-d Alternative dalet | |
1211 ﬣ 0xfb23 X-h Alternative he | |
1212 ﬤ 0xfb24 X-k Alternative kaf | |
1213 ﬥ 0xfb25 X-l Alternative lamed | |
1214 ﬦ 0xfb26 X-m Alternative mem-sofit | |
1215 ﬧ 0xfb27 X-r Alternative resh | |
1216 ﬨ 0xfb28 X-t Alternative tav | |
1217 ﬩ 0xfb29 X-+ Alternative plus | |
1218 שׁ 0xfb2a XW shin+shin-dot | |
1219 שׂ 0xfb2b Xw shin+sin-dot | |
1220 שּׁ 0xfb2c X..W shin+shin-dot+dagesh | |
1221 שּׂ 0xfb2d X..w shin+sin-dot+dagesh | |
1222 אַ 0xfb2e XA alef+patah | |
1223 אָ 0xfb2f XO alef+qamats | |
1224 אּ 0xfb30 XI alef+hiriq (mapiq) | |
1225 בּ 0xfb31 X.b bet+dagesh | |
1226 גּ 0xfb32 X.g gimel+dagesh | |
1227 דּ 0xfb33 X.d dalet+dagesh | |
1228 הּ 0xfb34 X.h he+dagesh | |
1229 וּ 0xfb35 Xu vav+dagesh | |
1230 זּ 0xfb36 X.z zayin+dagesh | |
1231 טּ 0xfb38 X.T tet+dagesh | |
1232 יּ 0xfb39 X.y yud+dagesh | |
1233 ךּ 0xfb3a X.K kaf sofit+dagesh | |
1234 כּ 0xfb3b X.k kaf+dagesh | |
1235 לּ 0xfb3c X.l lamed+dagesh | |
1236 מּ 0xfb3e X.m mem+dagesh | |
1237 נּ 0xfb40 X.n nun+dagesh | |
1238 סּ 0xfb41 X.s samech+dagesh | |
1239 ףּ 0xfb43 X.P pe sofit+dagesh | |
1240 פּ 0xfb44 X.p pe+dagesh | |
1241 צּ 0xfb46 X.x tsadi+dagesh | |
1242 קּ 0xfb47 X.q qof+dagesh | |
1243 רּ 0xfb48 X.r resh+dagesh | |
1244 שּ 0xfb49 X.w shin+dagesh | |
1245 תּ 0xfb4a X.t tav+dagesh | |
1246 וֹ 0xfb4b Xo vav+holam | |
1247 בֿ 0xfb4c XRb bet+rafe | |
1248 כֿ 0xfb4d XRk kaf+rafe | |
1249 פֿ 0xfb4e XRp pe+rafe | |
1250 ﭏ 0xfb4f Xal alef-lamed | |
1251 | |
1252 ============================================================================== | |
1253 10. Using UTF-8 *mbyte-utf8* *UTF-8* *utf-8* *utf8* | |
1254 *Unicode* *unicode* | |
1255 The Unicode character set was designed to include all characters from other | |
1256 character sets. Therefore it is possible to write text in any language using | |
1257 Unicode (with a few rarely used languages excluded). And it's mostly possible | |
1258 to mix these languages in one file, which is impossible with other encodings. | |
1259 | |
2033
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1260 Unicode can be encoded in several ways. The most popular one is UTF-8, which |
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1261 uses one or more bytes for each character and is backwards compatible with |
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1262 ASCII. On MS-Windows UTF-16 is also used (previously UCS-2), which uses |
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1263 16-bit words. Vim can support all of these encodings, but always uses UTF-8 |
7 | 1264 internally. |
1265 | |
2033
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1266 Vim has comprehensive UTF-8 support. It works well in: |
7 | 1267 - xterm with utf-8 support enabled |
1268 - Athena, Motif and GTK GUI | |
1269 - MS-Windows GUI | |
2033
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1270 - several other platforms |
7 | 1271 |
1272 Double-width characters are supported. This works best with 'guifontwide' or | |
1273 'guifontset'. When using only 'guifont' the wide characters are drawn in the | |
1274 normal width and a space to fill the gap. Note that the 'guifontset' option | |
1275 is no longer relevant in the GTK+ 2 GUI. | |
1276 | |
2033
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1277 *bom-bytes* |
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1278 When reading a file a BOM (Byte Order Mark) can be used to recognize the |
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1279 Unicode encoding: |
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1280 EF BB BF utf-8 |
2290
22529abcd646
Fixed ":s" message. Docs updates.
Bram Moolenaar <bram@vim.org>
parents:
2236
diff
changeset
|
1281 FE FF utf-16 big endian |
22529abcd646
Fixed ":s" message. Docs updates.
Bram Moolenaar <bram@vim.org>
parents:
2236
diff
changeset
|
1282 FF FE utf-16 little endian |
2033
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1283 00 00 FE FF utf-32 big endian |
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1284 FF FE 00 00 utf-32 little endian |
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1285 |
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1286 Utf-8 is the recommended encoding. Note that it's difficult to tell utf-16 |
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1287 and utf-32 apart. Utf-16 is often used on MS-Windows, utf-32 is not |
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1288 widespread as file format. |
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1289 |
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1290 |
714 | 1291 *mbyte-combining* *mbyte-composing* |
1292 A composing or combining character is used to change the meaning of the | |
1293 character before it. The combining characters are drawn on top of the | |
856 | 1294 preceding character. |
714 | 1295 Up to two combining characters can be used by default. This can be changed |
1296 with the 'maxcombine' option. | |
1297 When editing text a composing character is mostly considered part of the | |
1298 preceding character. For example "x" will delete a character and its | |
1299 following composing characters by default. | |
1300 If the 'delcombine' option is on, then pressing 'x' will delete the combining | |
7 | 1301 characters, one at a time, then the base character. But when inserting, you |
1302 type the first character and the following composing characters separately, | |
1303 after which they will be joined. The "r" command will not allow you to type a | |
1304 combining character, because it doesn't know one is coming. Use "R" instead. | |
1305 | |
1306 Bytes which are not part of a valid UTF-8 byte sequence are handled like a | |
1307 single character and displayed as <xx>, where "xx" is the hex value of the | |
1308 byte. | |
1309 | |
1310 Overlong sequences are not handled specially and displayed like a valid | |
1311 character. However, search patterns may not match on an overlong sequence. | |
1312 (an overlong sequence is where more bytes are used than required for the | |
1313 character.) An exception is NUL (zero) which is displayed as "<00>". | |
1314 | |
1315 In the file and buffer the full range of Unicode characters can be used (31 | |
2965 | 1316 bits). However, displaying only works for the characters present in the |
1317 selected font. | |
7 | 1318 |
1319 Useful commands: | |
1320 - "ga" shows the decimal, hexadecimal and octal value of the character under | |
236 | 1321 the cursor. If there are composing characters these are shown too. (If the |
7 | 1322 message is truncated, use ":messages"). |
1323 - "g8" shows the bytes used in a UTF-8 character, also the composing | |
1324 characters, as hex numbers. | |
1325 - ":set encoding=utf-8 fileencodings=" forces using UTF-8 for all files. The | |
1326 default is to use the current locale for 'encoding' and set 'fileencodings' | |
1621 | 1327 to automatically detect the encoding of a file. |
7 | 1328 |
1329 | |
1330 STARTING VIM | |
1331 | |
1332 If your current locale is in an utf-8 encoding, Vim will automatically start | |
1333 in utf-8 mode. | |
1334 | |
1335 If you are using another locale: > | |
1336 | |
1337 set encoding=utf-8 | |
1338 | |
1339 You might also want to select the font used for the menus. Unfortunately this | |
1340 doesn't always work. See the system specific remarks below, and 'langmenu'. | |
1341 | |
1342 | |
1343 USING UTF-8 IN X-Windows *utf-8-in-xwindows* | |
1344 | |
1345 Note: This section does not apply to the GTK+ 2 GUI. | |
1346 | |
1347 You need to specify a font to be used. For double-wide characters another | |
1348 font is required, which is exactly twice as wide. There are three ways to do | |
1349 this: | |
1350 | |
1351 1. Set 'guifont' and let Vim find a matching 'guifontwide' | |
1352 2. Set 'guifont' and 'guifontwide' | |
1353 3. Set 'guifontset' | |
1354 | |
1355 See the documentation for each option for details. Example: > | |
1356 | |
1357 :set guifont=-misc-fixed-medium-r-normal--15-140-75-75-c-90-iso10646-1 | |
1358 | |
1359 You might also want to set the font used for the menus. This only works for | |
1360 Motif. Use the ":hi Menu font={fontname}" command for this. |:highlight| | |
1361 | |
1362 | |
1363 TYPING UTF-8 *utf-8-typing* | |
1364 | |
1365 If you are using X-Windows, you should find an input method that supports | |
1366 utf-8. | |
1367 | |
1368 If your system does not provide support for typing utf-8, you can use the | |
1369 'keymap' feature. This allows writing a keymap file, which defines a utf-8 | |
1370 character as a sequence of ASCII characters. See |mbyte-keymap|. | |
1371 | |
1372 Another method is to set the current locale to the language you want to use | |
1373 and for which you have a XIM available. Then set 'termencoding' to that | |
1374 language and Vim will convert the typed characters to 'encoding' for you. | |
1375 | |
1376 If everything else fails, you can type any character as four hex bytes: > | |
1377 | |
1378 CTRL-V u 1234 | |
1379 | |
1380 "1234" is interpreted as a hex number. You must type four characters, prepend | |
1381 a zero if necessary. | |
1382 | |
1383 | |
1384 COMMAND ARGUMENTS *utf-8-char-arg* | |
1385 | |
1386 Commands like |f|, |F|, |t| and |r| take an argument of one character. For | |
167 | 1387 UTF-8 this argument may include one or two composing characters. These need |
7 | 1388 to be produced together with the base character, Vim doesn't wait for the next |
1389 character to be typed to find out if it is a composing character or not. | |
1390 Using 'keymap' or |:lmap| is a nice way to type these characters. | |
1391 | |
1392 The commands that search for a character in a line handle composing characters | |
1393 as follows. When searching for a character without a composing character, | |
1394 this will find matches in the text with or without composing characters. When | |
1395 searching for a character with a composing character, this will only find | |
1396 matches with that composing character. It was implemented this way, because | |
1397 not everybody is able to type a composing character. | |
1398 | |
1399 | |
1400 ============================================================================== | |
1401 11. Overview of options *mbyte-options* | |
1402 | |
1403 These options are relevant for editing multi-byte files. Check the help in | |
1404 options.txt for detailed information. | |
1405 | |
1406 'encoding' Encoding used for the keyboard and display. It is also the | |
1407 default encoding for files. | |
1408 | |
1409 'fileencoding' Encoding of a file. When it's different from 'encoding' | |
1410 conversion is done when reading or writing the file. | |
1411 | |
1412 'fileencodings' List of possible encodings of a file. When opening a file | |
1413 these will be tried and the first one that doesn't cause an | |
1414 error is used for 'fileencoding'. | |
1415 | |
1416 'charconvert' Expression used to convert files from one encoding to another. | |
1417 | |
1418 'formatoptions' The 'm' flag can be included to have formatting break a line | |
1419 at a multibyte character of 256 or higher. Thus is useful for | |
1420 languages where a sequence of characters can be broken | |
1421 anywhere. | |
1422 | |
1423 'guifontset' The list of font names used for a multi-byte encoding. When | |
1424 this option is not empty, it replaces 'guifont'. | |
1425 | |
1426 'keymap' Specify the name of a keyboard mapping. | |
1427 | |
1428 ============================================================================== | |
1429 | |
1430 Contributions specifically for the multi-byte features by: | |
1431 Chi-Deok Hwang <hwang@mizi.co.kr> | |
2033
de5a43c5eedc
Update documentation files.
Bram Moolenaar <bram@zimbu.org>
parents:
1702
diff
changeset
|
1432 SungHyun Nam <goweol@gmail.com> |
7 | 1433 K.Nagano <nagano@atese.advantest.co.jp> |
1434 Taro Muraoka <koron@tka.att.ne.jp> | |
1435 Yasuhiro Matsumoto <mattn@mail.goo.ne.jp> | |
1436 | |
1437 vim:tw=78:ts=8:ft=help:norl: |