874
|
1 *usr_27.txt* For Vim version 7.0. Last change: 2006 Apr 24
|
7
|
2
|
|
3 VIM USER MANUAL - by Bram Moolenaar
|
|
4
|
|
5 Search commands and patterns
|
|
6
|
|
7
|
|
8 In chapter 3 a few simple search patterns were mentioned |03.9|. Vim can do
|
|
9 much more complex searches. This chapter explains the most often used ones.
|
|
10 A detailed specification can be found here: |pattern|
|
|
11
|
|
12 |27.1| Ignoring case
|
|
13 |27.2| Wrapping around the file end
|
|
14 |27.3| Offsets
|
|
15 |27.4| Matching multiple times
|
|
16 |27.5| Alternatives
|
|
17 |27.6| Character ranges
|
|
18 |27.7| Character classes
|
|
19 |27.8| Matching a line break
|
|
20 |27.9| Examples
|
|
21
|
|
22 Next chapter: |usr_28.txt| Folding
|
|
23 Previous chapter: |usr_26.txt| Repeating
|
|
24 Table of contents: |usr_toc.txt|
|
|
25
|
|
26 ==============================================================================
|
|
27 *27.1* Ignoring case
|
|
28
|
|
29 By default, Vim's searches are case sensitive. Therefore, "include",
|
|
30 "INCLUDE", and "Include" are three different words and a search will match
|
|
31 only one of them.
|
|
32 Now switch on the 'ignorecase' option: >
|
|
33
|
|
34 :set ignorecase
|
|
35
|
|
36 Search for "include" again, and now it will match "Include", "INCLUDE" and
|
|
37 "InClUDe". (Set the 'hlsearch' option to quickly see where a pattern
|
|
38 matches.)
|
|
39 You can switch this off again with: >
|
|
40
|
|
41 :set noignorecase
|
|
42
|
|
43 But lets keep it set, and search for "INCLUDE". It will match exactly the
|
|
44 same text as "include" did. Now set the 'smartcase' option: >
|
|
45
|
|
46 :set ignorecase smartcase
|
|
47
|
|
48 If you have a pattern with at least one uppercase character, the search
|
|
49 becomes case sensitive. The idea is that you didn't have to type that
|
|
50 uppercase character, so you must have done it because you wanted case to
|
|
51 match. That's smart!
|
|
52 With these two options set you find the following matches:
|
|
53
|
|
54 pattern matches ~
|
|
55 word word, Word, WORD, WoRd, etc.
|
|
56 Word Word
|
|
57 WORD WORD
|
|
58 WoRd WoRd
|
|
59
|
|
60
|
|
61 CASE IN ONE PATTERN
|
|
62
|
|
63 If you want to ignore case for one specific pattern, you can do this by
|
|
64 prepending the "\c" string. Using "\C" will make the pattern to match case.
|
|
65 This overrules the 'ignorecase' and 'smartcase' options, when "\c" or "\C" is
|
|
66 used their value doesn't matter.
|
|
67
|
|
68 pattern matches ~
|
|
69 \Cword word
|
|
70 \CWord Word
|
|
71 \cword word, Word, WORD, WoRd, etc.
|
|
72 \cWord word, Word, WORD, WoRd, etc.
|
|
73
|
|
74 A big advantage of using "\c" and "\C" is that it sticks with the pattern.
|
|
75 Thus if you repeat a pattern from the search history, the same will happen, no
|
|
76 matter if 'ignorecase' or 'smartcase' was changed.
|
|
77
|
|
78 Note:
|
|
79 The use of "\" items in search patterns depends on the 'magic' option.
|
|
80 In this chapters we will assume 'magic' is on, because that is the
|
|
81 standard and recommended setting. If you would change 'magic', many
|
|
82 search patterns would suddenly become invalid.
|
|
83
|
|
84 Note:
|
|
85 If your search takes much longer than you expected, you can interrupt
|
|
86 it with CTRL-C on Unix and CTRL-Break on MS-DOS and MS-Windows.
|
|
87
|
|
88 ==============================================================================
|
|
89 *27.2* Wrapping around the file end
|
|
90
|
|
91 By default, a forward search starts searching for the given string at the
|
|
92 current cursor location. It then proceeds to the end of the file. If it has
|
|
93 not found the string by that time, it starts from the beginning and searches
|
|
94 from the start of the file to the cursor location.
|
|
95 Keep in mind that when repeating the "n" command to search for the next
|
|
96 match, you eventually get back to the first match. If you don't notice this
|
|
97 you keep searching forever! To give you a hint, Vim displays this message:
|
|
98
|
|
99 search hit BOTTOM, continuing at TOP ~
|
|
100
|
|
101 If you use the "?" command, to search in the other direction, you get this
|
|
102 message:
|
|
103
|
|
104 search hit TOP, continuing at BOTTOM ~
|
|
105
|
|
106 Still, you don't know when you are back at the first match. One way to see
|
|
107 this is by switching on the 'ruler' option: >
|
|
108
|
|
109 :set ruler
|
|
110
|
|
111 Vim will display the cursor position in the lower righthand corner of the
|
|
112 window (in the status line if there is one). It looks like this:
|
|
113
|
|
114 101,29 84% ~
|
|
115
|
|
116 The first number is the line number of the cursor. Remember the line number
|
|
117 where you started, so that you can check if you passed this position again.
|
|
118
|
|
119
|
|
120 NOT WRAPPING
|
|
121
|
|
122 To turn off search wrapping, use the following command: >
|
|
123
|
|
124 :set nowrapscan
|
|
125
|
|
126 Now when the search hits the end of the file, an error message displays:
|
|
127
|
|
128 E385: search hit BOTTOM without match for: forever ~
|
|
129
|
|
130 Thus you can find all matches by going to the start of the file with "gg" and
|
|
131 keep searching until you see this message.
|
|
132 If you search in the other direction, using "?", you get:
|
|
133
|
|
134 E384: search hit TOP without match for: forever ~
|
|
135
|
|
136 ==============================================================================
|
|
137 *27.3* Offsets
|
|
138
|
|
139 By default, the search command leaves the cursor positioned on the beginning
|
|
140 of the pattern. You can tell Vim to leave it some other place by specifying
|
|
141 an offset. For the forward search command "/", the offset is specified by
|
|
142 appending a slash (/) and the offset: >
|
|
143
|
|
144 /default/2
|
|
145
|
|
146 This command searches for the pattern "default" and then moves to the
|
|
147 beginning of the second line past the pattern. Using this command on the
|
|
148 paragraph above, Vim finds the word "default" in the first line. Then the
|
|
149 cursor is moved two lines down and lands on "an offset".
|
|
150
|
|
151 If the offset is a simple number, the cursor will be placed at the beginning
|
|
152 of the line that many lines from the match. The offset number can be positive
|
|
153 or negative. If it is positive, the cursor moves down that many lines; if
|
|
154 negative, it moves up.
|
|
155
|
|
156
|
|
157 CHARACTER OFFSETS
|
|
158
|
|
159 The "e" offset indicates an offset from the end of the match. It moves the
|
|
160 cursor onto the last character of the match. The command: >
|
|
161
|
|
162 /const/e
|
|
163
|
|
164 puts the cursor on the "t" of "const".
|
|
165 From that position, adding a number moves forward that many characters.
|
|
166 This command moves to the character just after the match: >
|
|
167
|
|
168 /const/e+1
|
|
169
|
|
170 A positive number moves the cursor to the right, a negative number moves it to
|
|
171 the left. For example: >
|
|
172
|
|
173 /const/e-1
|
|
174
|
|
175 moves the cursor to the "s" of "const".
|
|
176
|
|
177 If the offset begins with "b", the cursor moves to the beginning of the
|
|
178 pattern. That's not very useful, since leaving out the "b" does the same
|
|
179 thing. It does get useful when a number is added or subtracted. The cursor
|
|
180 then goes forward or backward that many characters. For example: >
|
|
181
|
|
182 /const/b+2
|
|
183
|
|
184 Moves the cursor to the beginning of the match and then two characters to the
|
|
185 right. Thus it lands on the "n".
|
|
186
|
|
187
|
|
188 REPEATING
|
|
189
|
|
190 To repeat searching for the previously used search pattern, but with a
|
|
191 different offset, leave out the pattern: >
|
|
192
|
|
193 /that
|
|
194 //e
|
|
195
|
|
196 Is equal to: >
|
|
197
|
|
198 /that/e
|
|
199
|
|
200 To repeat with the same offset: >
|
|
201
|
|
202 /
|
|
203
|
|
204 "n" does the same thing. To repeat while removing a previously used offset: >
|
|
205
|
|
206 //
|
|
207
|
|
208
|
|
209 SEARCHING BACKWARDS
|
|
210
|
|
211 The "?" command uses offsets in the same way, but you must use "?" to separate
|
|
212 the offset from the pattern, instead of "/": >
|
|
213
|
|
214 ?const?e-2
|
|
215
|
|
216 The "b" and "e" keep their meaning, they don't change direction with the use
|
|
217 of "?".
|
|
218
|
|
219
|
|
220 START POSITION
|
|
221
|
|
222 When starting a search, it normally starts at the cursor position. When you
|
|
223 specify a line offset, this can cause trouble. For example: >
|
|
224
|
|
225 /const/-2
|
|
226
|
|
227 This finds the next word "const" and then moves two lines up. If you
|
|
228 use "n" to search again, Vim could start at the current position and find the same
|
|
229 "const" match. Then using the offset again, you would be back where you started.
|
|
230 You would be stuck!
|
|
231 It could be worse: Suppose there is another match with "const" in the next
|
|
232 line. Then repeating the forward search would find this match and move two
|
|
233 lines up. Thus you would actually move the cursor back!
|
|
234
|
|
235 When you specify a character offset, Vim will compensate for this. Thus the
|
|
236 search starts a few characters forward or backward, so that the same match
|
|
237 isn't found again.
|
|
238
|
|
239 ==============================================================================
|
|
240 *27.4* Matching multiple times
|
|
241
|
|
242 The "*" item specifies that the item before it can match any number of times.
|
|
243 Thus: >
|
|
244
|
|
245 /a*
|
|
246
|
|
247 matches "a", "aa", "aaa", etc. But also "" (the empty string), because zero
|
|
248 times is included.
|
|
249 The "*" only applies to the item directly before it. Thus "ab*" matches
|
|
250 "a", "ab", "abb", "abbb", etc. To match a whole string multiple times, it
|
|
251 must be grouped into one item. This is done by putting "\(" before it and
|
|
252 "\)" after it. Thus this command: >
|
|
253
|
|
254 /\(ab\)*
|
|
255
|
|
256 Matches: "ab", "abab", "ababab", etc. And also "".
|
|
257
|
|
258 To avoid matching the empty string, use "\+". This makes the previous item
|
|
259 match one or more times. >
|
|
260
|
|
261 /ab\+
|
|
262
|
|
263 Matches "ab", "abb", "abbb", etc. It does not match "a" when no "b" follows.
|
|
264
|
|
265 To match an optional item, use "\=". Example: >
|
|
266
|
|
267 /folders\=
|
|
268
|
|
269 Matches "folder" and "folders".
|
|
270
|
|
271
|
|
272 SPECIFIC COUNTS
|
|
273
|
|
274 To match a specific number of items use the form "\{n,m}". "n" and "m" are
|
|
275 numbers. The item before it will be matched "n" to "m" times |inclusive|.
|
|
276 Example: >
|
|
277
|
|
278 /ab\{3,5}
|
|
279
|
|
280 matches "abbb", "abbbb" and "abbbbb".
|
|
281 When "n" is omitted, it defaults to zero. When "m" is omitted it defaults
|
|
282 to infinity. When ",m" is omitted, it matches exactly "n" times.
|
|
283 Examples:
|
|
284
|
|
285 pattern match count ~
|
|
286 \{,4} 0, 1, 2, 3 or 4
|
|
287 \{3,} 3, 4, 5, etc.
|
|
288 \{0,1} 0 or 1, same as \=
|
|
289 \{0,} 0 or more, same as *
|
|
290 \{1,} 1 or more, same as \+
|
|
291 \{3} 3
|
|
292
|
|
293
|
|
294 MATCHING AS LITTLE AS POSSIBLE
|
|
295
|
|
296 The items so far match as many characters as they can find. To match as few
|
|
297 as possible, use "\{-n,m}". It works the same as "\{n,m}", except that the
|
|
298 minimal amount possible is used.
|
|
299 For example, use: >
|
|
300
|
|
301 /ab\{-1,3}
|
|
302
|
|
303 Will match "ab" in "abbb". Actually, it will never match more than one b,
|
|
304 because there is no reason to match more. It requires something else to force
|
|
305 it to match more than the lower limit.
|
|
306 The same rules apply to removing "n" and "m". It's even possible to remove
|
164
|
307 both of the numbers, resulting in "\{-}". This matches the item before it
|
|
308 zero or more times, as few as possible. The item by itself always matches
|
|
309 zero times. It is useful when combined with something else. Example: >
|
7
|
310
|
|
311 /a.\{-}b
|
|
312
|
|
313 This matches "axb" in "axbxb". If this pattern would be used: >
|
|
314
|
|
315 /a.*b
|
|
316
|
|
317 It would try to match as many characters as possible with ".*", thus it
|
|
318 matches "axbxb" as a whole.
|
|
319
|
|
320 ==============================================================================
|
|
321 *27.5* Alternatives
|
|
322
|
|
323 The "or" operator in a pattern is "\|". Example: >
|
|
324
|
|
325 /foo\|bar
|
|
326
|
|
327 This matches "foo" or "bar". More alternatives can be concatenated: >
|
|
328
|
|
329 /one\|two\|three
|
|
330
|
|
331 Matches "one", "two" and "three".
|
|
332 To match multiple times, the whole thing must be placed in "\(" and "\)": >
|
|
333
|
|
334 /\(foo\|bar\)\+
|
|
335
|
|
336 This matches "foo", "foobar", "foofoo", "barfoobar", etc.
|
|
337 Another example: >
|
|
338
|
|
339 /end\(if\|while\|for\)
|
|
340
|
|
341 This matches "endif", "endwhile" and "endfor".
|
|
342
|
|
343 A related item is "\&". This requires that both alternatives match in the
|
|
344 same place. The resulting match uses the last alternative. Example: >
|
|
345
|
|
346 /forever\&...
|
|
347
|
|
348 This matches "for" in "forever". It will not match "fortuin", for example.
|
|
349
|
|
350 ==============================================================================
|
|
351 *27.6* Character ranges
|
|
352
|
|
353 To match "a", "b" or "c" you could use "/a\|b\|c". When you want to match all
|
|
354 letters from "a" to "z" this gets very long. There is a shorter method: >
|
|
355
|
|
356 /[a-z]
|
|
357
|
|
358 The [] construct matches a single character. Inside you specify which
|
|
359 characters to match. You can include a list of characters, like this: >
|
|
360
|
|
361 /[0123456789abcdef]
|
|
362
|
|
363 This will match any of the characters included. For consecutive characters
|
|
364 you can specify the range. "0-3" stands for "0123". "w-z" stands for "wxyz".
|
|
365 Thus the same command as above can be shortened to: >
|
|
366
|
|
367 /[0-9a-f]
|
|
368
|
|
369 To match the "-" character itself make it the first or last one in the range.
|
|
370 These special characters are accepted to make it easier to use them inside a
|
|
371 [] range (they can actually be used anywhere in the search pattern):
|
|
372
|
|
373 \e <Esc>
|
|
374 \t <Tab>
|
|
375 \r <CR>
|
|
376 \b <BS>
|
|
377
|
|
378 There are a few more special cases for [] ranges, see |/[]| for the whole
|
|
379 story.
|
|
380
|
|
381
|
|
382 COMPLEMENTED RANGE
|
|
383
|
|
384 To avoid matching a specific character, use "^" at the start of the range.
|
|
385 The [] item then matches everything but the characters included. Example: >
|
|
386
|
|
387 /"[^"]*"
|
|
388 <
|
|
389 " a double quote
|
|
390 [^"] any character that is not a double quote
|
|
391 * as many as possible
|
|
392 " a double quote again
|
|
393
|
|
394 This matches "foo" and "3!x", including the double quotes.
|
|
395
|
|
396
|
|
397 PREDEFINED RANGES
|
|
398
|
|
399 A number of ranges are used very often. Vim provides a shortcut for these.
|
|
400 For example: >
|
|
401
|
|
402 /\a
|
|
403
|
|
404 Finds alphabetic characters. This is equal to using "/[a-zA-Z]". Here are a
|
|
405 few more of these:
|
|
406
|
|
407 item matches equivalent ~
|
|
408 \d digit [0-9]
|
|
409 \D non-digit [^0-9]
|
|
410 \x hex digit [0-9a-fA-F]
|
|
411 \X non-hex digit [^0-9a-fA-F]
|
|
412 \s white space [ ] (<Tab> and <Space>)
|
|
413 \S non-white characters [^ ] (not <Tab> and <Space>)
|
|
414 \l lowercase alpha [a-z]
|
|
415 \L non-lowercase alpha [^a-z]
|
|
416 \u uppercase alpha [A-Z]
|
|
417 \U non-uppercase alpha [^A-Z]
|
|
418
|
|
419 Note:
|
|
420 Using these predefined ranges works a lot faster than the character
|
|
421 range it stands for.
|
|
422 These items can not be used inside []. Thus "[\d\l]" does NOT work to
|
|
423 match a digit or lowercase alpha. Use "\(\d\|\l\)" instead.
|
|
424
|
|
425 See |/\s| for the whole list of these ranges.
|
|
426
|
|
427 ==============================================================================
|
|
428 *27.7* Character classes
|
|
429
|
|
430 The character range matches a fixed set of characters. A character class is
|
|
431 similar, but with an essential difference: The set of characters can be
|
|
432 redefined without changing the search pattern.
|
|
433 For example, search for this pattern: >
|
|
434
|
|
435 /\f\+
|
|
436
|
|
437 The "\f" items stands for file name characters. Thus this matches a sequence
|
|
438 of characters that can be a file name.
|
|
439 Which characters can be part of a file name depends on the system you are
|
|
440 using. On MS-Windows, the backslash is included, on Unix it is not. This is
|
|
441 specified with the 'isfname' option. The default value for Unix is: >
|
|
442
|
|
443 :set isfname
|
|
444 isfname=@,48-57,/,.,-,_,+,,,#,$,%,~,=
|
|
445
|
|
446 For other systems the default value is different. Thus you can make a search
|
|
447 pattern with "\f" to match a file name, and it will automatically adjust to
|
|
448 the system you are using it on.
|
|
449
|
|
450 Note:
|
|
451 Actually, Unix allows using just about any character in a file name,
|
|
452 including white space. Including these characters in 'isfname' would
|
|
453 be theoretically correct. But it would make it impossible to find the
|
|
454 end of a file name in text. Thus the default value of 'isfname' is a
|
|
455 compromise.
|
|
456
|
|
457 The character classes are:
|
|
458
|
|
459 item matches option ~
|
|
460 \i identifier characters 'isident'
|
|
461 \I like \i, excluding digits
|
|
462 \k keyword characters 'iskeyword'
|
|
463 \K like \k, excluding digits
|
|
464 \p printable characters 'isprint'
|
|
465 \P like \p, excluding digits
|
|
466 \f file name characters 'isfname'
|
|
467 \F like \f, excluding digits
|
|
468
|
|
469 ==============================================================================
|
|
470 *27.8* Matching a line break
|
|
471
|
|
472 Vim can find a pattern that includes a line break. You need to specify where
|
|
473 the line break happens, because all items mentioned so far don't match a line
|
|
474 break.
|
|
475 To check for a line break in a specific place, use the "\n" item: >
|
|
476
|
|
477 /the\nword
|
|
478
|
|
479 This will match at a line that ends in "the" and the next line starts with
|
|
480 "word". To match "the word" as well, you need to match a space or a line
|
|
481 break. The item to use for it is "\_s": >
|
|
482
|
|
483 /the\_sword
|
|
484
|
|
485 To allow any amount of white space: >
|
|
486
|
|
487 /the\_s\+word
|
|
488
|
|
489 This also matches when "the " is at the end of a line and " word" at the
|
|
490 start of the next one.
|
|
491
|
|
492 "\s" matches white space, "\_s" matches white space or a line break.
|
|
493 Similarly, "\a" matches an alphabetic character, and "\_a" matches an
|
|
494 alphabetic character or a line break. The other character classes and ranges
|
|
495 can be modified in the same way by inserting a "_".
|
|
496
|
|
497 Many other items can be made to match a line break by prepending "\_". For
|
|
498 example: "\_." matches any character or a line break.
|
|
499
|
|
500 Note:
|
|
501 "\_.*" matches everything until the end of the file. Be careful with
|
|
502 this, it can make a search command very slow.
|
|
503
|
|
504 Another example is "\_[]", a character range that includes a line break: >
|
|
505
|
|
506 /"\_[^"]*"
|
|
507
|
|
508 This finds a text in double quotes that may be split up in several lines.
|
|
509
|
|
510 ==============================================================================
|
|
511 *27.9* Examples
|
|
512
|
|
513 Here are a few search patterns you might find useful. This shows how the
|
|
514 items mentioned above can be combined.
|
|
515
|
|
516
|
|
517 FINDING A CALIFORNIA LICENSE PLATE
|
|
518
|
|
519 A sample license place number is "1MGU103". It has one digit, three uppercase
|
|
520 letters and three digits. Directly putting this into a search pattern: >
|
|
521
|
|
522 /\d\u\u\u\d\d\d
|
|
523
|
|
524 Another way is to specify that there are three digits and letters with a
|
|
525 count: >
|
|
526
|
|
527 /\d\u\{3}\d\{3}
|
|
528
|
|
529 Using [] ranges instead: >
|
|
530
|
|
531 /[0-9][A-Z]\{3}[0-9]\{3}
|
|
532
|
|
533 Which one of these you should use? Whichever one you can remember. The
|
|
534 simple way you can remember is much faster than the fancy way that you can't.
|
|
535 If you can remember them all, then avoid the last one, because it's both more
|
|
536 typing and slower to execute.
|
|
537
|
|
538
|
|
539 FINDING AN IDENTIFIER
|
|
540
|
|
541 In C programs (and many other computer languages) an identifier starts with a
|
|
542 letter and further consists of letters and digits. Underscores can be used
|
|
543 too. This can be found with: >
|
|
544
|
|
545 /\<\h\w*\>
|
|
546
|
|
547 "\<" and "\>" are used to find only whole words. "\h" stands for "[A-Za-z_]"
|
|
548 and "\w" for "[0-9A-Za-z_]".
|
|
549
|
|
550 Note:
|
|
551 "\<" and "\>" depend on the 'iskeyword' option. If it includes "-",
|
|
552 for example, then "ident-" is not matched. In this situation use: >
|
|
553
|
|
554 /\w\@<!\h\w*\w\@!
|
|
555 <
|
|
556 This checks if "\w" does not match before or after the identifier.
|
|
557 See |/\@<!| and |/\@!|.
|
|
558
|
|
559 ==============================================================================
|
|
560
|
|
561 Next chapter: |usr_28.txt| Folding
|
|
562
|
|
563 Copyright: see |manual-copyright| vim:tw=78:ts=8:ft=help:norl:
|