cgit.sxemacs.org Git - sxemacs/blob - info/sxemacs/search.texi

   1 @node Search, Fixit, Display, Top
   2 @chapter Searching and Replacement
   3 @cindex searching
   4
   5   Like other editors, Emacs has commands for searching for occurrences of
   6 a string.  The principal search command is unusual in that it is
   7 @dfn{incremental}: it begins to search before you have finished typing the
   8 search string.  There are also non-incremental search commands more like
   9 those of other editors.
  10
  11   Besides the usual @code{replace-string} command that finds all
  12 occurrences of one string and replaces them with another, Emacs has a fancy
  13 replacement command called @code{query-replace} which asks interactively
  14 which occurrences to replace.
  15
  16 @menu
  17 * Incremental Search::     Search happens as you type the string.
  18 * Non-Incremental Search:: Specify entire string and then search.
  19 * Word Search::            Search for sequence of words.
  20 * Regexp Search::          Search for match for a regexp.
  21 * Regexps::                Syntax of regular expressions.
  22 * Search Case::            To ignore case while searching, or not.
  23 * Replace::                Search, and replace some or all matches.
  24 * Other Repeating Search:: Operating on all matches for some regexp.
  25 @end menu
  26
  27 @node Incremental Search, Non-Incremental Search, Search, Search
  28 @section Incremental Search
  29
  30   An incremental search begins searching as soon as you type the first
  31 character of the search string.  As you type in the search string, Emacs
  32 shows you where the string (as you have typed it so far) is found.
  33 When you have typed enough characters to identify the place you want, you
  34 can stop.  Depending on what you do next, you may or may not need to
  35 terminate the search explicitly with a @key{RET}.
  36
  37 @c WideCommands
  38 @table @kbd
  39 @item C-s
  40 Incremental search forward (@code{isearch-forward}).
  41 @item C-r
  42 Incremental search backward (@code{isearch-backward}).
  43 @end table
  44
  45 @kindex C-s
  46 @kindex C-r
  47 @findex isearch-forward
  48 @findex isearch-backward
  49   @kbd{C-s} starts an incremental search.  @kbd{C-s} reads characters from
  50 the keyboard and positions the cursor at the first occurrence of the
  51 characters that you have typed.  If you type @kbd{C-s} and then @kbd{F},
  52 the cursor moves right after the first @samp{F}.  Type an @kbd{O}, and see
  53 the cursor move to after the first @samp{FO}.  After another @kbd{O}, the
  54 cursor is after the first @samp{FOO} after the place where you started the
  55 search.  Meanwhile, the search string @samp{FOO} has been echoed in the
  56 echo area.@refill
  57
  58   The echo area display ends with three dots when actual searching is going
  59 on.  When search is waiting for more input, the three dots are removed.
  60 (On slow terminals, the three dots are not displayed.)
  61
  62   If you make a mistake in typing the search string, you can erase
  63 characters with @key{DEL}.  Each @key{DEL} cancels the last character of the
  64 search string.  This does not happen until Emacs is ready to read another
  65 input character; first it must either find, or fail to find, the character
  66 you want to erase.  If you do not want to wait for this to happen, use
  67 @kbd{C-g} as described below.@refill
  68
  69   When you are satisfied with the place you have reached, you can type
  70 @key{RET} (or @key{C-m}), which stops searching, leaving the cursor where
  71 the search brought it.  Any command not specially meaningful in searches also
  72 stops the search and is then executed.  Thus, typing @kbd{C-a} exits the
  73 search and then moves to the beginning of the line.  @key{RET} is necessary
  74 only if the next command you want to type is a printing character,
  75 @key{DEL}, @key{ESC}, or another control character that is special
  76 within searches (@kbd{C-q}, @kbd{C-w}, @kbd{C-r}, @kbd{C-s}, or @kbd{C-y}).
  77
  78   Sometimes you search for @samp{FOO} and find it, but were actually
  79 looking for a different occurrence of it.  To move to the next occurrence
  80 of the search string, type another @kbd{C-s}.  Do this as often as
  81 necessary.  If you overshoot, you can cancel some @kbd{C-s}
  82 characters with @key{DEL}.
  83
  84   After you exit a search, you can search for the same string again by
  85 typing just @kbd{C-s C-s}: the first @kbd{C-s} is the key that invokes
  86 incremental search, and the second @kbd{C-s} means ``search again''.
  87
  88   If the specified string is not found at all, the echo area displays
  89 the text @samp{Failing I-Search}.  The cursor is after the place where
  90 Emacs found as much of your string as it could.  Thus, if you search for
  91 @samp{FOOT}, and there is no @samp{FOOT}, the cursor may be after the
  92 @samp{FOO} in @samp{FOOL}.  At this point there are several things you
  93 can do.  If you mistyped the search string, correct it.  If you like the
  94 place you have found, you can type @key{RET} or some other Emacs command
  95 to ``accept what the search offered''.  Or you can type @kbd{C-g}, which
  96 removes from the search string the characters that could not be found
  97 (the @samp{T} in @samp{FOOT}), leaving those that were found (the
  98 @samp{FOO} in @samp{FOOT}).  A second @kbd{C-g} at that point cancels
  99 the search entirely, returning point to where it was when the search
 100 started.
 101
 102   If a search is failing and you ask to repeat it by typing another
 103 @kbd{C-s}, it starts again from the beginning of the buffer.  Repeating
 104 a failing backward search with @kbd{C-r} starts again from the end.  This
 105 is called @dfn{wrapping around}.  @samp{Wrapped} appears in the search
 106 prompt once this has happened.
 107
 108 @cindex quitting (in search)
 109   The @kbd{C-g} ``quit'' character does special things during searches;
 110 just what it does depends on the status of the search.  If the search has
 111 found what you specified and is waiting for input, @kbd{C-g} cancels the
 112 entire search.  The cursor moves back to where you started the search.  If
 113 @kbd{C-g} is typed when there are characters in the search string that have
 114 not been found---because Emacs is still searching for them, or because it
 115 has failed to find them---then the search string characters which have not
 116 been found are discarded from the search string.  The
 117 search is now successful and waiting for more input, so a second @kbd{C-g}
 118 cancels the entire search.
 119
 120   To search for a control character such as @kbd{C-s} or @key{DEL} or
 121 @key{ESC}, you must quote it by typing @kbd{C-q} first.  This function
 122 of @kbd{C-q} is analogous to its meaning as an Emacs command: it causes
 123 the following character to be treated the way a graphic character would
 124 normally be treated in the same context.
 125
 126  To search backwards, you can use @kbd{C-r} instead of @kbd{C-s} to
 127 start the search; @kbd{C-r} is the key that runs the command
 128 (@code{isearch-backward}) to search backward.  You can also use
 129 @kbd{C-r} to change from searching forward to searching backwards.  Do
 130 this if a search fails because the place you started was too far down in the
 131 file.  Repeated @kbd{C-r} keeps looking for more occurrences backwards.
 132 @kbd{C-s} starts going forward again.  You can cancel @kbd{C-r} in a
 133 search with @key{DEL}.
 134
 135   The characters @kbd{C-y} and @kbd{C-w} can be used in incremental search
 136 to grab text from the buffer into the search string.  This makes it
 137 convenient to search for another occurrence of text at point.  @kbd{C-w}
 138 copies the word after point as part of the search string, advancing
 139 point over that word.  Another @kbd{C-s} to repeat the search will then
 140 search for a string including that word.  @kbd{C-y} is similar to @kbd{C-w}
 141 but copies the rest of the current line into the search string.
 142
 143   The characters @kbd{M-p} and @kbd{M-n} can be used in an incremental
 144 search to recall things which you have searched for in the past.  A
 145 list of the last 16 things you have searched for is retained, and
 146 @kbd{M-p} and @kbd{M-n} let you cycle through that ring.
 147
 148 The character @kbd{M-@key{TAB}} does completion on the elements in
 149 the search history ring.  For example, if you know that you have
 150 recently searched for the string @code{POTATOE}, you could type
 151 @kbd{C-s P O M-@key{TAB}}.  If you had searched for other strings
 152 beginning with @code{PO} then you would be shown a list of them, and
 153 would need to type more to select one.
 154
 155   You can change any of the special characters in incremental search via
 156 the normal keybinding mechanism: simply add a binding to the
 157 @code{isearch-mode-map}.  For example, to make the character
 158 @kbd{C-b} mean ``search backwards'' while in isearch-mode, do this:
 159
 160 @example
 161 (define-key isearch-mode-map "\C-b" 'isearch-repeat-backward)
 162 @end example
 163
 164 These are the default bindings of isearch-mode:
 165
 166 @findex isearch-delete-char
 167 @findex isearch-exit
 168 @findex isearch-quote-char
 169 @findex isearch-repeat-forward
 170 @findex isearch-repeat-backward
 171 @findex isearch-yank-line
 172 @findex isearch-yank-word
 173 @findex isearch-abort
 174 @findex isearch-ring-retreat
 175 @findex isearch-ring-advance
 176 @findex isearch-complete
 177
 178 @kindex DEL (isearch-mode)
 179 @kindex RET (isearch-mode)
 180 @kindex C-q (isearch-mode)
 181 @kindex C-s (isearch-mode)
 182 @kindex C-r (isearch-mode)
 183 @kindex C-y (isearch-mode)
 184 @kindex C-w (isearch-mode)
 185 @kindex C-g (isearch-mode)
 186 @kindex M-p (isearch-mode)
 187 @kindex M-n (isearch-mode)
 188 @kindex M-TAB (isearch-mode)
 189
 190 @table @kbd
 191 @item DEL
 192 Delete a character from the incremental search string (@code{isearch-delete-char}).
 193 @item RET
 194 Exit incremental search (@code{isearch-exit}).
 195 @item C-q
 196 Quote special characters for incremental search (@code{isearch-quote-char}).
 197 @item C-s
 198 Repeat incremental search forward (@code{isearch-repeat-forward}).
 199 @item C-r
 200 Repeat incremental search backward (@code{isearch-repeat-backward}).
 201 @item C-y
 202 Pull rest of line from buffer into search string (@code{isearch-yank-line}).
 203 @item C-w
 204 Pull next word from buffer into search string (@code{isearch-yank-word}).
 205 @item C-g
 206 Cancels input back to what has been found successfully, or aborts the
 207 isearch (@code{isearch-abort}).
 208 @item M-p
 209 Recall the previous element in the isearch history ring
 210 (@code{isearch-ring-retreat}).
 211 @item M-n
 212 Recall the next element in the isearch history ring
 213 (@code{isearch-ring-advance}).
 214 @item M-@key{TAB}
 215 Do completion on the elements in the isearch history ring
 216 (@code{isearch-complete}).
 217
 218 @end table
 219
 220 Any other character which is normally inserted into a buffer when typed
 221 is automatically added to the search string in isearch-mode.
 222
 223 @subsection Slow Terminal Incremental Search
 224
 225   Incremental search on a slow terminal uses a modified style of display
 226 that is designed to take less time.  Instead of redisplaying the buffer at
 227 each place the search gets to, it creates a new single-line window and uses
 228 that to display the line the search has found.  The single-line window
 229 appears as soon as point gets outside of the text that is already
 230 on the screen.
 231
 232   When the search is terminated, the single-line window is removed.  Only
 233 at this time the window in which the search was done is redisplayed to show
 234 its new value of point.
 235
 236   The three dots at the end of the search string, normally used to indicate
 237 that searching is going on, are not displayed in slow style display.
 238
 239 @vindex search-slow-speed
 240   The slow terminal style of display is used when the terminal baud rate is
 241 less than or equal to the value of the variable @code{search-slow-speed},
 242 initially 1200.
 243
 244 @vindex search-slow-window-lines
 245   The number of lines to use in slow terminal search display is controlled
 246 by the variable @code{search-slow-window-lines}.  Its normal value is 1.
 247
 248 @node Non-Incremental Search, Word Search, Incremental Search, Search
 249 @section Non-Incremental Search
 250 @cindex non-incremental search
 251
 252   Emacs also has conventional non-incremental search commands, which require
 253 you type the entire search string before searching begins.
 254
 255 @table @kbd
 256 @item C-s @key{RET} @var{string} @key{RET}
 257 Search for @var{string}.
 258 @item C-r @key{RET} @var{string} @key{RET}
 259 Search backward for @var{string}.
 260 @end table
 261
 262   To do a non-incremental search, first type @kbd{C-s @key{RET}}
 263 (or @kbd{C-s C-m}).  This enters the minibuffer to read the search string.
 264 Terminate the string with @key{RET} to start the search.  If the string
 265 is not found, the search command gets an error.
 266
 267  By default, @kbd{C-s} invokes incremental search, but if you give it an
 268 empty argument, which would otherwise be useless, it invokes non-incremental
 269 search.  Therefore, @kbd{C-s @key{RET}} invokes non-incremental search.
 270 @kbd{C-r @key{RET}} also works this way.
 271
 272 @findex search-forward
 273 @findex search-backward
 274   Forward and backward non-incremental searches are implemented by the
 275 commands @code{search-forward} and @code{search-backward}.  You can bind
 276 these commands to keys.  The reason that incremental
 277 search is programmed to invoke them as well is that @kbd{C-s @key{RET}}
 278 is the traditional sequence of characters used in Emacs to invoke
 279 non-incremental search.
 280
 281  Non-incremental searches performed using @kbd{C-s @key{RET}} do
 282 not call @code{search-forward} right away.  They first check
 283 if the next character is @kbd{C-w}, which requests a word search.
 284 @ifinfo
 285 @xref{Word Search}.
 286 @end ifinfo
 287
 288 @node Word Search, Regexp Search, Non-Incremental Search, Search
 289 @section Word Search
 290 @cindex word search
 291
 292   Word search looks for a sequence of words without regard to how the
 293 words are separated.  More precisely, you type a string of many words,
 294 using single spaces to separate them, and the string is found even if
 295 there are multiple spaces, newlines or other punctuation between the words.
 296
 297   Word search is useful in editing documents formatted by text formatters.
 298 If you edit while looking at the printed, formatted version, you can't tell
 299 where the line breaks are in the source file.  Word search, allows you
 300 to search  without having to know the line breaks.
 301
 302 @table @kbd
 303 @item C-s @key{RET} C-w @var{words} @key{RET}
 304 Search for @var{words}, ignoring differences in punctuation.
 305 @item C-r @key{RET} C-w @var{words} @key{RET}
 306 Search backward for @var{words}, ignoring differences in punctuation.
 307 @end table
 308
 309   Word search is a special case of non-incremental search.  It is invoked
 310 with @kbd{C-s @key{RET} C-w} followed by the search string, which
 311 must always be terminated with another @key{RET}.  Being non-incremental, this
 312 search does not start until the argument is terminated.  It works by
 313 constructing a regular expression and searching for that.  @xref{Regexp
 314 Search}.
 315
 316  You can do a backward word search with @kbd{C-r @key{RET} C-w}.
 317
 318 @findex word-search-forward
 319 @findex word-search-backward
 320   Forward and backward word searches are implemented by the commands
 321 @code{word-search-forward} and @code{word-search-backward}.  You can
 322 bind these commands to keys.  The reason that incremental
 323 search is programmed to invoke them as well is that @kbd{C-s @key{RET} C-w}
 324 is the traditional Emacs sequence of keys for word search.
 325
 326 @node Regexp Search, Regexps, Word Search, Search
 327 @section Regular Expression Search
 328 @cindex regular expression
 329 @cindex regexp
 330
 331   A @dfn{regular expression} (@dfn{regexp}, for short) is a pattern that
 332 denotes a (possibly infinite) set of strings.  Searching for matches
 333 for a regexp is a powerful operation that editors on Unix systems have
 334 traditionally offered.
 335
 336  To gain a thorough understanding of regular expressions and how to use
 337 them to best advantage, we recommend that you study @cite{Mastering
 338 Regular Expressions, by Jeffrey E.F. Friedl, O'Reilly and Associates,
 339 1997}. (It's known as the "Hip Owls" book, because of the picture on its
 340 cover.)  You might also read the manuals to @ref{(gawk)Top},
 341 @ref{(ed)Top}, @cite{sed}, @cite{grep}, @ref{(perl)Top},
 342 @ref{(regex)Top}, @ref{(rx)Top}, @cite{pcre}, and @ref{(flex)Top}, which
 343 also make good use of regular expressions.
 344
 345  The SXEmacs regular expression syntax most closely resembles that of
 346 @cite{ed}, or @cite{grep}, the GNU versions of which all utilize the GNU
 347 @cite{regex} library.  SXEmacs' version of @cite{regex} has recently been
 348 extended with some Perl--like capabilities, described in the next
 349 section.
 350
 351  In SXEmacs, you can search for the next match for a regexp either
 352 incrementally or not.
 353
 354 @kindex M-C-s
 355 @kindex M-C-r
 356 @findex isearch-forward-regexp
 357 @findex isearch-backward-regexp
 358   Incremental search for a regexp is done by typing @kbd{M-C-s}
 359 (@code{isearch-forward-regexp}).  This command reads a search string
 360 incrementally just like @kbd{C-s}, but it treats the search string as a
 361 regexp rather than looking for an exact match against the text in the
 362 buffer.  Each time you add text to the search string, you make the regexp
 363 longer, and the new regexp is searched for.  A reverse regexp search command
 364 @code{isearch-backward-regexp} also exists, bound to @kbd{M-C-r}.
 365
 366   All of the control characters that do special things within an ordinary
 367 incremental search have the same functionality in incremental regexp search.
 368 Typing @kbd{C-s} or @kbd{C-r} immediately after starting a search
 369 retrieves the last incremental search regexp used:
 370 incremental regexp and non-regexp searches have independent defaults.
 371
 372 @findex re-search-forward
 373 @findex re-search-backward
 374   Non-incremental search for a regexp is done by the functions
 375 @code{re-search-forward} and @code{re-search-backward}.  You can invoke
 376 them with @kbd{M-x} or bind them to keys.  You can also call
 377 @code{re-search-forward} by way of incremental regexp search with
 378 @kbd{M-C-s @key{RET}}; similarly for @code{re-search-backward} with
 379 @kbd{M-C-r @key{RET}}.
 380
 381 @node Regexps, Search Case, Regexp Search, Search
 382 @section Syntax of Regular Expressions
 383
 384   Regular expressions have a syntax in which a few characters are
 385 special constructs and the rest are @dfn{ordinary}.  An ordinary
 386 character is a simple regular expression that matches that character and
 387 nothing else.  The special characters are @samp{.}, @samp{*}, @samp{+},
 388 @samp{?}, @samp{[}, @samp{]}, @samp{^}, @samp{$}, and @samp{\}; no new
 389 special characters will be defined in the future.  Any other character
 390 appearing in a regular expression is ordinary, unless a @samp{\}
 391 precedes it.
 392
 393 For example, @samp{f} is not a special character, so it is ordinary, and
 394 therefore @samp{f} is a regular expression that matches the string
 395 @samp{f} and no other string.  (It does @emph{not} match the string
 396 @samp{ff}.)  Likewise, @samp{o} is a regular expression that matches
 397 only @samp{o}.@refill
 398
 399 Any two regular expressions @var{a} and @var{b} can be concatenated.  The
 400 result is a regular expression that matches a string if @var{a} matches
 401 some amount of the beginning of that string and @var{b} matches the rest of
 402 the string.@refill
 403
 404 As a simple example, we can concatenate the regular expressions @samp{f}
 405 and @samp{o} to get the regular expression @samp{fo}, which matches only
 406 the string @samp{fo}.  Still trivial.  To do something more powerful, you
 407 need to use one of the special characters.  Here is a list of them:
 408
 409 @need 1200
 410 @table @kbd
 411 @item .@: @r{(Period)}
 412 @cindex @samp{.} in regexp
 413 is a special character that matches any single character except a newline.
 414 Using concatenation, we can make regular expressions like @samp{a.b}, which
 415 matches any three-character string that begins with @samp{a} and ends with
 416 @samp{b}.@refill
 417
 418 @item *
 419 @cindex @samp{*} in regexp
 420 is not a construct by itself; it is a quantifying suffix operator that
 421 means to repeat the preceding regular expression as many times as
 422 possible.  In @samp{fo*}, the @samp{*} applies to the @samp{o}, so
 423 @samp{fo*} matches one @samp{f} followed by any number of @samp{o}s.
 424 The case of zero @samp{o}s is allowed: @samp{fo*} does match
 425 @samp{f}.@refill
 426
 427 @samp{*} always applies to the @emph{smallest} possible preceding
 428 expression.  Thus, @samp{fo*} has a repeating @samp{o}, not a
 429 repeating @samp{fo}.@refill
 430
 431 The matcher processes a @samp{*} construct by matching, immediately, as
 432 many repetitions as can be found; it is "greedy".  Then it continues
 433 with the rest of the pattern.  If that fails, backtracking occurs,
 434 discarding some of the matches of the @samp{*}-modified construct in
 435 case that makes it possible to match the rest of the pattern.  For
 436 example, in matching @samp{ca*ar} against the string @samp{caaar}, the
 437 @samp{a*} first tries to match all three @samp{a}s; but the rest of the
 438 pattern is @samp{ar} and there is only @samp{r} left to match, so this
 439 try fails.  The next alternative is for @samp{a*} to match only two
 440 @samp{a}s.  With this choice, the rest of the regexp matches
 441 successfully.@refill
 442
 443 Nested repetition operators can be extremely slow if they specify
 444 backtracking loops.  For example, it could take hours for the regular
 445 expression @samp{\(x+y*\)*a} to match the sequence
 446 @samp{xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxz}.  The slowness is because
 447 Emacs must try each imaginable way of grouping the 35 @samp{x}'s before
 448 concluding that none of them can work.  To make sure your regular
 449 expressions run fast, check nested repetitions carefully.
 450
 451 @item +
 452 @cindex @samp{+} in regexp
 453 is a quantifying suffix operator similar to @samp{*} except that the
 454 preceding expression must match at least once.  It is also "greedy".
 455 So, for example, @samp{ca+r} matches the strings @samp{car} and
 456 @samp{caaaar} but not the string @samp{cr}, whereas @samp{ca*r} matches
 457 all three strings.
 458
 459 @item ?
 460 @cindex @samp{?} in regexp
 461 is a quantifying suffix operator similar to @samp{*}, except that the
 462 preceding expression can match either once or not at all.  For example,
 463 @samp{ca?r} matches @samp{car} or @samp{cr}, but does not match anything
 464 else.
 465
 466 @item *?
 467 @cindex @samp{*?} in regexp
 468 works just like @samp{*}, except that rather than matching the longest
 469 match, it matches the shortest match.  @samp{*?} is known as a
 470 @dfn{non-greedy} quantifier, a regexp construct borrowed from Perl.
 471 @c Did perl get this from somewhere?  What's the real history of *? ?
 472
 473 This construct is very useful for when you want to match the text inside
 474 a pair of delimiters.  For instance, @samp{/\*.*?\*/} will match C
 475 comments in a string.  This could not easily be achieved without the use
 476 of a non-greedy quantifier.
 477
 478 This construct has not been available prior to XEmacs 20.4.  It is not
 479 available in FSF Emacs.
 480
 481 @item +?
 482 @cindex @samp{+?} in regexp
 483 is the non-greedy version of @samp{+}.
 484
 485 @item ??
 486 @cindex @samp{??} in regexp
 487 is the non-greedy version of @samp{?}.
 488
 489 @item \@{n,m\@}
 490 @c Note the spacing after the close brace is deliberate.
 491 @cindex @samp{\@{n,m\@} }in regexp
 492 serves as an interval quantifier, analogous to @samp{*} or @samp{+}, but
 493 specifies that the expression must match at least @var{n} times, but no
 494 more than @var{m} times.  This syntax is supported by most Unix regexp
 495 utilities, and has been introduced to XEmacs for the version 20.3.
 496
 497 Unfortunately, the non-greedy version of this quantifier does not exist
 498 currently, although it does in Perl.
 499
 500 @item [ @dots{} ]
 501 @cindex character set (in regexp)
 502 @cindex @samp{[} in regexp
 503 @cindex @samp{]} in regexp
 504 @samp{[} begins a @dfn{character set}, which is terminated by a
 505 @samp{]}.  In the simplest case, the characters between the two brackets
 506 form the set.  Thus, @samp{[ad]} matches either one @samp{a} or one
 507 @samp{d}, and @samp{[ad]*} matches any string composed of just @samp{a}s
 508 and @samp{d}s (including the empty string), from which it follows that
 509 @samp{c[ad]*r} matches @samp{cr}, @samp{car}, @samp{cdr},
 510 @samp{caddaar}, etc.@refill
 511
 512 The usual regular expression special characters are not special inside a
 513 character set.  A completely different set of special characters exists
 514 inside character sets: @samp{]}, @samp{-} and @samp{^}.@refill
 515
 516 @samp{-} is used for ranges of characters.  To write a range, write two
 517 characters with a @samp{-} between them.  Thus, @samp{[a-z]} matches any
 518 lower case letter.  Ranges may be intermixed freely with individual
 519 characters, as in @samp{[a-z$%.]}, which matches any lower case letter
 520 or @samp{$}, @samp{%}, or a period.@refill
 521
 522 To include a @samp{]} in a character set, make it the first character.
 523 For example, @samp{[]a]} matches @samp{]} or @samp{a}.  To include a
 524 @samp{-}, write @samp{-} as the first character in the set, or put it
 525 immediately after a range.  (You can replace one individual character
 526 @var{c} with the range @samp{@var{c}-@var{c}} to make a place to put the
 527 @samp{-}.)  There is no way to write a set containing just @samp{-} and
 528 @samp{]}.
 529
 530 To include @samp{^} in a set, put it anywhere but at the beginning of
 531 the set.
 532
 533 @item [^ @dots{} ]
 534 @cindex @samp{^} in regexp
 535 @samp{[^} begins a @dfn{complement character set}, which matches any
 536 character except the ones specified.  Thus, @samp{[^a-z0-9A-Z]}
 537 matches all characters @emph{except} letters and digits.@refill
 538
 539 @samp{^} is not special in a character set unless it is the first
 540 character.  The character following the @samp{^} is treated as if it
 541 were first (thus, @samp{-} and @samp{]} are not special there).
 542
 543 Note that a complement character set can match a newline, unless
 544 newline is mentioned as one of the characters not to match.
 545
 546 @item ^
 547 @cindex @samp{^} in regexp
 548 @cindex beginning of line in regexp
 549 is a special character that matches the empty string, but only at the
 550 beginning of a line in the text being matched.  Otherwise it fails to
 551 match anything.  Thus, @samp{^foo} matches a @samp{foo} that occurs at
 552 the beginning of a line.
 553
 554 When matching a string instead of a buffer, @samp{^} matches at the
 555 beginning of the string or after a newline character @samp{\n}.
 556
 557 @item $
 558 @cindex @samp{$} in regexp
 559 is similar to @samp{^} but matches only at the end of a line.  Thus,
 560 @samp{x+$} matches a string of one @samp{x} or more at the end of a line.
 561
 562 When matching a string instead of a buffer, @samp{$} matches at the end
 563 of the string or before a newline character @samp{\n}.
 564
 565 @item \
 566 @cindex @samp{\} in regexp
 567 has two functions: it quotes the special characters (including
 568 @samp{\}), and it introduces additional special constructs.
 569
 570 Because @samp{\} quotes special characters, @samp{\$} is a regular
 571 expression that matches only @samp{$}, and @samp{\[} is a regular
 572 expression that matches only @samp{[}, and so on.
 573
 574 @c Removed a paragraph here in lispref about doubling backslashes inside
 575 @c of Lisp strings.
 576
 577 @end table
 578
 579 @strong{Please note:} For historical compatibility, special characters
 580 are treated as ordinary ones if they are in contexts where their special
 581 meanings make no sense.  For example, @samp{*foo} treats @samp{*} as
 582 ordinary since there is no preceding expression on which the @samp{*}
 583 can act.  It is poor practice to depend on this behavior; quote the
 584 special character anyway, regardless of where it appears.@refill
 585
 586 For the most part, @samp{\} followed by any character matches only
 587 that character.  However, there are several exceptions: characters
 588 that, when preceded by @samp{\}, are special constructs.  Such
 589 characters are always ordinary when encountered on their own.  Here
 590 is a table of @samp{\} constructs:
 591
 592 @table @kbd
 593 @item \|
 594 @cindex @samp{|} in regexp
 595 @cindex regexp alternative
 596 specifies an alternative.
 597 Two regular expressions @var{a} and @var{b} with @samp{\|} in
 598 between form an expression that matches anything that either @var{a} or
 599 @var{b} matches.@refill
 600
 601 Thus, @samp{foo\|bar} matches either @samp{foo} or @samp{bar}
 602 but no other string.@refill
 603
 604 @samp{\|} applies to the largest possible surrounding expressions.  Only a
 605 surrounding @samp{\( @dots{} \)} grouping can limit the grouping power of
 606 @samp{\|}.@refill
 607
 608 Full backtracking capability exists to handle multiple uses of @samp{\|}.
 609
 610 @item \( @dots{} \)
 611 @cindex @samp{(} in regexp
 612 @cindex @samp{)} in regexp
 613 @cindex regexp grouping
 614 is a grouping construct that serves three purposes:
 615
 616 @enumerate
 617 @item
 618 To enclose a set of @samp{\|} alternatives for other operations.
 619 Thus, @samp{\(foo\|bar\)x} matches either @samp{foox} or @samp{barx}.
 620
 621 @item
 622 To enclose an expression for a suffix operator such as @samp{*} to act
 623 on.  Thus, @samp{ba\(na\)*} matches @samp{bananana}, etc., with any
 624 (zero or more) number of @samp{na} strings.@refill
 625
 626 @item
 627 To record a matched substring for future reference.
 628 @end enumerate
 629
 630 This last application is not a consequence of the idea of a
 631 parenthetical grouping; it is a separate feature that happens to be
 632 assigned as a second meaning to the same @samp{\( @dots{} \)} construct
 633 because there is no conflict in practice between the two meanings.
 634 Here is an explanation of this feature:
 635
 636 @item \@var{digit}
 637 matches the same text that matched the @var{digit}th occurrence of a
 638 @samp{\( @dots{} \)} construct.
 639
 640 In other words, after the end of a @samp{\( @dots{} \)} construct.  the
 641 matcher remembers the beginning and end of the text matched by that
 642 construct.  Then, later on in the regular expression, you can use
 643 @samp{\} followed by @var{digit} to match that same text, whatever it
 644 may have been.
 645
 646 The strings matching the first nine @samp{\( @dots{} \)} constructs
 647 appearing in a regular expression are assigned numbers 1 through 9 in
 648 the order that the open parentheses appear in the regular expression.
 649 So you can use @samp{\1} through @samp{\9} to refer to the text matched
 650 by the corresponding @samp{\( @dots{} \)} constructs.
 651
 652 For example, @samp{\(.*\)\1} matches any newline-free string that is
 653 composed of two identical halves.  The @samp{\(.*\)} matches the first
 654 half, which may be anything, but the @samp{\1} that follows must match
 655 the same exact text.
 656
 657 @item \(?: @dots{} \)
 658 @cindex @samp{\(?:} in regexp
 659 @cindex regexp grouping
 660 is called a @dfn{shy} grouping operator, and it is used just like
 661 @samp{\( @dots{} \)}, except that it does not cause the matched
 662 substring to be recorded for future reference.
 663
 664 This is useful when you need a lot of grouping @samp{\( @dots{} \)}
 665 constructs, but only want to remember one or two -- or if you have
 666 more than nine groupings and need to use backreferences to refer to
 667 the groupings at the end.
 668
 669 Using @samp{\(?: @dots{} \)} rather than @samp{\( @dots{} \)} when you
 670 don't need the captured substrings ought to speed up your programs some,
 671 since it shortens the code path followed by the regular expression
 672 engine, as well as the amount of memory allocation and string copying it
 673 must do.  The actual performance gain to be observed has not been
 674 measured or quantified as of this writing.
 675 @c This is used to good advantage by the font-locking code, and by
 676 @c `regexp-opt.el'.
 677
 678 The shy grouping operator has been borrowed from Perl, and has not been
 679 available prior to XEmacs 20.3, nor is it available in FSF Emacs.
 680
 681 @item \w
 682 @cindex @samp{\w} in regexp
 683 matches any word-constituent character.  The editor syntax table
 684 determines which characters these are.  @xref{Syntax}.
 685
 686 @item \W
 687 @cindex @samp{\W} in regexp
 688 matches any character that is not a word constituent.
 689
 690 @item \s@var{code}
 691 @cindex @samp{\s} in regexp
 692 matches any character whose syntax is @var{code}.  Here @var{code} is a
 693 character that represents a syntax code: thus, @samp{w} for word
 694 constituent, @samp{-} for whitespace, @samp{(} for open parenthesis,
 695 etc.  @xref{Syntax}, for a list of syntax codes and the characters that
 696 stand for them.
 697
 698 @item \S@var{code}
 699 @cindex @samp{\S} in regexp
 700 matches any character whose syntax is not @var{code}.
 701 @end table
 702
 703   The following regular expression constructs match the empty string---that is,
 704 they don't use up any characters---but whether they match depends on the
 705 context.
 706
 707 @table @kbd
 708 @item \`
 709 @cindex @samp{\`} in regexp
 710 matches the empty string, but only at the beginning
 711 of the buffer or string being matched against.
 712
 713 @item \'
 714 @cindex @samp{\'} in regexp
 715 matches the empty string, but only at the end of
 716 the buffer or string being matched against.
 717
 718 @item \=
 719 @cindex @samp{\=} in regexp
 720 matches the empty string, but only at point.
 721 (This construct is not defined when matching against a string.)
 722
 723 @item \b
 724 @cindex @samp{\b} in regexp
 725 matches the empty string, but only at the beginning or
 726 end of a word.  Thus, @samp{\bfoo\b} matches any occurrence of
 727 @samp{foo} as a separate word.  @samp{\bballs?\b} matches
 728 @samp{ball} or @samp{balls} as a separate word.@refill
 729
 730 @item \B
 731 @cindex @samp{\B} in regexp
 732 matches the empty string, but @emph{not} at the beginning or
 733 end of a word.
 734
 735 @item \<
 736 @cindex @samp{\<} in regexp
 737 matches the empty string, but only at the beginning of a word.
 738
 739 @item \>
 740 @cindex @samp{\>} in regexp
 741 matches the empty string, but only at the end of a word.
 742 @end table
 743
 744   Here is a complicated regexp used by Emacs to recognize the end of a
 745 sentence together with any whitespace that follows.  It is given in Lisp
 746 syntax to enable you to distinguish the spaces from the tab characters.  In
 747 Lisp syntax, the string constant begins and ends with a double-quote.
 748 @samp{\"} stands for a double-quote as part of the regexp, @samp{\\} for a
 749 backslash as part of the regexp, @samp{\t} for a tab and @samp{\n} for a
 750 newline.
 751
 752 @example
 753 "[.?!][]\"')]*\\($\\|\t\\|  \\)[ \t\n]*"
 754 @end example
 755
 756 @noindent
 757 This regexp contains four parts: a character set matching
 758 period, @samp{?} or @samp{!}; a character set matching close-brackets,
 759 quotes or parentheses, repeated any number of times; an alternative in
 760 backslash-parentheses that matches end-of-line, a tab or two spaces; and
 761 a character set matching whitespace characters, repeated any number of
 762 times.
 763
 764 @node Search Case, Replace, Regexps, Search
 765 @section Searching and Case
 766
 767 @vindex case-fold-search
 768   All searches in Emacs normally ignore the case of the text they
 769 are searching through; if you specify searching for @samp{FOO},
 770 @samp{Foo} and @samp{foo} are also considered a match.  Regexps, and in
 771 particular character sets, are included: @samp{[aB]} matches @samp{a}
 772 or @samp{A} or @samp{b} or @samp{B}.@refill
 773
 774   If you want a case-sensitive search, set the variable
 775 @code{case-fold-search} to @code{nil}.  Then all letters must match
 776 exactly, including case. @code{case-fold-search} is a per-buffer
 777 variable; altering it affects only the current buffer, but
 778 there is a default value which you can change as well.  @xref{Locals}.
 779 You can also use @b{Case Sensitive Search} from the @b{Options} menu
 780 on your screen.
 781
 782 @node Replace, Other Repeating Search, Search Case, Search
 783 @section Replacement Commands
 784 @cindex replacement
 785 @cindex string substitution
 786 @cindex global substitution
 787
 788   Global search-and-replace operations are not needed as often in Emacs as
 789 they are in other editors, but they are available.  In addition to the
 790 simple @code{replace-string} command which is like that found in most
 791 editors, there is a @code{query-replace} command which asks you, for each
 792 occurrence of a pattern, whether to replace it.
 793
 794   The replace commands all replace one string (or regexp) with one
 795 replacement string.  It is possible to perform several replacements in
 796 parallel using the command @code{expand-region-abbrevs}.  @xref{Expanding
 797 Abbrevs}.
 798
 799 @menu
 800 * Unconditional Replace::  Replacing all matches for a string.
 801 * Regexp Replace::         Replacing all matches for a regexp.
 802 * Replacement and Case::   How replacements preserve case of letters.
 803 * Query Replace::          How to use querying.
 804 @end menu
 805
 806 @node Unconditional Replace, Regexp Replace, Replace, Replace
 807 @subsection Unconditional Replacement
 808 @findex replace-string
 809 @findex replace-regexp
 810
 811 @table @kbd
 812 @item M-x replace-string @key{RET} @var{string} @key{RET} @var{newstring} @key{RET}
 813 Replace every occurrence of @var{string} with @var{newstring}.
 814 @item M-x replace-regexp @key{RET} @var{regexp} @key{RET} @var{newstring} @key{RET}
 815 Replace every match for @var{regexp} with @var{newstring}.
 816 @end table
 817
 818   To replace every instance of @samp{foo} after point with @samp{bar},
 819 use the command @kbd{M-x replace-string} with the two arguments
 820 @samp{foo} and @samp{bar}.  Replacement occurs only after point: if you
 821 want to cover the whole buffer you must go to the beginning first.  By
 822 default, all occurrences up to the end of the buffer are replaced.  To
 823 limit replacement to part of the buffer, narrow to that part of the
 824 buffer before doing the replacement (@pxref{Narrowing}).
 825
 826   When @code{replace-string} exits, point is left at the last occurrence
 827 replaced.  The value of point when the @code{replace-string} command was
 828 issued is remembered on the mark ring; @kbd{C-u C-@key{SPC}} moves back
 829 there.
 830
 831   A numeric argument restricts replacement to matches that are surrounded
 832 by word boundaries.
 833
 834 @node Regexp Replace, Replacement and Case, Unconditional Replace, Replace
 835 @subsection Regexp Replacement
 836
 837   @code{replace-string} replaces exact matches for a single string.  The
 838 similar command @code{replace-regexp} replaces any match for a specified
 839 pattern.
 840
 841   In @code{replace-regexp}, the @var{newstring} need not be constant.  It
 842 can refer to all or part of what is matched by the @var{regexp}.  @samp{\&}
 843 in @var{newstring} stands for the entire text being replaced.
 844 @samp{\@var{d}} in @var{newstring}, where @var{d} is a digit, stands for
 845 whatever matched the @var{d}'th parenthesized grouping in @var{regexp}.
 846 For example,@refill
 847
 848 @example
 849 M-x replace-regexp @key{RET} c[ad]+r @key{RET} \&-safe @key{RET}
 850 @end example
 851
 852 @noindent
 853 would replace (for example) @samp{cadr} with @samp{cadr-safe} and @samp{cddr}
 854 with @samp{cddr-safe}.
 855
 856 @example
 857 M-x replace-regexp @key{RET} \(c[ad]+r\)-safe @key{RET} \1 @key{RET}
 858 @end example
 859
 860 @noindent
 861 would perform exactly the opposite replacements.  To include a @samp{\}
 862 in the text to replace with, you must give @samp{\\}.
 863
 864 @node Replacement and Case, Query Replace, Regexp Replace, Replace
 865 @subsection Replace Commands and Case
 866
 867 @vindex case-replace
 868 @vindex case-fold-search
 869   If the arguments to a replace command are in lower case, the command
 870 preserves case when it makes a replacement.  Thus, the following command:
 871
 872 @example
 873 M-x replace-string @key{RET} foo @key{RET} bar @key{RET}
 874 @end example
 875
 876 @noindent
 877 replaces a lower-case @samp{foo} with a lower case @samp{bar}, @samp{FOO}
 878 with @samp{BAR}, and @samp{Foo} with @samp{Bar}.  If upper-case letters are
 879 used in the second argument, they remain upper-case every time that
 880 argument is inserted.  If upper-case letters are used in the first
 881 argument, the second argument is always substituted exactly as given, with
 882 no case conversion.  Likewise, if the variable @code{case-replace} is set
 883 to @code{nil}, replacement is done without case conversion.  If
 884 @code{case-fold-search} is set to @code{nil}, case is significant in
 885 matching occurrences of @samp{foo} to replace; also, case conversion of the
 886 replacement string is not done.
 887
 888 @node Query Replace,, Replacement and Case, Replace
 889 @subsection Query Replace
 890 @cindex query replace
 891
 892 @table @kbd
 893 @item M-% @var{string} @key{RET} @var{newstring} @key{RET}
 894 @itemx M-x query-replace @key{RET} @var{string} @key{RET} @var{newstring} @key{RET}
 895 Replace some occurrences of @var{string} with @var{newstring}.
 896 @item M-x query-replace-regexp @key{RET} @var{regexp} @key{RET} @var{newstring} @key{RET}
 897 Replace some matches for @var{regexp} with @var{newstring}.
 898 @end table
 899
 900 @kindex M-%
 901 @findex query-replace
 902   If you want to change only some of the occurrences of @samp{foo} to
 903 @samp{bar}, not all of them, you can use @code{query-replace} instead of
 904 @kbd{M-%}.  This command finds occurrences of @samp{foo} one by one,
 905 displays each occurrence, and asks you whether to replace it.  A numeric
 906 argument to @code{query-replace} tells it to consider only occurrences
 907 that are bounded by word-delimiter characters.@refill
 908
 909 @findex query-replace-regexp
 910   Aside from querying, @code{query-replace} works just like
 911 @code{replace-string}, and @code{query-replace-regexp} works
 912 just like @code{replace-regexp}.@refill
 913
 914   The things you can type when you are shown an occurrence of @var{string}
 915 or a match for @var{regexp} are:
 916
 917 @kindex SPC (query-replace)
 918 @kindex DEL (query-replace)
 919 @kindex , (query-replace)
 920 @kindex ESC (query-replace)
 921 @kindex . (query-replace)
 922 @kindex ! (query-replace)
 923 @kindex ^ (query-replace)
 924 @kindex C-r (query-replace)
 925 @kindex C-w (query-replace)
 926 @kindex C-l (query-replace)
 927
 928 @c WideCommands
 929 @table @kbd
 930 @item @key{SPC}
 931 to replace the occurrence with @var{newstring}.  This preserves case, just
 932 like @code{replace-string}, provided @code{case-replace} is non-@code{nil},
 933 as it normally is.@refill
 934
 935 @item @key{DEL}
 936 to skip to the next occurrence without replacing this one.
 937
 938 @item , @r{(Comma)}
 939 to replace this occurrence and display the result.  You are then
 940 prompted for another input character.  However, since the replacement has
 941 already been made, @key{DEL} and @key{SPC} are equivalent.  At this
 942 point, you can type @kbd{C-r} (see below) to alter the replaced text.  To
 943 undo the replacement, you can type @kbd{C-x u}.
 944 This exits the @code{query-replace}.  If you want to do further
 945 replacement you must use @kbd{C-x @key{ESC} @key{ESC}} to restart (@pxref{Repetition}).
 946
 947 @item @key{ESC}
 948 to exit without doing any more replacements.
 949
 950 @item .@: @r{(Period)}
 951 to replace this occurrence and then exit.
 952
 953 @item !
 954 to replace all remaining occurrences without asking again.
 955
 956 @item ^
 957 to go back to the location of the previous occurrence (or what used to
 958 be an occurrence), in case you changed it by mistake.  This works by
 959 popping the mark ring.  Only one @kbd{^} in a row is allowed, because
 960 only one previous replacement location is kept during @code{query-replace}.
 961
 962 @item C-r
 963 to enter a recursive editing level, in case the occurrence needs to be
 964 edited rather than just replaced with @var{newstring}.  When you are
 965 done, exit the recursive editing level with @kbd{C-M-c} and the next
 966 occurrence will be displayed.  @xref{Recursive Edit}.
 967
 968 @item C-w
 969 to delete the occurrence, and then enter a recursive editing level as
 970 in @kbd{C-r}.  Use the recursive edit to insert text to replace the
 971 deleted occurrence of @var{string}.  When done, exit the recursive
 972 editing level with @kbd{C-M-c} and the next occurrence will be
 973 displayed.
 974
 975 @item C-l
 976 to redisplay the screen and then give another answer.
 977
 978 @item C-h
 979 to display a message summarizing these options, then give another
 980 answer.
 981 @end table
 982
 983   If you type any other character, Emacs exits the @code{query-replace}, and
 984 executes the character as a command.  To restart the @code{query-replace},
 985 use @kbd{C-x @key{ESC} @key{ESC}}, which repeats the @code{query-replace} because it
 986 used the minibuffer to read its arguments.  @xref{Repetition, C-x ESC ESC}.
 987
 988 @node Other Repeating Search,, Replace, Search
 989 @section Other Search-and-Loop Commands
 990
 991   Here are some other commands that find matches for a regular expression.
 992 They all operate from point to the end of the buffer.
 993
 994 @findex list-matching-lines
 995 @findex occur
 996 @findex count-matches
 997 @findex delete-non-matching-lines
 998 @findex delete-matching-lines
 999 @c grosscommands
1000 @table @kbd
1001 @item M-x occur
1002 Print each line that follows point and contains a match for the
1003 specified regexp.  A numeric argument specifies the number of context
1004 lines to print before and after each matching line; the default is
1005 none.
1006
1007 @kindex C-c C-c (Occur mode)
1008 The buffer @samp{*Occur*} containing the output serves as a menu for
1009 finding occurrences in their original context.  Find an occurrence
1010 as listed in @samp{*Occur*}, position point there, and type @kbd{C-c
1011 C-c}; this switches to the buffer that was searched and moves point to
1012 the original of the same occurrence.
1013
1014 @item M-x list-matching-lines
1015 Synonym for @kbd{M-x occur}.
1016
1017 @item M-x count-matches
1018 Print the number of matches following point for the specified regexp.
1019
1020 @item M-x delete-non-matching-lines
1021 Delete each line that follows point and does not contain a match for
1022 the specified regexp.
1023
1024 @item M-x delete-matching-lines
1025 Delete each line that follows point and contains a match for the
1026 specified regexp.
1027 @end table