Expert Search with Regular Expressions (regex)
GREP is a well-known search tool in the UNIX world. The original GREP tool printed each line containing the search pattern, hence the acronym GREP (Globally search for a Regular Expression and Print matching lines).
In ATLAS.ti, the results of a GREP search are not printed line-by-line; rather, the text matching the search pattern is highlighted on the screen, or you can automatically code the results including some surrounding context.
The core of a GREP search is the inclusion of special characters in the search string that control the matching process. GREP finds instances in your data that match certain patterns.
You can test and debug any regular expression you formulate on this website: https://regex101.com/
Carrying out a text search with Regex
To open the tool, select the Search & Code tab and from there Expert Search.
Select documents or document groups that you want to search and click Continue.
Select the base unit for the search and the coding:
- Paragraphs
- Sentences
- Words
- Exact matches
Enter a search term. You can test your search expression in the text that you see in the lower half of the screen.
To run the search, click on Show Results.
The result page shows you a Quotation Reader, indicating where the quotations are when (auto)coding the data. If codings already exist at the quotation, those will also be shown.
By clicking on the eye icon, you can change between small, medium and large previews.
You can autocode all results with one code by highlighting all data segments, e.g. via Ctrl+A. Then select Apply Codes, enter a code name and click the plus icon. Depending on the area you have selected, either the exact match, the word, the sentence or the paragraph is coded.
Another options is to review each find and code it by clicking on the coding icon. This opens the regular Coding Dialogue.
GREP Examples
GREP Expression | Description |
---|---|
^ | Matches an empty string at the beginning of a line. |
$ | Matches an empty string at the end of a line. |
. | Matches any character except a new line. |
+ | Matches at least one occurrence of the preceding expression or character. |
* | Matches the preceding element zero or more times. For example, ab*c matches "ac," "abc," "abbbc," etc. |
? | Matches the preceding element zero or one time. For example, ba? matches "b" or "ba." |
[] | Matches a range or set of characters: [a-z] or [0-9] or [aeiou] . For example: [0-9] finds all numeric characters, while [^0-9] finds all non-numeric character |
\b | Matches an empty string at a word boundary |
\B | Matches an empty string not a word boundary |
< | Matches an empty string at the beginning of a word |
> | Matches an empty string at the end of a word |
Enclose ORed expressions with parentheses if OR should be restricted to certain sequences of characters or expressions. See example below
Backspace character disables the special GREP functionality of the following character:
GREP Expression | Description |
---|---|
\d | Matches any digit (equivalent to [0-9] ) |
\D | Matches anything but a digit |
\s | Matches a white-space character |
\S | Matches anything but a white-space character |
\w | Matches any word constituent character |
\W | any character but a word constituent |
Character classes
Character Class | Description |
---|---|
[:alnum:] | Any alphanumeric, i.e., a word constituent, character |
[:alpha:] | Any alphabetic character |
[:cntrl:] | Any control character. In this version, it means any character whose ASCII code is < 32. |
[:digit:] | Any decimal digit |
[:graph:] | Any graphical character. In this version, this mean any character with the code >= 32. |
[:lower:] | Any lowercase character |
[:punct:] | Any punctuation character |
[:space:] | Any white-space character |
[:upper:] | Any uppercase character |
[:xdigit:] | Any hexadecimal character |
Note that these elements are components of the character classes, i.e. they have to be enclosed in an extra set of square brackets to form a valid regular expression. A non-empty string of digits or arbitrary length would be represented as [[:digit:]]+
Examples of GREP Searches
In the following, a few search examples are presented showing the matching GREP expression in the column on the right.
-
The expression
man|woman
matches "man" and "woman." You could also use(|wo)man
to the same effect. -
H(a|e)llo
matches "Hello" and "Hallo." -
H(a|e)+llo
matches "Haaaaaallo" as well as "Heeeeeaaaaeaeaeaeaello." -
And how about
the (angry|lazy|stupid) (man|woman) (walk|run|play|fight)ing with the gr(a|e)y dog
- get the idea?