Available Operators for Querying Data

Knowing about the available operators is important when using the Code Co-occurrence Tools, the Query Tool, when creating Smart Codes and Smart Groups. The following types of operators are available:

  • Boolean operators allow combinations of keywords according to set operations. They are the most common operators used in information retrieval systems.

  • Semantic operators exploit the network structures that were built from the codes.

  • Proximity operators are used to analyze the spatial relations (e.g., distance, embeddedness, overlapping, co-occurrence) between coded data segments.

Boolean Operators

OR, AND, ONE OF and NOT

  • OR, AND, and ONE OF are binary operators which need exactly two operands as input.
  • NOT needs only one operand.
  • Codes, code groups, or smart codes can be used as operands in a query.

AND: ALL of the following are true

The AND operator finds quotations that match ALL the conditions specified in the query. This means you have applied two or more codes to the same quotation.

Example: All quotations coded with both Earth AND Fire .

The AND operator is very selective and often produces an empty result set as it requires that the selected codes have all been applied to exactly the same data segment. It produces best results when combined with less restrictive operators or when the overall number of the available text segments is large.

OR: ANY of the following are true

The OR operator does not really match the everyday usage of OR. Its meaning is At least one of..., including the case where ALL conditions match. The OR operator retrieves all quotations that are coded with any of the codes used in the expression.

Example: All quotations coded with Earth OR Fire will produce a result list that contains all quotations coded with 'Earth' and all quotations coded with 'Fire', or coded with both codes.

ONE OF: Exactly one of the following is true

The ONE OF operator asks that EXACTLY one of the conditions must meet. It translates into the everyday either-or.

Example: All quotations coded with EITHER 'code A' OR 'code B' (but not with both).

NOT: None of the following are true

The NOT operator tests for the absence of a condition. Technically, it subtracts the findings of the non-negated term from all data segments available. Given 1000 quotations in the project and 20 quotations assigned to code 'code A', the query NOT Fire retrieves 980 quotations - those which are not coded with 'code A'.

The operator can be used with an arbitrary expression as in the argument NOT ('code A' OR 'code B') which is the equivalent of neither 'code A' nor 'code B'.

Proximity Operators

Proximity describes the spatial relation< between quotations. Quotations can be embedded in one another, one may follow another, etc. The operators in this section exploit these relationships. They require two operands as their arguments.

Proximity operators differ from the other operators in one important aspect: Proximity operators are non-commutative. This property makes their usage a little more difficult to learn.

Non-commutativity requires a certain input sequence for the operands. While A OR B is equal to B OR A, this does not hold for any of the proximity operators: A FOLLOWS B is not equal to B FOLLOWS A. When building a query, always enter the expressions in the order in which they appear in their natural language manifestation.

Another important characteristic for these operators when using the query tool is the specification of the operand for which you want the quotations retrieved. A WITHIN B specifies the constraint, but you must also specify if you want the quotations for the As or the Bs. This is done implicitly by the sequence. T

If you enter the query 'A WITHIN B', all quotations coded with A are retrieved.

If you enter B WITHIN A, all quotations coded with B are retrieved.

For example, if you want to retrieve all segments in a focus group transcript where '@speaker: Tom' talks about 'holiday experience' you enter:

'holiday experience' WITHIN '@speaker: Tom'

If you enter: '@speaker: Tom' WITHIN 'holiday experience'

You probably don't find anything, as the code '@speaker: Tom' codes the speaker unit and anything he said is embedded within the speaker unit.

If you enter: '@speaker: Tom' ENCLOSES 'holiday experience'

ATLAS.ti retrieves all speaker units of Tom where he talks about holiday experiences, thus the larger segments that contain something about the holiday experience.

In the Query Tool, you need to enter the code whose content you are most interested in on the right-hand side of the query.

  • Quotations enclosing quotations: A ENCLOSES B retrieves all quotations coded with A that contain quotations coded with B.

  • Quotations being enclosed by quotations: A being enclosed by B (WITHIN) retrieves all quotations coded with A that are contained within data segments coded with B.

  • Overlaps (quotation overlapping at start): A OVERLAPS B retrieves all quotations coded with A that overlap quotations coded with B

  • Overlapped by (quotations overlapping at the end): A OVERLAPPED BY B retrieves all quotations coded with A that are overlapped by quotations coded with B.

  • Co-Occurs: Often when interested in the relation between two or more codes, you don't really care whether something overlaps or is overlapped by, or is within or encloses. It this is the case, you simply use the Co-occurs operator.

Co-occur is essentially a short-cut for a combination of the four proximity operators discussed above, plus the operator AND. AND is a Boolean operator, but also finds co-occurrence, namely all coded segments that overlap 100%.

The more general co-occurrence operator is quite useful when working with transcripts. In interviews, people often jump back and forth in time or between contexts, and therefore it often does not make much sense to use the very specific embedding or overlap operators. With other types of data they are however quite useful. Think of video data where it might be important whether action A was already going on before action B started or vice versa. Or if you have coded longer sections in your data like biographical time periods in a person's life and then did some more fine-grained coding within these time periods. Then the WITHIN operator comes in very handy. The same applies when working with pre-coded survey or focus group data. Using WITHIN, you can for instance find all data segments coded with 'topic x' WITHIN 'question 5'; or all data segments coded with 'code A' WITHIN 'speaker unit: Tom'.

Semantic Operators

The operators in this section exploit code-code links that have been previously established by linking codes via drag-and-drop in the Code Manager, or in a network. See Linking Nodes. While Boolean-based queries are extensional and simply enumerate the elements of combined sets (e.g., Love OR Kindness), semantic operators are intentional, as they already capture some meaning expressed in appropriately linked concepts.

Only transitive code-code relations are processed when using semantic operators. See About Relations.

  • The UP operator looks at all directly linked codes, and their quotations at higher levels - all parents of a code.

  • The DOWN operator traverses the network from higher to lower concepts, collecting all quotations from any of the sub codes.

  • The SIBLING operator finds all quotations that are connected to the selected code or any other descendants of the same parent code.