Office 365 Security & Compliance Center eDiscovery – Part 3: Phrases and Grouping AND OR’ing (Oh, my!)

5/5 - (2 votes)

This is the third in a series of posts focusing on helping you get the most out of Office 365 Content Search and eDiscovery.

Intro

Over the posts in this series, I’m going to go over the following concepts:

In this, the third installment of “eDiscovery in the time of COVID-19,” we’re going to look at using phrasing and grouping effectively, as well as some of the nuances of operators.

In the words of Baz Luhrmann (Gen Y’s Bob Dylan):

Accept certain inalienable truths
Prices will rise, politicians will philander, you too, will get old
And when you do, you’ll fantasize that when you were young
Prices were reasonable, politicians were noble
And children respected their elders

When it comes to eDiscovery and searching for data, our inalienable truths include:

  • You probably AND’d when you should have OR’d
  • You probably OR’d when you should have AND’d
  • Grouping helps both you and the system understand what you’re trying to do
  • Boolean operators <> Property operators
  • When it doubt, quote it out

Understanding the syntax of how these things work will definitely help you be more effective.

Hello, Operators

No, not that operator.

Operators are special words and symbols that give instructions to the search process on how to join, connect, and order your keywords and conditions and values. There are a number of different types of operators that can be employed.

AND and OR

These are the most commonly used operators, so I’m giving them a special section.  AND and OR are logical Boolean operators are used to join words, phrases, and search conditions together in a way that helps the search engine find stuff.

There are four fundamental things to know about AND and OR:

  • AND is not the same as and
  • OR is not the same as or
  • (c:c) is the same as AND
  • (c:s) is the same as OR

When constructing searches using the KQL syntax (typically, by typing the KQL query into the Keywords condition card), it’s very important to know that capitalization matters.  Let’s say, for example, you construct a search with this query

aaron and robert

as shown below:

The capitalization of AND and OR is so important to the syntax that the Security & Compliance Center user interface raises a dialog box, asking you to confirm:

If you select Keep query, the system will interpret that to construct a search that looks like this:

aaron and robert

Which is equivalent to the phrase aaron and robert. In order to qualify as a match, all three of those words must be present in that order. You can see this in the following screencap:

Instead of choosing Keep query when you save and run, clicking Replace query will replace aaron and robert with aaron AND robert, which will, in turn, have different results.  The results will now include messages with both aaron and robert in any order. The operator word AND is no longer part of the search text (as it was in the previous example), so it isn’t highlighted:

It works the same way with the OR operator.  If you choose to Keep query with lowercase OR, your words will be joined as a phrase.  If you choose to Replace query, your keywords will be connected with the capitalized OR, indicating logical connection.

Other Operators

In addition to OR and AND (which we already covered in a bit of detail), there are a number of other operators that may be used, depending on the scenario, placement, and conditions.   There are two main kinds of operators that I’ve already alluded to:

  • Boolean search operators
  • Condition property search operators

What are they? When are they used?

That’s why you read this blog–for all the cliffhangers.

Boolean Search Operators

Boolean search operators work by allowing you to join keywords and conditions together.  So if I wanted to locate documents that contain both the words penguins and cool, I’d use the Boolean operator AND to join those keywords together:

penguins AND cool

The following table lists the Boolean search operators available in Office 365.

Operator Example Description
AND keyword1 AND keyword2 Returns items that include all of the specified keywords or property:value expressions. For example, from:"Aaron Guilmette" AND subject:"penguin videos" would return all messages sent by Aaron Guilmette that contained the phrase “penguin videos”
+ (plus symblol) keyword1 + keyword2 + keyword3 Returns items that contain either keyword2 or keyword3 and that also contain keyword1. This example is equivalent to the query (keyword2 OR keyword3) AND keyword1.However, the query keyword1 + keyword2 (with a space after the + symbol) isn’t the same as using the AND operator. This query would be equivalent to “keyword1 + keyword2”, returning contents that matched the exact phrase “keyword1 + keyword2”.This operator can be difficult to use, so I would recommend using OR and AND to attempt to accomplish the same task with less confusion.
OR keyword1 OR keyword2 Returns items that include one or more of the specified keywords or property:value expressions.  For example, (cat OR dog) OR author:Felix would return items with either cat or dog keywords or that had Felix as an author.
NOT keyword1 NOT keyword2

NOT from:"Aaron Guilmette"

NOT kind:im

Excludes items specified by a keyword or a property:value expression. When scoped to Exchange mailbox locations, the second example will exclude messages sent from Aaron Guilmette. When scoped to Exchange locations, the third example will exclude instant messages.
– (minus or hyphen symbol) keyword1 -keyword2 The minus sign or hypen is functionally equivalent to the NOT operator. The example query returns items with keyword1 and excludes items that contain keyword2.  For example, the queries hamsters NOT gerbils or hamsters -gerbils are evaluated the same and would return items that contain rabbits but also do not contain gerbils.

Because no one needs that many rodents.

NEAR keyword1 NEAR(n) keyword2 Returns items with words that are near each other, where n equals the number of words apart. For example, best NEAR(5) worst returns any item where the word “worst” is within five words of “best”. If no number is specified, the default distance is eight words.  It searches either direction, so whether keyword2 comes before or after keyword1, as long as it’s within the distance, it will match.
ONEAR keyword1 ONEAR(n) keyword2 Similar to NEAR, but returns items with words that are near each other in the specified order. For example, best ONEAR(5) worst returns any item where the word “best” occurs before the word “worst” and the two words are within five words of each other. If no number is specified, the default distance is eight words.Note: The ONEAR operator isn’t supported when searching mailboxes. It only works when searching SharePoint and OneDrive for Business sites. If you’re searching mailboxes and sites in the same search and the query includes the ONEAR operator, the search returns mailbox items as if you were using the NEAR operator. In other words, the search returns items in which the specified words are near each other regardless of the order in which the words occur.
WORDS(a,b) WORDS(cat,feline) The WORDS operator tells the search process to treat the words inside the parenthesis as synonyms. You can just as easily use OR, but I thought I’d include it here because it’s not very well-known.
: (colon symbol) property:value The colon (:) in the property:value syntax specifies that the value of the property being searched for contains the specified value. For example, recipients:garthf@contoso.com returns any message sent to garthf@contoso.com.
= (equals symbol) property=value Functionally the same as the : operator.
< (less than symbol) property<value Indicates that the property being searched is less than the specified value.
> (greater than symbol) property>value Indicates that the property being searched is greater than the specified value.
<= (less than symbol followed by equals sign symbol) property<=value Indicates that the property being searched is less than or equal to a specific value.
>= (greater than symbol followed by equals sign symbol) property>=value Indicates that the property being searched is greater than or equal to a specific value.
.. (double period) property:value1..value2 Indicates that the property being searched is between two values (greater than or equal to value1 and less than or equal to value2).
” ” (double quotation mark symbol) "watermelon diaries"

subject:"cat memes"

Use double quotation marks (” “) to search for an exact phrase or term in keyword and property:value search queries.
* (asterisk symbol) cat*

subject:set*

Prefix wildcard searches (where the asterisk is placed at the end of a word) match for zero or more characters in keywords or property:value queries. For example, title:set* returns documents that contain the word set, setup, and setting (and other words that start with “set”) in the document title.Note: You can use only prefix wildcard searches; for example, cat* or set*. Suffix searches (*cat ), infix searches (c*t), and substring searches (*cat*) are not supported.
( ) (parentheses symbol) (cheap OR free) AND beer

(from:contoso.com)(pizza OR burger*) AND (party)

(free beer)

Parentheses can be used to group together keywords as a phrase or combine keywords and phrases with Boolean operators, property:value pairs, and keywords.

You can combine multiple query parameters using the operators to create more complex queries.

Condition property search operators

In the user interface, these are usually presented as drop-down choices:

Using those property operators (greater than, greater than or equal, etc) on a condition card can be directly translated to a KQL query.  Here are the search operators related to property conditions and how you can use them:

Operator Query equivalent Description
After property>date Used with date conditions. Returns items that were sent, received, or modified after the specified date.
Before property<date Used with date conditions. Returns items that were sent, received, or modified before the specified date.
Between date..date Use with date and size conditions. When used with a date condition, the ‘between’ ellipsis returns items there were sent, received, created, or modified within the specified date range. When used with a size condition, returns items whose size is within the specified range.
Contains any of (property:value) OR (property:value) Used with conditions that use a string value type. It is used to return items that contain any part of one or more specified string values.
Doesn’t contain any of NOT property:value

-property:value

Used with conditions for properties that specify a string value. It’s function is to match items that don’t contain any part of the specified string (as in “everything except this”).
Doesn’t equal any of NOT property=value

-property=value

Used with conditions for properties that specify a string value.  It’s used to return items that don’t contain the specific string (as in “everything except this”).
Equals size=value Returns items whose size are equal to the specified value.
Equals any of (property=value) OR (property=value) Used with conditions for properties that specify a string value to match.
Greater size>value Returns items where the specified property is greater than the specified value.1
Greater or equal size>=value Returns items where the specified property is greater than or equal to the specified value.
Less size<value Returns items that are less than the specified value.
Less or equal size<=value Returns items that are less than or equal to the specified value.
Not equal size<>value Returns items that don’t equal the specified size.

Grouping

You’ve already seen a few examples in passing of using parentheses to group terms in a query.  We’re going to look at evaluating them a little more closely in this section.

Operations are generally evaluated left to right, so if you need to be specific (or want to ensure you’re getting the results you anticipate), you can use parentheses to group the expressions.

For example:

cat AND mouse OR dog

could be evaluated several ways:

(cat AND mouse) OR dog or (cat) AND (mouse OR dog) 

It’s best to group it so that the meaning and precedence is clear and search returns a predictable result.

Tips

  • When using the Keyword condition card, any other conditions you supply are logically connected by the AND (c:c) operator.  For example, if you enter cat and then select a date condition card and choose After 2020-03-03 as the date parameter, you’re only going to return content that meets both criteria (cat AND date:2016-03-03).  This also applies to using other unique conditions (for example, using the Date and Sender conditions or Sender and Subject conditions).
  • If you add more than one value to a condition, those values are logically connected with an OR.  For example, if you use the File type card and you enter pptx, docx in the box, it is the same as using the KQL search syntax (filetype:pptx) OR (filetype:docx)
  • Use parentheses to create groupings and give order and structure to your query.

Further Reading

In case this post wasn’t long enough, I’ve compiled a list of resources that you can also refer to.

Series Navigation<< Office 365 Security & Compliance Center eDiscovery – Part 2: Condition Cards: Sender, Recipients, & Participants and Content TypesOffice 365 Security & Compliance Center eDiscovery – Part 4: Learning NEAR and ONEAR >>
author avatar
Aaron Guilmette
Helping companies conquer inferior technology since 1997. I spend my time developing and implementing technology solutions so people can spend less time with technology. Specialties: Active Directory and Exchange consulting and deployment, Virtualization, Disaster Recovery, Office 365, datacenter migration/consolidation, cheese.