Word frequency query

Use word frequency queries to list the most frequently occurring words in your files.

You can use a word frequency query to:

  • Identify possible themes, particularly in the early stages of a project.
  • Analyze the most frequently used words in a particular demographic. For example, analyze the most common words used by farmers when discussing climate change.—then select the result code as the criteria for the word frequency query.
  • You can look for exact words or include words with the same stem. For example, if you look for the most frequent words in a set of interviews, you might find that water, health, and harmful are the most frequently occurring words. However, if you include words with the same stem, you might find that pollution (including pollutants, pollution, polluted, and pollutes) occurs most frequently.

Before you run a word frequency query, make sure the text content language is set to the language of your files Text content language & stop words.

Create a word frequency query

  1. On the Explore tab, click Word Frequency.
  2. Choose where you want to search for matching text:
    • File and Externals—search for content in all the files and externals in your project
    • Selected Items—restrict your search to selected items (for example, a set containing interview transcripts)
    • Items in Selected Folders—restrict your search to content in selected folders (for example, a folder of interview transcripts)
  3. (Optional) Select Include stemmed words if you want to include words with the same stem (e.g. look for 'talk' and also find 'talking') when Finding matches
  4. (Optional) You can choose to display:
    • All to include all words found in the selected project items.
    • <number> most frequent to include a specific number of words—for example, you could display the 100 most frequently occurring words.
  5. (Optional) Enter a With minimum length to exclude short words from the results—for example, enter 7 to display only words with seven or more letters. 
  6. Click the Run Query button at the top of the Detail View.

When the query has finished running, the results are displayed as a temporary preview in Detail View.

NOTES

  • To save query settings so you can run a query again later, click Save Criteria. Name the query criteria file and optionally add a description. The file is saved under Queries / Query Criteria in the Navigation View.

View the results

When you run a word frequency query the results are displayed in the Detail View.

Click the Undock icon (in the top right of the Detail View) to open the file in the Detail View into its own window, making more space to work. See Customize the workspace

Click on the tabs displayed at the top to see different views of the results:

  • Summary Displays a list of the most frequently occurring words (excluding stop words).
  • Word Cloud Displays up to 100 words in varying font sizes, where frequently occurring words are in larger fonts.

What words are counted in a word frequency query?

  • Words containing punctuation (such as hyphens, periods and other symbols) are divided into separate words. For example, part-time will be counted as part and time.
  • Words containing apostrophes (such as o'clock and d'accord) are treated as one word but if the apostrophe is followed by an 's then the s is not included (Tom's would be counted as Tom).
  • In audio and video transcripts, only words in the Transcript field (column) are included in the query—any words in the Speaker field are ignored.
  • In datasets, only words in codable fields (columns) are included in the query—any words in classifying fields are ignored.
  • When searching text in selected codes, if a word is coded against multiple codes, it is counted once for each code. Similarly, if a word has been coded by multiple users to the same code, it is counted once for each user.
  • Word frequency queries do not include 'stop words' Text content language & stop words.
  • Word frequency queries do not search text within images. PDFs created by scanning paper documents may contain only images—each page is a single image. If you want to use word frequency queries to explore the text in these PDFs, then you should consider using optical character recognition (OCR) to convert the scanned images to text before you import the files into NVivo. You can also try using NVivo's 'recognize text' feature to turn image-based text into codable content Turning a scanned document into codable text.

Exclude particular words

Word frequency queries do not include 'stop words'—by default, these are less significant words like conjunctions or prepositions, that may not be meaningful to your analysis.

Select the word you want to exclude from the query results, then click Add to Stop Words List in the Explore menu. The words you add to the stop word list will be excluded the next time you run the query.

You can view and edit the list of stop words. Text content language & stop words

Run a text search for a word

You can run text searches directly on words in the results of word frequency queries.

  1. Select the word in the word frequency results.
  2. In the Explore menu, click Other Actions > Run Text Search Query.
  3. (Optional) Change the Text Search Criteria. Text search
  4. Click Run Query.

NOTE

  • You can also double-click a word in the Word Cloud to run a text search.