Word frequency query

Use Word Frequency queries to list the most frequently occurring words or concepts in your files.

You could use a Word Frequency query to:

  • Identify possible themes, particularly in the early stages of a project.
  • Analyze the most frequently used words in a particular demographic. For example, analyze the most common words used by farmers when discussing climate change. You could do a coding query to gather all content coded at climate change and at 'case' nodes with the attribute farmer—then select the result node as the criteria for the Word Frequency query.
  • Look for exact words, or broaden your search to find the most frequently occurring concepts. For example, if you look for the most frequent words in a dataset survey, you might find that water, health, and harmful are the most frequently occurring words. However, if you group similar words together, you might find that the concept of pollution (including pollutants, pollution, polluted, and pollutes) occurs most frequently.

Before you run a Word Frequency query, make sure the text content language is set to the language of your files.

Create a word frequency query

If you are not familiar with NVivo queries, you may want to create your query using the Wizard—the Wizard guides you through the process of setting your query criteria. However, not all query features are available in the Wizard, so you may sometimes want to create your query outside the Wizard.

When the query has finished running, the results are displayed as a temporary preview in Detail View.

NOTES

  • To save the query, click the Add to Project button and enter the name and description (optional).

View the results

When you run a Word Frequency query the results are displayed in Detail View.

Click on the tabs displayed on the right to see different views of the results:

  • Summary Displays a list of the most frequently occurring words (excluding stop words).
  • Word Cloud Displays up to 100 words in varying font sizes, where frequently occurring words are in larger fonts.
  • Tree Map Displays up to 100 words as a series of rectangles, where frequently occurring words are in larger rectangles.
  • Cluster Analysis Displays up to 100 words as a dendrogram, where words that co-occur are clustered together.

What words are counted in a Word Frequency query?

  • Words containing punctuation (such as hyphens, periods and other symbols) are divided into separate words. For example, part-time will be counted as part and time.
  • Words containing apostrophes (such as o'clock and d'accord) are treated as one word but if the apostrophe is followed by an 's then the s is not included (Tom's would be counted as Tom).
  • In audio and video transcripts, only words in the Content field (column) are included in the query—any words in custom transcript fields are ignored.
  • In datasets, only words in codable fields (columns) are included in the query—any words in classifying fields are ignored.
  • When searching text in selected nodes, if a word is coded against multiple nodes, it is counted once for each node. Similarly, if a word has been coded by multiple users to the same node, it is counted once for each user.
  • Word Frequency queries do not include 'stop words'.
  • Word Frequency queries do not search text within images. PDFs created by scanning paper documents may contain only images—each page is a single image. If you want to use Word Frequency queries to explore the text in these PDFs, then you should consider using optical character recognition (OCR) to convert the scanned images to text (before you import the PDF files into NVivo).
  • A Word Frequency query does not search text in framework matrix summaries.
  • If the text content language is Japanese, the 'base form' is listed in the query results, but the count includes any alternate forms of the word.

Exclude particular words

Word Frequency queries do not include 'stop words'—by default, these are less significant words like conjunctions or prepositions, that may not be meaningful to your analysis.

Select the word you want to exclude from the query results, then click Add to Stop Words List, in the Words group on the Word Frequency Query tab. The words you add to the stop word list will be excluded the next time you run the query.

You can view and edit the list of stop words. Text content language and stop words

NOTE  In NVivo for Teams projects, only Project Owners can add words to the stop word list.

Create a node to gather references

  1. In the query results, select the word you want to use to create a node.
  2. On the Word Frequency Query tab, in the Words group, click Create As Code.
  3. Select a location and name the node.
  4. Click OK.

Run a Text Search query for a word

  1. On the Word Frequency Query tab, in the Words group, click Run Text Search Query.
  2. (Optional) Change the Text Search Criteria or Query Options. Text Search query
  3. Click Run.

NOTE

  • You can also double-click a word in the Word Cloud to run a Text Search query.
  • If the text content language is Japanese, the Text Search query will find all occurrences of the base form or any alternate forms of the word.