Queries > Word frequency query

Word frequency query

Use Word Frequency queries to list the most frequently occurring words or concepts in your files.

You could use a Word Frequency query to:

  • Identify possible themes, particularly in the early stages of a project.
  • Analyze the most frequently used words in a particular demographic. For example, analyze the most common words used by farmers when discussing climate change.—then select the result node as the criteria for the Word Frequency query.
  • You can look for exact words or include words with the same stem. For example, if you look for the most frequent words in a set of interviews, you might find that water, health, and harmful are the most frequently occurring words. However, if you include words with the same stem, you might find that pollution (including pollutants, pollution, polluted, and pollutes) occurs most frequently.

Before you run a Word Frequency query, make sure the text content language is set to the language of your files.

Create a word frequency query

  1. On the Query tab, in the Create group, click Word Frequency.
  2. Choose where you want to search for matching text:
    • File and Externals—search for content in all the files and externals in your project
    • Selected Items—restrict your search to selected items (for example, a set containing interview transcripts)
    • Items in Selected Folders—restrict your search to content in selected folders (for example, a folder of interview transcripts)
  3. (Optional) Select Include stemmed words if you want to include words with the same stem (e.g. look for 'talk' and also find 'talking') when Finding matches
  4. (Optional) You can choose to display:
    • All to include all words found in the selected project items.
    • <number> most frequent to include a specific number of words—for example, you could display the 100 most frequently occurring words.
  5. (Optional) Enter a With minimum length to exclude short words from the results—for example, enter 7 to display only words with seven or more letters. 
  6. Click the Run Query button at the top of Detail View.

When the query has finished running, the results are displayed as a temporary preview in Detail View.

NOTES

  • To save the query, click the Save Query button and enter the name and description (optional).

View the results

When you run a Word Frequency query the results are displayed in Detail View.

Click the Expand button (at the top right of Detail View) to make more room for working with your data. Customize the work area

Click on the tabs displayed at the top to see different views of the results:

  • Summary Displays a list of the most frequently occurring words (excluding stop words).
  • Word Cloud Displays up to 100 words in varying font sizes, where frequently occurring words are in larger fonts.

What words are counted in a Word Frequency query?

  • Words containing punctuation (such as hyphens, periods and other symbols) are divided into separate words. For example, part-time will be counted as part and time.
  • Words containing apostrophes (such as o'clock and d'accord) are treated as one word but if the apostrophe is followed by an 's then the s is not included (Tom's would be counted as Tom).
  • In audio and video transcripts, only words in the Transcript field (column) are included in the query—any words in the Speaker field are ignored.
  • In datasets, only words in codable fields (columns) are included in the query—any words in classifying fields are ignored.
  • When searching text in selected nodes, if a word is coded against multiple nodes, it is counted once for each node. Similarly, if a word has been coded by multiple users to the same node, it is counted once for each user.
  • Word Frequency queries do not include 'stop words'—refer to Exclude particular words for more information.
  • Word Frequency queries do not search text within images. PDFs created by scanning paper documents may contain only images—each page is a single image. If you want to use Word Frequency queries to explore the text in these PDFs, then you should consider using optical character recognition (OCR) to convert the scanned images to text (before you import the PDF files into NVivo). You can also try using NVivo's recognize text feature to turn image-based text into codable content.

Exclude particular words

Word Frequency queries do not include 'stop words'—by default, these are less significant words like conjunctions or prepositions, that may not be meaningful to your analysis.

Select the word you want to exclude from the query results, then click Add to Stop Words List, in the Actions group on the Query tab. The words you add to the stop word list will be excluded the next time you run the query.

You can view and edit the list of stop words, refer to Set the text content language and stop words for more information.

Run a Text Search query for a word

  1. On the Query tab, in the Actions group, click Other Actions, and then click Run Text Search Query.
  2. (Optional) Change the Text Search Criteria. Refer to Run a Text Search query for more information.
  3. Click Run Query.

NOTE

  • You can also double-click a word in the Word Cloud to run a Text Search query.