Word frequency query
Use Word Frequency queries to list the most frequently occurring words or concepts in your files.
You could use a Word Frequency query to:
- Identify possible themes, particularly in the early stages of a project.
- Analyze the most frequently
used words in a particular demographic. For example, analyze the most
common words used by farmers
when discussing climate change.
You could do a coding query to gather all content coded at climate change and at 'case' nodes with the attribute farmer—then select the result node as the criteria for the Word Frequency query.
- Look for exact words, or broaden your search to find the most frequently occurring concepts. For example, if you look for the most frequent words in a dataset survey, you might find that water, health, and harmful are the most frequently occurring words. However, if you group similar words together, you might find that the concept of pollution (including pollutants, pollution, polluted, and pollutes) occurs most frequently.
Before you run a Word Frequency query, make sure the text content language is set to the language of your files.
Create a word frequency query
If you are not familiar with NVivo queries, you may want to create your query using the Wizard—the Wizard guides you through the process of setting your query criteria. However, not all query features are available in the Wizard, so you may sometimes want to create your query outside the Wizard.
On the Explore tab, in the Query group, click Query Wizard.
The Query Wizard opens. Follow the steps on the Wizard.
Choose the query you want to run.
|Click Identify frequently occurring terms in content.|
Specify the terms you want to search for.
In the Display words box, specify the number of words displayed in the results—for example, show only the top 20 words.
In the Minimum word length box, type the number of characters of the smallest word you want to include. For example, a word length of 4 will exclude small words from the results.
Select a Grouping option. Choose to find exact matches or group words with the same stem together—for example, you can search for sport and find sporting.
You can adjust the slider to broaden your search to find similar concepts. For example, find sport, play and recreation.
Choose where you want to search.
Choose whether you want to search text in all your files, or restrict the search to selected items or folders.
Choose whether to add the query to your project.
You can run the query once or choose to add it to your project (and run it).
If you choose to add it to your project, you must enter a name. You can optionally enter a description.
- Click Run.
NOTE If you want to use Word Frequency query features that are not available via the Wizard—for example, only count words in files created by specific users—you can add the query to your project and update it later.
- On the Explore tab, in the Query group, click Word Frequency.
- Choose where you want to search for matching text:
- Files & Externals—search for content in all the files and externals in your project
- Selected Items—restrict your search to selected items (for example, a set containing interview transcripts)
- Selected Folders—restrict your search to content in selected folders (for example, a folder of interview transcripts)
- Specify how many words you want to display:
- <number> most frequent to include a specific number of words. For example, you could display the 100 most frequently occurring words.
- All to include all words found in the selected project items.
- (Optional) Enter a minimum word length to exclude short words from the results—for example, enter 5 to display only words with five or more letters.
- Select a Grouping option. Choose to find exact matches or group words with the same stem together—for example, search for sport and find sporting. You can adjust the slider to broaden your search to find similar concepts. For example, find sport, play and recreation.
- Click the Run Query button at the top of Detail View.
When the query has finished running, the results are displayed as a temporary preview in Detail View.
- To save the query, click the Add to Project button and enter the name and description (optional).
View the results
When you run a Word Frequency query the results are displayed in Detail View.
Click on the tabs displayed
- Summary Displays a list of the most frequently occurring words (excluding stop words).
- Word Cloud Displays up to 100 words in varying font sizes, where frequently occurring words are in larger fonts.
- Tree Map Displays up to 100 words as a series of rectangles, where frequently occurring words are in larger rectangles.
- Cluster Analysis Displays up to 100 words as a dendrogram, where words that co-occur are clustered together.
What words are counted in a Word Frequency query?
- Words containing punctuation (such as hyphens, periods and other symbols) are divided into separate words. For example, part-time will be counted as part and time.
- Words containing apostrophes (such as o'clock and d'accord) are treated as one word but if the apostrophe is followed by an 's then the s is not included (Tom's would be counted as Tom).
- In audio and video transcripts,
only words in the Content field (column)
are included in the query—any words in
custom transcript fieldsare ignored.
- In datasets, only words in codable fields (columns) are included in the query—any words in classifying fields are ignored.
- When searching text in selected nodes, if a word is coded against multiple nodes, it is counted once for each node. Similarly, if a word has been coded by multiple users to the same node, it is counted once for each user.
- Word Frequency queries do not include 'stop words'.
- Word Frequency queries do not search text within images. PDFs created by scanning paper documents may contain only images—each page is a single image. If you want to use Word Frequency queries to explore the text in these PDFs, then you should consider using optical character recognition (OCR) to convert the scanned images to text (before you import the PDF files into NVivo).
- A Word Frequency query does not search text in framework matrix summaries.
- If the text content language is Japanese, the 'base form' is listed in the query results, but the count includes any alternate forms of the word.
Word Frequency queries do not include 'stop words'—by default, these are less significant words like conjunctions or prepositions, that may not be meaningful to your analysis.
You can view and edit the list of stop words. Text content language and stop words
NOTE In server projects, only Project Owners can add words to the stop word list.
Create a node to gather references
- In the query results, select the word you want to use to create a node.
- On the Word Frequency Query tab, in the Words group, click Create As Code.
- Select a location and name the node.
- Click OK.
Run a Text Search query for a word
- On the Word Frequency Query tab, in the Words group, click Run Text Search Query.
- (Optional) Change the Text Search Criteria
or Query Options. Text Search query
- Click Run.
- You can also double-click a word in the Word Cloud to run a Text Search query.
- If the text content language is Japanese, the Text Search query will find all occurrences of the base form or any alternate forms of the word.