Automatically detect and code themes
Quickly identify broad themes in files using the Autocode Wizard. Autocode themes to answer the question What are the key themes in my files?
Text analytics is a complex process—manual coding is always going to be more accurate.
The process
Select multiple files, codes or cases and use the Autocode Wizard to produce results. A coding matrix is created, and content is coded to codes.
NVivo analyzes your material using a language pack. Themes are identified by analyzing the content and the sentence structure within it. NVivo assigns significance to some themes over others based on how frequently each theme occurs in the material being analyzed.
The themes are combined into groups and results are presented as a code for each broad idea, with child codes for each theme within that group.
The relevant content is coded to the codes that are created. The results are summarized in a coding matrix which shows the codes for each broad idea, and the number of coding references from each file.
What makes a theme?
The process detects significant noun phrases (for example, real estate development) to identify the most frequently occurring themes.
The process collects the themes and counts their mentions across all files in the set being processed.
NVivo actively filters the themes—only the most relevant themes are presented in the results. You can choose which themes to create as codes at the end of the process.
The automated insights process may produce different results in different languages. For example, if you have a translated version of the same file, analyzing the French version in French and the English version in English may produce different results.
Theme grouping
NVivo groups themes by comparing words with the same stem, for example house, houses and housing. It then filters the themes and excludes those groups that represent a much smaller proportion of your content.
For each group, NVivo uses the most frequently shared phrase or word as the name of the code.
Themes can belong to more than one group—for example, storm water runoff may be grouped with the theme water and the theme storm—and have the same coding references.
Collections of material
You can analyze files, codes or cases individually or as a combination of items.
If you analyze each item individually, you'll get different results compared to analyzing a group of items together.
1 Items A, B and C mention water quality once. Even though it is only mentioned once, the process identifies water as a concept that is mentioned in the majority of items being analyzed. Water quality is therefore suggested as a code in the results. If any of the items A, B or C were analyzed individually, this theme may not necessarily be identified as a theme. For example, if you are reviewing submissions on the same issue, and want to see what overall themes are detected, wait until you receive all the submissions and then process them together as a group.
2 Item D mentions tourism industry several times, but this will not be suggested as a code in the results because even though it is mentioned several times, it is only in one file in the group being analyzed. If you ran Item D individually, the theme tourism industry may be identified as a theme.
- In List View, select the codes, cases or text files you want to autocode. Text files include documents, PDFs, surveys and transcripts—they do not have to be of the same type but they should be the same language.
NOTE If you want to select items from different folders, use the dynamic set All Files or All Codes.
- On the Home tab, in the Coding group, click Autocode then follow the steps on the Wizard.
Wizard step | Description |
---|---|
Choose how you would like to autocode |
Click Identify themes. You may be prompted to download and install additional files. Make sure your project text content language is set to the language of the sources you are analyzing. |
Identifying themes |
NVivo analyzes your files for commonly occurring themes. The identified themes are grouped into proposed codes and child codes, and sorted according to number of mentions in the files. Click the Expand buttons to view the child codes for each theme. Choose the codes you want to create—by default, all codes are selected. Clear the checkboxes for the codes you don't want created. You can merge and delete codes later. Organize codes and cases |
Select how your text passages will be coded |
Choose how finely NVivo should code text passages:
The results are displayed as a coding matrix in Detail View—and saved in the Coding Matrices folder. The created codes are stored in a folder called Autocoded Themes under the Coding folder. |
Specify a location for the results |
The wizard displays your existing code structure. Select a location in the structure to place the autocreated codes. Click Create folder or Create code to create a new location. |
Work with the results of autocoded themes
When you autocode themes, the results are displayed in Detail View and saved as a coding matrix in the Coding Matrices folder. You can view the saved coding matrix later if you want a record of the coding performed by the Wizard at a particular date and time. This coding matrix is a static record that is not updated if you subsequently uncode some of the content.
Each time you run the autocode themes process, the results are merged with any existing autocoded themes. If you want to keep the themes separate, rename the Autocoded Themes folder before you run the Autocode Wizard.
1Columns display the names of the parent codes that have been coded by the Wizard.
2Rows display the files that have been coded by the Wizard.
3 Cells display the number of coding references that were created for a file (row) at a code (column). You can change the display, for example transpose the columns and rows—click Transpose in the View group, on the Matrix tab.
4 Click the Chart tab to see a visual representation of the autocoding results.
NOTE The number of coding references for a file displayed in the Codes column in List View includes the coding matrix.
How can I identify theme coding references?
Autocoded theme references that were created by the Wizard are associated with the user profile 'NVivo' with the initials 'NV'.
If you have performed multiple theme autocoding operations, you will not be able to distinguish which references were created by a particular coding operation. To see the references from a particular operation, you can view the specific coding matrix in the Coding Matrices folder.
You can run a matrix coding query to display the coding references currently associated with the user 'NVivo'.
Review and fine-tune theme coding
Review the results to confirm that you are satisfied with the autocoding before performing other actions in your project—so that you can adjust the coding if you need to.
In the coding matrix, double click a cell to see the content that was coded to the intersection of the file and code. Is the content relevant to that code? Take a look at other cells in the matrix.
Decide whether you are satisfied with the results—do you want to keep some of the coding or undo the entire autocoding operation?
If you are mostly satisfied with the results, but need to fine-tune some of the autocoding, you may want to uncode some of the references. Note that the coding reference is still displayed in the coding matrix cell, even if you have uncoded it.
If you are not satisfied with the overall results, you may want to 'undo' the autocoding completely.
Review the codes in the Autocoded Themes folder. You may want to refine the codes by merging, moving or deleting some.
If you intend to run the Autocode Wizard again, and want keep the new themes separate, rename the Autocoded Themes folder.
Why am I getting unexpected results from autocoding themes?
autocoding themes uses machine-based algorithms to scan your files to identify themes. This is a complex task—manual coding is always going to be more accurate.
- Choose your context carefully—autocoding for themes takes your selection of files as a whole into account. You may get different results from a larger set of data than with a smaller set or indeed, with individual files.
- Understand the structure of your files—look for clear sentence and paragraph structure. If you choose to Code sentences, full stops are used to designate the end of sentences. Make sure that sentences in your files end with a full stop, including bullet lists and text in table cells within a document. Any full stops used to designate an abbreviation will be interpreted as the end of a sentence.
- Make sure each file is in one language, and that you process files of the same language together. The autocoding themes process can only detect one language at a time—and this is based on the text content language setting of your project.
- Minimize the presence of advertising or repeated content in your files. If you are working with web pages, capture only the main content on the page before importing into your project. For example, if you are capturing content from a news site, repeated headlines displayed in a sidebar may be detected as frequently occurring themes.