Automatic coding in datasets

Use auto coding to organize dataset content into case nodes or theme nodes for further exploration.

Understand automatic coding in datasets

A dataset contains structured data arranged in records (rows) and fields (columns). Datasets can contain classifying columns and codable columns.

You can use the Auto Code Wizard to code content of codable columns. You can choose an autocoding method to:

  • Use the dataset structure
  • Use existing coding patterns
  • Identify themes or sentiment (NVivo 12 Plus only)

Auto code a dataset

  1. In List View or Detail view, click on the dataset you want to auto code.
  2. Do one of the following:
    • List View: On the Home tab, in the Coding group, click Auto Code.
    • Detail View: On the Dataset tab, in the Coding group, click Auto Code.
  3. Follow the steps on the Auto Code Wizard.

Auto code a dataset based on structure

For datasets containing survey data, auto coding may have already taken place during import with the Survey Import Wizard. The Wizard creates, classifies and codes to cases for each respondent and creates and codes to nodes for each question.

Code at nodes for selected columns

For example, if you have imported survey results from a spreadsheet, you can auto code to gather all the responses for each question.

The image below shows a dataset containing survey responses to two questions. With auto coding, you can create a node (Q1 and Q2) for each question. The responses to each question are automatically coded at the relevant node.

Making nodes from columns in a survey dataset.

Code at cases for each value in a column

For example, if your dataset contains survey results, you can auto code to gather everything a particular respondent has said.

In the image below, a case is created for each respondent (DE001, DE002, and DE003) and the responses are coded at the relevant case.

Making cases from values in a survey dataset.

If you have collected demographic information about the survey respondents, you could auto code based on the values in other classifying columns—for example Gender or Age. You can also group the responses together to create and code at a node for each demographic group—for example Age 21-30, Age 31-40 and so on.

Code at nodes for each value in predefined columns (social media datasets only)

For social media data collected with NCapture—for example from Facebook or Twitter—you can choose to code at nodes for each value in predefined columns.

For example, you might auto code to theme nodes based on hashtag, or case nodes based on username.

The image below shows an example of a dataset containing Twitter data. You can auto code to gather Tweets from predefined columns—for example hashtag. A theme node is created for each hashtag (hashtag1 and hashtag2) and the relevant content is coded at the node.

Creating a theme node for each hashtag in a social media dataset.

Auto code a dataset based on existing coding patterns

Available in Plus only

Pattern-based auto coding is an experimental feature that you can test and try out. This feature is designed to speed up the coding process for large volumes of textual content.

When you auto code using existing patterns, NVivo compares each text passage—for example, sentence or paragraph—to the content already coded to existing nodes. If the content of the text passage is similar in wording to content already coded to a node, then the text passage will be coded to that node.

You can also use pattern-based auto coding in conjunction with the other automatic coding techniques. For example, you could auto code your dataset containing survey responses to create nodes for each question. Then, you could use pattern-based coding to 'code on' from a question node.

Pattern-based auto coding is an experimental feature that may work better for some projects than others. Automatic coding using existing coding patterns

Auto code a dataset based on themes or sentiment

Available in Plus only

Identify themes or sentiment in a dataset, and code sentences or entire dataset cells to theme or sentiment nodes. Automated insights