Autocode datasets

Datasets contain structured data arranged in records (rows) and fields (columns). Some fields contain classifying information, e.g. respondents' age, and some contain codable information, e.g. respondents' responses to a question. Autocoding allows you to automatically code the codable elements into cases and/or codes.

You can use the Autocode Wizard to code content of codable columns. You can choose an autocoding method to:

  • Use the dataset structure
  • Use existing coding patterns
  • Identify themes or sentiment (This feature is only available in NVivo installations with coding enhancements enabled.)

Autocode a dataset

  1. In List View or Detail view, click on the dataset you want to autocode.
  2. Do one of the following:
    • List View: On the Home tab, click Autocode.
    • Detail View: On the Dataset tab, click Autocode.
  3. Follow the steps on the Autocode Wizard.

Autocode a dataset based on structure

For datasets containing survey data, autocoding may have already taken place during import with the Survey Import Wizard. The Wizard creates, classifies and codes to cases for each respondent and creates and codes to codes for each question.

Code to codes for selected columns

For example, if you have imported survey results from a spreadsheet, you can autocode to gather all the responses for each question.

The image below shows a dataset containing survey responses to two questions. With autocoding, you can create a code (Q1 and Q2) for each question. The responses to each question are automatically coded to the relevant code.

Making nodes from columns in a survey dataset.

Code to cases for each value in a column

For example, if your dataset contains survey results, you can autocode to gather everything a particular respondent has said.

In the image below, a case is created for each respondent (DE001, DE002, and DE003) and the responses are coded to the relevant case.

Making cases from values in a survey dataset.

If you have collected demographic information about the survey respondents, you could autocode based on the values in other classifying columns—for example Gender or Age. You can also group the responses together to create and code to a code for each demographic group—for example Age 21-30, Age 31-40 and so on.

Code to codes for each value in predefined columns (social media datasets only)

For social media data collected with NCapture—for example from Facebook or Twitter—you can choose to code to codes for each value in predefined columns.

For example, you might autocode to codes based on hashtag, or cases based on username.

The image below shows an example of a dataset containing Twitter data. You can autocode to gather Tweets from predefined columns—for example hashtag. A code is created for each hashtag (hashtag1 and hashtag2) and the relevant content is coded to the code.

Creating a theme node for each hashtag in a social media dataset.

Autocode a dataset based on existing coding patterns

(This feature is only available in NVivo installations with coding enhancements enabled.)

Pattern-based autocoding is an experimental feature that you can test and try out. This feature is designed to speed up the coding process for large volumes of textual content.

When you autocode using existing patterns, NVivo compares each text passage—for example, sentence or paragraph—to the content already coded to existing codes. If the content of the text passage is similar in wording to content already coded to a code, then the text passage will be coded to that code.

You can also use pattern-based autocoding in conjunction with the other automatic coding techniques. For example, you could autocode your dataset containing survey responses to create codes for each question. Then, you could use pattern-based coding to 'code on' from a question code.

Pattern-based autocoding is an experimental feature that may work better for some projects than others. Automatic coding using existing coding patterns

Autocode a dataset based on themes or sentiment

(This feature is only available in NVivo installations with coding enhancements enabled.)

Identify themes or sentiment in a dataset, and code sentences or entire dataset cells to theme or sentiment codes. Automated insights