Code files and manage codes

Coding, or gathering material by topic, is a fundamental task in most qualitative projects. It tends to be a cumulative rather than a one-stage process, with the meaning and structure of your nodes changing over time.

Where to start?

Coding is one of several techniques for making sense of your data—you can use it in conjunction with annotating, writing memos, linking and creating maps. The way you handle coding depends on your methodology and research design.

You could start with ‘broad-brush’ coding to organize your material into broad topic areas (you can use Text Search queries to help with this)—then explore the node for each topic and do more detailed coding:

Process of broad brush coding.

Researchers working in methodologies such as phenomenology or discourse analysis may get straight into detailed coding (making nodes as required) and then, later on, combine and group the nodes into related categories.

You could combine these approaches or experiment to see what works best for you.

What is the purpose of coding?

Coding the content of your files can contribute to your analysis in the following important ways:

  • The process of coding brings you closer to your data and 'forces' you to focus on your material—asking questions like; What is this about? Is it about more than one thing? How does it help me answer my research question?
  • Coding lets you gather all the material about a topic in one place (for example, what did everyone say about water quality?)—this makes it easier to see patterns, contradictions and to develop theories.
  • Coding facilitates the use of queries and visualizations allowing you to look for connections between themes and test your understanding. For example, you have a hunch that tourism impacts water quality—use a coding query to gather content coded at water quality where it is NEAR content coded at tourism and explore the connections.

How much coding should I do?

The amount of coding you do depends on your research design, the project time frame and the volume of data you are working with. A social scientist on a long term project may carefully code the content—looking for subtle themes and the connections between them. A brand manager may be more interested in analyzing broad topic areas in order to make rapid strategic decisions.

A researcher working with a large dataset (like the results of a survey) may use a combination of auto coding and Text Search queries to speed up the coding process.

You may not need to exhaustively code all your material. For example, if after working through twelve interviews you are not finding any new themes or ideas – you may have reached ‘saturation’. You could use Text Search queries to do some broad-brush coding in subsequent interviews.

Different types of coding

As you reflect on a piece of content, think about these different types of coding:

  • Topic coding—What is the topic being discussed? For example, water quality, real estate development, tourism and so on. Text search queries can help you to search for topics in your material.
  • Analytical coding—Once you have organized your material by topic, you can review the content of your nodes and ask What is this content really about? Why is it interesting? How does it relate to my research question? Consider the meaning in context and express new ideas about the data. For example,  ideals vs reality or tension between developers and residents.
  • Case coding—Who is speaking? What place, organization or other entity is being observed? You can code the material at a case node and assign demographic attributes. Classify cases (set attribute values to record information)

Coding at multiple nodes (co-occurring nodes)

As you work with your files, consider coding content at multiple nodes. For example, you could code Barbara’s comment at all of the nodes shown on the left:

  • water quality
  • development
  • sea level
  • negative attitude
  • Barbara

Well it’s a major one. Water quality in general and– I don’t know all of the issues related to larger scale development, but yeah I think that a lot of the easy land that can be approved easily has already been developed. It’s very low. And sea level rises. And so I think that problem is only gonna get worse.

If you code all your interviews like this, then you can use queries to gather your material in different combinations. For example, show me

  • All the content coded at water quality and development
  • Negative attitudes about water quality
  • What women said about water quality

Take time to reflect and 'code on'

After exploring and coding a file, take some time to reflect on what you have discovered. Display coding stripes to see your coding or run a report to see which nodes have been used most often—how do these nodes relate to each other? Make a map to explore the relationships and record your thoughts in a memo.

Coding is rarely a one-stage process. As you review the coded data at a node, you will often see ways to improve your coding—for example, you may want to:

  • Include more of the context around coded content—for example, expand coding to include the whole sentence or paragraph.
  • Remove some of the coded content by 'uncoding' it (especially useful if you used a Text Search query to do some broad-brush coding).
  • Develop ideas by coding content to other nodes— this process is termed 'coding-on'.
  • Create or reorganize nodes as you respond to what you are seeing in the coded data.

Review the references in a node

Build an efficient node hierarchy

Take time out from coding to reflect on your nodes and organize the themes that are emerging.  Reasons for keeping your nodes organized include:

  • Being able to find your nodes easily saves time during the coding process.
  • If you have trouble finding a node, you are less likely to use it consistently.
  • You are more likely to lose or confuse ideas in an unwieldy node structure.

Here are some strategies for building an efficient node structure:

  • Keep node names short and pertinent.
  • Make sure a node only appears once in the whole hierarchy.
  • Try not to combine concepts in a node. For example, instead of coding some text at skeptical attitudes about government policy, code it at both of these nodes:
    • skeptical attitudes
    • government policy

      Remember that you can use queries to gather your coded content in all sorts of combinations—for example, find all content coded at the node skeptical attitudes AND at the node government policy.

  • Try not to force nodes into a hierarchy—if a node is not related to any other concept then leave it at the top level.
  • Try not to nest more than 3 levels deep if you can help it.
  • Make a node to gather 'great quotes'.
  • Prune your nodes regularly. Merge, reorganize, rename.

Use a codebook to manage nodes

As you develop your thematic node structure, you can export or report on the nodes and their descriptions to make a codebook. A codebook can clarify the meaning of the thematic nodes so that you (or members in your team) can apply them to your data in a consistent manner.

If you have added information about the meaning of the node in the Description field in node properties, then you can: