Code-Document Table: Relative Frequencies

Relative frequencies are useful for comparing code distributions across or within documents or document groups, as percentages are easier to comprehend.

If documents are of unequal length, or if document groups have an unequal number of members, it is recommended to normalize the counts as absolute counts may distort the results. See Data Normalization.

Within Group Comparison: Column Relative Frequencies

Code-Document Table Within-Group Comparison The table shows data from a survey with open-ended questions where people evaluate the computer game Minecraft. You can download the project here. Three groups are compared, parents who do not play the game themselves, parents who play and others who play, but are not parents. You must read this table from top to bottom along the columns. It shows the distribution of the selected codes within the three document groups. A quick glance of the heat map indicates that people who play (parents or others) mention more benefits than parents who do not play. The latter group appears more concerned with the downsides of playing a computer game, especially with the social and emotional dangers. Players mention very few downsides.

Whether you need to select row or column relative frequencies depends on which way around the table is displayed. If documents/document groups are listed as rows and the codes as columns, you use row relative frequencies for a within group comparison.

Across Group Comparisons: Row Relative Frequencies


In the following example, from the Children & Happiness Ssample project Document 3 is about twice as long as document 5. Thus, the number of codings regarding a specific topic can be expected to be higher in document 3. If we compare how often positive and negative effects of parenting were mentioned in both documents, normalizing the data will give a more accurate picture. Code-Document Table: Across-Group Comparions Looking at the data without normalization it looks like that the readers of the NYTM article mention much fewer positive and negative effects of parenting. After normalization, this interpretation needs to be adjusted. The reader of the NYTM article mentioned more negative effects of parenting as compared to the readers of parenting blog (55% as compared to 45%) and only slightly less positive effects (47% as compared to 53%).

Total Relative Frequencies

If you select total relative frequencies, the calculation is based on the total number of codings of all selected codes in the table.

In the example below, these are 178. When comparing how much readers of the parenting blog and the NYTM article have written about reason for having and not having children, then the distribution is as follows:

  • readers of the parenting blog have given 68 / 178 = 60% of the reasons for having children and 32% of the reasons for not having children
  • readers of the NYTM article have contributed 44% of the reasons for having children and 44% of the reasons for not having children (the data have been normalized). Thus, comparatively, the readers of the NYTM articles have given quite a bit more reasons for not having children, which fits the results reported above that they also wrote more about negative effects of parenting.

You can display all values: absolute frequencies and all relative frequencies in one table by selecting all options. It depends on the purpose for which you want to use the table. For interpreting the data, it is probably easier if you look at each of the relative frequency counts separately. For a comprehensive report for an appendix, you may want to export the table with all options included.