Type-Token Ratio

The type-toke ration (TTR) is the relationship between the number of unique words that occur in a text and their frequencies. The number of unique words in a text is often referred to as the number of tokens. Several of these tokens are repeated.

  • The type-token ratio can vary between 0 and 1.
  • The more types there are in comparison to the number of tokens (the higher the value), the more varied is the vocabulary. This means there is greater lexical variety in the text.

The type-token ration is calculated as follows:

Type-Token Ratio = (number of types/number of tokens) * 100

In word lists and word clouds, you find the type-token ratio in the status bar at the bottom.

The type/token ratio (TTR) varies very widely in accordance with the length of the text -- or corpus of texts -- which is being studied. A 1,000 word article might have a TTR of 40%; a shorter one might reach 70%; 4 million words will probably give a type/token ratio of about 2%.