Word Frequencies: Analyze Word Frequencies

The simplest function of MAXDictio determines the vocabulary of all of a current project’s texts.

This function can be accessed by either:

  • selecting the option MAXDictio > Word Frequencies, or
  • just clicking on the corresponding quick button in the toolbar “MAXDictio”.
Start “Word frequencies” in MAXQDA

After starting the function, the following dialogue window appears. Here you may select all the options you need.

Word Frequency options in MAXQDA 2018

Selection of texts to be analyzed

Only for activated documents – the frequencies procedure will be restricted to the activated text files

Only in retrieved segments – the frequencies procedure will be restricted to the coded segments actually displayed in the “Retrieved Segments” window

If neither option is selected, all text and table documents in the MAXQDA project will be analyzed.

Please note: Please be aware that hyphenation is not recognized in PDF-documents.

Differentiation of results

None: The results table does not differentiate the results, providing only the totals over all analyzed texts.

By documents, document groups, document sets, focus group speakers: The results table contains additional columns that can be used to compare word frequency within individual documents, document groups, document sets, or focus group speakers (see Differentiation by Documents, Document Groups, Document Sets, and Focus Group Speakers). When the Only for activated documents option is selected, only activated documents within the document groups or document sets are taken into account, and only document groups or document sets containing activated documents will be analyzed.

By Codes: This option is available only if the analysis is restricted to the segments in the "Retrieved Segments" and a "Simple Coding Query" has been performed. The results table contains additional columns of recurring frequencies for each code that appears in the "Code System". This option is particularly helpful when texts have been divided into text units using codes for MAXDictio analysis, as it allows you to compare the word frequencies within different codes.

Ignore

Hyperlinks – words within a web link will be ignored.

Email addresses – if checked, words that are part of an email address will be ignored.

Hashtags – if checked, words from hashtags will be ignored.

Numbers – if checked, numbers won’t be treated as words.

Text within square brackets – if checked, words within square brackets will be ignored.

Text within curly brackets – if checked, words within curly brackets will be ignored.

Minimal number of characters – words with fewer characters will be skipped

Apply stop word list – If a stop word list is to be used, the corresponding box must be checked. Click on the button with the three dots to open and edit the stop lists.

Case sensitivity: If this setting is activated, "Give" and "give", for example, will be counted as different words. If the setting is inactive, all words will be displayed in lowercase in the results list.

Lemmatize words – when this box is checked, the identified words in the texts will be simplified to their word stems (lemmas) by using a lemma lexicon in the chosen language. For example, if a text contains the words “gave”, “given”, and “gives”, MAXDictio will list the base form “give” in the results table only.

Click OK, to begin the analysis of word frequencies. Depending on the size of the texts, this process may take a few moments. A display informs you about the progress of the analysis.

Was this article helpful?