This is a guest article written by Matthew Loxton, a Principal Healthcare Analyst and Professional MAXQDA Trainer.
How MAXQDA can assist with Affinity Mapping processes: analysis of word and phrase frequency within a large sample of existing healthcare documents
Knowledgebases (KB) and electronic document library systems (eDLS) often form the core of organizational business workflow and operations. Optimal classification of electronic documents is the foundation for easy browsing and effective searching to retrieve the documents during work activities. As a result, creating a representative taxonomy is a necessary and crucial step in developing an effective KB (Dalkir, 2013).
Taxonomy (from Ancient Greek τάξις (taxis), meaning ‘arrangement’, and -νομία (-nomia), meaning ‘method’) is the science of defining and naming groups on the basis of shared characteristics. A representative taxonomy allows documents in a KB to be categorized in a comprehensive and coherent manner that increases the precision and recall of documents (Busch, 2007). An effective taxonomy can be developed using a number of different approaches, including the analysis of existing tables of contents, wireframes and content structure of existing web pages and SharePoint sites, categories used in existing library systems, and the analysis of word and phrase frequency within a corpus of existing documents.
Word and phrase frequency can be used to identify taxonomy elements. The more common a word or phrase is in a given document corpus, the more likely they are to represent a category by which documents in the corpus can be classified. For example, if the term “potable water” occurs with great frequency throughout a corpus, it suggests that it may be an effective term under which to categorize documents, as well as an efficient search term or menu item for finding associated documents.
Affinity Mapping (also known as Card Sorting) is a method used to conceptually (and often manually) group semantically similar words or phrases under a collective category or to split categories into different elements using human insight and subject knowledge (Rosenfeld & Morville, 2002). The purpose of Affinity Mapping here is to derive a taxonomy that makes categorization more effective and enhances search precision and recall.
For example, context and human insight could determine if “potable water”, “clean water“, and “water“ should be grouped under a single category or if there are two of more categories into which they could be better sorted. Users searching for documents related to potable water would more easily find results if those documents were categorized or tagged with that term.
Note: For the purposes of this blog, Card Sorting, Affinity Diagramming, and Affinity Mapping will be synonymous.
As an example, my team explored a set of 368 healthcare documents using the MAXDictio feature in MAXQDA. MAXDictio was used to identify the most frequent 2-5 word phrases used in the documents, as well as the most frequent single-word terms. The 100 most frequent 2-5 word phrases and single words in the document set were extracted into Excel, and formatted and imported as a code list. The team collated, grouped, combined, and split phrases and words using MAXQDA.
Note: The team used MAXQDA 2018 (VERBI Software, 2017)
As a first step, select MAXDictio “Word Combinations” and set the number of combinations in the dialogue box to suit your purpose. In our study, we used 2-5 word phrase length. Select stop lists that will eliminate frequent but low-salience words, such as “a,” “and,” “on,” etc., and allow lemmatization so that words with the same stem are grouped under a single word (See Figure 1).
The purpose of first removing low-value phrases and words is to reduce the list and to familiarize the team with the contents.
Figure 1: MAXDictio Word Combinations
The result is a list of phrases in order of highest to lowest frequency, which uses lemmatization to cluster trivial variations of the same words (see extract in Figure 2)
Figure 2: Word Frequency Output
- Review the phrase list, identify any that have low saliency, add these to the stop list, and refresh to filter them out.
- Repeat the search with single words using the MAXDictio “Word Frequencies” function.
Once the lists have been trimmed, we can export them as an Excel file by using the “Open as Excel Table” or “Export” functions in the top right of the table view (See Figure 3).
Figure 3: Export to Excel Table
In the Excel sheet, merge the Word and Word Combination columns and rename it “Code”. Then clear the other columns. If you would like to add comments to explain or discuss any of the phrases/words, enter these in the cell to the right and add a header in the top cell of the column with the word “Memo.” These changes will allow the sheet to be imported back into the MAXQDA project as codes with associated memos (See Figure 4).
Figure 4: Excel Sheet – Most Frequent Words
Import the new codes by using the Codes/”Import Codes and Memos from Excel Spreadsheet” function. In Figure 5, we can see the codes and memos have been imported as subcodes under a parent code that I named “Voice of the Document.”
Figure 5: Imported Codes
At this point, we are ready to conduct the Affinity Mapping process. The process involves the following activities:
- Grouping codes together that belong to similar constructs
- Arranging codes in a hierarchy
- Merging codes that are synonymous
- Migrating codes if there are parent codes with greater affinity
- Reassigning codes if the parent code loses strength
- Creating new codes if existing parent or sub-codes are insufficient
- Renaming codes to better represent a construct
- Updating memos where applicable
Assuming you have fewer than 20 codes or have a large-format monitor, you can use MAXQDA’s Creative Coding feature to run an Affinity Mapping session. If the coding is too cluttered, see the note below for an alternative approach using the Code System panel.
- Click on the “Creative Coding” feature under Coding (see Figure 6), and drag the resulting window to your large-format screen. This step will ensure that you will have sufficient space to move codes.
- Drag the parent code for your imported “Voice of the Document” codes to the Creative Coding window, and click on “Start Organizing Codes.”
Figure 6: Creative Coding Function
The Creative Coding window will now contain all the codes imported from the word and phrase frequency list created in MAXDictio. In Figure 7, you can see the imported codes in a graphic structure.
Figure 7: Creative Coding Activity Window
At this point, we are ready to combine, split, or add codes to better represent our insight and understanding of the constructs. As codes are moved, renamed, or created, you can add memos to explain your thought process and document the meaning behind the codes.
For example, we decided that “Military Medical” and “Military Health” were the same thing for the purposes of our work, and so dragged “Military Medical” onto “Military Health” to merge them. We decided the same held for “Medical Care” and “Health Care,” and “Care Provider” and “Care Delivery.”
We also decided that “Army Medical” and “Navy Medicine” were facets of “Military Health,” and that “Population Health” and “Mental Health” were facets of “Health Care.” To organize these topics accordingly, we clicked on the “Link (Define as Subcode)” button and connected them as subcodes by dragging the parent to the subcode.
After discussion, we decided that “Homeland Security” and “Defense Authorization” should be grouped as subcodes of a new code, “Government Policy.” This new code was created by clicking on the “New Code” icon, naming it “Government Policy,” linking it as a subcode of “Voice of the Document” master parent code and linking it to the two sub-codes.
“Health Insurance” was thought to really comprise three very different things – insurance obtained from employers, private insurance, and government insurance. These distinctions were added as new codes and linked as sub-codes under “Health Insurance.” This transitional state can be seen in Figure 8.
Figure 8: Affinity Mapping Code Transitional State
The map can be exported in a variety of formats using the “Export Map” function. Exported maps can be used in reports and other documents as a way to explain the conceptual structure.
The process may require several iterations until there is consensus that no new codes are needed to describe all the constructs, all linking and splitting of codes has been accomplished and the team feels that a stable description of the concepts has been achieved.
When the Affinity Mapping process is completed, click “Quit creative coding” to end the session. MAXQDA will offer an option to apply the new structure to the MAXQDA project and update the Code System (See Figure 9).
Figure 9: MAXQDA Code System
Note: If a large number of imported codes are involved or no large-format monitor is available, the creative coding window can become too cluttered to be used effectively. A less intuitive (or enjoyable) method is to undock and maximize the code system panel and to use the activation and “move activated codes here” functions to move codes in bulk. Adding, moving, linking codes as subcodes, and renaming codes is still possible using the code system view, but it may be less intuitive and slower than Creative Coding.
At this point, the Affinity Mapping is completed, but since the codes are already in MAXQDA as a project with the associated documents, further segment coding and analysis can be carried out without having to switch tools or environments. This further enhances the typical Affinity Mapping process and allows for reports, statistics, and figures to be exported and shared.
The use of MAXQDA for Affinity Mapping enables more efficient sorting and categorization and adds the bonus of features for analysis that is often lacking.
- Busch, J. A. (2007). Getting Started with Business Taxonomy Design. Retrieved from https://taxonomystrategies.com/wp-content/uploads/2016/02/Getting_Started.ppt
- Dalkir, K. (2013, 9 5). Knowledge management in theory and practice.
- Rosenfeld, L., & Morville, P. (2002). Information architecture for the world wide web. “O’Reilly Media, Inc.”.
- Walk, P. (2011). Dublin Core metadata. Retrieved from https://github.com/dcmi/repository/blob/master/mediawiki_wiki/User_Guide.md
About the Author
Matthew Loxton is a Principal Analyst at Whitney, Bradley, and Brown Inc. focused on healthcare improvement, serves on the board of directors of the Blue Faery Liver Cancer Association, and holds a master’s degree in KM from the University of Canberra. Matthew is the founder of the Monitoring & Evaluation, Quality Assurance, and Process Improvement (MEQAPI) organization, and regularly blogs for Physician’s Weekly. Matthew is active on social media related to healthcare improvement and hosts the weekly #MEQAPI chat. You can also read other guest posts by Mathew Loxton on Mixed Methods Research here in the MAXQDA Research Blog: