I want to use the dictionary and autocoding functions to code for large amounts of words more easily for 10-20 large PDFs, which are mostly text, but also include lots of tables and some images.
However, most of these PDFs have headers or footers that include the title or other repeated key words. I also want to leave certain sections out, like the table of contents or the bibliography. I've tried and searched, but I can't find any way to implement this. Is it simply not possible?
My only other option I can think of is to extract all of the text and then edit out the parts I don't want, but that is of course also quite tedious and time consuming. I would also prefer to keep them as PDFs to maintain a better idea of where the text occurs in the document.
Is it generally better to just use pre-cleaned text? Even when trying to clean the text in Word, there are lots of formatting irregularities that I have to fix, and I assume these would be carried over to MaxQDA, right? For example, sentences that are split with paragraph breaks abruptly.
Any tips would be greatly appreciated!
Version: MAXQDA 2022
System: Mac OS X 12.x (Monterey)