File metadata -> document variables?

17.05.2017, 05:43

I'm working on the Mac with a lot of downloaded PDFs. MacOS automatically creates a metadata field com.apple.metadata:kMDItemWhereFroms which stores the URL from which the file was downloaded (see https://apple.stackexchange.com/questio ... via-chrome for a discussion).

Is there a way to read this in MAXQDA and automatically create a document variable with the same value? I'd hate to need to reenter information which is already stored in the file system.

Version: MAXQDA 12
System: Mac OS X 10.11 (El Capitan)
derickfay
 
Posts: 19
Joined: 01.04.2017, 18:22

Re: File metadata -> document variables?

18.05.2017, 15:30

Dear derickfay,
I don't know much about metadata and even less about metadata in MAC, but I do know there is currently no direct import function for metadata in MAXQDA. There is however the possibility for a workaround: if you could somehow manage to get this metadata into an excel-table you could then import this excel table and hence transform the column with the url into a document variable. But I don't know about any metadata-tools that could do that, so if you find one, it would be nice to know how you did it. Maybe someone else here has another idea.
All the best,
Andreas
MAXQDA Support Team
Andreas V.
 
Posts: 42
Joined: 13.04.2017, 16:23

Re: File metadata -> document variables?

19.05.2017, 09:21

Thanks for the suggestion. I have got a python script working that will go through a directory and produce a CSV including various bits of document metadata (including e.g. the full path to the source file, where from (when available), MacOS tags, etc.). I can then convert the CSV to Excel and import its contents successfully as document variables in MAXQDA.

I'm running into two issues with document variables, though.

First, text variables seem to be limited to 64 characters -- which means that a lot of the URLs get cut off.
Second, activating by document variables only uses = <> < > operators without wildcards. I had hoped to be able to activate documents where the where from URL contains domain, for example, or = *domain* using wildcard syntax (and the same kind of operations with MacOS file tags, which I've been using for preliminary coding of overall file content) but this doesn't look to be possible at present.

Given these limitations, it's turning out to be less immediately useful than I had hoped.

I think the solution is going to be to parse the URLs in the script, and produce a field with just the domain (which is all I really need for purposes of analysis for the moment), and import that. And I guess I could set up a Boolean variable for the presence/absence of each of the MacOS file tags.

In the longer term, I think this would be useful as well to import geospatial info. for photos, to create sets / activate by location, etc.
derickfay
 
Posts: 19
Joined: 01.04.2017, 18:22

Return to Technical Questions

Who is online

Users browsing this forum: No registered users and 1 guest