File metadata -> document variables?

17.05.2017, 05:43

I'm working on the Mac with a lot of downloaded PDFs. MacOS automatically creates a metadata field which stores the URL from which the file was downloaded (see ... via-chrome for a discussion).

Is there a way to read this in MAXQDA and automatically create a document variable with the same value? I'd hate to need to reenter information which is already stored in the file system.

Version: MAXQDA 12
System: Mac OS X 10.11 (El Capitan)
Posts: 52
Joined: 01.04.2017, 18:22

Re: File metadata -> document variables?

18.05.2017, 15:30

Dear derickfay,
I don't know much about metadata and even less about metadata in MAC, but I do know there is currently no direct import function for metadata in MAXQDA. There is however the possibility for a workaround: if you could somehow manage to get this metadata into an excel-table you could then import this excel table and hence transform the column with the url into a document variable. But I don't know about any metadata-tools that could do that, so if you find one, it would be nice to know how you did it. Maybe someone else here has another idea.
All the best,
MAXQDA Support Team
Andreas V.
Posts: 274
Joined: 13.04.2017, 16:23

Re: File metadata -> document variables?

19.05.2017, 09:21

Thanks for the suggestion. I have got a python script working that will go through a directory and produce a CSV including various bits of document metadata (including e.g. the full path to the source file, where from (when available), MacOS tags, etc.). I can then convert the CSV to Excel and import its contents successfully as document variables in MAXQDA.

I'm running into two issues with document variables, though.

First, text variables seem to be limited to 64 characters -- which means that a lot of the URLs get cut off.
Second, activating by document variables only uses = <> < > operators without wildcards. I had hoped to be able to activate documents where the where from URL contains [i:3god990x]domain[/i:3god990x], for example, or = *[i:3god990x]domain[/i:3god990x]* using wildcard syntax (and the same kind of operations with MacOS file tags, which I've been using for preliminary coding of overall file content) but this doesn't look to be possible at present.

Given these limitations, it's turning out to be less immediately useful than I had hoped.

I think the solution is going to be to parse the URLs in the script, and produce a field with just the domain (which is all I really need for purposes of analysis for the moment), and import that. And I guess I could set up a Boolean variable for the presence/absence of each of the MacOS file tags.

In the longer term, I think this would be useful as well to import geospatial info. for photos, to create sets / activate by location, etc.
Posts: 52
Joined: 01.04.2017, 18:22

Return to Technical Questions

Who is online

Users browsing this forum: No registered users and 1 guest

We use cookies to improve your experience on our website. By clicking OK or by continuing to browse the website, we’ll assume that you are happy with their use. Click here to review our Cookie Policy. OK