Descriptive statistics not possible in large projects

16.08.2018, 10:48

I tried to run descriptive statistics in our project, but I encountered a problem. After selecting all documents, I get an error message:



So, apparently, MAXQDA has an internal limitation here, while our project is bigger than this.
How can I adjust this setting to a higher number, so that we don't encounter a limitation anymore?



Version: MAXQDA 2018
System: Windows 10
Loekarin
 
Posts: 45
Joined: 25.06.2018, 12:39

Re: Descriptive statistics not possible in large projects

17.08.2018, 17:08

Can you perhaps describe what you are attempting to do, and explain how many documents, codes, and variable you are working with in this project?
mloxton
 
Posts: 117
Joined: 28.02.2014, 19:38
Location: Washington DC & Denver Colorado

Re: Descriptive statistics not possible in large projects

19.08.2018, 11:01

I got an answer from the MAXQDA support team:

We are sorry, but as stated in the manual for MAXStats, the limit for the number of variables is 1.000:
https://www.maxqda.com/help-max18-stats/limits-and-technical-notes

In the near future there won`t be any technical adjustments in this point, but we will take your suggestion for improvement into account.
Loekarin
 
Posts: 45
Joined: 25.06.2018, 12:39

Re: Descriptive statistics not possible in large projects

19.08.2018, 11:26

The order in which I did things was: first go to -> Stats --> Start with all Documents
and THEN, after this process has run, I can select which codes I want to include and which not.

By following this procedure, I never get around the error message.
Even if I would like to make, for instance, a frequency tabel on a subcode (which would definitely stay below 1000 codes).
-
I just tested again.
If I start with 'start with all documents' I get the error message.

If I start with 'start with activated documents and codes' and I have activated all documents, but only 1 parent code (with its subcodes), I get no error message.
Loekarin
 
Posts: 45
Joined: 25.06.2018, 12:39

Re: Descriptive statistics not possible in large projects

19.08.2018, 11:45

What I am attempting to do is: I try to understand why or why not making frequency tables is a usefull function. What is the purpose of the function Frequency tables?
When is it usefull to use this?

For instance, I can use 'frequency tables' --> and then add a code
but I can also go to Analysis > Code frequencies.

To me it is not clear what the added value could be of making frequency tables by 'Descriptive stats'.

I'm interested in the best way to count all kind of numbers (frequencies, co-occurence of codes in the same document etc.) for our quantitative content analysis.
Loekarin
 
Posts: 45
Joined: 25.06.2018, 12:39

Re: Descriptive statistics not possible in large projects

20.08.2018, 13:00

So, apparently, MAXQDA has an internal limitation here, while our project is bigger than this.

I'm sorry to hear that you have reached the limit of 1000 variables and codes – and thank you for sharing your workaround! I have to admit that I've never heard of a project of this size before.. even the World Values Survey doesn't reach the number of 1000 variables. I think it is safe to say that >99% of all MAXQDA users will never reach that number, which is probably why those issues haven't been addressed before.

I try to understand why or why not making frequency tables is a usefull function. What is the purpose of the function Frequency tables? When is it usefull to use this?

Uhm.. I don't want to sound impolite by stating the obvious, but I'm afraid I don't know what else to say: The purpose of all features that produce frequency tables is to display frequencies in tables. It is useful when you want to see frequencies in tables, for example when you are interested in the distribution of attributes like age or gender across your cases.

For instance, I can use 'frequency tables' --> and then add a code
but I can also go to Analysis > Code frequencies.

To me it is not clear what the added value could be of making frequency tables by 'Descriptive stats'.


Yes, it is mainly the same feature, put in two places, so that you don't have to close the "Stats"-module when interested in frequencies. The main difference is that the feature "Subcode Statistics" will allow you to aggregate frequencies, and the entry in the Stats-module will allow you to produce frequencies for both codes & variables, while those features are split up in the standard version.

I'm interested in the best way to count all kind of numbers (frequencies, co-occurence of codes in the same document etc.) for our quantitative content analysis.

Well, all the frequency-features simply produce tables that show frequencies. They all look pretty much the same, so I'm having trouble in finding a criterion which would allow me to distinguish between a "good" or "bad" or "best" way to count and display frequencies in tables.
MAXQDA Support Team
Andreas V.
 
Posts: 272
Joined: 13.04.2017, 16:23

Re: Descriptive statistics not possible in large projects

20.08.2018, 14:25

"I have to admit that I've never heard of a project of this size before.. even the world values survey doesn't reach the number of 1000 variables."

To be clear, we have only 7 variables so far, but about 1000 codes.
And the error message is also about the amount of codes.

(we started with almost each document having its own code. And with about 1000 documents, that resulted in about 1000 codes as well.
We have reduced this number of course, but so far didn't delete the original detailed codes.)
Loekarin
 
Posts: 45
Joined: 25.06.2018, 12:39

Re: Descriptive statistics not possible in large projects

20.08.2018, 14:41

"Well, all the frequency-features simply produce tables that show frequencies. They all look pretty much the same, so I'm having trouble in finding a criterion which would allow me to distinguish between a "good" or "bad" or "best" way to count and display frequencies in tables."

The criterion for me is what happens on an aggregate level. We spoke about this already, see
https://www.maxqda.com/en/support/forum/viewtopic.php?f=11&t=1266

If collapsing sub-sub-subcodes into higher order codes gives strange results in functions that do not support 'collaps and aggregate', then I call this 'bad'.

And 'best' is when I can collaps and aggregate, and sum (or split), on every level I want.
For instance the function 'subcode statistics' from the right mouse button works best.
Loekarin
 
Posts: 45
Joined: 25.06.2018, 12:39

Re: Descriptive statistics not possible in large projects

20.08.2018, 14:59

I will try to show you what I do, and why 'counting frequencies' is not as easy as it seems when you have a s ystem with sub-sub-subcodes.

First: one or our main categories is called 'Support, treatment and medication'. On parent level we aggregate this, because it is all about help/care/treatment/support.

See here:



table 1

From 923 documents in total, there are 158 documents (=cases) about 'Support, treatment and medication'.

Looking more in detail, we see that 36 documents are about 'treatment' (behandeling).

Second: I take a closer look into the part about 'treatment'.
I use the function 'subcode statistics'.

Result:



table 2

We see for instance that 'effectiviteit van behandeling' is most often mentioned, while other aspects of treatment are mentioned less.
This is a usefull frequency table.


Third: using the function 'Code frequencies' for the subcode 'Treatment' (remember this is a subcode in the category support and treatment and medication)

This results in a long list:



table 3

Conclusion for me: not useful

It does not automatically collaps, aggregate and sum sub-sub-level codes. So, for instance, I can no longer see how often 'effectiviteit van behandeling' is mentioned.
From the previous results I know that this is in 12 documents.
Based on this result, it says only 4 documents, which is not true, but sublevel codes are not summed, but split.

[continue in next message]
Last edited by Loekarin on 20.08.2018, 15:09, edited 1 time in total.
Loekarin
 
Posts: 45
Joined: 25.06.2018, 12:39

Re: Descriptive statistics not possible in large projects

20.08.2018, 15:08

[continue]

Fourth: use function Code Frequencies
Remove all lower level codes (sub sub subcodes), so I keep only the same subcodes as in the frequency table made by subcode statistics.

See:


Running Code Frequencies results in this table:



table 4

As you can see, still no 'true' results. Effectiviteit van behandeling is only mentioned 4 times. Subcodes are not counted.

Althoug Table 4 should be the same as Table 2, it is not.

Conclusion: not useful
Loekarin
 
Posts: 45
Joined: 25.06.2018, 12:39

Re: Descriptive statistics not possible in large projects

20.08.2018, 15:35

Fifth: using descriptive statistics --> frequency tables

using 'activated documents and activated codes',
activating all documents
activating the subcode 'Treatment' (in the higher parent category treatment, medication and support)



Taking a closer look makes me already suspect problems:



I see that on a bit more aggregated level, the color is light grey (left column).

For instance the sub-subcode 'effectivity of treatment' (effectiviteit van behandeling) is light grey, but has gone to the right column.
More detailed sub-subcodes below 'effectivity of treatment', such as 'effectivity of intervention X' (e.g. effectivity of Canabis) are still in the left column.

-Then running descriptive statistics - frequency tables

Result:



Table 5

It still counts only 4 documents for effectivity of treatment. Is does not include subcodes and sum them.

'Effectivity of treatment' is a subcode below 'treatment'.
On a higher order, it looks like 'treatment' is only mentioned once in all 923 cases.



This is not true.
From our code system I learn that the code 'treatment' (behandeling) is put to 47 segments. Not just to one.



And from Table 1 we already learned that these 47 segments occur in 36 documents.

So, the function Descriptive Stats - Frequency Tables is not useful at all.
Again, a problem with collapse, aggregate and sum.
Loekarin
 
Posts: 45
Joined: 25.06.2018, 12:39

Re: Descriptive statistics not possible in large projects

20.08.2018, 15:44

Concluding:

There are 3 different ways to make frequency tables:
1) using subcode statistics
2) using Code Frequencies
3) using descriptive statistics --> frequency tables

Allthough you might expect that these functions give the same results for the same code and subcode, as I tried to show, it does not.
If you like language, you might even notice that options 2 and 3 have the word 'frequency' in the name, but give the worst results. They do not give the actual frequencies, since they ignore the subcodes.

And 'subcode statistics', which has a name that suggests it only gives results on subcode-level, actually is the most useful function to count aggregate frequencies.
Loekarin
 
Posts: 45
Joined: 25.06.2018, 12:39

Re: Descriptive statistics not possible in large projects

20.08.2018, 15:53

Concluding 2:

Although you might think that "all the frequency-features simply produce tables that show frequencies", you might wonder WHICH frequencies they show, and how reliable those results are. Crosstesting with other tables, or simply looking at the code system, is definitely something I recommend.

(see again my Table 2 and Table 4, which look the same, but produce different results for the same subcode.)

More importantly, simply producing frequencies is not simple at all, when you have sub-sub-sublevel codes.
Loekarin
 
Posts: 45
Joined: 25.06.2018, 12:39

Re: Descriptive statistics not possible in large projects

22.08.2018, 16:36

Thank you for your extensive reply. I can follow your line of thought most of the time, but it seems to me that every instance or variation of the sentence "the results are not true" is not valid, but based on the arbitrary assumption that only aggregated frequencies are "actual frequencies" or "logical" or "true". If however you imagine for example that someone uses a parent code to simply denote a phenomen without going into details about its form/attributes/shapes etc., and then uses the subcodes to denote specific (identified) instances of this phenomenon, then it might be necessary to not automatically aggregate frequencies. And this someone could rightfully argue that automatically aggregating frequencies in this instance would falsify the results because it would ignore the distinction between an unidentified instance of a phenomenon and instances that have been identified in some regard.

So I completly agree that an option that aggregates frequencies would be very nice to have in both the "Descriptive Statistics > Frequencies"-feature as well as in the "Code Frequencies"-feature, but I can not agree in your assessment that any of the results are "untrue" or "unreliable" just because the frequencies are not aggregated.. frequencies are frequencies, and aggregated frequencies are aggregated frequencies. Nonetheless, thank you again very much for all of your feedback, I think the frequency-tables will really improve when the frequencies can be aggregated not just by the "Subcode Statistics"-function.. makes it a bit "insider-knowledge" that it is only possible with this one feature also.

Kind regards,

Andreas
MAXQDA Support Team
Andreas V.
 
Posts: 272
Joined: 13.04.2017, 16:23

Re: Descriptive statistics not possible in large projects

24.08.2018, 12:04

Hi Andreas (again),

Yes, you're totally right. The frequency-results are 'true' for the kind of function that it is programmed to do.
But (as we also agree on) it would be nice to have a choice, to agregate or not.

I also recognize "that someone uses a parent code to simply denote a phenomen without going into details about its form/attributes/shapes etc.,".
We encountered this in our project as well. We made a subcode called 'not otherwise specified' for these cases.

Maybe in the instructions, or the description how currently to read the frequency tables, it could explain that only parent codes are counted. I mean:

If you code:

bird - 4 segments
(subcodes) - pigeons 6 segments
sparrows - 2
blackbirds - 10

Now some functions would say that only 4 segments are about birds, while I think it would be better to say that 4 segments are about birds - not otherwise specified; and 22 segments are about birds - specified or unspecified.
Specified = with attributes/shapes/other kinds of specification.
Loekarin
 
Posts: 45
Joined: 25.06.2018, 12:39

Return to MAXQDA in Research

Who is online

Users browsing this forum: No registered users and 1 guest

We use cookies to improve your experience on our website. By clicking OK or by continuing to browse the website, we’ll assume that you are happy with their use. Click here to review our Cookie Policy. OK