Saturday, 16 April 2016

EgaFem Analysis Part 2: A glossary of terms.

Author: Drew Roan

Previous piece in this series:
A basic guide on what to look for in research papers -  http://egafeminist.blogspot.co.uk/2016/04/egafem-analysis-part-1-basic-guide-on.html

As stated in my previous piece, Egalitarian Feminism want to help people learn how to read research papers, what to look for and from there how to interpret the data.
Continuing on with that theme, I have compiled a basic glossary of some frequently used terms and their definitions to help people understand some of the language used. This list is not exhaustive by any means, but hopefully people may find it useful.

Aggregate:
Definition: A total created from multiple smaller units. To take an example, the population of a country is the aggregate of all the cities, towns, villages, rural areas and so forth.

Attrition:
Definition: The rates at which participants drop out of a study over an extended period of time. A study with a high attrition rate risks creating significant bias in the results and potentially threatening the overall quality of the research.

Bias:
Definition: Bias is a form of influence in a set of data which ends up producing lopsided or misleading results. As a result of different biases, a set of data might be over-representative or under-representative of the larger population. Bias can come in many forms, such as positive & negative response rate bias in samples, common method bias (see below), instruction bias (where instructions as to what is wanted are unclear, researchers use their judgement to dictate what they want. However people can respond to this differently depending on their perception of the instructions) and more besides.

Chi square:
Definition: A statistical test used to compare expected data with data that has been collected, usually represented with an x2. A large difference between expected and collected results indicates that something may have caused the discrepancy. A suitably large difference allows researchers to reject the null hypothesis (see further down).

Coefficient:
Definition: The number or known factor (usually a constant) by which another value (usually a variable) is multiplied. As an example, imagine you have a sample of workers that is 10% of the total population of workers in an area. Having collected your results from the sample (let’s say how many employees work in sales), you wish to estimate how the larger population is likely to look if your data is accurate. You would therefore multiply your variable (sales employees) by your coefficient (10).

Common Method Variance/Bias:
Definition: A term given to concerns raised by how the data is interpreted and supplied from surveys. For example, a survey in which respondents use their own interpretation of terms might receive very different response rates depending on who is responding and how they interpret the questions being asked.

Confidence Interval:
Definition: A term used to express the researchers’ level of uncertainty in their estimates. A researcher who claims that their confidence level stands at 60% is telling you that if you were to take the same sampling method but choose different samples, you would expect the true population parameters to fall within that estimate 60% of the time. The smaller the confidence interval, the greater the uncertainty in the accuracy of the results.

Control group:
Definition: In an experiment, the control group has data collected, but the findings from their data is not included in the results. Its’ purpose is to show what would normally happen in a given situation and to compare that data with what happens when you alter an independent variable. This allows the researchers to determine if altering a variable is having an effect on a test group, as well as demonstrating what effect it has.

Dependent variable:
Definition: A variable that can be influenced by another variable which researchers can change. For example, consider two variables “employment” and “age”. Here, “employment” is the dependent variable. It can be affected by the variable “age”.

Double blind experiment:
Definition: An experiment where both the researcher(s) and the participants are unaware of which the control group is and which is the treatment group. This is often done in psychology studies to further reduce potential bias created by the participants and the researchers.

Hypothesis:
Definition: A testable theory. For example, “My hypothesis is that if I water my plants regularly and give them lots of sunshine, they will grow healthily”.

Independent Variable:
Definition: The variable in an experiment that is manipulated by the researchers. It also refers to a variable that is not affected by, but does affect, a dependent variable. In the example given under “dependent variable”, “age” would be an independent variable. “Employment” does not affect “age”, but “age” may have an impact on “employment”.

Meta-Analysis:
Definition: A term used to describe the method of combining and analysing data from multiple studies on the same subject.

Null hypothesis:
Definition: This term represents the assumption that the variables of an experiment may have no effect on the results. In the example given for “hypothesis”, the null hypothesis would be that “regular water and lots of sunshine will not help plants grow more healthily”.

P-Value:
Definition: P-Value refers to the idea that the results from a study may have been down to chance, usually represented by a lower case p (for probability). For example, if you see in a study p. < 0.05, this tells you that there is an equal to or less than 1 in 20 chance that the results were down to luck. Most researchers assume that a p value greater than 0.05 means the results were not statistically significant or are too prone to chance to be considered viable.

Parameter:
Definition: “Parameter” refers to a summary – usually a percentage or average – that describes the entire population.

Population:
Definition: “Population” refers to any large group of objects or individuals about which information is desired, such as Germans, flowers or insects.

Random Sampling:
Definition: A sampling technique where individuals from a population are picked at random.

Regression Analysis:
Definition: A method of statistical analysis used to examine relationships between variables. Perhaps the best way to think of this is to think of a scatter chart (a chart where the data points are marked with little dots). The regression analysis is represented by the central line drawn through the data to mark out the average.

Sampling error:
Definition: A term used to describe the level in which results from a sample are different from results that is expected to be obtained from the larger population.

Statistically significant:
Definition: A term used to explain that a difference in results did not occur by chance.

Variable:
Definition: A characteristic or trait that varies between any group of objects or people. Race, gender, age and education are all examples of variables.

Weighted sample:
Definition: A correcting technique used to adjust responses given by survey respondents to match the larger population. Typically this is done when certain demographics (based on race, age, etc) are over or under-represented in a survey and the researchers wish to have their results reflect the larger population.


A word of caution: Correlation Vs Causation

A trap many people (and even some researchers) can fall into is to confuse correlation in data with causation. Although a trend may exist, it does not mean that one causes the other.
Correlation can be understood as recognising when two or more variables show a tendency to fluctuate together, but a change in one does not necessarily cause a change in another.
Causation can be understood as recognising when two or more variables show a tendency to fluctuate together and a change in one WILL cause a change in another.
As an example, we could look at profits from ice cream sales and warm weather. As the weather becomes warmer, we would expect to see companies sell more ice creams, thus make more profit. The correlation in this example is that ice cream sales increase with rising temperatures. The causation is that selling more ice cream leads to a rise in profits.

Ending Notes

Hopefully you have found the above piece useful and leaves you feeling more confident in being able to read research papers for yourself.

In the near future, I will begin tackling claims made by different people as well as research pieces of interest. If there are any pieces you wish examined in more detail, please leave a comment below or contact me on twitter (@DrewRoanEgaFem).

No comments:

Post a Comment