Saturday, 30 April 2016

EgaFem Analysis Part 3: How poorly created statistics can create false narratives.

Author: Drew Roan

Previous pieces in the analysis series:
1) http://egafeminist.blogspot.co.uk/2016/04/egafem-analysis-part-1-basic-guide-on.html
2) http://egafeminist.blogspot.co.uk/2016/04/egafem-analysis-part-2-glossary-of-terms.html

My previous pieces in this series have looked at things to look for when reading research pieces as well as a beginners’ glossary of research terms.

This time, I want to demonstrate how careless representation of data can lead to extremely misleading claims, even if well intentioned.

The example I am using is an older infographic created by RAINN (Rape, Abuse and Incest National Network). [1]

I should make it clear that whilst RAINN have since provided an updated version, that version still operates on the same principles as this one:


From their website, they list reports from both the FBI and the Department of Justice as sources for their data. [1] This gives us a good opportunity to track down the data used.

Having understood the claim and learned where the sources come from, we should look at their methodology.

32 are reported to the police

The data for this claim was taken from the National Crime Victimization Survey, 2008 – 2012. [2] The infographic tells us that 68 out of 100 rapes (or 68 per cent) weren’t reported to law enforcement over this time.

The method they used to determine this estimate was to first use a survey of 90,000 households on self-reports of crime victimization. [13] Once they collect this data, they use a series of complex calculations to provide estimates for the larger population. Finally, they cross reference this estimate with the data collected by police forces as to the total number of reports lodged with them on each type of crime. The difference in the numbers between the survey estimate and the police records is then considered “estimated unreported incidences” of a crime. You can see a full breakdown of their methodology below. [3]

So far, so good.

However, there are several caveats to consider:

i) It’s worth bearing in mind that the total estimate RAINN uses in their initial calculation is NOT a total estimate for “rape”, but a total estimate for “rape/sexual assault”. [4] It is possible that since many states use various definitions for “rape/sexual assault” that they simply wished to create a broader term to collate the data in. However, this is purely speculation on my part. The reality is that rape is not the same as sexual assault and conflating the totals only muddies the waters.

ii) It would seem RAINN is aware of this and has ignored this distinction, based on their own phrasing. You can see in the text above they refer to “sexual assault”, then refer to “rape” in the infographic, despite the fact the data source they used does not make this distinction. [1][4]

iii) A final note of caution here – you cannot conflate “report” as “confirmed incident”, which RAINN have done in their total estimates. We have no way of sorting how many cases had little/no evidence, how many allegations were withdrawn or how many may have been purposefully false (for example).

It’s worth noting that for all of their claims, they compare the survey estimate for “rape/sexual assault” combined with the total estimates for arrests, referrals to prosecutors, convictions and those who spend a single day in prison.

As for the Department of Justice’s definition of “sexual assault”, according to their website it is:
“…any type of sexual contact or behavior that occurs without the explicit consent of the recipient. Falling under the definition of sexual assault are sexual activities as forced sexual intercourse, forcible sodomy, child molestation, incest, fondling, and attempted rape.” [5]

Whilst all of those things should be treated with the utmost seriousness, conflating a broad spectrum of sexual offences with rape is simply not helpful in any way.

7 Lead to an arrest

The second source used by RAINN is the FBI’s Uniform Crime Reports, focusing on arrest data over the duration of 2006-2010. I have included a link to the summary of their 2010 data below. [6]

Their claim that 7 out of 100 incidences lead to an arrest comes from collecting the total arrest rates for rape over a 5 year period from the FBI databases and comparing that to the data provided by the Justice Department on the total estimated rape/sexual assault numbers.

To use an example based on just one years’ data, in 2010 there were estimated to be 188,380 “rape/sexual assault” incidences according to the Justice Department. [7] According to the FBI, in 2010 there were estimated 20,088 arrests for “forcible rape”. [6] Therefore:

20,088/188,380=0.106636x100=10.6636

Therefore, by this calculation, we can assert that the total estimated arrests for “forcible rape” (by the FBI’s figures) made up close to 11% of arrests for all estimated incidences of rape/sexual assault (by the Justice Department’s figures) for 2010.

There is a serious issue here. The FBI and the Justice Department both record rape and sexual assaults differently. Whereas the Justice Department data conflates “rape/sexual assault” in the same category, the FBI has separate categories for “forcible rape” and “sexual assault (excluding forcible rape and prostitution)”. Yet in RAINN’s analysis, they looked specifically at arrest rates relating to “forcible rape” whilst ignoring “sexual assault” arrest rates in the process.

The only reason for doing this is to inflate the number of “rapes” in one category, but minimize the number of arrests in another. In this way, RAINN has managed to present a misleading narrative which does not draw an accurate comparison on data rates.

There is another point to consider here, out of interest. The FBI have their own estimates on rapes that may have been reported to the law enforcement, which comes in at 84,767 incidences in 2010 alone. [8] Had RAINN compared the total estimated rapes reported to law enforcement with the total estimated arrest rates using the same database, the calculation would have looked like this:

20,088/84,767=0.236979x100=23.6979%

We can see when comparing these two data sources that almost 1 in 4 estimated reports of rape to law enforcement led to an arrest in 2010. This is far from definitive and ignores estimates on not-reported crimes. However, it also paints a very different picture and could even be used to suggest that law enforcement is more pro-active than is suggested by RAINN’s infographic.

One final point to consider here is that there is a discrepancy with the numbers on “rape/sexual assault” available from the Justice Department. For reference, this chart details the “rape/sexual assault” estimates from 2010-2011, as cited by [9]:


Whereas this one details the 2009-2010 estimates, as cited by [10]:


Note how the estimate for 2010 is different on both charts by a value of 80,190 estimated incidences. At this time, I am unable to confirm how this discrepancy has occurred and whether or not RAINN factored these differences in with their infographic.

If anyone is curious, I have included a link to another piece, “Victimizations Not Reported to the Police, 2006-2010”, which offers some explanations as to why crimes were not reported. [11]


3 were referred to prosecutors
This data is relatively simple to work out. RAINN cite “Uniform Crime Reports, Offenses Cleared Data” from the FBI over a 5 year period. I have provided a link to the 2010 data. [12]

A “cleared offense” can happen one of several ways, but typically refers to when a case is “closed” and usually turned over for prosecution. RAINN appear to have taken the overall “clearance rate” for each year, then compared that to the FBI’s arrest rates (for example, in 2010, 40.3% of arrests for forcible rape ended in being “cleared”) and worked out a broad average.

The clearance rates for 2006-2010 according to the FBI stood at:

40.9, 40.0, 40.4, 41.2 & 40.3 per cent

RAINN have then taken 40% of total estimated arrests in the infographic (in this case, 7) and then referred to them as “cases referred to prosecution” (a total of 3), which is a simple but mostly reasonable assumption to have made.

Conclusion

In order to save on the word count, I will say that the remainder of the infographic, surprisingly, is loosely accurate, though the relevance & use of claiming "the other 98 will walk free" is up for debate.

Unfortunately, RAINN have chosen to conflate “rape/sexual assault” estimates with “forcible rape” data specifically from two different data sources. This is then followed by purposefully conflating “reported & unreported estimates” as meaning “confirmed incidence of rape”, which the data does not tell us.

To top this off, these looser estimates on possible criminal incidences are then compared to arrest rates based on a much stricter definition, further increasing the perceived disparity between criminal activity and actions taken by the justice system.

This is precisely the reason why we must take time to understand what data is used to make a claim and HOW the claim is made, not simply repeat a claim uncritically. Misleading claims, even if well intentioned, can have disastrous consequences and impractical real world results.

I hope you have found this to be useful.

References
[1] https://rainn.org/get-information/statistics/reporting-rates
[2] http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/34650
[3] http://www.bjs.gov/content/pub/pdf/NCVS_Variance_User_Guide%2011.06.14.pdf
[4] http://www.bjs.gov/content/pub/pdf/cv12.pdf
[5] https://www.justice.gov/ovw/sexual-assault
[6] https://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/2010/crime-in-the-u.s.-2010
[7] http://www.bjs.gov/content/pub/pdf/cv10.pdf
[8] https://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/2010/crime-in-the-u.s.-2010/violent-crime/rapemain
[9] http://www.bjs.gov/content/pub/pdf/cv11.pdf
[10] http://www.bjs.gov/content/pub/pdf/cv10.pdf
[11] http://www.bjs.gov/content/pub/pdf/vnrp0610.pdf
[12] https://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/2010/crime-in-the-u.s.-2010/clearances
[13] http://www.bjs.gov/index.cfm?ty=dcdetail&iid=245

Saturday, 16 April 2016

EgaFem Analysis Part 2: A glossary of terms.

Author: Drew Roan

Previous piece in this series:
A basic guide on what to look for in research papers -  http://egafeminist.blogspot.co.uk/2016/04/egafem-analysis-part-1-basic-guide-on.html

As stated in my previous piece, Egalitarian Feminism want to help people learn how to read research papers, what to look for and from there how to interpret the data.
Continuing on with that theme, I have compiled a basic glossary of some frequently used terms and their definitions to help people understand some of the language used. This list is not exhaustive by any means, but hopefully people may find it useful.

Aggregate:
Definition: A total created from multiple smaller units. To take an example, the population of a country is the aggregate of all the cities, towns, villages, rural areas and so forth.

Attrition:
Definition: The rates at which participants drop out of a study over an extended period of time. A study with a high attrition rate risks creating significant bias in the results and potentially threatening the overall quality of the research.

Bias:
Definition: Bias is a form of influence in a set of data which ends up producing lopsided or misleading results. As a result of different biases, a set of data might be over-representative or under-representative of the larger population. Bias can come in many forms, such as positive & negative response rate bias in samples, common method bias (see below), instruction bias (where instructions as to what is wanted are unclear, researchers use their judgement to dictate what they want. However people can respond to this differently depending on their perception of the instructions) and more besides.

Chi square:
Definition: A statistical test used to compare expected data with data that has been collected, usually represented with an x2. A large difference between expected and collected results indicates that something may have caused the discrepancy. A suitably large difference allows researchers to reject the null hypothesis (see further down).

Coefficient:
Definition: The number or known factor (usually a constant) by which another value (usually a variable) is multiplied. As an example, imagine you have a sample of workers that is 10% of the total population of workers in an area. Having collected your results from the sample (let’s say how many employees work in sales), you wish to estimate how the larger population is likely to look if your data is accurate. You would therefore multiply your variable (sales employees) by your coefficient (10).

Common Method Variance/Bias:
Definition: A term given to concerns raised by how the data is interpreted and supplied from surveys. For example, a survey in which respondents use their own interpretation of terms might receive very different response rates depending on who is responding and how they interpret the questions being asked.

Confidence Interval:
Definition: A term used to express the researchers’ level of uncertainty in their estimates. A researcher who claims that their confidence level stands at 60% is telling you that if you were to take the same sampling method but choose different samples, you would expect the true population parameters to fall within that estimate 60% of the time. The smaller the confidence interval, the greater the uncertainty in the accuracy of the results.

Control group:
Definition: In an experiment, the control group has data collected, but the findings from their data is not included in the results. Its’ purpose is to show what would normally happen in a given situation and to compare that data with what happens when you alter an independent variable. This allows the researchers to determine if altering a variable is having an effect on a test group, as well as demonstrating what effect it has.

Dependent variable:
Definition: A variable that can be influenced by another variable which researchers can change. For example, consider two variables “employment” and “age”. Here, “employment” is the dependent variable. It can be affected by the variable “age”.

Double blind experiment:
Definition: An experiment where both the researcher(s) and the participants are unaware of which the control group is and which is the treatment group. This is often done in psychology studies to further reduce potential bias created by the participants and the researchers.

Hypothesis:
Definition: A testable theory. For example, “My hypothesis is that if I water my plants regularly and give them lots of sunshine, they will grow healthily”.

Independent Variable:
Definition: The variable in an experiment that is manipulated by the researchers. It also refers to a variable that is not affected by, but does affect, a dependent variable. In the example given under “dependent variable”, “age” would be an independent variable. “Employment” does not affect “age”, but “age” may have an impact on “employment”.

Meta-Analysis:
Definition: A term used to describe the method of combining and analysing data from multiple studies on the same subject.

Null hypothesis:
Definition: This term represents the assumption that the variables of an experiment may have no effect on the results. In the example given for “hypothesis”, the null hypothesis would be that “regular water and lots of sunshine will not help plants grow more healthily”.

P-Value:
Definition: P-Value refers to the idea that the results from a study may have been down to chance, usually represented by a lower case p (for probability). For example, if you see in a study p. < 0.05, this tells you that there is an equal to or less than 1 in 20 chance that the results were down to luck. Most researchers assume that a p value greater than 0.05 means the results were not statistically significant or are too prone to chance to be considered viable.

Parameter:
Definition: “Parameter” refers to a summary – usually a percentage or average – that describes the entire population.

Population:
Definition: “Population” refers to any large group of objects or individuals about which information is desired, such as Germans, flowers or insects.

Random Sampling:
Definition: A sampling technique where individuals from a population are picked at random.

Regression Analysis:
Definition: A method of statistical analysis used to examine relationships between variables. Perhaps the best way to think of this is to think of a scatter chart (a chart where the data points are marked with little dots). The regression analysis is represented by the central line drawn through the data to mark out the average.

Sampling error:
Definition: A term used to describe the level in which results from a sample are different from results that is expected to be obtained from the larger population.

Statistically significant:
Definition: A term used to explain that a difference in results did not occur by chance.

Variable:
Definition: A characteristic or trait that varies between any group of objects or people. Race, gender, age and education are all examples of variables.

Weighted sample:
Definition: A correcting technique used to adjust responses given by survey respondents to match the larger population. Typically this is done when certain demographics (based on race, age, etc) are over or under-represented in a survey and the researchers wish to have their results reflect the larger population.


A word of caution: Correlation Vs Causation

A trap many people (and even some researchers) can fall into is to confuse correlation in data with causation. Although a trend may exist, it does not mean that one causes the other.
Correlation can be understood as recognising when two or more variables show a tendency to fluctuate together, but a change in one does not necessarily cause a change in another.
Causation can be understood as recognising when two or more variables show a tendency to fluctuate together and a change in one WILL cause a change in another.
As an example, we could look at profits from ice cream sales and warm weather. As the weather becomes warmer, we would expect to see companies sell more ice creams, thus make more profit. The correlation in this example is that ice cream sales increase with rising temperatures. The causation is that selling more ice cream leads to a rise in profits.

Ending Notes

Hopefully you have found the above piece useful and leaves you feeling more confident in being able to read research papers for yourself.

In the near future, I will begin tackling claims made by different people as well as research pieces of interest. If there are any pieces you wish examined in more detail, please leave a comment below or contact me on twitter (@DrewRoanEgaFem).

Friday, 8 April 2016

EgaFem Analysis Part 1: A basic guide on what to look for in research papers

Author: Drew Roan

At Egalitarian Feminism, we strongly believe in using evidence based claims and using good quality research to back our assertions. We believe poor quality research can have the unfortunate effect of increasing peoples’ fears and creating hostility where it is not warranted. In striving for equality for all regardless of gender, we must make sure that any research we use is as fair and reasonable as possible. We are also strongly aware that sometimes, research can be misrepresented in the media and how this can affect what assumptions people make.

Therefore, Egalitarian Feminism will be starting an “EgaFem Analysis” series where research is scrutinized in detail and any findings are posted here for anyone to read.

But we want to do more than just analyse data. We want to help people learn what to look for, how to understand academic pieces and facilitate themselves in being able to interpret research and come to their own conclusions, not to solely rely on the reports of others.

Here are some points I have found to be highly useful when examining research papers. Hopefully, you will too:

Do you know where the claim has come from?


Statistics and claims are often thrown around during discussions online and in the media. Ask yourself “do you know where this statistic has come from”? Have you seen any source material for it? If you have no idea where the claim has come from, can you honestly say you understand the claim and can trust it?

As a rule of thumb, if a claim is made but no evidence can be or has been provided for it, it’s best to assume that it may not be accurate or fully trustworthy.

Don’t rely on the summary. Skim the introduction. Study the method and the results.


A trap many people fall into is to read the summary and assume it accurately reflects the research material in the study to such a level they often feel safe using it as a source. This is a bad idea. The summary at most only gives you a snap shot of what the researchers want you to take away from their study. It tells you nothing about their methods, results, conclusions, errors or disclaimers that might complicate the picture.

Introductions are usually filler text to explain the background and the necessity for the study in the first place. That said, it is always worth skimming through an introduction as it can yield a lot of information about the researchers’ point of view before conducting the study. Researchers who make bold or extraordinary claims, especially if those claims lack citations, may be more inclined to offer up incomplete or misleading research.

It cannot be stressed enough that reading the method and results are essential to understanding any research piece.

Always check the sample size and neutrality.


Sample size and bias can massively alter the quality of the results. A study with a small sample size might produce disproportionately large or small results that do not accurately reflect the wider population. Participants who have volunteered often offer a higher response bias than participants who are randomly sampled.

On rare occasions, participants in a sample may have been “coached” to elicit certain responses before participating in the research. Again, this can massively distort the results and potentially make the research extremely unreliable.

Who funded and conducted the research?


A commonly overlooked point to consider is the question of where the money for the research came from. It is not unheard of for researchers to be funded by specialist or advocacy groups, in order to conduct a study on a particular topic. Whilst it should be expected that researchers will always remain impartial, this is not always the case. You might find a study on the “harms of sugar in the brain” being funded by diabetes research groups, or a study on the benefits of alcohol being privately funded by alcohol manufacturers. This does not necessarily invalidate a piece of research, but it may warrant taking the findings with a pinch of salt.

Of equal interest is the question “who conducted the research"? Was it conducted by a survey group paid to find data on a particular topic, or from professors with a known history of possessing a particular ideological bias? All people are capable of being influenced by their own biases and this can be reflected in their research. Once again, it does not necessarily invalidate a piece of research, but it may give a reason to be cautious when repeating the findings.

Are you sure you understand their definitions?


Another common trap to fall into is to assume the definitions the researchers are using match up to legal or common use definitions. Equally, some terms that are frequently used in regular conversations may have a different meaning when applied in a certain context. Some examples might be:

  • Confusing “wage” (pay per hour) with “salary” (pay over the course of a year).
  • Mixing up “sexual offences” with “sexual assault”.
     A researcher who may be using a non-legal definition of “discrimination” to include examples that may not match the legal definition.

Always make sure that you are clear on exactly what the researchers mean when using particular phrases.

 If the research includes a survey, are those questions clear cut and straight forward to respond to?


Researchers have known for a long time that asking a question directly often does not yield particularly fruitful responses. Instead, questions with more ambiguous wording often yield a higher response rate, though these sometimes come with a risk of inaccurate reporting and artificially inflated response rates.

For example, consider the following two questions:
  • Have you ever been raped whilst drunk?
  • Have you ever had sexual intercourse when you did not want to whilst under the influence of alcohol?
On the surface, these two may appear to be the same in nature. In reality, the second version may yield a greater response yet, but may also include occasions that were not rape (such as drunken one-night stands, incidences of cheating and so forth).

 Have the researchers used all the available data in their conclusions?


Always check to see if all relevant and available data has been used in the conclusions of the research. If data has been left out, why? Would it have affected the results? Was that data potentially relevant to the overall piece?

Whilst it may not necessarily invalidate the research in question, if researchers have chosen to leave out certain results or data sources, it may lead to others misinterpreting the information or claims being made which lack context. It may even lead the researchers to drawing a potentially unreasonable conclusion.

A little critical thinking is a great thing.


One of the best things you can do with any piece of research is to critically think about the findings, method and conclusions. Be mindful of potential flaws in the study. What would you have done differently? Would you have come to the same conclusions? Have you double checked their mathematics, to see if the numbers add up properly? Not all criticisms will be reasonable, but you might be surprised what issues with a research piece you can find if you examine it in more detail.

I hope you have found the above points useful in some way and that in turn, you feel more confident in reading research papers for yourself. My next piece will be a basic glossary of technical terms you may frequently encounter to help further deepen your understanding.

If you come across any piece of research or a claim you would like examined, feel free to leave a comment and let us know.

Saturday, 2 April 2016

Campaign: Rape - Recognising Women's Impact and Agency

Update 28/09/2016: new petition has been launched - let's get this one debated in parliament!

Articles in this series:
Gendered Equality of Opportunity: http://egafeminist.blogspot.co.uk/2016/03/gender-equality-of-opportunity.html
Definition of UK rape law: http://egafeminist.blogspot.co.uk/2016/03/rape-definition-and-nounswap-test.html
Campaign Info: This article
Impact on Reports: http://egafeminist.blogspot.co.uk/2016/04/rape-impact-on-reports.html
Denial of Women’s Impact and Agency: http://egafeminist.blogspot.co.uk/2016/05/rape-denial-of-womens-impact-and-agency.html
Reponse the the GovUK Reply: http://egafeminist.blogspot.co.uk/2016/05/rape-uk-governments-response.html

Author: Blaise Wilson

Egalitarian Feminism didn't write this petition, but we support it. This petition needs to reach 100,000 UK signatures before September 2016. The sooner the better to send a strong message to the UK Government that this is a serious issue.

If you're not part of the UK, please spread the word and advertise it. The more people who spread it the better. Once the UK change their law, it will put pressure on other countries to follow suit.

Sign and Share:

Share it on Twitter, Facebook, Tumblr, and every other social media outlet. But don't stop there! Tell your friends and family. Get the message out in as many ways as possible. And spam this to any YouTube personalities, politicians, feminists, twitters accounts you think would be interested. Check out this page for ideas of who to send it to: http://egafeminist.blogspot.co.uk/2016/01/egafems-campaign-stakeholders.html

Here is a useful image that summarizes the petition. Download and share it - but if you make any changes please ensure the link to the petition is included.


If you have twitter, please retweet this:



— Egalitarian Feminism (@EgaFem)


And for all your feminist friends out there:

Here is a useful summary on why this is a feminist issue. Please download it and use it to support the campaign and send it to all the feminists you know.

If the feminist believes in true equality, at all levels and not only when it is convenient to women they should support this petition and help spread the word.

It doesn't matter if they are Opportunity or Outcome Feminists, as this issue impacts both. Women should have equal outcome as their actions should be equally recognised to men. Forcing someone to have sex is rape. Women should have equal outcome and opportunity under law to men when they choose to commit the same crime.

However, if they believe in female supremacy, that women are incapable of evil or are a 'special case' then they don't believe in equality, and can't be feminists. This isn't a 'true scotsman' logical fallacy, but the very definition of feminism is to believe women are equal to men, especially through their actions and should be treated equally by society and law. If they don't believe women should be equal, they are not feminists.

Only a misogynist would deny women their impact and agency. Only someone who doesn't believe women are capable would claim they cannot chose to commit wrongdoings based on their gender. If women's wrong doings are not recognised, if they shouldn't be held equal to men, then why should the good women do be equally recognised? If women are a 'special case' that should be treated carefully, as if women are poor fragile pieces of glass then why should women be paid equally? Why should women be given equality when they are good girls, if they have less responsiblity, impact and agency to men?

Either you believe women should be equal, or you don't. You don't get to pick to and choose equality only when it benefits women.


Previous Article: Definition of UK rape law: http://egafeminist.blogspot.co.uk/2016/03/rape-definition-and-nounswap-test.html

Next Article: Impact on Reports: http://egafeminist.blogspot.co.uk/2016/04/rape-impact-on-reports.html