Saturday, 30 April 2016

EgaFem Analysis Part 3: How poorly created statistics can create false narratives.

Author: Drew Roan

Previous pieces in the analysis series:

My previous pieces in this series have looked at things to look for when reading research pieces as well as a beginners’ glossary of research terms.

This time, I want to demonstrate how careless representation of data can lead to extremely misleading claims, even if well intentioned.

The example I am using is an older infographic created by RAINN (Rape, Abuse and Incest National Network). [1]

I should make it clear that whilst RAINN have since provided an updated version, that version still operates on the same principles as this one:

From their website, they list reports from both the FBI and the Department of Justice as sources for their data. [1] This gives us a good opportunity to track down the data used.

Having understood the claim and learned where the sources come from, we should look at their methodology.

32 are reported to the police

The data for this claim was taken from the National Crime Victimization Survey, 2008 – 2012. [2] The infographic tells us that 68 out of 100 rapes (or 68 per cent) weren’t reported to law enforcement over this time.

The method they used to determine this estimate was to first use a survey of 90,000 households on self-reports of crime victimization. [13] Once they collect this data, they use a series of complex calculations to provide estimates for the larger population. Finally, they cross reference this estimate with the data collected by police forces as to the total number of reports lodged with them on each type of crime. The difference in the numbers between the survey estimate and the police records is then considered “estimated unreported incidences” of a crime. You can see a full breakdown of their methodology below. [3]

So far, so good.

However, there are several caveats to consider:

i) It’s worth bearing in mind that the total estimate RAINN uses in their initial calculation is NOT a total estimate for “rape”, but a total estimate for “rape/sexual assault”. [4] It is possible that since many states use various definitions for “rape/sexual assault” that they simply wished to create a broader term to collate the data in. However, this is purely speculation on my part. The reality is that rape is not the same as sexual assault and conflating the totals only muddies the waters.

ii) It would seem RAINN is aware of this and has ignored this distinction, based on their own phrasing. You can see in the text above they refer to “sexual assault”, then refer to “rape” in the infographic, despite the fact the data source they used does not make this distinction. [1][4]

iii) A final note of caution here – you cannot conflate “report” as “confirmed incident”, which RAINN have done in their total estimates. We have no way of sorting how many cases had little/no evidence, how many allegations were withdrawn or how many may have been purposefully false (for example).

It’s worth noting that for all of their claims, they compare the survey estimate for “rape/sexual assault” combined with the total estimates for arrests, referrals to prosecutors, convictions and those who spend a single day in prison.

As for the Department of Justice’s definition of “sexual assault”, according to their website it is:
“…any type of sexual contact or behavior that occurs without the explicit consent of the recipient. Falling under the definition of sexual assault are sexual activities as forced sexual intercourse, forcible sodomy, child molestation, incest, fondling, and attempted rape.” [5]

Whilst all of those things should be treated with the utmost seriousness, conflating a broad spectrum of sexual offences with rape is simply not helpful in any way.

7 Lead to an arrest

The second source used by RAINN is the FBI’s Uniform Crime Reports, focusing on arrest data over the duration of 2006-2010. I have included a link to the summary of their 2010 data below. [6]

Their claim that 7 out of 100 incidences lead to an arrest comes from collecting the total arrest rates for rape over a 5 year period from the FBI databases and comparing that to the data provided by the Justice Department on the total estimated rape/sexual assault numbers.

To use an example based on just one years’ data, in 2010 there were estimated to be 188,380 “rape/sexual assault” incidences according to the Justice Department. [7] According to the FBI, in 2010 there were estimated 20,088 arrests for “forcible rape”. [6] Therefore:


Therefore, by this calculation, we can assert that the total estimated arrests for “forcible rape” (by the FBI’s figures) made up close to 11% of arrests for all estimated incidences of rape/sexual assault (by the Justice Department’s figures) for 2010.

There is a serious issue here. The FBI and the Justice Department both record rape and sexual assaults differently. Whereas the Justice Department data conflates “rape/sexual assault” in the same category, the FBI has separate categories for “forcible rape” and “sexual assault (excluding forcible rape and prostitution)”. Yet in RAINN’s analysis, they looked specifically at arrest rates relating to “forcible rape” whilst ignoring “sexual assault” arrest rates in the process.

The only reason for doing this is to inflate the number of “rapes” in one category, but minimize the number of arrests in another. In this way, RAINN has managed to present a misleading narrative which does not draw an accurate comparison on data rates.

There is another point to consider here, out of interest. The FBI have their own estimates on rapes that may have been reported to the law enforcement, which comes in at 84,767 incidences in 2010 alone. [8] Had RAINN compared the total estimated rapes reported to law enforcement with the total estimated arrest rates using the same database, the calculation would have looked like this:


We can see when comparing these two data sources that almost 1 in 4 estimated reports of rape to law enforcement led to an arrest in 2010. This is far from definitive and ignores estimates on not-reported crimes. However, it also paints a very different picture and could even be used to suggest that law enforcement is more pro-active than is suggested by RAINN’s infographic.

One final point to consider here is that there is a discrepancy with the numbers on “rape/sexual assault” available from the Justice Department. For reference, this chart details the “rape/sexual assault” estimates from 2010-2011, as cited by [9]:

Whereas this one details the 2009-2010 estimates, as cited by [10]:

Note how the estimate for 2010 is different on both charts by a value of 80,190 estimated incidences. At this time, I am unable to confirm how this discrepancy has occurred and whether or not RAINN factored these differences in with their infographic.

If anyone is curious, I have included a link to another piece, “Victimizations Not Reported to the Police, 2006-2010”, which offers some explanations as to why crimes were not reported. [11]

3 were referred to prosecutors
This data is relatively simple to work out. RAINN cite “Uniform Crime Reports, Offenses Cleared Data” from the FBI over a 5 year period. I have provided a link to the 2010 data. [12]

A “cleared offense” can happen one of several ways, but typically refers to when a case is “closed” and usually turned over for prosecution. RAINN appear to have taken the overall “clearance rate” for each year, then compared that to the FBI’s arrest rates (for example, in 2010, 40.3% of arrests for forcible rape ended in being “cleared”) and worked out a broad average.

The clearance rates for 2006-2010 according to the FBI stood at:

40.9, 40.0, 40.4, 41.2 & 40.3 per cent

RAINN have then taken 40% of total estimated arrests in the infographic (in this case, 7) and then referred to them as “cases referred to prosecution” (a total of 3), which is a simple but mostly reasonable assumption to have made.


In order to save on the word count, I will say that the remainder of the infographic, surprisingly, is loosely accurate, though the relevance & use of claiming "the other 98 will walk free" is up for debate.

Unfortunately, RAINN have chosen to conflate “rape/sexual assault” estimates with “forcible rape” data specifically from two different data sources. This is then followed by purposefully conflating “reported & unreported estimates” as meaning “confirmed incidence of rape”, which the data does not tell us.

To top this off, these looser estimates on possible criminal incidences are then compared to arrest rates based on a much stricter definition, further increasing the perceived disparity between criminal activity and actions taken by the justice system.

This is precisely the reason why we must take time to understand what data is used to make a claim and HOW the claim is made, not simply repeat a claim uncritically. Misleading claims, even if well intentioned, can have disastrous consequences and impractical real world results.

I hope you have found this to be useful.


1 comment:

  1. My first interpretation of the graphic at face value would also indicated that rape isn't equal to rapists, multiple accounts of rape, from one perpetrator automatically inflates this statistic, well actually makes "98 walk free" patently false.