Coding a murderer
Discover how algorithms put serial killers at the centre of the map – Tamsin Oxford investigates the Murder Accountability Project
How the Murder Accountability Project is helping to catch killers
One journalist, a story on prostitution, a passion for open source software, and an old yet powerful database led to the creation of the Murder Accountability Project (MAP, www.murderdata. org). The MAP involves finding serial killers, preventing murders and connecting the statistical dots. It’s also a project that has a fascinating back story.
Thomas Hargrove – the journalist – had purchased a Uniform Crime Report from the University of Missouri while doing research on a story about prostitution in 2004. The university threw in the Supplemental Homicide Report at no extra cost, and this free, data-heavy document changed the course of Thomas’s life.
“The document contained row after row of information about individual murders that covered everything from the month, the year the murder happened, and the jurisdiction,” says Thomas. “The file also contained data around the age, sex, method of killing, race and the police theory around the killing, plus the offender’s details if the information was available.
“The moment I saw this file I was asking myself one very important question: would it be possible to use this data to teach a computer to detect serial murders?,” he continues. “Could I use open source tools to build a platform that enables people to access this data and understand it in ways that allow for these murders to be solved more effectively?”
Today, MAP is a viable open source, non-profit organisation with an algorithm that is ‘capable of detecting serial killers who target multiple victims using similar methods of killing within a specific geographic region’. The platform takes a depressingly endless list of violent deaths and transforms it into a visual tool that highlights patterns and trends which may have been previously undetected by the FBI or the police. It also neatly helps to solve one of the biggest challenges facing police detectives – ‘linkage blindness’ – where they don’t recognise the link between one case and another, or that there may be a common offender involved. It’s understandable: different people working different cases reported in different ways would make it hard to pick up similarities. It’s something MAP can support detectives in overcoming, because it connects the data dots to create visual patterns and highlight similarities.
The project has undergone several iterations since Thomas first held the data in his hands back in 2004. It wasn’t easy finding the algorithm and teaching the code to translate the data in meaningful ways. Today, MAP has evolved thanks to inventive code, determined development and a passion for solving the unsolved…
once Thomas discovered the data in 2004 he spent the next six years trying to persuade his editor to allow him to test his theory. He knew that the information that the university had given him could be used to refine and resolve murders, he just needed the time and the budget. After years of pushing the right buttons he finally got the chance to use the data while working on a project known as Murder Mysteries in 2010.
The project won awards and raised just the right amount of awareness to allow for Thomas to take his work to the next level. The University of Missouri offered him the support of a talented Masters candidate, Liz Lucas, and so started the hunt for the perfect algorithm.
“We found hundreds of things that didn’t work,” says Hargrove. “We tested to see if an elevated rate of murders would indicate the presence of a serial killer, and it failed. We tested to see if there was an elevated rate of female murders, and this didn’t deliver results. We tested to see if there was an elevated rate of murder for a particular type, such as strangulation, and this was also a no.”
The team tried a lot of variations, but ultimately it was a serial killer who taught them what would work: the green River Killer, gary Ridgeway. He had left a series of bodies behind him as one of the most prolific killers in the United States. He strangled 49 women before he was caught and confessed to the crimes.
“We knew there had been a serial killer in Seattle and we wanted to know if we could craft a computer program that would alert us to this pattern in Seattle at that time,” says Hargrove. “We tried all these different combinations to see if Seattle would appear and nothing worked. Then we tried the rates of unsolved murders, which also didn’t really impact on the results. Finally, what worked was cluster analysis.”
It was a blend of the victim’s gender, location and method of murder that finally worked. originally, the data used age, but this metric has been subsequently discarded because it had a minimal impact on results. The murders were then allocated a murder group number based on these four categories and the system used this structure to create around 100,000 groups.
“Then we told the computer to calculate how often, in each of these groups, the murders were solved, and to alert us to any large groups of murder clusters that had low rates of resolution,” Hargrove says. “It worked
The platform takes a depressingly endless list of violent deaths and transforms it into a visual tool that highlights patterns and trends
like a dream. We had finally hit upon the algorithm. We then applied it to the area where the green River Killer had been active and there it was – a huge bubble over Seattle. Another serial killer came right after Ridgeway and the algorithm detected him easily.”
The algorithm was now capable of finding dozens of clusters all over the country, a plethora of deadly bubbles that indicated an excess of murder, which may or may not indicate that a serial killer is at work. Some are known, some are not.
The tech behind the algorithm
“We have used a variety of different solutions to create MAP,” explains Hargrove. “We have proprietary software called Tableau (www.tableau.com) that enables us to visually display the data on the website, but as we’re
committed to ensuring all the data and tools are open source and accessible, we also make the raw data available to anyone to use on any platform they wish.”
Visitors keen to turn their eye to unlocking patterns and potentially saving lives can use a free copy of Tableau, download the workbook created by MAP and manipulate the data. It can’t be saved onto a local hard drive or server, but it does enable anyone to work with the technology and the data using the same system as MAP. The team of the non-profit organisation, now extended to include a vice chairman, treasurer and board of directors, uses Tableau because it provides them with the tech they need to display large sets of data on the website.
“We have ensured that the base data and everything that we have built onto the platform is not proprietary,” says Hargrove. “We are dedicated to making homicide data more readily available to the world, which will hopefully encourage the public to review murder occurrence data and easily access the information. our target audience is, of course, homicide detectives and we wanted a system that would allow them to call up records on their own and share them with other departments.”
MAP is designed to be as open a platform as possible, to ensure it can be accessed across jurisdictions, US counties and varying technical ability. The team isn’t interested in secrets or mystery or obfuscation – they use the data that the FBI has openly provided them and use it to enhance and expand their capabilities and the efficiency of the algorithm.
“The report provided by the FBI accounts for all the homicides in each jurisdiction, along with case-level detail that includes around 30 variables for each individual,” says Hargrove. “We have also combined this with the data gathered by regional police departments. Many of these don’t collaborate with the FBI, so we have added their information to the data that the FBI has given us to create an incredibly detailed database. We currently have caselevel details on 752,000 murders from 1976 to the present. You won’t find that level of insight anywhere else.”
This data is also freely accessible to the public and can be downloaded as UCR (Uniform Crime Reporting) reports. These reports are assembled using the IBM program SPSS Statistics, which is capable of managing sophisticated operations and delving into the database to impressive depths. Because this is a proprietary system, MAP has opted to use PSPP (www.gnu.org/software/ pspp) as the open source alternative. This gnU project enables access across multiple platforms that underpins the MAP philosophy: everyone should be able to play a part in detecting serial murderers.
“We advise people to rather use PSPP because it can be downloaded at no extra cost – many people have done just that as they want to work with the datasets we have created,” says Hargrove. “Making the entire project open
source is incredibly important to us as there’s still so much more work to be done. The murder resolution rate has been declining and we believe that by encouraging people to come to our site and download the workbooks, raw data and data dictionaries they can call us with questions and suggestions. This is the entire principle that drives the open source community and can help us save lives.”
Visitors keen to turn their eye to unlocking patterns can use a free copy of Tableau
The data collected by MAP from the FBI is decidedly user-unfriendly. Written in Cobol, it comes with some oddities and complexities that impact on its accessibility, but this was the language that was common when the
FBI started assembling the data.
To resolve the challenges of shifting the data from complex to accessible, MAP has written sophisticated syntax commands that take these wads of data and assembles them into the PSPP files. This can be further split out into CSV files which, Hargrove admits, may not be as useful as PSPP but are a lingua franca that every statistical package can open.
“These syntax commands are also available to anyone who wants them,” says Hargrove. “They have made a laborious process far easier. They also enable us to expand the searches more widely – from only getting insights from a specific jurisdiction to insights across borders, because criminals don’t respect geopolitical boundaries. This entire process is made very simple with the PSPP platform.”
The records are assembled and put into multiple formats so they can be downloaded on demand. This is the data that is then loaded into Tableau to showcase all the information in a visual format – age, gender, method, location, type of murder and more – and allows for quick cross-tabulation. For detectives who work with limited time and plenty of information, Tableau helps them to test a theory or determine if their type of crime has occurred in different jurisdictions.
“over the past year we have added further refinement to the system, where we now visually show murder clusters identified by the algorithm we’ve developed,” adds Hargrove. “There’s a kind of magic to carefully counting things and uncovering answers. our algorithm counts through the three-quarters of a million records looking for clusters that have an elevated probability of being serial murders and puts them on this visual map that reveals serial murder potential.”
So the system works – and now the team is overwhelmed with cases. Cases that need to be solved because, as Hargrove concludes, “Budgets, limited resources, increased death tolls – all these factors are influencing the ability of the authorities to solve crimes. What makes this more of a concern is that if murders go unsolved it inspires even more murders. Whenever a killer gets away with a crime, he becomes a living testament to a system that isn’t working. Murder begets murder, and clearance reduces murder.”
SPSS Statistics www.ibm.com/uk-en/marketplace/spss-statistics Supplemental Homicide Report https://ucr.fbi.gov/nibrs/addendum-for-submittingcargo-theft-data/shr
University of Missouri https://missouri.edu
Uniform Crime Reports https://ucr.fbi.gov
Above Big graph, bigger problems: murder is onthe rise is the US
Thomas underscoring the value ofmurder accountability at the Investigative Reporters and Editors (IRE) annual conference