Friday, January 13, 2017

Open Letter to the Atypical Homicide Research Community


I have crafted this open letter to you – members of the atypical homicide research community – as a notification that I have resigned as a longstanding contributor to and promoter of the Radford/FGCU Database Project after being accused by its organizer of misappropriation.

In the interest of transparency, the charge came after users of the online data science platform Kaggle were provided with access to the collectively built serial homicide dataset in an effort to improve its limitations and gain answers to several outstanding research questions. A research space on the Harvard Dataverse was also established so that all collaborators could be credited for their contributions – for the first and only time in four years – via the following citation:

Aamodt, Michael; Fox, James Alan ; Hickey, Eric; Hinch, Ronald; Labuschagne, Gerard; Levin, Jack; McClellan, Janet; Nelson, Bryan; Newton, Michael; Quinet, Kenna; Steiger, Cloyd (HITS); White, John; Yaksic, Enzo, 2017, "Consolidated Serial Homicide Offender Database", doi:10.7910/DVN/4N0RND, Harvard Dataverse, V1[i]

I have since been instructed that I “must” dismantle these pages and remove all references to the citation. To placate my longtime colleague, I complied with the demands. But I came to realize upon reflection that my participation in an initiative that wittingly subsumes all other efforts while presenting a face-forward view of being a separate and unique entity cannot continue. Doing so directly conflicts with the principles engendered in me by my work in open science.

Encouraging open and wide collaboration among researchers has been my primary mission for more than a decade but I am now forced to confront the futility of this idealistic, foolish and Sisyphean errand. The exploration of the database on Kaggle was characterized as “damaging” even after ample evidence was offered that users created visualizations implementing programs like Python whose output featured statistical analyses serving to move our field beyond a reliance on the typical and somewhat limiting descriptive variety.

I can no longer endorse or contribute to the continual presentation of work as one’s own while this organizer has accepted hordes of information from sources that remain uncredited. This practice is not only unethical but dangerous in that it lessens the legitimacy of the parent dataset and, as a result, the validity of its contents. Warehousing data and constricting access and use of it to a band of hand-selected individuals and media contacts encourages bias and favoritism and, in turn, influences results and impacts research outcomes. Acting as an arbiter of freely available information and exercising total control over inclusion and exclusion calls into question the methods used to assemble databases and prevents the timely publication of academic articles.

Openness to critiques is a prerequisite for occupying positions as scientists and the nature of one’s assumed status as a subject matter expert. Triaging requests for access and granting less than .01 percent should rightly be seen as a protection of a product that may, under scrutiny, be revealed to be either a useful or worthless resource depending on how it responds to the techniques applied to it by those in the data science community. I wholeheartedly came to believe, in my naivety, that such are the risks and leaps we are required to take in the altruistic advancement of science and that improvement was predicated on building upon each other’s work. Instead, it was demonstrated to me that one individual could command an entire domain if their instinct to trump other efforts is strong enough.

It may be appropriate to create a record detailing the proper version of history in light of my recent experience as some members of this collaborative may be unaware of how the group came to be in its current form. This resource exists due solely to an unabashed hunt for data. Kenna Quinet shared her database with Jamie and Jack’s Extreme Killing research team eight years ago and shortly thereafter I disseminated a survey to thought leaders (many of you were respondents) in an effort to gauge how common of a practice it was to freely offer up ones work. It was summarily apparent that data was rarely traded, an unfortunate fact that many of you had been privy to and worked around that limitation for decades. In what can be described as a longshot, I began approaching all those that claimed to maintain serial murder data and, after trading the EK dataset, had received seven datasets. All collection efforts eventually shifted towards building out the Radford “brand” and the original consortium was abandoned to devote resources to the larger effort. Although I engaged in this process willingly as it felt like a natural progression at first, I can only equate the process to what founders probably go through after being acquired by a corporation – the slow dissolution of preexisting endeavors over a period of time.

Now and again it is good to review our current course while taking stock of achievements and how our attitudes impact others. The presence of my beliefs began to create strife for the creator of a product so it is only fair that I step aside so that business can continue unabated and without objection. I have resolved in this new year to attempt a return to a time when interesting projects were sought on their merit rather than as a byproduct of forwarding the stature of those with agendas that stray from open science. My reputation will undoubtedly suffer for choosing to break away from the pursuit of notoriety and media releases but I take solace in knowing that my integrity remains intact.

Although I report with regret that the larger data sharing effort within the realm of atypical homicide research has been an abysmal failure I would like to encourage folks to trust the process of open data science and direct attention towards a paper published in Nature Human Behavior just yesterday. The authors of the article advocate for the promotion of transparency and open collaborations in team science in the context of reproducibility. This message has delivered a measure of consolation to me at this difficult time in my career.

Please do not allow my experiences to dissuade you from engaging in an exciting arena filled with possibilities such as answering the Real-Time Forecasting Challenge or applying for the 2017 Risk Terrain Modelling Exemplar Award. There will be an Open Data Science Conference in Boston in May and as I rebuild the Serial Homicide Expertise and Information Sharing Collaborative database I will attend in the hopes of learning best practices in team building and establishing deep and truly collaborative relationships in conjunction with open data.

All the best,

Enzo Yaksic


[i] The relationship I had cultivated with agents of the FBI’s BAU-5 has gone unnoticed but I commend their willingness to dedicate time to correcting errors within the data and they should be acknowledged publically.

2 comments: