I have crafted this open letter to you – members of the
atypical homicide research community – as a notification that I have resigned
as a longstanding contributor to and promoter of the Radford/FGCU Database
Project after being accused by its organizer of misappropriation.
In the interest of transparency, the charge came
after users of the online data science platform Kaggle were provided with
access to the collectively built serial homicide dataset in an effort to
improve its limitations and gain answers to several outstanding research
questions. A research space on the Harvard Dataverse was also established so
that all collaborators could be credited for their contributions – for the
first and only time in four years – via the following citation:
Aamodt, Michael; Fox,
James Alan ; Hickey, Eric; Hinch, Ronald; Labuschagne, Gerard; Levin, Jack;
McClellan, Janet; Nelson, Bryan; Newton, Michael; Quinet, Kenna; Steiger, Cloyd
(HITS); White, John; Yaksic, Enzo, 2017, "Consolidated Serial Homicide
Offender Database", doi:10.7910/DVN/4N0RND,
Harvard Dataverse, V1[i]
I have since been instructed that I “must” dismantle
these pages and remove all references to the citation. To placate my longtime colleague,
I complied with the demands. But I came to realize upon reflection that my
participation in an initiative that wittingly subsumes all other efforts while
presenting a face-forward view of being a separate and unique entity cannot
continue. Doing so directly conflicts with the principles engendered in me by
my work in open science.
Encouraging open and wide collaboration among
researchers has been my primary mission for more than a decade but I am now
forced to confront the futility of this idealistic, foolish and Sisyphean
errand. The exploration of the database on Kaggle was characterized as
“damaging” even after ample evidence was offered that users created
visualizations implementing programs like Python whose output featured
statistical analyses serving to move our field beyond a reliance on the typical
and somewhat limiting descriptive variety.
I can no longer endorse or contribute to the
continual presentation of work as one’s own while this organizer has accepted hordes
of information from sources that remain uncredited. This practice is not only unethical
but dangerous in that it lessens the legitimacy of the parent dataset and, as a
result, the validity of its contents. Warehousing data and constricting access
and use of it to a band of hand-selected individuals and media contacts encourages
bias and favoritism and, in turn, influences results and impacts research
outcomes. Acting as an arbiter of freely available information and exercising total
control over inclusion and exclusion calls into question the methods used to
assemble databases and prevents the timely publication of academic articles.
Openness to critiques is a prerequisite for occupying
positions as scientists and the nature of one’s assumed status as a subject
matter expert. Triaging requests for access and granting less than .01 percent should
rightly be seen as a protection of a product that may, under scrutiny, be
revealed to be either a useful or worthless resource depending on how it responds
to the techniques applied to it by those in the data science community. I wholeheartedly
came to believe, in my naivety, that such are the risks and leaps we are required
to take in the altruistic advancement of science and that improvement was predicated
on building upon each other’s work. Instead, it was demonstrated to me that one
individual could command an entire domain if their instinct to trump other
efforts is strong enough.
It may be appropriate to create a record detailing the
proper version of history in light of my recent experience as some members of
this collaborative may be unaware of how the group came to be in its current
form. This resource exists due solely to an unabashed hunt for data. Kenna Quinet
shared her database with Jamie and Jack’s Extreme Killing research team eight
years ago and shortly thereafter I disseminated a survey to thought leaders
(many of you were respondents) in an effort to gauge how common of a practice it
was to freely offer up ones work. It was summarily apparent that data was
rarely traded, an unfortunate fact that many of you had been privy to and
worked around that limitation for decades. In what can be described as a longshot,
I began approaching all those that claimed to maintain serial murder data and,
after trading the EK dataset, had received seven datasets. All collection
efforts eventually shifted towards building out the Radford “brand” and the
original consortium was abandoned to devote resources to the larger effort.
Although I engaged in this process willingly as it felt like a natural
progression at first, I can only equate the process to what founders probably
go through after being acquired by a corporation – the slow dissolution of
preexisting endeavors over a period of time.
Now and again it is good to review our current
course while taking stock of achievements and how our attitudes impact others. The
presence of my beliefs began to create strife for the creator of a product so
it is only fair that I step aside so that business can continue unabated and without
objection. I have resolved in this new year to attempt a return to a time when
interesting projects were sought on their merit rather than as a byproduct of
forwarding the stature of those with agendas that stray from open science. My
reputation will undoubtedly suffer for choosing to break away from the pursuit of
notoriety and media releases but I take solace in knowing that my integrity
remains intact.
Although I report with regret that the larger data
sharing effort within the realm of atypical homicide research has been an
abysmal failure I would like to encourage folks to trust the process of open
data science and direct attention towards a paper published in Nature
Human Behavior just yesterday. The authors of the
article advocate for the promotion of transparency and open collaborations in
team science in the context of reproducibility. This message has delivered a
measure of consolation to me at this difficult time in my career.
Please do not allow my experiences to dissuade you
from engaging in an exciting arena filled with possibilities such as answering
the Real-Time
Forecasting Challenge or applying for the 2017 Risk Terrain
Modelling Exemplar Award. There will be an Open
Data Science Conference in Boston in May and as I rebuild the Serial Homicide
Expertise and Information Sharing Collaborative database I will attend in the
hopes of learning best practices in team building and establishing deep and truly
collaborative relationships in conjunction with open data.
All the best,
Enzo Yaksic
[i] The
relationship I had cultivated with agents of the FBI’s BAU-5 has gone unnoticed
but I commend their willingness to dedicate time to correcting errors within
the data and they should be acknowledged publically.
This comment has been removed by a blog administrator.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDelete