Is this an American project that sources & aggregates data from American police forces, or is it some general thing that wants to draw from multiple countries? (Because the site doesn't mention anything about locale).
I thought the same, but there are apparently 17,985 police forces in USA[1] so it looks to be American centric. I think they should list this in their FAQ or make it more obvious.
[1] https://en.wikipedia.org/wiki/Law_enforcement_in_the_United_...
An issue I see is that the public data police departments put out is often only what they choose to, meaning any analysis of that data is potentially biased to an unknown degree. I suppose adding FOIA data into the mix could improve that, but doesnât sound easy since those are usually delivered as PDFs.
I think this "the dog that didn't bark" phenomenon may be a much more important phenomenon than meets the eye. People's perception of the world is a function of their observations of reality - for starters, the vast majority of people don't even have any exposure to things like police data (other than the typically misleading narratives that they might encounter in the news), but then even the small majority of people who do dig deeper into the details, these people too will leave with a distorted perception of reality if only a subset of the data is revealed, and they do not know that it is only a subset.
The end result of all of this complexity is that people think the world is one way, but it is actually another, and the vastness of the gap between the two could be anywhere from small and insignificant, to gigantic and crucially significant, if one applies this idea to not just police data, but everything (the entirety of reality is subject to this flaw, and many others that we do not realize).
Add the fact that even the businessmen and regulators, influential people, politicians and judges, they all build their image of the world mostly on the same sources as everyone else - and it's becoming worryingly apparent that all the control systems in societies are operating with very low visibility into the things they're controlling.
Imagine a PID controller keeping a boiler from overheating. Now imagine the temperature input to that controller is a thermometer mounted to the door of a metal cabinet next to the boiler, and thus it reports anything related to the boiler temperature only when somebody is looking for something in that cabinet (when the opened metal doors touch the boiler and conduct heat). Doesn't sound like a good control system.
They could and should document and account for this somewhere in the data set.
It wouldnât solve the problem but it would allow the bias to be visible.
âAre you affiliated with a political party or have a political agenda?
No. Our only motivation is to provide trusted data in an age of disinformation.â
If true, this is an awesome project.
They're already working with one political candidate:
You have linked to a message saying:
âAny chance you guys have data on NJ? I'm running for office there and was looking to get some police misconduct data to double check a couple thingsâ
It is not clear how this means that they are working with a political candidate, or what that means to you.
Can you explain what conclusions you are drawing from this message?
From their slack, a political candidate posted that they were interested
One of two of their volunteer scrapers said that he would drop his work in California to work on New Jersey instead.
I guess "working with" is bad language, though "interested in dropping other work in favor of requests by political candidates" is more accurate.
"Working with" feels like a loaded phrase. Someone created that ticket requesting that data 14 minutes ago, and the ticket has yet to even be assigned to anyone (at the time of my post). I'm not sure how you can conflate this project's lack of a response to a request as "already working with".
A political candidate asked for help accessing some data, not the same thing as affiliating with a political party.
On the flip side, this will likely expose a lot more arrest data, and complicate the "innocent until proven guilty" thing that's mostly not true.
This group seems to have turned into a group that's only interested in scraping "criminal records" and "court records" over police accountability records, and believes that cop names should be removed from scraped records. They've also openly discussed inviting cops and FBI employees into their slack to discuss scraping. Consider that "Police Data Accessibility Project" is another way of saying "We only make available data that the police want us to make available.".
The project itself has tons of managerial types (who happily put PDAP on their linkedin pages..) trying to manage the 2,500 people who have joined their slack, however only a small number of contributors are actually working on scrapers (~2), who like I said -- are focusing on "criminal" datasets and not datasets around accountability. In doing so, they're effectively ignoring the core of the data that exists to aid in police accountability, eg complaints and misconduct records themselves. This shouldn't be a surprise, since police agencies tend NOT to provide information online that could be used against them in court, and they WILL fight to the teeth to prevent the release of damning records. Souce: I spend a lot of time with FOIA trying to get misconduct records..
The group's foundation is also shaky. The group was started by owners of a marketing company, frac.tl that specializes in exploiting emotions to turn something viral, which on its own should raise flags. The blog post that "started" their supposed movement has three listed writers over the course of several months [1][2][3]. The data from this blog post (afaict the only dataset actually released that's tied to this project) was found to contain private information and had to be cleaned up - 6mo after it was pushed onto github and forked many times.
If you are looking for a group to support, please consider either volunteering with or donating to local police accountability groups. Your time/donations would be much better served with them.
[1] https://web.archive.org/web/20191118214540/https://lawsuit.o...
[2] https://web.archive.org/web/20200518181855/https://lawsuit.o...
[3] https://web.archive.org/web/20200527213804/https://lawsuit.o...
So where's the data? All I see is a FAQ page and a donation button.
Edit: Found the answer, they dont have any publicily available. Lol.
> Our data isn't hosted anywhere yet.
Some States have comprehensive databases. Here's one for New Jersey: https://force.nj.com/
Just to clarify, The Force Report was the result of over a year of work by journalists that had to fight tooth and nail for every FOIA request and they freely acknowledge it is imperfect. To me at least it is something our government should be already doing.
"most comprehensive" is different from comprehensive. Curious, I went to take a look. The latest set of data was release November 2018 and covered the years 2012-2016 purchasable for $200.
Do you happen to have the links for any other states?
It really would be nice if we could get some legally-recognized standards for data schemas and interfaces so projects like this wouldn't be so necessary. But I have absolutely no idea if there's any precedent for that or how we'd get there.
Get a daily email with the the top stories from Hacker News. No spam, unsubscribe at any time.