Hacker News
3 years ago by cyberlab

Is this an American project that sources & aggregates data from American police forces, or is it some general thing that wants to draw from multiple countries? (Because the site doesn't mention anything about locale).

3 years ago by _joel

I thought the same, but there are apparently 17,985 police forces in USA[1] so it looks to be American centric. I think they should list this in their FAQ or make it more obvious.

[1] https://en.wikipedia.org/wiki/Law_enforcement_in_the_United_...

3 years ago by glenda

An issue I see is that the public data police departments put out is often only what they choose to, meaning any analysis of that data is potentially biased to an unknown degree. I suppose adding FOIA data into the mix could improve that, but doesn’t sound easy since those are usually delivered as PDFs.

3 years ago by mistermann

I think this "the dog that didn't bark" phenomenon may be a much more important phenomenon than meets the eye. People's perception of the world is a function of their observations of reality - for starters, the vast majority of people don't even have any exposure to things like police data (other than the typically misleading narratives that they might encounter in the news), but then even the small majority of people who do dig deeper into the details, these people too will leave with a distorted perception of reality if only a subset of the data is revealed, and they do not know that it is only a subset.

The end result of all of this complexity is that people think the world is one way, but it is actually another, and the vastness of the gap between the two could be anywhere from small and insignificant, to gigantic and crucially significant, if one applies this idea to not just police data, but everything (the entirety of reality is subject to this flaw, and many others that we do not realize).

3 years ago by TeMPOraL

Add the fact that even the businessmen and regulators, influential people, politicians and judges, they all build their image of the world mostly on the same sources as everyone else - and it's becoming worryingly apparent that all the control systems in societies are operating with very low visibility into the things they're controlling.

Imagine a PID controller keeping a boiler from overheating. Now imagine the temperature input to that controller is a thermometer mounted to the door of a metal cabinet next to the boiler, and thus it reports anything related to the boiler temperature only when somebody is looking for something in that cabinet (when the opened metal doors touch the boiler and conduct heat). Doesn't sound like a good control system.

3 years ago by undefined
[deleted]
3 years ago by zepto

They could and should document and account for this somewhere in the data set.

It wouldn’t solve the problem but it would allow the bias to be visible.

3 years ago by zepto

“Are you affiliated with a political party or have a political agenda?

No. Our only motivation is to provide trusted data in an age of disinformation.”

If true, this is an awesome project.

3 years ago by chaps

They're already working with one political candidate:

https://pdap.atlassian.net/browse/PDAP-149

3 years ago by zepto

You have linked to a message saying:

“Any chance you guys have data on NJ? I'm running for office there and was looking to get some police misconduct data to double check a couple things”

It is not clear how this means that they are working with a political candidate, or what that means to you.

Can you explain what conclusions you are drawing from this message?

3 years ago by chaps

From their slack, a political candidate posted that they were interested

One of two of their volunteer scrapers said that he would drop his work in California to work on New Jersey instead.

I guess "working with" is bad language, though "interested in dropping other work in favor of requests by political candidates" is more accurate.

3 years ago by jjulius

"Working with" feels like a loaded phrase. Someone created that ticket requesting that data 14 minutes ago, and the ticket has yet to even be assigned to anyone (at the time of my post). I'm not sure how you can conflate this project's lack of a response to a request as "already working with".

3 years ago by Hasnep

A political candidate asked for help accessing some data, not the same thing as affiliating with a political party.

3 years ago by tyingq

On the flip side, this will likely expose a lot more arrest data, and complicate the "innocent until proven guilty" thing that's mostly not true.

3 years ago by chaps

This group seems to have turned into a group that's only interested in scraping "criminal records" and "court records" over police accountability records, and believes that cop names should be removed from scraped records. They've also openly discussed inviting cops and FBI employees into their slack to discuss scraping. Consider that "Police Data Accessibility Project" is another way of saying "We only make available data that the police want us to make available.".

The project itself has tons of managerial types (who happily put PDAP on their linkedin pages..) trying to manage the 2,500 people who have joined their slack, however only a small number of contributors are actually working on scrapers (~2), who like I said -- are focusing on "criminal" datasets and not datasets around accountability. In doing so, they're effectively ignoring the core of the data that exists to aid in police accountability, eg complaints and misconduct records themselves. This shouldn't be a surprise, since police agencies tend NOT to provide information online that could be used against them in court, and they WILL fight to the teeth to prevent the release of damning records. Souce: I spend a lot of time with FOIA trying to get misconduct records..

The group's foundation is also shaky. The group was started by owners of a marketing company, frac.tl that specializes in exploiting emotions to turn something viral, which on its own should raise flags. The blog post that "started" their supposed movement has three listed writers over the course of several months [1][2][3]. The data from this blog post (afaict the only dataset actually released that's tied to this project) was found to contain private information and had to be cleaned up - 6mo after it was pushed onto github and forked many times.

If you are looking for a group to support, please consider either volunteering with or donating to local police accountability groups. Your time/donations would be much better served with them.

[1] https://web.archive.org/web/20191118214540/https://lawsuit.o...

[2] https://web.archive.org/web/20200518181855/https://lawsuit.o...

[3] https://web.archive.org/web/20200527213804/https://lawsuit.o...

3 years ago by bigth

So where's the data? All I see is a FAQ page and a donation button.

Edit: Found the answer, they dont have any publicily available. Lol.

> Our data isn't hosted anywhere yet.

3 years ago by ClosedPistachio

Some States have comprehensive databases. Here's one for New Jersey: https://force.nj.com/

3 years ago by rhcom2

Just to clarify, The Force Report was the result of over a year of work by journalists that had to fight tooth and nail for every FOIA request and they freely acknowledge it is imperfect. To me at least it is something our government should be already doing.

3 years ago by psychlops

"most comprehensive" is different from comprehensive. Curious, I went to take a look. The latest set of data was release November 2018 and covered the years 2012-2016 purchasable for $200.

3 years ago by electricBllue

Do you happen to have the links for any other states?

3 years ago by CivBase

It really would be nice if we could get some legally-recognized standards for data schemas and interfaces so projects like this wouldn't be so necessary. But I have absolutely no idea if there's any precedent for that or how we'd get there.

Daily Digest

Get a daily email with the the top stories from Hacker News. No spam, unsubscribe at any time.