I’ve been riding my bike to the Metro to head into DC to attend the Flatiron School. I’m not particularly “strong” or “physically fit.” I’m not good at “balancing” or “riding in straight lines.” But parking is $5 a day and, you know, the environment. So I bike, which means I worry about getting rolled over by a middle aged dude in an SUV. The lane I ride down is pretty vacant first thing or last thing. So if I was run over, it would be a good long while before someone found my body.
I came across a Twitter bot, @HowsMyDrivingNY. Tweet a license plate at the bot, and she will respond with chapter and verse on speed and red-light camera violations in the 5 boroughs of New York. There is also a web-based front end.
I wondered how it was done. I suspected that the city of New York publishes traffic and parking violations in an easy to parse format and I was right. The raw data can be downloaded in a number of format, as well as a JSON API. The data is not updated in real time.
Turns out that my county, Montgomery County, MD, publishes a similar data source, with a comparable JSON API, updated every day with data going back 5 years or so.
Putting the Twitter API to the side for the moment, would I have the knowledge to iterate over this data, take which information that I want, and present it in a useful way? Sure.
The JSON published is an array of hashes. 1.5 million records long (as of this publication) and it is updated every 24 hours but the county government as part of their open data initiative. This data set does not include Personally Identifiable Information (PII) as the NYC set does, which I consider to be good and right, but it gives us enough data to make some good decisions about the relatively safety of a particular intersection, a route, or even the make and model of an automobile
The County’s public data initiative is driven by a tool called Socrata, which makes public the parameters that can be passed to their API. For example, if you wanted to paginate your results.
Using something like the Ruby Class CSV I can write the elements from the parsed hash to a CSV file. I could then use that CSV file, including Date of Violation, Latitude, Longitude, Car Make and Model, to build a map of Traffic incidents in and around my route to the Metro.
Here is the method that I created (h/t to Nick and our Star Wars API lab) to parse the JSON, and put out the data I want into a CSV file:
def moco_details
raw_data = RestClient.get('https://data.montgomerycountymd.gov/resource/4mse-ku6q.json?$limit=5&$offset=0')
parsed_data = JSON.parse(raw_data)
parsed_data.each do |event|
arr = []
arr << event["latitude"]
arr << event["longitude"]
arr << event["description"]
arr << event["make"]
arr << event["model"]
arr << event["color"]
CSV.open("traffic.csv","ab") do |csv|
csv << arr
end
end
end
And here is the resulting CSV from the method above:
39.1187633333333,-77.182455,DRIVING VEH. W/O ADEQUATE REAR REG. PLATE ILLUMINATION,TOYOTA,CAMRY,RED
39.1187633333333,-77.182455,FAILURE TO DISPLAY REGISTRATION CARD UPON DEMAND BY POLICE OFFICER,TOYOTA,CAMRY,RED
39.1163733333333,-77.2057483333333,DRIVER FAILURE TO OBEY PROPERLY PLACED TRAFFIC CONTROL DEVICE INSTRUCTIONS,TOYOTA,CAMRY,GRAY
39.1187633333333,-77.182455,DRIVING VEHICLE ON HIGHWAY WITH SUSPENDED REGISTRATION,TOYOTA,CAMRY,RED
39.0298833333333,-77.1263516666667,DRIVER USING HANDS TO USE HANDHELD TELEPHONE WHILEMOTOR VEHICLE IS IN MOTION,MAZDA,GX5,GRAY
From here, I would post the CSV to a globally visible web site (such as this one) and take advantage of the Leafletjs – a javascript framework for the creation and management of web-based maps – and some an associated CSV plugin, to turn that file into a map with markers and descriptions. Visually, then, I would have a good idea of where there are traffic problems in my neighborhood.
This is surely not as interactive as the Twitter bot leveraging NYC data, privacy considered, but an interesting exercise in using public data to highlight safe areas and troubled areas. This exercise is kind of the long way around the Horn, however. The County data source does have an available GEOJSON API that is processed natively by LeafletJSavailable as well, this could be done much more simply. Surely more to come.