As part of a data visualization workshop I helped put together, I created a few visualizations as examples for participants to be inspired by. In this case, we were dealing with crime data from the City of Denver. After bringing the reasonably large dataset (retrieved on 10/24/2017) and using Google Fusion Tables to filter down the dataset to just January of 2016. From there, I pulled the resulting .csv file into Tableau Desktop and created the following visualizations, which reveal some interesting patterns.
Amount of Crime Reports vs. Time of Day and Area
An immediate takeaway from this chart for me is: why is there such a sharp drop-off in reported crimes between 4AM and 6AM? Is this because this when criminals decide to go to sleep? Is it because crimes aren't actually reported until the victims wake up and discover the crime has occurred? It's a question I would be fascinated to know the answer to.
This chart looks at the top 7 neighborhoods in Denver by the amount of crime that occurs within them. There don't seem to be any significant patterns that would point to a certain neighborhood being particularly active at a certain time of day.
What day of the week is crime in Denver the worst?
Friday looks like the answer to that question. Interestingly, traffic accidents see a substantial spike on Fridays. Is that due to commuters wanting to get home or are more people driving their cars on a Friday than a Sunday, for example?
What type of crimes are most popular in the top neighborhoods?
This chart reveals some interesting patterns. It looks like Stapleton and Baker see a lot of traffic accidents compared to the other neighborhoods but Five Points, East Colfax, and the Capitol Hill area see a significant amount of drug and alcohol-related crimes compared to the other neighborhoods.
Back to the Basics
Here, we see that traffic accidents account for almost a quarter of all crimes that occurred during this time period. We also run into a common data science problem. The "other" category. Although I wasn't looking for granular categorical data on the crimes in this case, I could see where crimes being studied could fall into this category and it could be frustrating for someone seeking answers from this data set.
Notes
- This is a very small subset of the data, a more comprehensive analysis would take into account all of the available data or a random sample of the data
- There were no tests of statistical significance done in this analysis, so any claims I make in this analysis are based purely on the visualizations