I’m well-versed in Excel and SQL, regularly producing data analysis and visualizations. I’m starting to use Python for data analysis.
Python
I used code to join multiple spreadsheets of federal crash data for each year, then many years consecutively to create a database I could query with SQL. The analysis revealed New Jersey and Newark lead the nation for racial disparity in police car chase deaths.
One of the most useful applications was creating a hotkey that inserts the date and time, which is helpful for taking notes.
SQL
Here’s the backstory on two of my best SQL applications:
Hidden Misconduct: I fought for years to get full access to a government employment database, arguing to get the state to include each row for people who held multiple jobs. I crafted a SQL query to return employees who appeared an odd number of times – those who held a job, left it and found another. I filtered further to find 71 cops who had previously been fired from public safety jobs. But I didn’t simply trust the data. I contacted every police agency involved, finding in at least 22 cases the state data conflicted with local records and recollection.
Pension crooks: I joined state databases of pensions and convictions, finding dozens of convicted criminals collecting state retirement checks, including one in prison. This reporting prompted the state to stop that prisoner’s retirement checks.
No random drug testing for cops
I collected policies from hundreds of police departments across the state and produced a map of the areas where police officers weren’t randomly drug tested. This reporting prompted New Jersey to mandate random drug testing for all cops.
Dirty water
I assisted a colleague with data analysis of water utility violations. I used SQL to group water providers with multiple violations and mapped the results.