The bulk of the data required for this project is public, provided by DataSF. Due to the mass amount of data available, we decided to focus our analysis at a neighborhood level.

Our Two Data Sets from DataSF:

  • SF311 Cases – Street Light Repair Requests
  • Police Department Incidents – Crime reports

 Map of San Francisco Neighborhoods

Image result for map of san francisco neighborhoods

Preliminary Crime and Ticket Analysis

We explored reports of quality-of-life, property, and violent crime from the Police Department Incidents data set as well as street light repair requests from the 311 Cases data set. The following histograms contain the top 10 neighborhoods with the most crime occurrences and street light repair request:

As seen above, many neighborhoods with the most crime also have the highest number of street light outages.

Based on these results, we focused our subsequent analysis on the neighborhoods with the most crime:

  • South of Market
  • Tenderloin
  • Mission
  • Bayview
  • Downtown / Union Square
  • Civic Center
  • Potrero Hill
  • Lower Nob Hill
  • Western Addition
  • Outer Sunset

We chose these neighborhoods for two reasons: 1) we hypothesized crime rate and street light outages may have a stronger correlation in areas with more crime, and 2) it would increase our amount of crime data and improve the accuracy of our findings.


Determining Weighted Crime Rate for Each Neighborhood

To calculate the statistical relationship between streetlight outages and crime in San Francisco, we wrote a Python script to calculate the number of crimes that occurred around a specific street light while it was both on and off.  

We only included crime reports that occurred during the night, while there was no day light. For each street light, we used a fixed year window to compare crime when the light was off versus on to avoid seasonal bias.

We only consider crime that occurs within a 12 or 24 meter (m) radius of the street light. Crimes that occurred within 12m are weighted more heavily than those that occurred within 24m. Finally, we obtain a weighted and scaled crime rate for each neighborhood.


Statistical Analysis: Street Light Outages and Crime Rate Correlation

After running the code for our ten selected neighborhoods, we completed a paired sample hypothesis test on the calculated crime rates to determine if the relationship between crime and street light outages is significant. We tested whether or not there was a statistically significant difference between the average crime rate when a light is off and when it is on in a given neighborhood. Our results are shown below. 

Bellow are the neighborhoods with a statistically significant relationship between crime and street light outages:

  • South of Market
  • Mission
  • Bayview
  • Downtown/Union Square
  • Civic Center
  • Lower Nob Hill


To learn more about our technical approach and code, refer to sections 3, 4 and 5 of our final report.