Prioritizing Inspections Based on Fire Risk

The mission of the Chattanooga Fire Department (CFD) is to protect life, property, and community resources through prevention, preparation, response, and mitigation. As part of that mission, CFD conducts regular building inspections to determine if applicable fire code regulations are being followed and to ensure safety in the event of a fire. 
Currently, it is not possible to inspect all commercial properties on an annual basis, so choosing which properties to inspect is left to the fire prevention bureau. To determine which properties to inspect, the first considerations are 1) looking at what is required by city code and 2) weighing CFD's priorities. Many other variables also factor into where inspections are performed after the first two criteria are considered.
Using a data-driven variable, such as a fire risk score, would help determine where to apply resources towards additional inspections. A fire risk score:
  1. collects data about an address, including things like the number of non-fire emergency calls to that
  2. looks at where fire events have occurred in the past
  3. attempts to determine if there is any correlation between the address data and fire events
  4. assigns a fire risk score to properties based on the relevant address data

More about our fire risk model

Code for the model can be found on our GitHub page here.
The model is based on the assumption that fires do not occur in random locations, and that certain information about each property can tell us something about the likelihood of future fires.
We first determine where fires have occurred between 2015 and 2018. For this model a fire was a call with a generic incident code of 100. Some fires that were not building or parcel related were removed based on their incident code (code 131 'passenger vehicle fire,' code 136 'self-propelled motor home or recreational vehicle fire,' code 137 'camper or recreational vehicle (RV) fire,' and code 138 'off-road vehicle or heavy equipment fire').
To determine what properties looked like before a fire event, we first look at what non-fire calls were made to an address in the two years before a fire, based on the code of those calls. For addresses without any fires, the model looks at what occurred in the previous two years (2017 and 2018).  The model currently uses the following non-fire codes:
  • 745 Alarm system activation no fire unintentional
  • 743 Smoke detector activation no fire
  • 651 Smoke scare odor of smoke
  • 522 Water or steam leak
  • 531 Smoke or odor removal
  • 412 Gas leak (natural gas or LPG)
We also use parcel information like the parcel size and value to look for information that might correlate with fires. 
Once the dataset is ready, we use a machine learning algorithm to look for correlations between address data and the prevalence of fires. We can use that same machine learning algorithm to assign a fire risk score to parcels in 2019 based on what happened at that parcel in 2017 and 2018. Finally, the fire risk score generated for 2019 can be compared to what actually occurred in 2019 to see how well the algorithm performed. 
Below are three tables to help visualize how the data and results are compiled. The fire risk dataset shows the aggregated data including the 2019 fire indicator and the number of non-fire codes from 2017 and 2018 at a single address. The fire incident dataset shows the recorded 2019 fire incident (19-020101), and the bottom table shows 2017 and 2018 non-fire codes at that same address. As you can see there were 2 743-codes, 6 745-codes, and 1 651-code.
Fire Risk Dataset
Fire Incidents Dataset


Below you'll see a first look at how well the algorithm did assigning a 2019 risk score to addresses based on data from 2017 and 2018. The 0 and 1 along the x-axis is the fire indicator for 2019. 0 means no fire event occurred at the address in 2019 and a 1 means a fire did occur. The y-axis is the average fire risk score given to properties based on the 2017 and 2018 data. 

Our algorithm fire risk score is about 4.5 times higher to properties that had a 2019 fire than those that did not. This is evidence that the algorithm is picking up on some pattern that exists between address information and the prevalence of fires.
In other words, the algorithm does not know which parcels had fires in 2019. Instead, it is using the set of indicators discussed above to determine the fire risk score for each parcel. The way we checked to see if those indicators were actually helpful at predicting fire risk was by comparing the average risk score that the algorithm put out for parcels where fires did later happen to those where they did not later happen. The average risk score is significantly higher for parcels where fires did indeed happen. So those parcels show up as much higher risk, and they are also the parcels that saw most actual fires. 

Here's another way to test the results. For this visual, properties are placed into different risk categories based on their fire risk score and the following rules.

  • Fire risk score from 0 to 0.24 = Low
  • Fire risk score from 0.25 to 0.49 = Med Low
  • Fire risk score from 0.5 to 0.74  = Med High
  • Fire risk score from 0.75 to 1 = High
The number of fires occurring at properties in a category (red) are then shown as a percentage of the total number of properties in that category.
As you can see, out of the 133 properties in the highest risk score category, about 1 in 5 experienced a fire in 2019. 

Making Use of The Model

Now we know the model is able to create a decent correlation between property indicators and real world fires. How does that compare to inspections the fire department is currently conducting?  Is there any existing agreement between the model and CFD?
Turns out there is. The graph below once again breaks properties into four categories based on their fire risk score. Categories are then grouped based on how recently an inspection was conducted. The black color indicates no inspection was found for this property in the inspections data.
The results show that more inspections are already being conducted on properties that are at a higher fire risk according to the model.