In the end, we did not produce a so-called ‘final’ algorithm. Instead, we have arrived at a general algorithm with a variety of components that we believe are the most effective of all of our methods created throughout the semester. Specifically, we are confident that a Euclidean distance algorithm with added modifications is a strong contender as a viable contact tracing algorithm. The components that we included in our strongest iteration are:

  • Normalizing the Euclidean distance by a factor of n in order to compare environments with different concentrations of access points.
  • Replacing unmatched networks with a penalty/substitution of a grand average of RSSI values taken across both phones.
  • Dropping networks with a relatively high value of standard deviation amongst its recorded RSSI values over a small time interval.
  • Dropping networks whose averaged RSSI values over a small time interval are beneath a defined threshold.

We produced many plots and did major analysis on a ‘hybrid’ algorithm that incorporates these specific conditions, and we were able to produce plots with the highest R2 values of any singular algorithm written over the semester. These incredibly successful results, which showed consistently positive slopes and clear correlation, have led us to believe that an algorithm that includes a combination of these individual conditions could be very effective.

However, one major question remains: while we have established that these elements of data inclusion are essential, the universal thresholds at which they should perform are yet to be finalized. We believe that the particular thresholds of standard deviation and RSSI limits achieve optimality at different values depending on the circumstances of the testing environment, which leaves opportunity for further research on finding ways to dynamically choose these measures using machine learning analysis. Once these thresholds are determined in a universal way to allow this algorithm to perform optimally in all environments, this would serve as a viable solution for large scale implementation. At that point, work could also be done to establish proper conversion from Euclidean distance to true distance, which would allow for correct identification of exposures.