This post has been partly rewritten in a Medium post (and expanded for the Bluetooth considerations).
The Oxford team has recently released a paper comparing the epidemiological advantages of decentralized and centralized systems. This is very welcome, as it will enable better evidence-based comparison of protocols.
That paper includes the following paragraph:
This says essentially:
- Whatever the system, you will need to make algorithmic risk assessments based on [Bluetooth-measured distance and] data about infector and/or infectee.
- Decentralized systems need to make these assessments based on [Bluetooth-measured distance and] data about the potential infectee.
- Centralized systems need to make these assessments based on [Bluetooth-measured distance and] data about the potential infector.
- The three best epidemiological predictors of actual transmission [conditioned on a given distance measurement through Bluetooth signal strength proxy] are:
- time before of since onset of symptoms of the index case (infector side)
- severity of symptoms of the index case (infector side)
- age of the contact (infectee side)
It ends with “This gives a small accuracy advantage to the centralized system, since two of the three predictors depend on the transmitter, and the effect sizes are larger”. It doesn’t offer in this paper any data to back up this quantitative comparison, but this breakdown aligns with all the public health discussion.
Within the context of the original article (narrow comparison between the two protocols), this argument could arguably make sense, and even be convincing.
In this post I want to highlight what is unsaid in the article, and show that the resulting effects would trump absolutely every other consideration. It also suggests that focusing on protocols is a distraction. If you look back more broadly, you see that contact tracing apps are trying to infer transmission risk from Bluetooth signals. The Oxford paper highlights some sources of uncertainty, tied to the biological conditions of either side. But of course there is another source of uncertainty, in inferring distance from Bluetooth signals. This is presumably very important, since the main public intervention - after washing hands - implemented so far is social/physical distancing (with a sharp drop off in risk at 2m - presumably one can dig papers backing up this claim).
So, how good is BLE at inferring distance?
This is a screenshot taken from here, showing the impact of a human body intersecting a Bluetooth channel.
This is a video showing a 20dB difference between carrying a phone in your front of back pocket.
I have confirmed myself playing around with iStumbler on my laptop and the “Pally BLE Scanner” app on my phone (with different experimental setups) that the shadowing due to a body is at least of 15dB (disclaimer: I should exercise more).
A delta of 20dB corresponds to a corrective factor in distance of 10 (rule of thumb: every 6dB correspond to a doubling/halving of distance).
You have to take into account that this dampening of the signal could be present on the infector side, on the infectee side, or both! This seems like a much more significant factor in deciding which protocol to use (or at least that it would be much more critical to pay attention how this noise is removed in either protocol, and who gets to decide what is noise in the first place!).
Now, Bluetooth is a very complex protocol. In fact there are very many additional sources of information that could be used to improve the distance measurement, such as the fact that it is not just one channel but at least three (see again here and other research we are currently analyzing).
This being said:
- all these improvements on the distance measurements come from opportunities of the Bluetooth channels (or side channels) that are also risks from a privacy point of view. For instance, which of the three subchannels are used and in what order can also help determining which device model is being used. In a sense, saying that a side-channel is exploited only highlights gaps in the privacy analysis.
- There is also huge variation (up to 20dB, again a factor of ten) between smartphone models, and on average 10dB between devices of the same model (a factor of 3 - not actually clear that this variation is not already present with the same device jut at different times). This is measured by the Singapore team
in an anechoic chamber:
Signal attenuation does enter into the risk calculation of the Google/Apple APIs. Apps get offered access to an attenuation value (from API doc v 1.2):
This value is calculated as the difference between transmission power and the maximum strength of the signal received:
You would be forgiven to wonder how the transmission power is known to the receiving device. Well, it turns out this is transmitted automatically, but encrypted so only Google and Apple devices can decode it once a person declares themselves infected (you have to read through both the Google/Apple Cryptography Specification and the Bluetooth Specification). Note that there are bytes reserved for future use (and that it is hopeless to measure distance to a device without knowing the power AND the orientation of the emitting device).
This encrypted metadata is encoded using the temporary keys (so can be decrypted on the at-risk person’s phone once the infected person declares their status through the server, by handing out the master keys).
Conclusion
This post has been written as the author understood the comparative significance of the hardware and epidemiological factors. Indeed, it is clear by looking at the engineering that technical sources of errors and the systematic biases they might introduce will dominate everything else, and should be addressed before even considering:
- making a comparison between de/centralized protocols.
- computing how those systematic biases translate into systemic outcomes (e.g. false positives systematically affecting users of Android phones sold under Chinese brand)
- thinking about false positives/false negatives