Idea for a decentralised COVID-19 distance (non-)tracking aepp

doktorbastian.chain · April 2, 2020, 2:14pm

Hi all,

ML guru Yoshua Bengio proposed a decentralised app which supposedly enables people to compute their own probability of infection.

However, it seems to rely solely on point-2-point bluetooth visibility for determining distance. It says “the apps could talk to each other via bluetooth within 10 meters”, but I don’t know how this would work though, because bluetooth range is highly variable, and you wouldn’t know if there’s e.g. a lightweight wall in between.

But we could refine it into an aepp to enable anonymised after-the-fact triangulation like this:

Each client aepp could randomly create a private-public key pair, and broadcast it via bluetooth (but see caveats below). It would log all public keys it sees from other aepps, with timestamp, location (coordinate) and signal strength.

Then diagnosed patients could make the aepp announce their own infection by calling a suitable function on a smart contract, or this could be done by a doctor for the patient’s broadcasted public key.

Then the aepp would regularly check the infected person list on that smart contract and search its own logs whether they came near this id recently enough. If yes, the aepp could publish its logs encrypted with the public keys of those other aepps to which it came near at around the same times. The other aepps who came near would do this in turn, so potentially more readings from different locations would be available to triangulate the actual distance to the infected person.

This way, the sharing of each user’s own location readings would be minimised only to those people who were also near an infected person around the same time.

Note that this means that in places where there are many people, there would arguably be some sufficient degree of anonymity. Not so much in places where there were few people (if there was only this one patient with you in the same place, you’ll have seen them and possibly know them).

But it could all be opt-in, the reporting and the data exchange for triangulation.

Then we could still add a ML-based service which determines one’s own probability of being infected given the triangulation results from the known cases. It would work completely anonymously.

But there are some caveats:

I don’t know whether the broadcasting of a public key via bluetooth to unpaired phones is actually possible. There seem to be ways to point-to-point link without pairing but depending on hardware, or setting a bluetooth name (might be too short for a public key though, and it would make using BT for everything else highly inconvenient). Does anyone here know more?
To allow third parties to mark someone as infected can be abused if this is not permission somehow, such as allowing only vetted doctors or medical staff to do so. But disallowing it would risk under-reporting, i.e. pattens simply not setting themselves as infected. So probably a vetting protocol is needed.

Thoughts?

nikitafuchs.chain · April 2, 2020, 2:31pm

Damn, this is great.

For your questions: There is some sort of public identification you can get from a bluetooth device when scanning the devices around you. Not sure though if devices don’t limit the amount of their responses generally.

False accusations Might become a bigger problem though, as I could pair the “bluetooth ID” of someone else and give it the status “infected” on the contract. The only way to prohibit this would be having the devices send signed identification messages, whose publickey would allow to match with the data in a smart contract for that particular device('s owner).

Difficult, difficult.

doktorbastian.chain · April 2, 2020, 3:10pm

I just don’t know about the bluetooth issues, need to learn more (links to short and good intros would be appreciated).

Good point about the additional vector for false accusations using a “spoofed” public key. Clearly this needs some refinements…

doktorbastian.chain · April 6, 2020, 8:22am

I meanwhile learned that Bluetooth Low Energy allows the required kind of non-paired broadcasting of IDs. It also enables distance estimation via RSSI, so no triangulation might be necessary.

This, as well as a mechanism to shuffle the broadcasted public keys in order to prevent tracking of users (and thereby probably allow de-anonymisation, particularly when paired with e.g. cameras) by snooping and logging, is tackled by the WeTrace approach by the Airgap team.

Interestingly, PEPP-PT, the official EU approach, also proposes frequent ID regeneration, but then seems to botch this by having the app of infected persons send the “secret” seed of these IDs to the central server, and then even to all users from there:

The app opens an encrypted TLS connection to the server and sends the authorization code and its seed SK, a compact representation of the EphIDs it has broadcast during the infectious window, to the backend.

Several times a day, the backend server sends newly received EphID seeds of infected patients to all installed proximity tracing apps via the notification service.

bruteforce.chain · April 6, 2020, 10:57am

I have some experience with BLE4.0 (Eddystone, iBeacons, etc.).

Correct in BLE you broadcast packages constantly, which can contain some payload in the form of minor/major ids, RSSI, Tx power, and some other params. (32 bytes as far as I remember).

The triangulation is not that hard if you need accurate location. Here is some info from a research and project of mine (using BLE beacons couple of years ago):

dа = d0а * 10 ^ ((RSSI0а-RSSIа) / (10*cappa))
db = d0b * 10 ^ ((RSSI0b-RSSIb) / (10*cappa))
dc = d0c * 10 ^ ((RSSI0c-RSSIc) / (10*cappa))

d0 - distance from the measured value of RSSI0
RSSI0 (dBm) - power of the signal, measured at distance of 1m
RSSI (dBm) - power of the signal, measured from the mobile phone of the user/client
cappa - variable, the value of which depends on the condition of the area of which the measurements are taken (such as geometrical dimensions, obstacles etc.).

va = ((db*db-dc*dc) - (xb*xb-xc*xc) - (yb*yb-yc*yc)) / 2
vb = ((db*db-da*da) - (xb*xb-xa*xa) - (yb*yb-ya*ya)) / 2

temp1 = vb*(xc-xb) - va*(xa-xb)
temp2 = (ya-yb)*(xc-xb) - (yc-yb)*(xa-xb)
y = temp1 / temp2
x = (va - y*(yc-yb)) / (xc-xb)

My cents: people are not used to having their Bluetooth constantly turned on which may be a challenge for such apps.

doktorbastian.chain · April 6, 2020, 11:22am

Thanks, great info. However just using the distance info without triangulation might actually be better, because less info is acquired about other users.

people are not used to having their Bluetooth constantly turned on which may be a challenge for such apps.

This is another good point.

doktorbastian.chain · April 10, 2020, 4:35pm

I have written up my current understanding in a working doc of other approaches – including WeTrace and the EU’s PEPP-PT – and how to combine the best elements of each + replacing the central communication server with a smart contract.
There are some issues to solve, namely the validation of reports, and how smart contract calls can be made such that the calls don’t compromise anonymity (still need to update the doc about that latter point).

Mogley · April 11, 2020, 12:43pm

Sometimes you might find interesting:

doktorbastian.chain · April 16, 2020, 7:52am

I’ve updated the working doc with the new approaches from MIT and Apple/Google (spoiler: they’re not much better than PEPP-PT), as well as added some more “Open Problems” and thoughts on mitigations.