Anonymity game
Project
This classroom activity is part of the Disease Dynamics collection
When collecting data about a population, people may be concerned if details about their personal status or health status are not secure. As such, as researchers we must take steps to ensure the security of any personal details shared with us about social interactions, vaccination status, infectious contracted, as well as the patient/participant's name, age, address etc, so that they cannot be identified. This is often done though anonymising the data so that it cannot be attributed to an individual.
However, sometimes data breaches occur if data is not stored securely. This activity shows how easy it is to de-anonymise data from only a small piece of information.
How many people can you identify if Jack and Lucy's list of mutual contacts is leaked?
Resources: Slides (PowerPoint or PDF), Printable Network (PowerPoint or PDF)
Aims
- To teach students about how important anonymity is during data collection.
- To understand how identities can be deducted retrospectively from a data set.
- To understand the importance of data security.
Activity (Small Groups)
Ask students to examine the network (on screen or printed out).
Invite students to try and de-anonymise the network, based on the leaked information of Jack and Lucy's mutual contacts.
Questions for thought:
What If we had coded the network with boys and girls?
What if a different pair of mutual links were leaked? What if only the single directional choices was leaked?
What are the dangers of being able to re-identify people in this way?
Other networks may have sensitive data that people would not want to be made public. Does it matter what sort of information we are able to identify about someone?
Case Study: Governments losing data.
Personal details of every child in the UK; name, date of birth and address, along with the bank details and National Insurance numbers of the parents claiming child benefit went missing in the post in 2007.
Her Majesty's Revenue and Customs (HMRC) apologised for this loss - which was caused by a junior member of staff sending the data on 2 unencrypted discs by standard Royal Mail to an audit office elsewhere in the UK. This could potentially affect 25 million Britons, and cost £145m.
In response, the Government asked parents to remain vigilant for any suspicious behaviour on their bank account, although it may be that the full extent of the data breach is not known until all of these children reach the age of 18 and may have suffered from identity fraud.