A growing number of security and data privacy experts are warning that proposed NHS Digital plans to scrape medical data on 55 million patients in England into a new database creates unacceptable levels of security risk.
The plan was officially announced earlier in May, and of particular note is the fact that patients have only until 23 June 2021 to opt out of the scheme by filling out a paper-based form and handing it to their GP. If they do not do so, their data will become part of the data store and they will not be able to remove it, although they will be able to stop data yet to be generated from being added.
The General Practice Data for Planning and Research (GPDPR) database will contain swathes of sensitive personally identifiable information (PII), which will be pseudonymised, and will include data on diagnoses, symptoms, observations, test results, medications, allergies, immunisations, referrals, recalls and appointments. It will also include information on physical, mental and sexual health, data on gender, ethnicity and sexual orientation, and data on staff who have treated patients.
It is proposed that the data store will be shared by multiple bodies, including academic and commercial organisations such as pharmaceutical companies in the interests of research and forward health planning, to analyse inequalities in healthcare provision, and to research the long-term impact of Covid-19 on the population.
David Sygula, a senior cyber security analyst at CybelAngel, conceded that, taken at face value, the plans provided some “strong benefits” from the perspective of an academic researcher, and agreed that – as NHS Digital hopes – an initiative such as GPDPR could be highly valuable in controlling the magnitude of the pandemic’s impact on the UK.
“However,” he added, “data collection on this scale is creating a new set of risks for individuals, where their personal health information is exposed to third-party data breaches.
“The extent of the unsecured database problem is growing. It is not simply an NHS issue, but the NHS’s third, fourth or further removed parties too, and how they will ensure the data is securely handled by all suppliers involved. These security policies and processes absolutely need to be planned well in advance and details shared with both third parties and individuals.”
Sygula recommended several mechanisms that might usefully be put in place – such as the full anonymisation, not pseudonymisation, of data – on the basis that a leak of data from the system is practically inevitable.
“Security researchers, attackers and rogue states have all put in place processes to identify unsecured databases and will rapidly find leaked information,” he said. “That is the default assumption we should start with. It is about making sure patients are not personally exposed in case of a breach, while setting up the appropriate monitoring tools to look for exposed data among the supply chain.”
Timelines too short?
Beyond the risk from third-party breaches and cyber criminals tempted by valuable personal data, IntSights chief compliance officer Chris Strand said that in his view, NHS Digital had failed to give people long enough to assess their personal risk position and opt out if desired.
“The opt-out plan could introduce complexities for some people who aren’t actively involved in how their data is used or who understand the implications of how their data may be used for research,” he said. “In the course of less than a month, how can they ensure that every individual included had an adequate opportunity to be informed on the data use and also had the opportunity to understand the implications of their data being used by third parties?
“I would be concerned about the legality of proving that people had a fair opportunity to opt out of the ‘data collection’. There could be challenges presented after the database is released to those who want to use it for research.
“Having dealt with the process of ensuring data use is disclosed to data owners, there may be legal consequences as it could be difficult to prove that all the individuals included in the database had an adequate opportunity to opt out of its use, especially given the nature of the sensitive data involved in this database.”
History repeating itself
Keystone Law technology and data partner Vanessa Barnett was also among those who pointed out risks. She said previous data-sharing health initiatives, such as an arrangement between the Royal Free Hospital NHS Trust and Google DeepMind, had been ruled non-compliant with the UK’s Data Protection Act (DPA) by the Information Commissioner’s Office (ICO).
“This is one of those times where one of the less famous bits of the GDPR [General Data Protection Regulation] comes to mind – that the processing of personal data should be designed to serve mankind,” she said. “The right to protection of personal data is not an absolute right; it must be considered in relation to its function in society and be balanced against other fundamental rights, in accordance with the principle of proportionality.
“This processing of health data could quite rightly serve mankind – but it all depends on what data, who it is given to, and what they do with it.”
In the Royal Free-DeepMind case, the ICO found shortcomings in the way patient records were shared, notably that patients would not have reasonably expected their data to be shared, and that the Trust should have been more transparent over its intentions.
“To me, this new mass sharing proposed by the NHS could well be history repeating itself,” said Barnett. “Most people would not expect their GP records to be shared in this way, have no awareness of it, and will not opt out because they had no awareness.
“It is noteworthy to see that the data will be pseudonymised rather than anonymised – so it is possible to reverse-engineer the identity of the patients in some circumstances. If the data lake being created is genuinely for research, analysing healthcare inequalities and research for serious illness, what is the reason this cannot be done on a true anonymised basis?”
Barnett warned that while using personal data in this way was not in itself illegal, failure to put in the necessary legwork to enable the data subjects – the general public – to understand what is happening and to have a “real and proper” opportunity to withdraw consent could ultimately prove a breach of some of the more administrative aspects of the DPA.
What NHS Digital says
According to outgoing NHS Digital CEO Sarah Wilkinson, GP data is particularly valuable to the health services because of the volume of illnesses treated in primary care.“We want to ensure that this data is made available for use in planning NHS services and in clinical research,” she said.
But Wilkinson did acknowledge that it was critical this was done in such a way that patient confidentiality and trust is prioritised and uncompromised.
“We have therefore designed technical systems and processes which incorporate pseudonymisation at source, encryption in transit and in situ, and rigorous controls around access to data to ensure appropriate use,” she said. “We also seek to be as transparent as possible in how we manage this data, so that the quality of our services are constantly subject to external scrutiny.”
NHS Digital says it has consulted with patient and privacy groups, clinicians and technology experts, as well as multiple other bodies including the British Medical Association (BMA), the Royal College of GPs (RCGP) and the National Data Guardian (NDG) on the GPDPR system.
Arjun Dhillon, Caldicott guardian and clinical director at NHS Digital, said: “This dataset has been designed with the interests of patients at its heart.
“By reducing the burden of data collection from general practice, together with simpler data flows, increased security and greater transparency, I am confident as NHS Digital’s Caldicott guardian that the new system will protect the confidentiality of patients’ information and make sure that it is used properly for the benefit of the heath and care of all.”
NHS Digital’s GPDPR transparency notice, including further details of how the data will be used and by whom, and information on how to opt out, is available here.