Abstract
Introduction Privacy protection is a core principle of genomic research but needs further refinement for high-throughput proteomic platforms.
Methods We identified independent single nucleotide polymorphism (SNP) quantitative trait loci (pQTL) from COPDGene and Jackson Heart Study (JHS) and then calculated genotype probabilities by protein level for each protein-genotype combination (training). Using the most significant 100 proteins, we applied a naïve Bayesian approach to match proteomes to genomes for 2,812 independent subjects from COPDGene, JHS, SubPopulations and InteRmediate Outcome Measures In COPD Study (SPIROMICS) and Multi-Ethnic Study of Atherosclerosis (MESA) with SomaScan 1.3K proteomes and also 2,646 COPDGene subjects with SomaScan 5K proteomes (testing). We tested whether subtracting mean genotype effect for each pQTL SNP would obscure genetic identity.
Results In the four testing cohorts, we were able to correctly match 90%-95% their proteomes to their correct genome and for 95%-99% we could match the proteome to the 1% most likely genome. With larger profiling (SomaScan 5K), correct identification was > 99%. The accuracy of matching in subjects with African ancestry was lower (∼60%) unless training included diverse subjects. Mean genotype effect adjustment reduced identification accuracy nearly to random guess.
Conclusion Large proteomic datasets (> 1,000 proteins) can be accurately linked to a specific genome through pQTL knowledge and should not be considered deidentified. These findings suggest that large scale proteomic data be given privacy protections of genomic data, or that bioinformatic transformations (such as adjustment for genotype effect) should be applied to obfuscate identity.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
The work for this manuscript was funded by NIH R01 HL 13 7995 Supplement
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This manuscript proposal was approved by oversight committees from all four participating cohorts: COPDGene, SPIROMICS, MESA, and JHS.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
hilla{at}njhealth.org
elizabeth.litkowski{at}cuanschutz.edu
am3xa{at}virginia.edu
LESLIE.LANGE{at}CUANSCHUTZ.EDU
prattek{at}njhealth.org
KATERINA.KECHRIS{at}CUANSCHUTZ.EDU
MATTHEW.DECAMP{at}CUANSCHUTZ.EDU
MARILYN.COORS{at}CUANSCHUTZ.EDU
Ortega.Victor{at}mayo.edu
ssr4n{at}virginia.edu
jrotter{at}lundquist.org
rgerszte{at}bidmc.harvard.edu
clary{at}broadinstitute.org
jlcurtis{at}med.umich.edu
xh6dx{at}virginia.edu
debby.ngo{at}novartis.com
wanda_o'neal{at}med.unc.edu
dameyers{at}arizona.edu
erbleecker{at}email.arizona.edu
rebdh{at}channing.harvard.edu
remhc{at}channing.harvard.edu
FARNOUSH.BANAEI-KASHANI{at}UCDENVER.EDU
Data Availability
All data produced in the present study are available upon reasonable request to the authors.