Evaluation of SARS-CoV-2 Antibody Point of Care Devices in the Laboratory and Clinical Setting ============================================================================================== * Kirsty McCance * Helen Wise * Jennifer Simpson * Becky Bachelor * Harriet Hale * Lindsay McDonald * Azul Zorzoli * Elizabeth Furrie * Charu Chopra * Frauke Muecksch * Theodora Hatziioannou * Paul D. Bieniasz * Kate Templeton * Sara Jenks ## Abstract SARS-CoV-2 Antibody tests have been marketed to diagnose previous SARS-CoV-2 infection and as a test of immune status. There is a lack of evidence on the performance and clinical utility of these tests. We aimed to carry out an evaluation of 14 point of care (POC) SARS-CoV-2 antibody tests. Serum from participants with previous RT-PCR (Real-Time Polymerase chain reaction) confirmed SARS-CoV-2 infection and pre-pandemic controls were used to determine specificity and sensitivity of each POC device. Changes in sensitivity with increasing time from infection were determined on a cohort of participants. Corresponding neutralising antibody status was measured to establish whether the detection of antibodies by the POC device correlated with immune status. Paired capillary and serum samples were collected to ascertain whether POC devices performed comparably on capillary samples. Sensitivity and specificity varied between the POC devices and in general did not meet the manufacturers reported performance characteristics signifying the importance of independent evaluation of these tests. The sensitivity peaked at >20 days following symptoms onset however sensitivity of 3 POC devices evaluated at extended time points showed that sensitivity declined with time and this was particularly marked at >140 days post infection onset. This is relevant if the tests are to be used for sero-prevelence studies. Neutralising antibody data showed positive antibody results on POC devices did not necessarily confer high neutralising antibody titres and these POC devices cannot be used to determine immune status to the SARS-CoV-2 virus. Comparison of paired serum and capillary results showed that there was a decline in sensitivity using capillary blood. This has implications in the utility of the test as they are designed to be used on capillary blood by the general population. ## Introduction The emergence of SARS-CoV-2 has triggered a race by laboratories and manufacturers to bring commercial diagnostic tests to market. The unusually rapid pace of this development risks potential compromises on quality in the absence of rigorous independent evaluation and validation. It is therefore essential to ensure adequate performance of tests for population wide or individual use to prevent roll out of tests which add no, or at best, minimal clinical value to individual patients or the wider population. The current most widely used diagnostic test for SARS-CoV-2 is based on real-time polymerase chain reaction (RT-PCR) amplification of viral RNA from an upper respiratory tract sample (1,2). However, due to a limited time window of active infection, capacity constraints and restriction of access to symptomatic patients, cases determined using this method underestimate the true burden of infection. In contrast, serological assays test for previous infection and are therefore a key additional tool for monitoring prevalence of infection within the population. Antibody tests can also be used as an aid in diagnosis where COVID-19 is suspected clinically but the PCR time window has passed (3) and significant interest exists in the potential for use of these tests at an individual level to provide an indication of immune status and act as an ‘immunity passport’. For countries where vaccine availability is limited pre-screening the population with antibody tests to identify individuals who may either not require vaccination or be suitable for a reduced vaccine dosing regimen may help optimise the use of limited vaccine resources. Of particular interest is the use of monoclonal antibody treatment in hospitalised patients with SARS-CoV-2. Ronapreve, a combination of Casirivimab and Imdevimab, is a monoclonal antibody treatment directed at the Spike protein receptor binding domain on SARS-CoV-2. This has been shown to significantly reduce 28 day mortality in seronegative hospitalised patients (4).Therefore a well performing antibody tests has the potential to identify patients who would benefit from this treatment. A vast number of commercially available immunoassays have been developed to detect SARS-CoV-2 IgG, IgM, IgA and total antibodies (5). Although the majority of antibody production is directed towards the more abundant N (nucleocapsid) protein, the S (spike) protein contains the receptor-binding domain responsible for host cell attachment and antibodies to the S protein are therefore predicted to be neutralising (6). In contrast to laboratory-based immunoassays, which require venous sampling and transport to centralised testing sites, lateral flow immunoassays (LFA) offer the potential to allow rapid, cheap, mass population antibody testing on capillary samples in the home environment. This relief on clinical pressure and its use at a population level relies on the LFA working on capillary whole blood samples and the tests offering sufficient ease of use, interpretation and acceptability to the wider population. If LFAs are to be used for population sero-surveillance they will have to be sensitive enough to detect the presence of antibodies in those who only suffered from mild disease or were asymptomatic. Of even greater importance is the test specificity when testing at a population level where the pre-test probability is low. Without adequate specificity the chance of a positive result being a false positive is considerable and this is of particular concern if these tests were to be used as evidence of immune status or ‘immunity passports’. It is currently unclear if the detection of antibodies to the COVID 19 virus by LFA confers immunity to the virus and some manufactures fail to disclose whether the assay is directed towards the N or S protein. Worldwide, over 200 LFAs have been produced to date (5), for the majority of which only manufacturer reported performance data is available. There is no requirement for external validation to obtain CE marking and the manufacturer’s validation process is often not publically available. The UK Medicines and Healthcare Products Regulatory Agency (MHRA) advise that any SARS-CoV-2 LFA should have sensitivity and specificity > 98% (95 % CI 96-100 %) in samples collected over 20 days post symptom onset irrespective of whether they are performed as a home self test kit or by a trained health care professional (7). Despite claims by many manufactures to have achieved this target (see S1), UK external validations of LFAs have so far failed to reproduce this (8–10). As part of a Scottish evaluation of SARS-CoV-2 diagnostic tests, we carried out an independent validation of 14 point of care SARS-CoV-2 antibody tests including 13 LFAs. We assessed their specificity and sensitivity against manufacturer’s claims and determined compatibility with MHRA criteria. For tests that performed well against the MHRA criteria, we studied changes in sensitivity with increasing time from infection. To investigate suitability for home use, we evaluated ease of use and interpretation of these tests when performed directly by patients. Neutralising antibody status was determined for samples to establish whether the detection of antibodies by the LFA correlated with immune status. ## Methods ### Point of care (POC) device selection and the evaluation process Companies approached National Services Scotland (NSS) with their POC device. The manufacturer claims and costing were reviewed and manufacturers who passed these initial checks were invited to send up to 100 POC test kits for initial evaluation. 14 point-of-care devices designed to detect antibodies to the SARS-CoV-2 virus were evaluated. The tests included in the evaluation were AbC-19, Alpha Pharma, Biomerica, Biozek, Fortress, Jiangsu, Lepu, Menarini Healgen, Mologic, Pharmact, Roche, Wuhan Easy Diagnosis, Wuhan Life Origin Biotech (Syzbio) and LumiraDX. 13 of the POC devices were LFAs, consisting of immunochromatography based cassettes, with the LumiraDX assay using a point of care microfluidic immunofluorescence assay. Each device measured IgG and IgM except Abc19 which only measured IgG and LumiraDX which gave an antibody result of unspecified subclass. Target antigens (either S or N) were not always disclosed by the manufacturers. The tests were performed as per the manufacturer’s instructions. Typically, 2.5µL to 10µL of serum or capillary sample was pipetted into the sample well followed by a pre-specified volume of buffer supplied by the manufacturer (see S1). Serum samples were run at room temperature in the laboratory at the Royal Infirmary of Edinburgh and the read time varied from 10 to 20 minutes depending on manufacturer’s instructions. Photographic records of serum results were taken. Capillary samples were run as part of a research clinic. Antibody results were visually read and recorded as positive, weak positive, or negative for IgG and IgM. Positive and weak positive results were deemed an overall positive result. In the first stage of the process 50 serum samples from RT-PCR confirmed SARS-CoV-2 positive participants and 50 negative controls were run. Initial sensitivity and specificity were calculated and if the test kit performed well (typically defined as an IgG specificity of >98% and sensitivity of >95% at >day 21 post infection) the company was taken forward to the second stage of evaluation and asked to provide further test kits. Further stages of the evaluation process which were carried out for a subset of kits included more stringent specificity testing, determination of batch to batch variation, sensitivity as time from infection increased, comparison of capillary to serum results and evaluation of ease of use by study participants in a research clinic. ### Overview of participants and samples Serum samples from patients with RT-PCR confirmed SARS-CoV-2 were collected between March and November 2020. Excess serum was stored at the point of discard from hospitalised COVID-19 patients using ethical permissions obtained through NRS BioResource. Out-patients with previous RT-PCR confirmed SARS-CoV-2 infection were also recruited to two COVID convalescent research studies where participants were invited to donate serial blood samples (SR1407 BioResource study, n=112 participants) and to provide capillary and serum blood samples (COVID-19 Antibody Test Evaluation (CATE) Study, n= 82 participants). For both studies the date of symptom onset and positive RT-PCR test result were recorded. Inclusion criteria were those aged >16 years old with previous RT-PCR confirmed diagnosis of COVID-19. Only 5 study participants were diagnosed with COVID-19 in hospital with the remainder being diagnosed in the community. Due to the limited access to COVID-19 testing at the start of the pandemic when most study recruitment occurred many study participants were health care workers. Exclusion criteria were any frail or shielding individuals. Each participant provided between 1 and 5 serum samples. For participants providing serial samples these were collected between 24 and 216 days post PCR confirmed infection. Capillary samples were collected as part of the CATE study research clinic between September and November 2020. Capillary samples were tested in the clinic setting with paired serum samples subsequently run on the same POC devices in the laboratory to assess concordance on capillary and serum samples. A subset of CATE study participants were asked to carry out one of the POC tests themselves including finger prick testing with the aid of the manufacturers printed instructions. Health care workers intervened if the participant needed support in completing the testing and any intervention was recorded. Afterwards participants were asked to complete a questionnaire on their experience of performing the test. For a subset of other test kits capillary results were interpreted by both the participant (with the aid of a labelled diagram) and the researcher. Negative controls consisted of venous samples collected prior to December 2019. For more stringent specificity testing samples positive for rheumatoid factor, seasonal coronaviruses, other respiratory pathogens, CMV (cytomegalovirus) or EBV (Epstein-Barr Virus) were used as negative controls along with samples from the national antenatal screening programme and BTS (The British transfusion service). Wherever possible the same panel of samples were run on each POC device but due to limitations in sample volume not all devices were assessed with the same panel of samples. However, a similar variety of samples types were used to test each kit. ### Neutralisation assay Neutralising antibody assays were performed on serum samples using a pseudotyped chimeric virus, expressing the SARS-CoV-2 spike protein, as described in (11). ### Statistical analysis The primary outcome was the sensitivity and specificity of each POC test. For sensitivity, tests were compared against PCR-confirmed SARS-CoV-2 infection. Specificity of each POC test was evaluated against pre-pandemic negative samples, with all positives counting as false positives. The analysis included all available data for the relevant outcome and are presented with the corresponding 95% confidence intervals (Clopper Pearson). For samples where neutralising antibody levels are available positive predictive value (PPV) and negative predictive value (NPV) are calculated at two NT50 thresholds; ≥ 50 and ≥ 160. For the comparison of the performance of each POC test between clinic capillary and serum samples we calculated the sensitivity and 95% confidence interval. The McNemar test was used to assess for significant difference between dependent groups. Agreement between serum and capillary samples was measured using the Cohen’s Kappa. Results of the Kappa were interpreted as previously described (<0, poor agreement; 0.00–0.20, slight agreement; 0.21–0.40, fair agreement; 0.41–0.60, moderate agreement; 0.61–0.80, substantial agreement; and >0.8, almost perfect agreement) (12). All data were analysed using PRISM version 9 and SPSS, and a p value<0.05 was considered significant. ### Ethics Ethical approval for access to pre-pandemic stored samples and hospitalised COVID-19 patient samples was obtained through the NRS BioResource (Ref: SR1410). Outpatient COVID-19 convalescent patient samples were obtained through the NRS BioResource (Ref: SR1407) and the COVID-19 Antibody Test Evaluation (CATE) study, with ethical approval from London-Brent Research Ethics Committee (REC ref: 20/HRA/3764 IRAS: 286538 ## Results ### Comparison of POC device performance using serum samples 14 POC devices were evaluated for sensitivity and specificity including 13 LFAs and one microfluidic immunofluorescence assay (Lumira DX). In the first stage the sensitivity of the LFA was assessed using a panel of 50 serum samples from RT-PCR positive individuals. The specificity of the LFA was assessed using a panel of 50 serum samples collected prior to December 2019 for routine virological investigations. Adequately performing LFA then progressed to stage 2, where they were evaluated against an expanded panel of serum samples. In particular, a more stringent specificity panel was used, including serum samples taken from individuals with a positive PCR result for the endemic coronaviruses, other respiratory pathogens, and samples with positive rheumatoid factor, CMV or EBV serology. The results are summarised in **Table 1** and **Fig 1**. For each sample, and each device, the IgG and IgM result was scored as negative, weak positive or positive (the Abc19 LFA only had an IgG window, and the LumiraDX platform only returned an antibody result). For the purposes of the sensitivity and specificity calculations, weak positive and positive results were counted as positive. View this table: [Table 1:](http://medrxiv.org/content/early/2021/12/16/2021.12.16.21267703/T1) Table 1: Summary of POC test sensitivity and specificity For each manufacturer the n, overall sensitivity, sensitivity at ≥ 20 days post symptom onset, and specificity is shown, with 95% confidence intervals (Clopper-Pearson). The LFA are grouped according to which stage of the evaluation they reached. For kits that progressed further than stage 1, the additional analyses performed are indicated (Y, performed). For Abc19, only IgG data was available, and for LumiraDX, only the overall antibody result was available. The results from two separate batches of Menarini kits are also shown. ![Fig 1:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/12/16/2021.12.16.21267703/F1.medium.gif) [Fig 1:](http://medrxiv.org/content/early/2021/12/16/2021.12.16.21267703/F1) Fig 1: Sensitvity vs specificity at >20 days Sensitivity against specificity for the kits tested (IgG only, and positive or negative result for the LumiraDx assay). For each kit the ≥ 20 days post symptom onset sensitivity is shown. The MHRA targets of 98% sensitivity and specificity are shown as dotted lines. Pharmact was not included on the graph due to low specificity. The LFA that did not progress beyond this stage are shown in red. Only the LumiraDX POCT assay reached the MHRA criteria for sensitivity and specificity (> 98% for both), although a number of the LFA (Biomerica, Biozek, Fortress, Menarini and Roche) surpassed the specificity criteria, even when tested against the high stringency specificity serum samples. One empirical finding was that there was extensive variation in the strength of the staining between the kits, which could affect ease of interpretation. Another more worrying observation was that batch-to-batch variation in performance was observed for the Menarini assay with batch 2005156 giving a sensitivity of 76.3% on samples ≥20 days post infection versus batch 2003288 giving a sensitivity of 94.9% on the same samples (n=99). The Abc-19, Biomerica, Biozek, Fortress, Menarini (batch 2003288) and Roche LFA and the LumiraDX assay underwent further analysis using an expanded panel of RT-PCR positive samples. Initially, sensitivity was examined at proximal time points to infection for the Biomerica, Biozek, Menarini, Roche and LumiraDX LFAs and the results are shown in **Table 2** and **Fig 2**. All the LFA showed an initial increase in sensitivity, peaking at ≥ 21 days post symptom onset, with sensitivities of ≥ 90% obtained for all LFA from this time point onwards. Between 8-20 days post symptom onset, LFA performance was more variable, with Roche showing the lowest sensitivity, and Menarini the highest. View this table: [Table 2](http://medrxiv.org/content/early/2021/12/16/2021.12.16.21267703/T2) Table 2 Sensitivity against time ![Fig 2](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/12/16/2021.12.16.21267703/F2.medium.gif) [Fig 2](http://medrxiv.org/content/early/2021/12/16/2021.12.16.21267703/F2) Fig 2 Sensitivity against time Sensitivity against time post symptom onset for selected kits (IgG for all LFA, with the exception of LumiraDX, where overall antibody result was used). An equally important question was what the sensitivity of the POCT LFA was at later time points post symptom onset, since it has been shown that antibody levels wane with time following the primary antibody response to SARS-CoV-2 (13–15). Sensitivity was analysed across three time windows from 20 days post symptom onset for the Abc19, Biomerica, Roche and LumiraDX predominantly using samples from the convalescent cohort, and the results are shown in **Table 3** and **Fig 3**. View this table: [Table 3](http://medrxiv.org/content/early/2021/12/16/2021.12.16.21267703/T3) Table 3 Sensitivity against time post symptom onset ![Fig 3](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/12/16/2021.12.16.21267703/F3.medium.gif) [Fig 3](http://medrxiv.org/content/early/2021/12/16/2021.12.16.21267703/F3) Fig 3 Sensitivity against time post symptom onset Sensitivity against time post symptom onset for selected kits (IgG for all LFA, with the exception of LumiraDX, where overall antibody result was used). The LumiraDX assay was the only assay evaluated that maintained its sensitivity at 98% across the time interval studied (up to 224 days post symptom onset). The three LFA showed variable drops in sensitivity with increasing time from symptom onset, with Biomerica dropping from 91% to 54% over the time points studied. For the convalescent cohort samples, neutralising titres were also available, and the POCT result (for Abc19, Biomerica, Roche and LumiraDX) could be compared with the half maximal neutralising titre (NT50). Initially, the NT50 result for each serum sample was plotted, dividing each LFA by positive and negative IgG (or for LumiraDX, antibody) result (**Fig 4**). ![Fig 4](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/12/16/2021.12.16.21267703/F4.medium.gif) [Fig 4](http://medrxiv.org/content/early/2021/12/16/2021.12.16.21267703/F4) Fig 4 Neutralising antibody titres compared to POC test result NT50 against binary POC test result. The NT50 results for the positive and negative LFA result are plotted by kit (the mean and SEM are denoted by black lines). Dotted lines represent the NT50 ≥ 50 and ≥ 160 thresholds, and the p values from a Kruskal-Wallis test are shown. There was a statistically significant difference in NT50 level between the positive and negative LFA result for all assays. However, for the LFA positive result, a large range of NT50 values was observed. To examine this further, the sensitivity and specificity of the LFA was evaluated at two NT50 thresholds; ≥ 50 and ≥ 160 (**Table 4**). These two thresholds were selected based on a detectable NT50 (≥ 50) and a threshold which may confer protection from reinfection (≥ 160) (16). View this table: [Table 4.](http://medrxiv.org/content/early/2021/12/16/2021.12.16.21267703/T4) Table 4. Performance of POC devices compared to NT50 ≥ 50 and ≥ 160 The sensitivity of all the POCT for detecting an NT50 ≥ 50 (of the 280 serum samples, 222 had an NT50 ≥ 50 and 58 had an NT50 < 50) was at least 90%, and for Roche and LumiraDX it was > 98%. However, the NPV varied between 40 and 80%, as a result of the small number of samples that were negative by the POCT, relative to the number of positive samples (**Fig 4)**. Specificity performance was also poor with a large number of false positive results (POCT positive, but NT50 < 50). In particular, the LumiraDX assay had a specificity of < 5%, whereas the Abc19 assay had the highest specificity of 62%. These results impacted the PPV in this high prevalence setting, which ranges from 90% for the Abc19 assay and 84% for the LumiraDX assay. When the NT50 threshold was raised to ≥ 160 (where 142 of the serum samples had an NT50 ≥ 160 and 138 had an NT50 < 160), the sensitivity for all the POCT rose to at least 96%, with a consequential increase in NPV to at least 90%. However, the specificity of all of the POCT fell, with the Abc19 kit having the highest specificity at 42%. This affected the PPV, which also fell from the lower NT50 threshold, to between 55 and 64%. ### Comparison of POCT device performance in serum and capillary samples A major advantage of the POCT devices over immunoassays performed within accredited laboratories is the lack of requirement for phlebotomy. However, many evaluations of POCT performance have not compared serum and capillary samples. Furthermore, it has been suggested that LFA could be suitable for home use, outside of a healthcare setting, and this would require the participant to interpret their test result. In this work, paired serum and capillary samples were available for a number of participants in the convalescent cohort, and the performance of seven POCT assays (Abc19, Biomerica, Biozek, Fortress, Menarini, Roche and LumiraDX) was compared. The serum sample was processed in the laboratory, and read by a health care worker, and the capillary sample was read by the study participant (with the aid of a labelled diagram or instruction leaflet) and a second health care worker in the research clinic. Sensitivity findings for the 3 matched interpretations are shown in **Table 5 and Fig 5**. View this table: [Table 5.](http://medrxiv.org/content/early/2021/12/16/2021.12.16.21267703/T5) Table 5. Comparison of paired serum and capillary results on selected POC devices ![Fig 5:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/12/16/2021.12.16.21267703/F5.medium.gif) [Fig 5:](http://medrxiv.org/content/early/2021/12/16/2021.12.16.21267703/F5) Fig 5: Sensitivity comparison between paired serum and capillary samples Sensitivity for LFA devices comparing serum, and HCW and participant interpreted capillary data. Sensitivity and 95% CI (Clopper-Pearson) were plotted. A common feature of the LFA results in capillary samples was a reduction in sensitivity compared to the serum samples. This was seen in five out of the 7 LFA analysed, however, the magnitude of the effect varied. Both Biozek and Roche showed significant decreases between serum and capillary sensitivity when interpreted by either HCW or participant. This decrease in sensitivity ranged from 19.1 to 34.2%. ABC 19 showed a significant decrease in sensitivity between serum and participant read capillary results decreasing from 73.3% to 33.3%. The Fortress LFA showed a small increase in sensitivity in capillary samples compared to serum but this was not statistically significant. The LumiraDX assay showed 100% sensitivity for both samples (n=11 but a large number of test fails (n=18) occurred on capillary samples at the research clinic. The concordance between serum, HCW and participant read capillary results are displayed in table 5. Concordance between LFA varied. For example of the 45 Abc19 results, 24 were interpreted as positive by a HCW and 15 by the participant. In contrast, for the Fortress LFA, there was a single discrepant positive result between the HCW and participant interpretation, where the participant reported a positive, and the HCW a negative, result. Participants who performed self-testing using the AbC-19 LFA in the clinic also completed a questionnaire that addressed aspects of self testing using the kits (n = 35 participants). Their responses are summarised in **Fig 6**. Overall, 89% of participants reported it was “very easy” or “easy’ to understand the leaflet explaining the results, and 75% reported it was “very easy” or “easy” to understand the instructions. Any issues appeared to be predominantly associated with taking the capillary samples and using the test kit, with 20% of participants reporting it was “difficult” or very difficult’ to take the sample. ![Fig 6:](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2021/12/16/2021.12.16.21267703/F6.medium.gif) [Fig 6:](http://medrxiv.org/content/early/2021/12/16/2021.12.16.21267703/F6) Fig 6: Ease of use Summary of ease of use questionnaire. Participant ease of use responses to the indicated categories were expressed as a percentage of responses. ## Discussion The SARS-CoV-2 pandemic has generated unprecedented demand for testing both to confirm acute infection and past virus exposure. In particular, serological assays measure prior exposure to the virus and are being used in high volume settings to measure seroprevalence at a population level. The role for these tests at an individual level is still poorly understood, and there are huge risks of using poorly performing tests. For example, using a test with a poor specificity in a low prevalence population would mean a high ratio of people with a false positive result compared to a true positive result, leading people to believe they were antibody positive, when they are not. This study was designed as part of a national evaluation of SARS-CoV-2 antibody POCTs within NHS Scotland, and aimed to address several questions regarding the performance of the POCT. To be effective, POCT should have similar performance in samples from patients who experience relatively mild illness compared to patients requiring hospitalisation. This was addressed using a cohort of convalescent patients, the majority of whom did not require hospital treatment during their illness. This cohort was also used to assess the performance of the POCT with increasing time from infection, and to examine how well they were able to predict neutralising antibody titres. Another major requirement is for POCT to perform well in capillary samples compared to serum. This point is critical since a major strength of the POCT devices is that they could be used on capillary samples, thus reducing the requirement for phlebotomy. In this work, 14 POCTs were evaluated on serum samples, and a number of these tests underwent further evaluation on capillary samples. From the serum evaluation, it became clear that specificity performance of many of the POCT was good – with eleven out of fourteen reaching the MHRA specificity target of > 98% for IgG, or in the case of LumiraDX total Ab. However, only a single device (LumiraDX) reached the MHRA sensitivity criteria of >98% ≥ 20 days post symptom onset. The sensitivity panel that was used to assess these POCT in the first instance consisted predominantly of hospitalised patients, who tend to have higher antibody titres than patients with milder disease (17–22), making disease severity a less likely explanation for these observations. The reduced sensitivity of the LFA compared to manufacturer’s claims is not unique to the LFA; the sensitivity of a laboratory analyser immunoassay on this cohort was 93% at ≥ 20 days post symptom onset. Therefore at relatively early time points post symptom onset, a negative result cannot rule out SARS-CoV-2 exposure. Indeed, for the subgroup of LFA where sensitivity with time was examined it was clear that, at early time points, sensitivity increased with time reaching maximal sensitivity after 21 days post symptom onset. This was in keeping with other reports that have studied the time to reach maximal antibody titres (17,23–25). An equally important question regarding timing of testing was how long post infection the POCT could detect an antibody response. The issue of sero-reversion is important for seroprevalence studies, and the ideal test for this purpose would have a high sensitivity for a long period of time. Otherwise, sero-reversion may result in underestimation of seroprevalence, and models to account for this trend of waning antibodies themselves risk over-or under-correcting for this effect (26,27). To examine this, a cohort of SARS-CoV-2 convalescent individuals for whom longitudinal serum samples were available were used. Crucially, these patients were most representative of the majority of SARS-CoV-2 infections in so far as only 5 participants providing serial samples required hospitalisation, and none required treatment in intensive care. In this work, the LumiraDX assay was the only assay to maintain a high sensitivity at time points from 21 to 244 days post symptom onset; the other assays showed variable levels of declining sensitivity. This was most marked for the Biomerica LFA, which had a sensitivity of 91% at 21-79 days post symptom onset, dropping to 54% at ≥ 140 days post symptom onset. The relative importance of specificity compared to sensitivity at least in part depends on the intended use. For example, for serosurveillance, particularly in low prevalence populations, specificity should be maximised to reduce the false positive rate. However, in clinical settings, drops in specificity with increased sensitivity might be acceptable if there were adequate follow up tests to identify true positives. A major question remains over what a positive antibody test result means in terms of protection from reinfection, or protection from infection following vaccination. This is pertinent given the continued discussions regarding the issue of “immunity passports” and a possible role for antibody tests in assisting populations where vaccine supply is scarce through identifying antibody negative individuals who should be prioritise for vaccination and/or antibody positive individuals who may be suitable for reduced vaccine dosing regimens. In this work, we compared the LFA result with the NT50 level (**Fig 4** and **Table 4**). It was apparent that a positive LFA was associated with a wide range of neutralising antibody levels, and that a number of LFA positive samples had low NT50 levels (< 50). This adds to the concern that a positive LFA result may not inform about the likelihood of protection from infection. Alongside specificity concerns and the increased circulation of variants of concern that may be able to evade existing humoral responses (28,29), this highlights the risk of using these tests for the purpose of immunity passports. Whilst some of the better performing LFA may be suitable for sero-prevalence studies, they should not be used to draw conclusions on an individual’s protection from reinfection. A major advantage of POCT stems from their potential to relieve requirement for phlebotomy and to be performed outside of a healthcare setting. However, for this potential to be realised, additional factors including POCT performance in capillary samples and ease of use should be considered. In this work, a head to head comparison of POCT performance on serum and capillary samples was performed, and for the LFAs HCW and participant interpretation of the test result was also compared. There were differences in performance on serum and capillary samples, with the majority of LFA showing reduced sensitivity on capillary samples compared to serum. Concordance between serum and capillary results was sometimes poor indicating that that the performance of a test on serum under laboratory conditions cannot be assumed to equate to performance on a capillary sample. Only the LumiraDX device was equally sensitive on both capillary and serum samples. There was good agreement between capillary results interpreted by HCW and participants for the Roche, Menarini, Biozek and Fortress kits studied indicating that these LFA tests may be best suited for use as home test kits as interpretation at home is likely to match interpretation in a hospital setting. For other LFAs, such as AbC-19, many participants had difficulty noticing the faint lines produced by the test kit and interpreted the result as being negative when it was interpreted as positive by a HCW. A disadvantage of the POCT devices compared to laboratory immunoassays is that they are strictly qualitative (with the exception of LumiraDX where a numeric value is converted to a qualitative result), meaning the possibility of introducing equivocal zones is not available. Further issues that came to light during the course of this evaluation work included the possibility of batch-to-batch variation in kits from the same manufacturer. For example two different batches of Menarini kits showed marked differences in sensitivity (76.7% on one batch verus 94.9% on another batch). Whilst we did not observe any failed tests during our evaluation of the LFA using serum or capillary samples, a high failure rate was seen using the LumiraDX device with capillary samples with 18 out of 29 participants having a failed test result with lumiraDX. Unpublished data available from another POCT evaluation site in Scotland (NHS Tayside) where a newer version of the LumiraDX device was being trialled indicated that there were no issues with failed SARS-CoV-2 antibody capillary tests with the newer version of the LumiraDX device suggesting that this issue has now been resolved by the company. However, strategies to identify such issues should continue to be employed by the manufacturer and/or the end user to ensure accurate performance on capillary samples. The strength of this work lies in its contribution to our understanding of POCT performance – in particular performance at different times post infection, confirmed evidence of batch to batch variation with some POCT kits and through demonstrating the general lack of correlation between a positive POCT result and the presence of neutralising antibodies. Our data also demonstrates that the results obtained in the lab using serum cannot automatically be assumed to be representative of finger-prick capillary test results. Furthermore, a good proportion of this work used a convalescent cohort of patients with relatively mild disease, making the findings applicable to community based studies. However, the study does have limitations. Due to sample availability constraints and the large number of POCTs evaluated not all could be evaluated on the same panel of positive and negative samples. In particular, the numbers of samples where paired serum and capillary sample results were available to assess concordance was limited for the majority of the POCT examined. For one of the test kits ease of use was assessed but as many of the study participants providing capillary samples in this study were healthcare workers the data obtained through this may not be representative of the general population. In summary, our results highlight a wide variation in performance of SARS-CoV-2 antibody test kits and illustrates the importance of evaluating multiple different aspects of test performance including checking for batch to batch variation, changes in sensitivity as time from infection increases, correlation with neutralising antibodies and performance on capillary samples. Thorough evaluation of all these aspects is essential prior to considering the utilisation of these tests for antibody or ‘immunity’ passports and identification of hospitalised patients who would benefit from monoclonal antibody treatment. ## Supporting information Supporting table 1 [[supplements/267703_file02.pdf]](pending:yes) ## Data Availability All data produced in the present study are available upon reasonable request to the authors ## Supplementary information S1 Table. **Manufacturers published product specifications and performance data** ## Acknowledgements Menarini test kits (batch batch 2003288) were supplied by Annette Thomas from Cardiff & Vale University Health board. Funding and support for this study was provided by Lothian NRS BioResource. We would also like to thank members of the National Service Scotland team for their support with co-ordinating this work. ## Abbreviations LFA : (lateral flow assay) POC : (point of care) PPV : (positive predictive value) NPV : (negative predictive value) RT-PCR : (real time polymerase chain reaction) NT50 : (half maximal neutralising titre) hCoV : (Human Coronavirus) SARS-CoV-2 : (severe acute respiratory syndrome coronavirus 2) COVID-19 : (coronavirus disease 2019). * Received December 16, 2021. * Revision received December 16, 2021. * Accepted December 16, 2021. * © 2021, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NoDerivs 4.0 International), CC BY-ND 4.0, as described at [http://creativecommons.org/licenses/by-nd/4.0/](http://creativecommons.org/licenses/by-nd/4.0/) ## References 1. 1.Guidance and Standard Operating Procedure. COVID-19 Virus Testing in NHS Laboratories [Internet]. NHS England and NHS Improvement; Available from: [https://www.rcpath.org/uploads/assets/90111431-8aca-4614-b06633d07e2a3dd9/Guidance-and-SOP-COVID-19-Testing-NHS-Laboratories.pdf](https://www.rcpath.org/uploads/assets/90111431-8aca-4614-b06633d07e2a3dd9/Guidance-and-SOP-COVID-19-Testing-NHS-Laboratories.pdf) 2. 2.Interim Guidelines for Collecting, Handling, and Testing Clinical Specimens for COVID-19 [Internet]. Centers for Disease Control and Prevention; Available from: [https://www.cdc.gov/coronavirus/2019-nCoV/lab/guidelines-clinicalspecimens.html](https://www.cdc.gov/coronavirus/2019-nCoV/lab/guidelines-clinicalspecimens.html) 3. 3.Guo L, Ren L, Yang S, Xiao M, Chang D, Yang F, et al. Profiling Early Humoral Response to Diagnose Novel Coronavirus Disease (COVID-19). Clin Infect Dis. 2020 Mar 21;ciaa310. 4. 4.RECOVERY Collaborative Group, Horby PW, Mafham M, Peto L, Campbell M, Pessoa-Amorim G, et al. Casirivimab and imdevimab in patients admitted to hospital with COVID-19 (RECOVERY): a randomised, controlled, open-label, platform trial [Internet]. Infectious Diseases (except HIV/AIDS); 2021 Jun [cited 2021 Oct 20]. Available from: [http://medrxiv.org/lookup/doi/10.1101/2021.06.15.21258542](http://medrxiv.org/lookup/doi/10.1101/2021.06.15.21258542) 5. 5.SARS-CoV-2 diagnostic pipeline [Internet]. Available from: [https://www.finddx.org/covid-19/pipeline/](https://www.finddx.org/covid-19/pipeline/) 6. 6.Sethuraman N, Jeremiah SS, Ryo A. Interpreting Diagnostic Tests for SARS-CoV-2. JAMA. 2020 Jun 9;323(22):2249. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1001/jama.2020.8259&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F12%2F16%2F2021.12.16.21267703.atom) 7. 7.Target product profile antibody tests to help determine if people have immunity to SARS-CoV-2, 2020. [Internet]. MHRA; Available from: [https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment\_data/file/883897/Target\_Product\_Profile\_antibody\_tests\_to\_help\_determine\_if\_people\_have\_immunity\_to\_SARS-CoV-2\_Version\_2.pdf](https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment\_data/file/883897/Target\_Product\_Profile\_antibody\_tests\_to\_help\_determine\_if\_people\_have\_immunity\_to_SARS-CoV-2_Version_2.pdf) 8. 8.Flower B, Brown JC, Simmons B, Moshe M, Frise R, Penn R, et al. Clinical and laboratory evaluation of SARS-CoV-2 lateral flow assays for use in a national COVID-19 seroprevalence survey. Thorax. 2020 Aug 12;thoraxjnl-2020-215732. 9. 9.Adams ER, Anand R, Andersson MI, Auckland K, Baillie JK, Barnes E, et al. Evaluation of antibody testing for SARS-Cov-2 using ELISA and lateral flow immunoassays [Internet]. Infectious Diseases (except HIV/AIDS); 2020 Apr [cited 2020 Apr 20]. Available from: [http://medrxiv.org/lookup/doi/10.1101/2020.04.15.20066407](http://medrxiv.org/lookup/doi/10.1101/2020.04.15.20066407) 10. 10.1. Fouchier RAM Pickering S, Betancor G, Galão RP, Merrick B, Signell AW, Wilson HD, et al. Comparative assessment of multiple COVID-19 serological technologies supports continued evaluation of point-of-care lateral flow assays in hospital and community healthcare settings. Fouchier RAM, editor. PLOS Pathog. 2020 Sep 24;16(9):e1008817. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1371/journal.ppat.1008817&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32970782&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F12%2F16%2F2021.12.16.21267703.atom) 11. 11.Muecksch F, Wise H, Batchelor B, Squires M, Semple E, Richardson C, et al. Longitudinal analysis of serology and neutralizing antibody levels in COVID19 convalescents. J Infect Dis. 2020 Nov 3;jiaa659. 12. 12.Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977 Mar;33(1):159–74. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.2307/2529310&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=843571&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F12%2F16%2F2021.12.16.21267703.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1977CY39700012&link_type=ISI) 13. 13. Antonin Bal, Trabaud M-A, Fassier J-B, Rabilloud M, Saker K, Langlois-Jacques C, et al. Six-month antibody response to SARS-CoV-2 in healthcare workers assessed by virus neutralisation and commercial assays. Clin Microbiol Infect. 2021 Jan;S1198743X21000057. 14. 14.L’Huillier AG, Meyer B, Andrey DO, Arm-Vernez I, Baggio S, Didierlaurent A, et al. Antibody persistence in the first six months following SARS-CoV-2 infection among hospital workers: a prospective longitudinal study. Clin Microbiol Infect. 2021 Jan;S1198743X21000318. 15. 15.Dan JM, Mateus J, Kato Y, Hastie KM, Yu ED, Faliti CE, et al. Immunological memory to SARS-CoV-2 assessed for up to 8 months after infection. Science. 2021 Feb 5;371(6529):eabf4063. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjE3OiIzNzEvNjUyOS9lYWJmNDA2MyI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIxLzEyLzE2LzIwMjEuMTIuMTYuMjEyNjc3MDMuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 16. 16.1. McAdam AJ, editor Addetia A, Crawford KHD, Dingens A, Zhu H, Roychoudhury P, Huang M-L, et al. Neutralizing Antibodies Correlate with Protection from SARS-CoV-2 in Humans during a Fishery Vessel Outbreak with a High Attack Rate. McAdam AJ, editor. J Clin Microbiol. 2020 Aug 21;58(11):e02107-20, /jcm/58/11/JCM.02107-20.atom. 17. 17.Long Q-X, Liu B-Z, Deng H-J, Wu G-C, Deng K, Chen Y-K, et al. Antibody responses to SARS-CoV-2 in patients with COVID-19. Nat Med [Internet]. 2020 Apr 29 [cited 2020 Apr 29]; Available from: [http://www.nature.com/articles/s41591-020-0897-1](http://www.nature.com/articles/s41591-020-0897-1) 18. 18.The UCSF COMET Consortium, Combes AJ, Courau T, Kuhn NF, Hu KH, Ray A, et al. Global absence and targeting of protective immune states in severe COVID-19. Nature. 2021 Mar 4;591(7848):124–30. 19. 19.Tan AT, Linster M, Tan CW, Le Bert N, Chia WN, Kunasegaran K, et al. Early induction of functional SARS-CoV-2 specific T cells associates with rapid viral clearance and mild disease in COVID-19 patients. Cell Rep. 2021 Jan;108728. 20. 20.1. Schultz-Cherry S Guthmiller JJ, Stovicek O, Wang J, Changrob S, Li L, Halfmann P, et al. SARS-CoV-2 Infection Severity Is Linked to Superior Humoral Immunity against the Spike. Schultz-Cherry S, editor. mBio. 2021 Jan 19;12(1):e02940-20, /mbio/12/1/mBio.02940-20.atom. 21. 21.Wang P, Liu L, Nair MS, Yin MT, Luo Y, Wang Q, et al. SARS-CoV-2 neutralizing antibody responses are more robust in patients with severe disease. Emerg Microbes Infect. 2020 Jan 1;9(1):2091–3. 22. 22.Wajnberg A, Amanat F, Firpo A, Altman DR, Bailey MJ, Mansour M, et al. Robust neutralizing antibodies to SARS-CoV-2 infection persist for months. Science. 2020 Oct 28;eabd7728. 23. 23.Borremans B, Gamble A, Prager K, Helman SK, McClain AM, Cox C, et al. Quantifying antibody kinetics and RNA shedding during early-phase SARS-CoV-2 infection [Internet]. Open Science Framework; 2020 May [cited 2020 May 19]. Available from: [https://osf.io/evy4q](https://osf.io/evy4q) 24. 24.Lou B, Li T-D, Zheng S-F, Su Y-Y, Li Z-Y, Liu W, et al. Serology characteristics of SARS-CoV-2 infection after exposure and post-symptom onset. Eur Respir J. 2020 Aug;56(2):2000763. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiZXJqIjtzOjU6InJlc2lkIjtzOjEyOiI1Ni8yLzIwMDA3NjMiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMS8xMi8xNi8yMDIxLjEyLjE2LjIxMjY3NzAzLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 25. 25.Zhao J, Yuan Q, Wang H, Liu W, Liao X, Su Y, et al. Antibody Responses to SARS-CoV-2 in Patients With Novel Coronavirus Disease 2019. Clin Infect Dis. 2020 Nov 19;71(16):2027–34. [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2021%2F12%2F16%2F2021.12.16.21267703.atom) 26. 26.Sabino EC, Buss LF, Carvalho MPS, Prete CA, Crispim MAE, Fraiji NA, et al. Resurgence of COVID-19 in Manaus, Brazil, despite high seroprevalence. The Lancet. 2021 Jan;S0140673621001835. 27. 27.Kadelka S, Bouman JA, Ashcroft P, Regoes RR. Comment on Buss et al., Science 2021: An alternative, empirically-supported adjustment for sero-reversion yields a 10 percentage point lower estimate of the cumulative incidence of SARS-CoV-2 in Manaus by October 2020. ArXiv210313951 Q-Bio [Internet]. 2021 Mar 25 [cited 2021 May 25]; Available from: [http://arxiv.org/abs/2103.13951](http://arxiv.org/abs/2103.13951) 28. 28.Hoffmann M, Arora P, Groß R, Seidel A, Hörnich BF, Hahn AS, et al. SARS-CoV-2 variants B.1.351 and P.1 escape from neutralizing antibodies. Cell. 2021 Mar;S0092867421003676. 29. 29.Vidal SJ, Collier AY, Yu J, McMahan K, Tostanoski LH, Ventura JD, et al. Correlates of Neutralization Against SARS-CoV-2 Variants of Concern by Early Pandemic Sera. J Virol. 2021 Apr 23;JVI.00404-21, jvi;JVI.00404-21v1.