Author Response:
Reviewer 1:
The deficiencies of this study are:
- This is a very specific cohort, largely urban, with - presumably - relatively higher levels of education. It is hard to see how this might translate into a general statement about the population
We agree with the reviewer that this is a very specific cohort, largely urban, and with higher levels of education than average. We further agree that the utility of this cohort is not in making general statements about the population, but rather in deriving specific insights for which the cohort is best suited. We enumerate some of them that are present in this manuscript.
a) It is as important to understand the relative degree of spread between Indian cities, where a combination of denser population and indoor lives has led to the greatest spread of disease. Since pandemics are typically self-limiting, regions with greater spread are further along the course and can expect declines faster. This provides useful insight for public health strategy. While our cohort does not necessarily represent the average population, it is similar between cities, something that is not true for any other survey. The ICMR national serosurvey is a random selection of districts and is heavily rural biased.1 While that is important, that is not where fast growing outbreaks are likely based on a very outdoor life and lower density. Other city-wise serosurveys are variable in target population as well as methodology and cannot be easily compared.2-5 Thus our data is the first that permits comparison between many important urban regions of India, showing which regions were more advanced along the course and where future outbreaks were still likely. We note here that some of the regions identified by this survey as high risk such as Kerala, interior Maharashtra, amongst others, are where the outbreaks continued until much later.
b) The CSIR cohort has the added advantage of greater baseline data and repeated access, we are able to determine antibody stability, as shown, and possible correlates
c) The cohort is well suited to understanding clinical associations of SARS CoV2 infections such as symptom rate and severity amongst its participants as well as associations of infection risks (using seropositivity as an imperfect surrogate).
- The presentation of Figure 1 was quite confusing, especially the colour coding
Figure 1 was made to represent cities with CSIR labs where the sero-survey was carried out in different colour coding formats to have a quick understanding of prevalence. The cities with sero-prevalence greater than 10 percent were coded as green, while cities with sero-prevalence between 5-10 percent values were coded as yellow. Cities with less than 5 percent sero-positivity were depicted as red for these may turn up into hotspots or rise of cases may be higher in these cities later when sero-positivity is used as an indirect surrogate of infection. Though, these cities while truly may not represent the state population, the state colour were coded as a gradient blue in respective format to reflect increased sero-positivity in a darker shade according to city sero-positivity.
- It is surprising that the state of Maharashtra shows only intermediate to low levels of seropositivity, given that the impact of the pandemic was largest there and especially in the city of Pune. There have been alternative serosurveys for Pune which found much higher levels of seropositivity from about the same period.
The Pune city sero-surveillance which has been pointed out by the reviewer was a survey of Pune’s five most affected sub-wards and not the Pune population in general. 6 Despite all the limitations, which we accept in the prior comment, our overall crude positivity rate of 10% is very similar to that of the ICMR national serosurvey, and in general the patterns we see are along the lines of what is known about severity of outbreaks. Thus, there is no real evidence to the contrary that would establish inaccuracy of the trends seen by us, and we respectfully note that surprising findings may be the most valuable ones. In fact, seeing current trends of rising cases in Maharashtra, including in Pune, when compared to other cities, our survey values may have been more correct.
- The statement "Seropositivity of 10% or more was associated with reductions in TPR which may mean declining transmission": For a disease with R of about 2, this would actually be somewhat early in the epidemic, so you wouldn't expect to see this in an indicator such as TPR. TPR is also strongly correlated with amounts of testing which isn't accounted for.
We agree that for R of about 2, one would not expect a decline at sero-positivity of about 10%. However, it is worth noting that general seropositivity during the declining phase of the outbreak has been in this range for not just India, but also in major western European cities, New York, amongst other.7 This has three explanations. First, the highly exposed community containing the high-contact spreaders gets infected first, with higher seropositivity, thus effectively shortening or blocking transmission chains. We too note a much higher seropositivity amongst public transport users who may better represent this sub-population. Second, R0 of 2-3 is the potential of this virus. R-effective after measures are put in place may be much lower.8 9 Better compliance with masking in India may have been important. Last, the fraction of population immune at baseline is unknown but has been variably estimated at 20-30% from T cell reactivity studies as well as closed area breakouts such as ships. This is a speculative area but may help understand the results.
We agree that we do not directly account for testing rate, which is difficult to adjust for and can affect TPR despite that fact that TPR already is one way of adjusting for different levels of testing. Since our data is a trend across different geographies, but for 30 days bracketing the sample collection, different testing rate would not in itself explain the very strong inverse association of seropositivity with TPR. Given that high seropositivity areas are likely more advanced in the course of the pandemic, we favour that as the explanation. This is after noting the issues with overall seropositivity as a surrogate of population immunity as above.
- The correlation with vegetarianism is unusual - you might have argued that this could potentially protect against disease but that it might protect against infection is hard to credit. Much of South Asia is not particularly vegetarian but has seen significantly less impact
We very well agree with the statement that much of South-Asia is particularly non-vegetarian and when we started analyzing our data, it was observed that our cohort had a 70:30 ratio for non-vegetarian population to vegetarian population which was in agreement with what nationwide surveys have concluded in the past and hence our cohort was not biased in terms of sampling for this variable. 10 We hereby in this work have tried to demonstrate sero-positivity as an indirect surrogate of infection and the data was not analyzed in respected of zonal distribution and was analyzed for the entire cohort where we obtained the said observation. At this stage, we cannot speculate on the role a vegetarian diet may play in decrease sero-positivity amongst vegetarian individuals but could possibly relate it to anti-inflammatory effects and effect of high fibre diet in protecting gut mucosa against viral invasion. Existing studies have only speculated on the role diet could play and there are no affirmative or largely biochemical studies to provide further evidence on this cause effect relationship.11 12 We also did a multi-collinearity analysis to study if diet was related to any other variable being studied but we didn’t find any such association.
- On the same point above, it is possible that social stratification associated with diet - direct employees being more likely to be vegetarian than contract workers - might be a confounder here, since outsourced staff seem to be at higher risk.
When we analyzed the data, we also hypothesized for the above stated bias; a person’s occupation or job reflecting indirectly the socio-economic status can have an influence on diet preferences, but we didn’t obtain such a finding. In our cohort also, outsourced staff had higher non-vegetarianism than staff. Against 70:30 ratio of non-vegetarianism to vegetarianism, for the entre cohort, outsourced staff had 83 percent non-vegetarianism while staff had 66 percent, but sero-positivity amongst non-vegetarians in both the groups had higher sero-positivity of 17.25 and 8.77 percent respectively against sero-positivity of 11.89 and 6.05 percent amongst vegetarian people. We also did a logistic regression and collinearity assessment through VIF score but did not observe any such association and hence this was not acting as a confounder. For females, we rather didn’t found this association and only found transport and occupation to be significant, hence to a certain extent it is the crowding environment and occupational exposure which stand as major exposure variables when both the genders are taken into consideration.
- There may be correlations to places of residence that again act as confounders. If direct employees are provided official accommodation, they may simply have had less exposure, being more protected.
That was a standing hypothesis for this work as CSIR labs provides accommodation at campus at most of the labs, this data we couldn’t study as the variable was not available for where we could have observed the residence status of a person if he/she resides in office provided accommodation or outside the lab in city. Though we didn’t study this exclusively, it remains more of speculative than affirmative but this is in agreement for a hypothesis that outsourced personnel and staff who have to travel and specifically utilize public means of transport are exposed to a higher risk.
- The correlations with blood group don't seem to match what is known from elsewhere.
Data for 7496 individuals was available for their Blood Group type and serological status. Blood Group (BG) distribution amongst total samples collected was similar to national reference based on a recent systematic review.13 Hence the sample characteristics of our cohort were similar to the national population reference. Through the literature available, it has been observed that ‘O’ BG type has less risk of getting infected which was observed in our study also.14-19 In our study, BG type O was associated with a lower sero-positivity rate, with an OR of 0∙76 (95 % CI 0∙64 -0∙91, p=0∙018) vs Non O blood group types with a overall sero-positivity of 7.09 percent which was less than the cohort wide sero-seropositivity. BG type AB and B had higher chances of testing sero-positive is what has been observed by available literature which was corroborated in our findings too.17 In regard to available literature; BG A has a higher risk of getting infection and this was contrary to our finding where we obtained a favourable OR in favour BG type A albeit it was not significant on statistical testing.14-16 18
- The statement that "declining cases may reflect persisting humeral immunity among sub-communities with higher exposure" is unsupported. What sub-communities?
Wording has been corrected, it just refers to sub-groups of population with high levels of exposure
Reviewer 2:
Weaknesses:
- The extrapolation of the study results to the country may not be completely acceptable with the basic difference from the country's urban rural divide and a largely agricultural economy. The female gender is underrepresented in the study cohort, and no children have been included.
We agree with the reviewer that this is a specific cohort. We agree that female gender in the cohort is underrepresented and hence all variable based associations were done separately for male and female. For low number of female samples in the cohort, association with smoking etc could not be carried out, while, it was obtained as not significant on model testing for diet variable. As the ethical approval didn’t permit us to have data on children, we couldn’t provide the same, but it is complemented through ICMR survey who have provided data for younger individuals. We further agree that the utility of this cohort is not in making general statements about the population, but rather in deriving specific insights for which the cohort is best suited. We enumerate some of them that are present in this manuscript.
a) It is as important to understand the relative degree of spread between Indian cities, where a combination of denser population and indoor lives has led to the greatest spread of disease. Since pandemics are typically self-limiting, regions with greater spread are further along the course and can expect declines faster. This provides useful insight for public health strategy. While our cohort does not necessarily represent the average population, it is similar between cities, something that is not true for any other survey. The ICMR national sero-survey is a random selection of districts and is heavily rural biased.1 While that is important, that is not where fast growing outbreaks are likely based on a very outdoor life and lower density. Other city-wise serosurveys are variable in target population as well as methodology and cannot be easily compared.2-5 Thus our data is the first that permits comparison between many important urban regions of India, showing which regions were more advanced along the course and where future outbreaks were still likely. We note here that some of the regions identified by this survey as high risk such as Kerala, interior Maharashtra, amongst others, are where the outbreaks continued until much later.
b) The CSIR cohort has the added advantage of greater baseline data and repeated access, we are able to determine antibody stability, as shown, and possible correlates
c) The cohort is well suited to understanding clinical associations of SARS CoV2 infections such as symptom rate and severity amongst its participants as well as associations of infection risks (using seropositivity as an imperfect surrogate)
- The observations regarding corelates of sero-positivity such as diet smoking etc would need specifically designed adequately powered studies to confirm the same. The sample size for the three and six months follow up to conclude stability of the humoral immunity, is small and requires further follow-up of the cohort. The role of migration of labour helping the spread of the pandemic simultaneously to all parts of the country though attractive may not explain lower rates in states like UP and Bihar where maximum migrants moved to.
We agree that the observations in regard to diet and smoking are only hypothesis generating and need specifically designed studies to confirm the findings. We have also mentioned in the manuscript that associations found between seropositivity and some of the parameters should be confirmed with studies specifically designed for this purpose. We are following up more individuals at three and six months to ascertain the stability of the antibodies. Maximum migrants in the early phase moved to UP and Bihar and it would indeed be expected that seeding would be higher there. While known cases were low for these states, the seropositivity data supports that seeding did occur but may have gone undetected. The ICMR Aug-Sept serosurvey data, for example, shows seropositivity in districts of these states to be higher than those Gujarat or Rajasthan.
- A large chunk of seropositive data set has been removed representing the big cities of Delhi and Bengaluru while correlating Test Positivity Rate citing duration as the reason. However, these cities also had different testing strategies and health infrastructure and hence are important.
We agree that for these cities, data was removed considering sample collection was extended in these labs, though for Delhi, only IGIB has been removed, rest all Delhi labs data are still in the analysis. The data was removed for the mentioned reason above but, the graph directionality and trends remain same when analyzed with the excluded data. On keeping Bengaluru data, R square doesn’t change to second decimal place and remains same, while on adding the data from CSIR-IGIB, the R square is 0.32, maintaining the directionality and trend. We would like to state that for IGIB, the collection spanned over considerable time duration when the sero-positivity was low to the time when sero-positivity had come to mentioned levels in Delhi.
- Test positivity rate depends on testing strategy and type of test used; whether RTPCR or the Rapid Antigen Test and the ratio of the two tests was different in different parts of the country.
This is a very well taken point, but the data was taken as a surrogate from a third party website for calculation purposes only with results obtained were logically expected, though yes, it should be done ideally with one type of test only and this could influence the outcome and interpretation , but when we saw the data and compare to the observed real trends, the graph directionality is in agreement and hence the adoption of these as surrogate could well work in these scenarios specifically when context is in of large scale heterogenous population.
Reviewer 3:
Weaknesses: While it is a pan-India survey, the population is not quite representative of general population of the country. CSIR labs are mostly in cities, and most of the employees use private transport. So the results cannot be generalized to the country as a whole. Restricting to people using public transport would be a better representation, although it still would not be fully representative.
We agree with the reviewer that this is a specific cohort, largely urban. We also agree that a cohort of people utilizing public transport would be better representative and we are following the individuals as the cohort enables to follow them and get further insights. We further agree that the utility of this cohort is not in making general statements about the population, but rather in deriving specific insights for which the cohort is best suited. We enumerate some of them that are present in this manuscript.
a) It is as important to understand the relative degree of spread between Indian cities, where a combination of denser population and indoor lives has led to the greatest spread of disease. Since pandemics are typically self-limiting, regions with greater spread are further along the course and can expect declines faster. This provides useful insight for public health strategy. While our cohort does not necessarily represent the average population, it is similar between cities, something that is not true for any other survey. The ICMR national sero-survey is a random selection of districts and is heavily rural biased.1 While that is important, that is not where fast growing outbreaks are likely based on a very outdoor life and lower density. Other city-wise serosurveys are variable in target population as well as methodology and cannot be easily compared.2-5 Thus our data is the first that permits comparison between many important urban regions of India, showing which regions were more advanced along the course and where future outbreaks were still likely. We note here that some of the regions identified by this survey as high risk such as Kerala, interior Maharashtra, amongst others, are where the outbreaks continued until much later.
b) The CSIR cohort has the added advantage of greater baseline data and repeated access, we are able to determine antibody stability, as shown, and possible correlates
c) The cohort is well suited to understanding clinical associations of SARS CoV2 infections such as symptom rate and severity amongst its participants as well as associations of infection risks (using seropositivity as an imperfect surrogate).