A topic landscape study for adolescent depression ================================================= * Tianran Li * Liang Guo * Xiaoqiang Wang * Stijn Decoster ## Background Literature on adolescent depression is not only rich in content and diverse in form, but also scattered and complex in structure. However, there is no dynamic development analysis and evolution analysis related to the topic. Therefore, this study drew a topic landscape map and predicted the prevalence of topics in the future. **Methods** A systematic review was made to collect research publications on adolescent depression and describe the development of this field in the last five decades. We adopted visualization strategy and Herfindahl-Hirschman Index to uncover the latent structure of different topics from literature. The ARIMA model was utilized to predict the prevalence of each topic in the time dimension. **Results** By using content analysis technique, 21 topics are extracted from 1,595 articles of adolescent depression. These 21 topics were further divided into four hot topics, seven stable topics and ten cold topics based on the results of the ARIMA model. In particular, we predict that “family environment and parenting styles”, “diagnosis, treatment and interventions” and “mental disorder and behavior problems” will receive much attention in the next five years. **Conclusion** This paper provides temporal characteristics of each topic. This has an important implication to choose promising research topics for researchers and journal editors. Keywords * adolescent depression * topic landscape * content analysis * ARIMA * prevalence of topics ## Background Over the years, the World Health Organization (WHO) has warned that depression is a major cause of disease and disability, affecting more than 264 million people worldwide. For example, depression is linked with a higher risk of suicide [1]. Studies have shown that depression increases significantly during adolescence and its course is frequently recurrent[1]. Depressive symptoms appearing in adolescence are related to more costly and frequent physical health problems[2], and increase the risk of suicide attempts[3]. Prior studies have investigated the explanatory theories of depression. Bernaras et al. (2019)[4] summarize a variety of causal factors, such as (1) biological factors including noradrenalin deficits, endocrine disorders, sleep-related disorders, alterations in brain structure, and genetics; (2) psychological factors including attachment theories, behavioral models, cognitive models, the self-control model, interpersonal theory, stressful life events, and sociocultural models. Extensive evidence illustrates that parenting styles or family situations are highly associated with adolescent depression[5, 6]. Parents with high conflicts, families with less warmth as well as aversiveness were found to augment the adolescent’s risk of depression. Parents also play an important role in prevention[7]. In addition, social networking sites can damage personal relationships and cause interpersonal-relationship conflicts. For instance, a mediation model was established to construct the relationship between social networking sites and adolescent depression [8]. Based on the classical theory of depression, a great number of studies have empirically explored the consequences as well as the determinants of adolescent depression. Multidisciplinary knowledge for adolescent depression is not only profound in content and diverse in form, but also scattered and complex in structure. To the best of our knowledge, there is no the topics of adolescent depression have been systematically reviewed across multiple disciplines in the past 50 years. There is also no a prediction of future prevalence of the topics in this literature is conducted. Hence, a clear mapping of this increasingly complex landscape is urgently established to help researchers make effective literature review. This extensive description of the field of adolescent depression will be valuable because it provides an understanding of how this field has evolved over time. Moreover, it reveals the consensus and differences among scholars, and reveals the blank of research on knowledge structure in this area. In this study, we aim to predict the prevalence of topics that might generate groundbreaking research for adolescent depression in the future. A computerized overview of the literature in the past five decades is firstly carried out. We further uncover a latent structure which consisting of 21 topics and their development tracks. Finally, we predict the popularity of these 21 topics based on the temporal characteristics of each topic, which will help researchers and journal editors to choose promising research topics. ## Materials and Methods ### Data processing Web of Science (WoS) core collection is the most authoritative citation index for scientific and scholarly research. We first collected the abstracts of all the articles that appeared in WoS. Notice that the first paper on adolescent depression appeared in 1967. Next, After removing duplicates, in total 2,072 articles were selected according to the following three criteria: 1) redaction in English; 2) including “adolescent depression” in title, abstract or keywords; 3) SCIE or SSCI indexed journals. After eliminating the articles which did not include an abstract and which were obviously unrelated to adolescent depression, we obtained a valid sample of 1,595 articles. The PRISMA flowchart in Appendix I illustrate the process of selecting or excluding the articles. ### Topic extraction methodology We followed the conceptual analysis approach outlined by Hsieh & Shannon[9]. Firstly, we randomly split the sample articles into three sets: 600 articles in the first set as the “training sample”, 500 articles in the second set as the “validation sample” and the remaining 495 articles in the third set as the “test sample”. We invited 30 student volunteers who majored in social sciences to read the full text of the first set of articles. The students and the author team worked together to develop sets of terms and of coding rules. Next, the students and the author team developed a set of *n* topics based on these terms. Finally, each student and author read the abstracts of the 1,595 articles and assigned a topic loading (i.e. a score between zero to one that refers to how likely does a particular article belong to a particular topic) to every topic. If an article does not belong to a particular topic, then its topic loading to this topic is zero. We generated an article-topic matrix with a size of 1,595**n*, in which each row represented an article and each column represented a topic. We assigned each article to the dominant topic with the highest topic loading. In order to verify the effect of conceptual analysis and considering that there is no high linear correlation among the feature vectors, we employed the t-SNE approach to carry out nonlinear dimension reduction on the data[10]. A t-SNE algorithm can ensure that the distribution of data in low dimension has high similarity with the distribution of original feature space. Most importantly, it can visualize the distance between topics in two-dimensional space, so that we can have an intuitive idea about whether the classification of *n* topics makes sense or not. Then, we followed the approach of Guo et al.[11]. adopted Herfindahl-Hirschman Index (HHI) to evaluate the level of topic diversity. HHI [12] is frequently used by economists and government regulators to measure the changes in market share, namely the dispersion degree of firm size in the market. Assuming that there are *N* enterprises in the industry, the value of HHI varies 1/*N* to 1. The higher the value, the higher the degree of market concentration and monopoly. HHI refers to the sum of squares of the percentage of total assets of each market competitor in an industry. Analogically, HHI in our case could be defined as the sum of squares of the ratio of the score of each article to the total score. The value of HHI for each topic further varies between 1/*N**i* and 1, where *N**i* is the number of articles with a score of not zero in topic *i*. However, the sample size of *N**i* is different for each topic. In order to make a fair horizontal comparison of concentration distribution between different topics, we normalized HHI for each topic as follows ![Formula][1] where *Y**ij* represents the score of article *j* in the topic *i* and *Y**i* the total score for the topic *i*. This normalized HHI takes value from zero to one. The larger the value of *H**i*, the higher the degree of concentration of topic *i*. ### Topic dynamics methodology We firstly classified each article according to the topic with the highest score, and counted the number of articles for each assigned topic. We further conducted time series analysis to predict the development trend of each topic in the next five years, and compared them with each other. The exponential smoothing model and the Autoregressive Integrated Moving Average (ARIMA) model are the two most widely used time series forecasting methods. Based on these two prediction methods, many other prediction methods have been born. In our case, most of the article count series of 21 topics are observed to involve autocorrelation. This fact will lead to the correlated residuals when fitting data by the regression model. Obviously, the phenomenon of autocorrelation hinder the performance of regression model to fit our data. Such defect of autocorrelation can be solved by the I (“integrated”) part in ARIMA. The article count sequence values will be differentiated between the current data and the previous one or several data, so that non-stationary series can be modeled. In the ARIMA model, the AR part is responsible for the linear regression of the previous time series value; the MA part is responsible for the linear regression of the previous random error series. ARIMA clearly meets a set of standard structures in time series data (i.e. article count series data) and provides a simple and powerful method for proficient time series forecasting [13]. According to *Time Series Analysis Forecasting and control* [14], the timing data predicted by the ARIMA model must be stable, which, by definition, has no trend and no periodicity (that is, its mean value has a constant amplitude on the time axis, and its variance tends to the same stable value on the time axis). We used Dickey-Fuller Test to judge the stability of time series data, which is a common unit root Test method. For the time series data that do not meet the requirements of stationarity, we made a difference to eliminate the periodic factors. The degree of differences to make the data stable was taken as parameter *d*. If the preprocessed time series was determined to be a stable non-white noise sequence, we determined the autocorrelation function (ACF) and partial autocorrelation function (PACF) as parameters *p* and *q*. We conducted a grid-search to calculate the scores of RMSE, AIC and BIC under different combinations of orders, and chose the order specifications with the lowest scores. To ensure the validity of the model, we drew the residual to check if it is completely random white noise, and performed the Ljung-Box test to formally check whether the error is uncorrelated among many lags [15]. Otherwise, we will improve the model by removing all the remaining trends. ## Results ### Data description The top line in Figure 1 depicts the annual growth rate of article count from 1967 to 2020. ![Figure 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/12/24/2020.12.22.20248745/F1.medium.gif) [Figure 1.](http://medrxiv.org/content/early/2020/12/24/2020.12.22.20248745/F1) Figure 1. The percentage of publications (‰, top) and the percentage of the growth rate (%, bottom) in the field of “adolescent depression” It can be seen that after a period of volatility from 1980 to 1995, the growth rate of papers has tended to be flat in the past two decades. In 1987, the growth rate jumped from −50.00% of the previous year to 200.00%, and reached the highest level of 250.00% in 1990. However, as shown by the bottom line in Figure 1, the proportion of the articles in the field of adolescent depression to the total SCIE or SSCI indexed articles has become larger and larger since 1990. It means that over the past five decades, the field has been greatly developed. Especially, only one article (0.003‰) related to adolescent depression is retrieved in 1967 in WoS, but this number has grown to 140 (0.061‰) in 2019. The average annual growth rate in this field reached 18.7%. The papers related to the adolescent depression were found to be published in 586 journals. The authors of these papers are distributed in 71 countries, of which the United States accounts for 61.15%, Britain 9.56%, Canada 7.42%, Australia 6.95%, and China 4.54%. One twentieth of them come from the University of Pittsburgh. Table 1 summarizes the ten most frequent organizations, ten most frequent fields and ten most frequent outlets. We notice that the top ten fields listed in Table 1 account for about 92.13% of all articles. Additionally, the articles are spread across 79 WoS research fields. View this table: [Table 1](http://medrxiv.org/content/early/2020/12/24/2020.12.22.20248745/T1) Table 1 Top 10 organizations, research fields and research outlets ### Topic extraction Following the method described in Subsection *Topic extraction methodology*, the students and the author team initially classified the second set of 500 articles into 20 topics. However, we found that the optimal number of topics should be 21. Finally, the students and the author team classified the third set of articles into 21 topics (see Table 2). We believe that these 21 topics allowed to correctly categorize all the sample articles. In Figure 2, we find that the topics are evenly distributed, which validate our conceptual analysis and topic distribution. View this table: [Table 2.](http://medrxiv.org/content/early/2020/12/24/2020.12.22.20248745/T2) Table 2. The 21 topics and their Herfindahl-Hirschman Index (HHI) ![Figure 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/12/24/2020.12.22.20248745/F2.medium.gif) [Figure 2.](http://medrxiv.org/content/early/2020/12/24/2020.12.22.20248745/F2) Figure 2. t-SNE clustering of the 1,595 articles in two-dimensional maps. The analysis was statistically constrained to 21 clusters for distinct topics as colored nodes. Table 2 lists the score of normalized HHI for all the 21 topics. Some topics exhibit a high degree of concentration. For example, “psychosocial functioning and religiosity” (Topic 4), “internet and digital technology” (Topic 20) and “self-report and self-efficacy” (Topic 7) exhibit the three highest HHI, and they are all topics with strong topicality, resulting in a relatively concentrated distribution of scores. Topics exhibiting the lowest HHI include “diagnosis, treatment and interventions” (Topic 5), “family environment and parenting styles” (Topic 16) and “brain function, hormone changes and body parameters” (Topic 11) discussed “treatment” and “interventions”, “family environment” and “physiological factors”, etc., which are easy to be mentioned incidentally, so the score distribution is relatively uniform. Notice that we obtained consistent results between t-SNE clustering and degree of concentration (HHI in Table 2). For instance, the lower HHI scores are more scattered in the topic area shown in Figure 2 (e.g. Topic 5, 6, 11, 16). Conversely, the higher HHI score shows a larger cluster in the topic area shown in Figure 2 (e.g. Topic 4, 7, 20). These results confirm the rationality of our selected topics. ### Topic dynamics Table 2 displays the article count data in each topic. The article count for each topic ranges from 24 to 206 with mean 75.95 and standard deviation 47.77. “diagnosis, treatment and interventions” (Topic 5), “family environment and parenting styles” (Topic 16) and “screening, assessment and forecast” (Topic 6) are the three most prevalent topics. In contrast, “self-report and self-efficacy” (Topic 7), “psychosocial functioning and religiosity” (Topic 4), and “Internet and digital technology” (Topic 20) are the three least prevalent topics. In order to study the figures and publications that have contributed the most to each topic in the field of “adolescent depression”, we listed the top authors and publications that appear the most frequently on each topic in AppendixII, which can provide reference for research direction and popular trends in this field. Note that we listed multiple authors or publications if their frequencies of appearance were the same. We found that Peter M. Lewinsohn from Oregon Research Institute was one of the most productive and influential authors in Topic 1, 4, and 19. We also found that the Journal of The American Academy of Child and Adolescent Psychiatry was the most popular outlet for nine topics. Finally, we selected three most representative articles (i.e. with the highest topic loadings) of each topic and summarize the information in AppendixII. Notice that the number of articles published in the field of adolescent depression increased from one in 1967 to 140 in 2019, which indicate that more and more scholars have participated and made contributions in this field. Further, we expected to explore the rapid development period and the dynamic development law for each topic in the time dimension. In order to show the process more intuitively, we drew the ThemeRiver (see Appendix III), which indicated that Topic 2 (social support, primary care and mental health services), Topic 5 (diagnosis, treatment and interventions), Topic 6 (screening, assessment and forecast), Topic 11 (brain function, hormone changes and body parameters), and Topic 16 (family environment and parenting styles) experienced rapid growths during the last decade. Then, we built 22 time series comprising of the field integrally and 21 topics. An optimized ARIMA model is finally adopted to predict the proportion of publications in this field for each topic in the next five years (i.e. 2021–2025), respectively. The predicted average annual growth rate will be used as the indicator of the prevalence of future topic (see Table 3). In the future, the field is likely to expand steadily because its annual growth rate will reach 49.03%, but this figure is far lower than the peak annual growth rate of 250% in 1990. This fact indicates that adolescent depression will continue to be a hotspot of academic research, but it is unlikely to attract explosive attention in professional research communities. View this table: [Table 3.](http://medrxiv.org/content/early/2020/12/24/2020.12.22.20248745/T3) Table 3. The results of ARIMA and prediction. We utilize the following criteria to divide 21 topics into three categories: hot topics with predicted annual growth rate greater than or equal to the field growth rate (i.e. 49.03%); stable topics with growth rate greater than zero or equal to zero but less than the field growth rate; cold topics with negative topic rate. As a result, we find four hot topics, seven stable topics and ten cold topics (see Table 3 and Table 2). ## Discussion Our ARIMA model predicted that “School education and school-based preventions” (Topic 12, - 131.74%), “Social stratification, socioeconomic statues and racial inequality” (Topic 3, −131.1%) and “Suicidal ideation, sleep disorders and other externalized symptoms” (Topic 9, −46.41%) might dramatically shrink. Likewise, “Obesity, BMI and body shame” (Topic 10, −35.79%) and “Psychosocial functioning and religiosity” (Topic 4, −12.35%) also showed a significant downward trend. That is probably because these topics have been addressed extensively over decades, which makes these fields of investigation saturated. On the contrary, the “Family environment and parenting styles” (Topic 16, 106.2%) and “Diagnosis, treatment and interventions” (Topic 5, 98.75%) showed a significantly positive trend of increase. In the past studies in this field, these two topics also showed a high degree of popularity (22.38% of the articles). Our ARIMA findings suggested that given the omnipresence of work-life imbalance in today’s industrial relations, it is reasonable that the research focus of the field might turn to family environment and treatment diagnosis. In addition, “Brain function, hormone changes and body parameters” (Topic 11, 68.24%) and “Prevalence, co-morbidity and probability of onset” (Topic 19,45.22%) also presented a sign of popularity. This might suggest that more and more researchers will develop and employ new scientific methods. It is worth mentioning that although “Internet and digital technology” (Topic 20, 43.64%) was a small topic in the past, our ARIMA analysis predicted that this topic might attract attention in the future, probably because on the one hand, more and more adolescents are addicted to the Internet, which may lead to depression; on the other hand, future research may investigate how to use the Web and digital technology to prevent adolescent depression. This study constructed all of the WoS literature in the field of adolescent depression over the last five decades. By describing the entire knowledge system at a relatively detailed level, we contribute to a rich understanding of the topic landscape of the field so that researchers and journal editors can appreciate the entire range of topics and choose the topics they wish to study in depth. In addition, because we help instructors identify important topics and assign groundbreaking articles related to each topic, the topic landscape provides information for related teaching and curriculum design. However, it must be pointed out that we only collected sample articles from WoS. Although WoS is the most comprehensive and authoritative source for high-impact publications, we may run the risk of missing articles published in emerging journals or books. Consequently, our findings may be biased. Future studies may also collect publication records from other freely-accessible online publications. ## Conclusions We firstly established a systematic overview of the research on adolescent depression in the last five decades: its evolution, topic landscape, and dynamics. We uncovered the latent structure of 21 different topics from 1,595 articles related to adolescent depression. Secondly, we utilized the ARIMA model to predict the prevalence of 21 topics in the next five years. Our forecasting results suggested that although this field might continue to be a hot research direction and attract more attention, ten topics might lose their popularity. ## Data Availability Some or all data, models, or code generated or used during the study are available from the corresponding author by request. ## Disclosure statement - Conflict of interest and funding The authors report no conflicts of interest. ## Ethics and consent N/A ## Funding information This work was supported by the National Natural Science Foundation of China under Grant 11601286; Key Research & Development Project of Shandong Province under Grant 2018GGX101007. ## Appendix I ![Figure.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/12/24/2020.12.22.20248745/F3.medium.gif) [Figure.](http://medrxiv.org/content/early/2020/12/24/2020.12.22.20248745/F3) Figure. PRISMA flowchart of literature search and identification. ## Appendix II View this table: [Table 1.](http://medrxiv.org/content/early/2020/12/24/2020.12.22.20248745/T4) Table 1. The most popular authors,outlets and the three most representative articles per topic. ## Appendix III ![Figure.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/12/24/2020.12.22.20248745/F4.medium.gif) [Figure.](http://medrxiv.org/content/early/2020/12/24/2020.12.22.20248745/F4) Figure. ThemeRiver drawing for topics developmen * Received December 22, 2020. * Revision received December 22, 2020. * Accepted December 24, 2020. * © 2020, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NoDerivs 4.0 International), CC BY-ND 4.0, as described at [http://creativecommons.org/licenses/by-nd/4.0/](http://creativecommons.org/licenses/by-nd/4.0/) ## References 1. 1.Thapar A, Collishaw S, Pine DS, Thapar AK. Depression in adolescence. Lancet. 2012; 379(9820): 1056–1067. [https://doi.org/10.1016/s0140-6736(11)60871-4](https://doi.org/10.1016/s0140-6736(11)60871-4) PMID:12685676 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/S0140-6736(11)60871-4&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22305766&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F12%2F24%2F2020.12.22.20248745.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000301712300034&link_type=ISI) 2. 2.Wright DR, Katon WJ, Ludman E, McCauley E, Oliver M, Lindenbaum J, et al. Association of Adolescent Depressive Symptoms With Health Care Utilization and Payer-Incurred Expenditures. Academic Pediatrics. 2016; 16(1): 82–89. [https://doi.org/10.1016/j.acap.2015.08.013](https://doi.org/10.1016/j.acap.2015.08.013) PMID:26456002 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.acap.2015.08.013&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=26456002&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F12%2F24%2F2020.12.22.20248745.atom) 3. 3.Lewinsohn PM, Rohde P, Seeley JR. PSYCHOSOCIAL RISK-FACTORS FOR FUTURE ADOLESCENT SUICIDE ATTEMPTS. Journal of Consulting and Clinical Psychology. 1994; 62(2): 297–305. [https://doi.org/10.1037/0022-006x.62.2.297](https://doi.org/10.1037/0022-006x.62.2.297) PMID:8201067 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1037/0022-006X.62.2.297&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=8201067&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F12%2F24%2F2020.12.22.20248745.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1994NF75000012&link_type=ISI) 4. 4.Bernaras E, Jaureguizar J, Garaigordobil M. Child and Adolescent Depression: A Review of Theories, Evaluation Instruments, Prevention Programs, and Treatments. Frontiers in Psychology. 2019; 10: 24. [https://doi.org/10.3389/fpsyg.2019.00543](https://doi.org/10.3389/fpsyg.2019.00543) PMID:30949092 [PubMed](http://medrxiv.org/lookup/external-ref?access_num=30949092&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F12%2F24%2F2020.12.22.20248745.atom) 5. 5.Yap MBH, Pilkington PD, Ryan SM, Jorm AF. Parental factors associated with depression and anxiety in young people: A systematic review and meta-analysis. Journal of Affective Disorders. 2014; 156: 8–23. [https://doi.org/10.1016/j.jad.2013.11.007](https://doi.org/10.1016/j.jad.2013.11.007) PMID:24308895 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.jad.2013.11.007&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=24308895&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F12%2F24%2F2020.12.22.20248745.atom) 6. 6.Samek DR, Wilson S, McGue M, Iacono WG. Genetic and Environmental Influences on Parent-Child Conflict and Child Depression Through Late Adolescence. Journal of Clinical Child and Adolescent Psychology. 2018; 47: S5–S20. [https://doi.org/10.1080/15374416.2016.1141357](https://doi.org/10.1080/15374416.2016.1141357) PMID:27043719 [PubMed](http://medrxiv.org/lookup/external-ref?access_num=27043719&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F12%2F24%2F2020.12.22.20248745.atom) 7. 7.Yap MBH, Mahtani S, Rapee RM, Nicolas C, Lawrence KA, Mackinnon A, et al. A Tailored Web-Based Intervention to Improve Parenting Risk and Protective Factors for Adolescent Depression and Anxiety Problems: Postintervention Findings From a Randomized Controlled Trial. Journal of Medical Internet Research. 2018; 20(1): 19. [https://doi.org/10.2196/jmir.9139](https://doi.org/10.2196/jmir.9139) PMID:29351895 [PubMed](http://medrxiv.org/lookup/external-ref?access_num=29351895&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F12%2F24%2F2020.12.22.20248745.atom) 8. 8.Wang PC, Wang XY, Wu YQ, Xie XC, Wang XC, Zhao FQ, et al. Social networking sites addiction and adolescent depression: A moderated mediation model of rumination and self-esteem. Personality and Individual Differences. 2018; 127: 162–167. [https://doi.org/10.1016/j.paid.2018.02.008](https://doi.org/10.1016/j.paid.2018.02.008) 9. 9.Hsieh HF, & Shannon, S. E. Three approaches to qualitative content analysis. Qualitative health research. 2005; 15(9): 1277–1288. PMID16204405 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1177/1049732305276687&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=16204405&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F12%2F24%2F2020.12.22.20248745.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000232660800009&link_type=ISI) 10. 10.van der Maaten L, Hinton G. Visualizing Data using t-SNE. Journal of Machine Learning Research. 2008; 9: 2579-2605. 11. 11.Guo L, Li, S., Lu, R., Yin, L., Gorson-Deruel, A., & King, L.. The research topic landscape in the literature of social class and inequality. PloS one. 2018; 13(7). PMID:29965983 12. 12.Hannan TH. Market share inequality, the number of competitors, and the HHI: An examination of bank pricing. Review of Industrial Organization. 1997; 12(1): 23–35. [https://doi.org/10.1023/a:1007744119377](https://doi.org/10.1023/a:1007744119377) PMID:32790471 [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1023/A:1007744119377&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=32790471&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F12%2F24%2F2020.12.22.20248745.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1997WQ00100003&link_type=ISI) 13. 13.Hyndman RJ, Athanasopoulos GJLB-SP. Forecasting: Principles and Practice. 2014. 14. 14.Box G JG, Reinsel G, Ljung G. Time series analysis: Forecasting and control. New York (USA): John Wiley & Sons; 2015. 15. 15.Ljung GM, Box GEP. MEASURE OF LACK OF FIT IN TIME-SERIES MODELS. Biometrika. 1978; 65(2): 297–303. [https://doi.org/10.2307/2335207](https://doi.org/10.2307/2335207) [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/biomet/65.2.297&link_type=DOI) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1978FL40100006&link_type=ISI) [1]: /embed/graphic-1.gif