Cellular and plasma proteomic determinants of COVI...

Study design and clinical cohorts

First, we considered individuals who presented with respiratory illness symptoms and had a physician-ordered SARS-CoV-2 test performed at the Barnes Jewish Hospital between 26 March 2020 and 28 August 2020 (Washington University 350 (WU350) cohort). Based on nasopharyngeal testing by PCR with reverse transcription (RT–PCR), participants were defined as SARS-CoV-2 positive (CV; 140 females and 173 males) or SARS-Cov-2 negative (NCV; 98 females and 40 males; Fig. 1a). The population was heterogeneous for body mass index (BMI), where nearly half of individuals were moderately or severely obese (BMI > 33; Extended Data Fig. 1a). Given that obesity is a recognized risk factor for severe COVID-19 (ref. 16) and known to strongly impact immune and proteomic homeostasis23, we chose to minimize these confounding factors in our analysis and excluded participants with moderate and severe obesity. Our selected CV and NCV cohorts consisted of 80 individuals, with age and sex distributions proportional to those of the nonobese individuals (53 CV individuals: median BMI, 25.5; interquartile range (IQR), 21.9–28.4; 27 NCV individuals: median BMI, 27.3; IQR, 25.6–29.8; Extended Data Fig. 1b,c). We cannot conclusively rule out SARS-CoV-2 infection in participants with negative SARS-CoV-2 tests because the false-negative rate of the nasopharyngeal RT–PCR test is reported to be 0.018–0.33 (ref. 24); however, 13 of 27 NCV individuals were retested, and none of the retests was positive for SARS-CoV-2, and none of the 27 individuals had a subsequent hospital readmission. The most common diagnoses at discharge were pneumonia and chronic obstructive pulmonary disease (Supplementary Table 1). The majority of nonobese individuals with COVID-19 were males (~70%), and the average age was 71 years. The age of individuals without COVID-19 was distributed more broadly, with an average age of 55 years old (Fig. 1a and Extended Data Fig. 1c). We divided the participants with COVID-19 into three subgroups based on admission to an intensive care unit (ICU) and survival criteria: (1) CV_moderate, including individuals who were not admitted to the ICU during treatment, (2) CV_severe, including individuals who were admitted to the ICU, and (3) CV_deceased, including individuals who did not survive the illness (Supplementary Tables 2 and 3 and Fig. 1a). Most individuals admitted to the ICU were assigned a severity score based on a time-weighted average of discharge readiness25. Of note, our ICU-based definition of severity correlated well with known inflammation characteristics such as C-reactive protein levels (Extended Data Fig. 1e,f) and other common parameters of disease severity such as intubation and severity score (Extended Data Fig. 1d). Consistent with the known increase in COVID-19 severity with age, the average age of the deceased cohort was higher compared to individuals with moderate or severe COVID-19 (Supplementary Table 2 and Fig. 1a).

Fig. 1: Study outline and clinical characterization of healthy and COVID-19/non-COVID-19 cohorts.

Blood panels were performed for the following cohorts: A (25–34 years), n = 36; B (35–44 years), n = 21; C (45–54 years), n = 16; D (55–65 years), n = 24; E (>65 years), n = 25; CV, COVID-19 (32–91 years, 70.8 mean, 11.2 s.d.), n = 53; NCV, non-COVID-19 (32–87 years, 52.8 mean, 17 s.d.), n = 17. See Extended Data Fig. 1 for statistics related to bd. a, Study outline. An asterisk represents four patients who had a BMI < 33. bd, Selected WBC differentials (b); RBC, hemoglobin and platelet differentials (c); and clinical blood values (d) for cohorts A–E and CV/NCV cohorts. The lower and upper hinges of all box plots represent the 25th and 75th percentiles. Horizontal bars show the median value. Whiskers extend to values that are no further than 1.5 times the IQR from either the upper or the lower hinge. RDW, RBC distribution width; ER, emergency room.

Age is a known susceptibility factor for COVID-19, and it also significantly affects the immune and proteomic homeostasis in healthy individuals20,26. Therefore, to discriminate the effect of aging from disease-associated changes, we expanded our study to include a cohort of 148 healthy nonobese individuals aged 25 to 80 years, divided into five age groups (ABF300 cohort; Fig. 1a and Extended Data Fig. 1d). These blood samples were collected before the COVID-19 pandemic as part of an ongoing study of healthy human aging. In total, we analyzed 219 samples using clinical blood tests, complete blood count differentials, mass cytometry immunostaining (CyTOF) and plasma proteomics. Joint analysis of the healthy ABF300 cohort and the WU350 COVID-19 and non-COVID-19 cohorts revealed unique age-specific and disease-specific features of immune and physiological responses to COVID-19.

Clinical laboratory characteristics

Complete blood count differential analysis showed a statistically significant increase in the absolute numbers of white blood cells (WBCs) in NCV and non-moderate CV groups (Fig. 1b; see Extended Data Fig. 1f for statistical evaluation between all groups). This increase was attributed to a statistically significant increase in numbers of neutrophils (adjusted P value (Padj.) < 0.001), while changes in lymphocyte and monocytes numbers did not reach statistical significance when comparing NCV and CV groups to the age-matched healthy control groups (Fig. 1b and Extended Data Fig. 1g). This observation is consistent with previous reports27,28, including the increase in immature granulocytes with disease severity14,29 (Fig. 1b).

We observed that red blood cell (RBC) count decreased within the oldest age group (A versus E; Padj. < 0.001) and that RBC count in NCV participants and individuals with moderate COVID-19 did not statistically differ from corresponding age-matched values, while individuals with severe COVID-19 had a statistically lower RBC count compared to healthy individuals of any age (Fig. 1c; see Extended Data Fig. 1g for statistical evaluation between all groups). Similar alterations were observed for hemoglobin levels (Fig. 1c and Extended Data Fig. 1g). Strikingly, RBC distribution width was distinctly associated with COVID-19 at all severity levels relative to both healthy people and individuals without COVID-19 (Fig. 1c), consistent with previous works30,31. Lastly, platelet counts demonstrated a decreasing trend that appeared specific to individuals with COVID-19, although it did not reach significance in our cohorts (Extended Data Fig. 1g).

Several biochemical parameters changed dramatically in an inflammation and/or COVID-19-specific manner. Albumin concentration, indicative of liver health, did not decrease with age, but it significantly decreased during inflammation, particularly in COVID-19 groups of all severity levels (Fig. 1d; see Extended Data Fig. 1h for statistical evaluation between all groups). Calcium significantly decreased in individuals with COVID-19 compared to all ages of healthy controls and individuals without COVID-19, consistent with previous reports32, yet our data show that individuals without COVID-19 demonstrated only a nonsignificant decreasing trend compared to healthy individuals (Fig. 1d). Of note, unlike other blood ions (potassium, sodium and chloride), calcium levels did not increase with age (Extended Data Fig. 2a,b). Biochemical measures indicative of kidney function showed patterns that were strikingly specific to individuals with COVID-19 and correlated with disease severity. Specifically, creatinine and urea nitrogen levels did not differ between healthy individuals and participants without COVID-19, while they increased progressively in individuals with COVID-19, with the highest levels reached in the deceased cohort (Fig. 1d). Notably, urea nitrogen levels, but not creatinine levels, were age dependent—increasing with age within the healthy range (Extended Data Fig. 2c,d). However, the significant urea nitrogen level increase in severe and deceased COVID-19 groups was not attributed to age, as the COVID-19-dependent increase was significant even when compared to the oldest age group (Padj. < 0.05, CV_severe versus cohort E; Padj. < 0.001, CV_deceased versus cohort E; Extended Data Fig. 1i). Other age-dependent biochemical properties observed in the healthy control cohort included C-peptide levels33, lactic acid dehydrogenase levels34, glucose35, thyrotropin36 and DHEA37 (Extended Data Fig. 2).

CyTOF analysis of peripheral blood mononuclear cells

To understand changes in immune cell populations with the disease, we performed mass cytometry (CyTOF) on PBMCs of 219 blood samples from the healthy and disease cohorts using 28 myeloid and lymphoid markers (Methods). A subset of target proteins was selected based on single-cell RNA sequencing (scRNA-seq) of PBMCs to maximize cellular subset resolution. Specifically, we included mucosal-associated invariant T (MAIT) cell and γδ T cell markers (TCRVA7.2 and TCRγδ, respectively) and antibodies to granzymes GZMK and GZMB because we38 and others39 have shown that these proteins discriminate two major effector memory T (TEM) CD8+ cell subpopulations. We identified the major cell populations such as T cells (CD4+ T cells, CD8+ T cells, γδ T cells and MAIT cells), B cells, natural killer (NK) cells and myeloid cells (Fig. 2a) using unsupervised clustering and distribution of key lineage markers (Extended Data Fig. 3b and Methods).

Fig. 2: Defining major immune subsets in PBMCs by CyTOF for healthy and COVID-19/non-COVID-19 groups.

CD4+ T cell activation in participants with COVID-19 comes from age and inflammation signatures. Cohorts: A, n = 38; B, n = 28; C, n = 20; D, n = 29; E, n = 33; NCV, n = 17; CV_moderate, n = 18; CV_severe, n = 18; CV_deceased, n = 12. a, Uniform manifold and approximation projection (UMAP) plot of all cell profiles with CyTOF, colored according to identified cell types. b, Cell proportions of each cluster across cohorts. c, UMAP plot of CD4+ T cells, colored by the cluster. d, Heat map of normalized gene expression for all genes used for CD4+ T cell analysis, per cluster. e, UMAP plots with the expression of selected markers. f, UMAP density plots characterizing the distribution of CD4+ T cells across conditions. g, MDS projection for all samples, colored by cohort. For each sample, cluster percentages were used to perform MDS. h, Cell proportions of each CD4+ T cell cluster across cohorts. In b and h, the lower and upper hinges of all box plots represent the 25th and 75th percentiles. Horizontal bars show the median value. Whiskers extend to the values that are no further than 1.5 times the IQR from either the upper or the lower hinge. See Extended Data Fig. 3 for statistics related to b and Extended Data Fig. 4 for statistics related to h.

Differences between the major cell subpopulations can be appreciated directly from the distributions seen in cell density plots (Extended Data Fig. 3c). B cell proportions significantly increased in both SARS-CoV-2-positive and SARS-CoV-2-negative disease groups in line with previously reported results13 (Fig. 2b; see Extended Data Fig. 3d for statistical evaluation between all groups), indicating that this increase is a general characteristic of the immune response to pulmonary disease. Proportions of CD4+ T cells for NCV, CV_moderate and CV_deceased groups were decreased relative to age-matched healthy controls. A similar decrease in CD4+ T cell proportions during SARS-CoV-2 and influenza infection was recently reported40,41 (Fig. 2b and Extended Data Fig. 3d). Proportions of CD8+ T cells were increased in the group with moderate COVID-19 compared to the age-matched healthy group (group E), while there was no statistically significant difference for severe and deceased individuals relative to healthy individuals of any age. Of note, within the healthy cohort, CD8+ T cells proportions were significantly decreased in the oldest donors (group E; >65 years old) relative to younger groups. Next, we analyzed major immune cell populations individually (Fig. 2b).

CD4+ cells

We performed dimensionality reduction and clustering based on the relevant subset of markers (Methods) and identified 12 CD4+ T cell subpopulations (Fig. 2c,d). They included three subsets of CD4+ TEM cells (that is, CCR7CD45RO+) divided based on EOMES and TBET expression, two subpopulations of central memory T (TCM) CD4+ cells (that is, CCR7+CD45RO+) distinguished by the level of CD45RO expression (medium or low), two subpopulations of regulatory T (Treg) CD25+ CD4+ cells (CD45RA positive and CD45RO positive), three subpopulations of naïve CD4+ T cells based on the combinatorial expression of CD25 and SELL (CD62L) and two subpopulations with generally low levels of both CD45RA and CD45RO surface markers, which we denoted as RAlowRO (Fig. 2c,d and Extended Data Fig. 4a). Changes in population structure associated with age and disease were evident from the density plots of individual groups (Fig. 2f). Multidimensional scaling (MDS), computed based on the cluster percentages, also demonstrated distinct age-dependent and disease-dependent sample separation (Fig. 2g).

A decrease in naïve CD4+ T cells was one of the most prominent age-associated features, and this population was further diminished in individuals with pulmonary disease, both in SARS-CoV-2-positive and SARS-CoV-2-negative groups (Fig. 2h; see Extended Data Fig. 4b for statistical evaluation between all groups). Interestingly, the population of naïve CD4+ T cells lacking SELL surface expression was distinctly upregulated (see naïve SELL population in Fig. 2h; Extended Data Fig. 4b) in disease cohorts, likely comprising a transient population associated with an active immune response. A similar pattern was observed for a subset of TCM cells characterized by low levels of CD45RO expression (CM ROlow), which increased specifically in the disease conditions. Among the three subsets of TEM cells, the subpopulation lacking both TBET and EOMES (TBETEOMESCD4+) expression significantly increased in disease groups, likely indicating effector cells associated with the immune response. Proportions of CD4+ TEM cells that expressed both TBET and EOMES were specifically increased in moderate but not severe or deceased COVID-19-infection cohorts. This subpopulation of CD4+ T cell expresses cytotoxicity markers (GZMB and GZMK), which might be beneficial in disease progression. This population also appeared to accumulate with age, albeit the difference did not reach statistical significance (Fig. 3f and Extended Data Fig. 4b). This population likely corresponds to recently reported cytotoxic CD4+ T cells that dramatically increase in supercentenerians42.

Fig. 3: CD8+ T cells in COVID-19/non-COVID-19 groups lose the conventional effector memory phenotype, with a COVID-19-specific increase in HLA-DR+CD38+ CD8+ T cells.

Cohorts: A, n = 38; B, n = 28; n = 20; D, n = 29; E, n = 33; NCV, n = 17; CV_moderate, n = 18; CV_severe, n = 18; CV_deceased, n = 12. a, UMAP plot of all CD8+ T cells, colored by the cluster. b, Heat map of normalized gene expression for all genes used for CD8+ T cell analysis, per cluster. c, UMAP plots with the expression of selected markers. d, UMAP density plots characterizing the distribution of CD8+ T cells across conditions. e, MDS projection for all samples, colored by cohort. For each sample, cluster percentages were used to perform MDS. f, Cell proportions of each CD8+ T cell cluster across cohorts. See Extended Data Fig. 5 for statistics related to f. The lower and upper hinges of all box plots represent the 25th and 75th percentiles. Horizontal bars show the median value. Whiskers extend to values that are no further than 1.5 times the IQR from either the upper or the lower hinge.

Additionally, we identified a distinct CD4+ T cell subpopulation, RAlowROCD25low, which progressively accumulated with age (Fig. 3f and Extended Data Fig. 4b). To our knowledge, this is the first time this cell population has been defined as age dependent. Interestingly, this population was increased in individuals with severe COVID-19 but not in those with moderate or no COVID-19, compared to younger controls (that is, group A or B; Extended Data Fig. 4b).

Taken together, the CD4+ T cell compartment demonstrates age-associated (increase in RAlowROCD25low, loss of naïve cells, increasing trend of TBET+EOMES+ and central memory populations) and inflammation-associated remodeling, where its key features (further loss of conventional naïve cells, increase in TBETEOMES, CD45ROlow and naïve SELL cells) appear to be associated with the respiratory pathology immune response rather than COVID-19-specific responses, with the possible exception of the TEM TBET+EOMES+ subpopulation which increases strongly in individuals with moderate COVID-19.

CD8+ cells

CD8+ T cells demonstrated the most striking remodeling in healthy aging and inflammatory contexts (Fig. 3). In total, we identified ten CD8+ T cell clusters (Fig. 3a–c and Extended Data Fig. 5a). In addition to naïve and CD8+ TCM cells, we defined eight distinct subpopulations of the CD8+ TEM cells—five subpopulations in healthy individuals and three subpopulations that arise during disease conditions (Fig. 3d–f and Extended Data Fig. 5b). MDS plots and density plots demonstrated distinct CD8+ compartment remodeling associated with aging and disease (Fig. 3d,e). Consistent with the published scRNA-seq data39 and our previous observations38, CD8+ TEM cells can be divided into two major populations based on expression of GZMK and GZMB (Fig. 3c). In healthy individuals, GZMB-expressing CD8+ TEM cells were mostly CD45RA positive, identifying them as TEMRA, and were divided into CD27+ (4.1% ± 3.7% of total CD8+ T cells) and CD27 (9.4% ± 10.8% of total CD8+ T cells) subpopulations (Fig. 3b,c). We recently demonstrated that proportions of GZMK+CD8+ T cells among the total CD8+ T cells increase during healthy aging38. However, surface markers distinguishing this population remained unclear. Here, we find that GZMK+CD8+ TEM cells can be identified by the surface expression of CCR5 and are predominantly CD57 negative (Fig. 3b,c and Extended Data Fig. 5d). These data further extend our previous observation to highlight the gradual age-dependent increase in GZMK+CD8+ TEM cells. Additionally, healthy aging was accompanied by a substantial decrease in naïve cells, a significant progressive increase in TCM cells and an increasing trend of TEMRA cells, although the latter did not reach statistical significance (Fig. 3f; see Extended Data Fig. 5b for statistical evaluation between all groups). This observation extends our previous work, in which the proportion of GZMK+CD8+ T cells among the total CD8+ T cell population was shown to increase with age based on a comparison of young and old populations38. In addition to these age-dependent cell populations, two distinct PD-1-positive subsets were present in the healthy individuals, each at ~5% of total CD8+ T cells: GZMB+GZMK and GZMB+GZMK+ TEM cells (Fig. 3f). These cell subpopulations were characterized by a PD1+CD57+CD45RA phenotype, yet they differed in the expression of CD27 (Fig. 3c). These cell subpopulations were present at steady levels across the aging subgroups (Fig. 3f).

The disease-associated inflammatory response was accompanied by a pronounced remodeling of the CD8+ T cell compartment. Three major cell populations emerged in disease groups (Fig. 3f). The largest increase was observed for inflammatory GZMB+GZMK and GZMB+GZMK+ T cells that differed from the corresponding healthy counterparts (TEMRA and TEM GZMK+ T cells, respectively) in that they lost CD45RA and CD27 surface expression (Fig. 3b,c). Lack of surface expression of CD45RA, CD27, CD28 and PD-1 proteins indicated that these could be effector cells43. Appearance of these cell populations was a shared feature of all individuals independent of COVID-19 status. However, an additional inflammatory cell population characterized by expression of HLA-DR, CD38 and PD-1 was found almost exclusively in individuals with COVID-19. The appearance of this cell population was recently reported13,44, but specificity to the COVID-19 immune response versus non-COVID-19 respiratory pathology immune response has not yet been established. The increase in these three inflammation-specific cell populations was paralleled by a decrease in the conventional steady-state subpopulations: TEMRA subpopulations and GZMK-expressing TEM subpopulations decreased to very low levels in all inflammatory groups (Fig. 3f). Interestingly, unlike in CD4+ T cells, naïve CD8+ T cells did not significantly decrease compared to corresponding age-matched controls (CV/NCV groups compared with E cohort; Fig. 3f and Extended Data Fig. 4b). This result suggests that, in this context, effector CD8+ T cells may arise from TEM subpopulations, for example, GZMK+ TEM cells acquiring the GZMK+GZMB+ inflammatory T (TINFLAM) phenotype and GZMB+ TEMRA cells acquiring the GZMKGZMB+ TINFLAM phenotype.

Taken together, we find that peripheral blood CD8+ T cells undergo major remodeling during both healthy aging and inflammatory contexts. During aging, there is a loss of naïve cells and an increase of TCM and GZMK+ TEM cells. Inflammatory remodeling is characterized by a decrease in conventional TEM subpopulations and an increase in inflammatory effector-like subpopulations and HLA-DR+CD38+PD-1+ CD8+ T cells, which are specific to individuals with COVID-19.

NK cells, B cells and myeloid cells

NK cells were split into 11 subpopulations based on the expression of CD16, CD57, CD56, GZMK and SELL (Fig. 4a–c and Extended Data Fig. 6a). There was major inflammatory-associated remodeling of NK cells (Fig. 4d,e), as seven clusters demonstrated a difference between the healthy group and at least one inflamed group: CD56+CD57GZMK+ (enriched in CV_moderate group), CD56CD57CD16 (enriched in disease groups except for CV_severe), CD56dimCD57+CD16 (enriched in NCV group) and CD56dimCD57low (enriched in NCV and CV_moderate groups; Fig. 4f and Extended Data Fig. 6b). Two clusters did not change with age but significantly decreased across all disease cohorts: CD56+CD57low and CD56+CD57+ (Fig. 4f; see Extended Data Fig. 6b for statistical evaluation between all groups). The CD56+CD57+SELL+ cluster showed a similar decreasing pattern but did not reach statistical significance. Only one cluster changed significantly with age: the CD56+CD57SELL+ cluster decreased with age (cohort E was significantly lower than cohort A; Extended Data Fig. 6b), yet it did not change with inflammation. This observation is consistent with previous reports of a decrease in CD56+ NK cells with age45 (Extended Data Fig. 6c).

Fig. 4: Inflammatory remodeling of NK and B cells.

Cohorts: A, n = 38; B, n = 28; C, n = 20; D, n = 29; E, n = 33; NCV, n = 17; CV_moderate, n = 18; CV_severe, n = 18; CV_deceased, n = 12. a, UMAP plot of all NK cells, colored by the cluster. b, Heat map of normalized gene expression for all genes used for NK cell analysis, per cluster. c, UMAP plots with the expression of selected markers. d, UMAP density plots characterizing the distribution of NK cells across conditions. e, MDS projection for all samples, colored by cohort. For each sample, cluster percentages were used to perform MDS. f, Cell proportions of each NK cell cluster across cohorts. g, UMAP plot of all B cells, colored by the cluster. h, Heat map of normalized gene expression for all genes used for B cell analysis, per cluster. i, UMAP plots with the expression of selected markers. j, UMAP density plots characterizing the distribution of B cells across conditions. k, MDS projection for all samples, colored by cohort. For each sample, cluster percentages were used to perform MDS. l, Cell proportions of each B cell cluster across cohorts. In f and l, the lower and upper hinges of all box plots represent the 25th and 75th percentiles. Horizontal bars show the median value. Whiskers extend to the values that are no further than 1.5 times the IQR from either the upper or the lower hinge. See Extended Data Fig. 6 for statistics related to f and l.

Our panel included a limited number of markers to resolve B cell subpopulations. B cells separated into six clusters (Fig. 4g–i) with no significant change detected in these subpopulations across age subgroups (Extended Data Fig. 6e), and there was no clear separation between samples in the MDS plot (Fig. 4k). However, the density plots indicated some inflammation-associated remodeling (Fig. 4j). Specifically, consistent with previous reports13, we observed an increase in CD27+CD38+ plasmablasts in individuals with severe COVID-19 (in comparison with age-matched healthy E cohort; Fig. 4l; see Extended Data Fig. 6e for statistical evaluation between all groups). This cell subpopulation is specific to individuals with severe COVID-19 and was not significantly different between healthy individuals and those without COVID-19. The B cell memory population, defined as CD27+CD38SELL+, demonstrated a COVID-19-specific decrease in proportions among the B cells (statistically significant for individuals with severe COVID-19 versus participants in all age groups).

Myeloid cells demonstrated remodeling associated with infection (Extended Data Fig. 7a–d): proportions of classical monocytes and dendritic cells significantly decreased while proportions of HLA-DRlow monocytes significantly increased in the disease cohorts relative to healthy controls (Extended Data Fig. 7e,f). This DRlow subset was previously associated with an immunosuppressive monocyte phenotype46, consistent with the general features of immunosuppression reported for COVID-19 recently47.

Protein signatures of disease linked to healthy aging

Next, we used the SomaScan assay to analyze the proteomic signature from CV and NCV groups (WU350) and the healthy aging cohort (ABF300). SomaScan quantifies ~4,700 proteins in relative units of intensity, allowing data comparison within homogeneously collected and processed samples (Supplementary Tables 4 and 5). One caveat of our study was that samples for the cohorts were collected using different collection approaches: WU350 samples were collected in EDTA tubes, and ABF300 samples were collected in heparin tubes. While this did not affect the measurement of cellular proportions, proteomic data from the cohorts was required to be analyzed first within each cohort and then individual aging/disease signatures could be compared across cohorts.

The comparison of CV and NCV groups identified 435 upregulated proteins in individuals with COVID-19 and 464 upregulated proteins in individuals without COVID-19 (Fig. 5a). Most of these differences were driven by the severe and lethal cases of COVID-19 (Fig. 5b). Overall, the up/down COVID-19-specific signatures demonstrated a progressive increase/decrease with disease severity (Fig. 5c,d). The same pattern emerged when each COVID-19 cohort was compared to individuals without COVID-19 (Extended Data Fig. 8a–c). A relatively small number of proteins were differentially expressed between the NCV and CV_moderate disease groups (20 CV-specific and 7 NCV-specific upregulated proteins; Fig. 5b and Extended Data Fig. 8d). Proteins upregulated in the CV group (Fig. 5c) included complement protein C9; interferon response markers MX1, ISG15 and IFIT3; ferritin subunits FTL and FTH1; heparin-binding growth factors pleiotrophin (PTN) and midkine (MDK); growth factors CLEC11A, HAMP, TINAGL1 and SFRP1; inflammation-associated soluble factors serum amyloid a1 (SAA1), fibrinogen like protein (FGL1) and granulin (GRN); soluble forms of surface receptors FOLR2 and members of CD85 family (LILRB2 and LILRA3); and two additional proteins CHST12 and DKK3 (Fig. 5d). Notably, FGL1 and LILRA3 have the potential to directly negatively impact CD8+ T cell activity by engaging with LAG3 or interfering with human leukocyte antigen (HLA) class I/II accessibility48,49. The proteins upregulated in NCV groups (Fig. 5e,f) compared to individuals with COVID-19 included AHSG (fetuin-A), KLRC4, CLEC3B, afamin (AFM) and others.

Fig. 5: SomaLogic plasma protein profiling demonstrates age-specific and inflammation-specific signatures in individuals with COVID-19.

Cohorts: A, n = 42; B, n = 27; C, n = 18; D, n = 29; E, n = 34; NCV, n = 27; CV_moderate, n = 18; CV_severe, n = 21; CV_deceased, n = 14. a,b, Volcano plot for differential expression of 4,801 proteins between NCV and all CV cohorts (a) or CV_moderate, CV_severe and CV_deceased cohorts separately (b). Protein names for the top ten upregulated and downregulated genes are shown. c,e, Box plot of average expression per sample of proteins upregulated (c) or downregulated (e) in CV cohorts compared to NCV cohort, across CV/NCV cohorts. d,f, Box plot with the scaled expression of selected proteins, upregulated (d) or downregulated (f) in the CV cohort compared to NCV, across CV/NCV cohorts. Genes that are differentially expressed with age are marked in red. g, Volcano plot for differential expression of 4,801 proteins between cohorts A and E. Protein names for the top ten upregulated and downregulated genes are shown. h,i, GSEA of all proteins upregulated (h) or downregulated (i) with age (cohorts E versus A) in proteins ranked according to differential expression between CV/NCV cohorts. j,k, Overlap between proteins upregulated (j) or downregulated (k) with age (cohorts E versus A) compared to proteins upregulated in COVID-19-related inflammation (CV versus NCV comparison). P values are one-sided and adjusted for multiple testing using the Benjamini–Hochberg method (Padj.). NES, normalized enrichment scores. l, Box plot with the scaled expression of selected genes in cohorts A–E. Genes that are differentially expressed with age are marked in red. In cf and l, the lower and upper hinges of all box plots represent the 25th and 75th percentiles. Horizontal bars show the median value. Whiskers extend to the values that are no further than 1.5 times the IQR from either the upper or the lower hinge. In a, b and g, P values and log fold change values were calculated using the limma package (two-sided test). Significant genes were selected after correction for multiple testing using the Benjamini–Hochberg method.

Given the different distribution of ages between the pulmonary disease cohorts, we next examined the degree to which age-related proteomic changes shape this behavior. Comparison of young (A) versus old (E) subgroups of the aging cohort revealed 241 proteins that were statistically upregulated with age and 140 downregulated proteins (Fig. 5g). Our data are consistent with the results previously published from our group and others26,50,51,52,53,54: proteins most upregulated with age were GDF15, SOST and ADAMTS5, as well as PTN, TAGLN, TREM2, WISP2, MYL3 and MLN, while most downregulated proteins included RET, SELL and KIT, as well as MSMP, CILP2, CTSV and CR2 (Extended Data Fig. 8e). Because we also characterized our cohorts using clinical blood tests, we compared proteomics data with the blood biochemistry analyses obtained for the same individuals from the healthy aging cohort (Fig. 1a and Supplementary Tables 610). A number of measured proteins strongly correlated with the clinical blood test results (Extended Data Fig. 8f): (1) creatinine kinase strongly correlated with plasma levels of SLC26A7, CKB, ACTN2, TNNI2 and MYBPC1; (2) clinical alanine aminotransferase levels correlated with plasma levels of UGDH, ALDH1A1, ASL, ALDOB, PSAT1, ACY1, FBP1 and DCXR1; (3) C-peptide and insulin levels strongly anti-correlated with IGFBP1 and ADIPOQ (as expected, insulin levels measured by clinical blood test strongly correlated with insulin levels analyzed via SomaScan profiling); (4) clinical measurements of direct high-density lipoprotein cholesterol levels positively correlated with EHMT2 protein levels and anti-correlated with WNT5A protein levels, while the latter (5) also correlated with general triglyceride levels; (6) clinical osteocalcin levels were strongly correlated with plasma levels of CHAD protein; (7) clinical thyrotropin hormone levels strongly correlated with the corresponding protein (CGA/TSHB) levels in the proteomic data; (8) and lastly, clinically measured unsaturated iron binding capacity was strongly correlated with FTL/FTH1 and NEO1 protein levels. While this high level of concordance does not imply that SomaScan-based profiling can substitute for clinical measurements, it demonstrates the capability of unbiased profiling in characterizing the physiological state.

Gene-set enrichment analysis (GSEA) analysis demonstrated that the COVID-19 versus non-COVID-19 differentially expressed proteins strongly associated with the up/down aging signatures, consistent with the differences in the age distribution of those cohorts (Fig. 5h,i). Furthermore, we found that the COVID-19 versus non-COVID-19 signatures significantly overlapped with the up/down aging signatures (Fig. 5j,k) but not vice versa (Extended Data Fig. 8g,h), underscoring the importance of taking age into account when considering determinants of COVID-19. We found 337 unique proteins that were upregulated in COVID-19 and 421 proteins that were downregulated in individuals with COVID-19 compared to those without COVID-19 that were not age dependent. Age-associated proteins that were also significantly different in the COVID-19 versus non-COVID-19 comparison included PTN, SFRP1 and DKK3, which increased with age, and CLEC3B, which decreased with age. It is interesting to note the dissimilar age-associated behavior of two heparin-binding proteins (MDK and PTN) that were both upregulated in the COVID-19 group relative to the non-COVID-19 group (Fig. 5l). Consistent with our data, PTN was previously associated with aging52, while MDK does not change with age, yet serum concentrations of MDK are linked to heart injury conditions55. Another protein associated with age and COVID-19, SFRP1, a soluble mediator of WNT signaling, has also been linked to modulation of cardiac function56. Another WNT signaling modulator, DKK3, was previously linked to aging and is considered a major indicator of muscle atrophy57. A small number of proteins behaved in the opposite manner between aging and COVID-19 (11 upregulated with CV and downregulated with age, and 7 vice versa), which included inflammatory mediators (CCL21 and SEMA4A) or apolipoproteins (APOA4 and APOE2; Extended Data Fig. 8g,h).

COVID-19 protein profile linked to hepatocytes and muscle secretomes

To understand the broad-level differences between individuals with COVID-19 and individuals without COVID-19, we performed pathway enrichment analysis on the differential proteins. Several pathways were upreguated or downregulated in a disease-specific manner (Fig. 6a). The pathways most upregulated in individuals with COVID-19 were associated with extracellular matrix proteins (for example, WISP2 and FBLN5) and were also profoundly associated with age (Fig. 6b,c). Similarly, soluble forms of TREM2 and IGFBP2 were increased in individuals with COVID-19 and older healthy individuals. Several COVID-19-specific pathways were independent of aging signatures and included inflammatory processes (interferon, IL-6 and IL-2/stat5), complement pathways and glycosaminoglycan metabolism (Fig. 6d,e). Conversely, proteins from MAP kinase-associated pathways were downregulated in the plasma of individuals with COVID-19 relative to that of individuals without COVID-19. These proteins were mostly independent of age and included MAP2K3, BRAF, HRAS and MAP2K4. (Fig. 6f,g).

Fig. 6: Pathway enrichment analysis distinguishes COVID-19 from non-COVID-19 inflammation.

Cohorts: A, n = 42; B, n = 27; C, n = 18; D, n = 29; E, n = 34; NCV, n = 27; CV_moderate, n = 18; CV_severe, n = 21; CV_deceased, n = 14. a, Volcano plot for GSEA for CV/NCV comparison. The top 20 upregulated and downregulated pathways, grouped by function, are shown. Pathways differentially expressed with age are marked in red. P values and log fold change values were calculated using the limma package (two-sided test). Significant genes were selected after correction for multiple testing using the Benjamini–Hochberg method. b,d,f, GSEA for selected pathways upregulated with CV (Padj. < 0.05) and with age (P < 0.05), or for pathways upregulated in CV and age (b), in CV but not age (d) or downregulated with CV but not with age (f; Padj. < 0.05). P values are one-sided and adjusted for multiple testing using the Benjamini–Hochberg method (Padj.). NES are also shown. c,e,g, Box plots with the scaled expression of selected genes, upregulated in CV and age (c), in CV but not age (e) or downregulated with CV but not with age (g). Genes that were differentially expressed with age are marked with red. The lower and upper hinges of all box plots represent the 25th and 75th percentiles. Horizontal bars show the median value. Whiskers extend to the values that are no further than 1.5 times the IQR from either the upper or the lower hinge. ECM, extracellular matrix; ES, enrichment score.

We next evaluated if cell-type-specific signatures of PBMC subpopulations are enriched in the COVID-19-specific proteome. None of the individual cell types were enriched; however, a myeloid signature (monocytes and neutrophils) was indeed upregulated in individuals with COVID-19 (Extended Data Fig. 8). To further investigate cell-type specificities, we extracted tissue-specific transcriptional signatures from the Genotype-Tissue Expression (GTEx) database (Fig. 7a; see Methods for details and Supplementary Table 11 for list of genes) and evaluated these signatures against the proteomic data ranked by the comparisons of CV versus NCV groups or by aging comparison (A versus E cohorts; Fig. 7b). Individuals with COVID-19 had a pronounced increase in liver-specific proteins accompanied by a significant decrease of muscle-specific proteins. These tissue-associated changes were unique to the COVID-19 cohort and did not vary with age. Instead, artery/aorta-specific proteins were highly upregulated with age (Fig. 7b).

Fig. 7: COVID-19 plasma protein signatures are linked to hepatocytes and skeletal muscle secretomes.

Cohorts: A, n = 42; B, n = 27; C, n = 18; D, n = 29; E, n = 34; NCV, n = 27; CV_moderate, n = 18; CV_severe, n = 21; CV_deceased, n = 14. a, Outline of GTEx-based analysis of SomaScan data. b, Enrichment of GTEx-derived tissue signatures in NCV versus CV and cohort A versus E comparisons, performed using fgsea R package. P values are one-sided. c, UMAP of liver atlas cells (GSE124395) with outlined cell types (left) and with the mean expression of 54 liver-related genes from GTEx, upregulated in the CV group (right). d, Heat map of normalized gene expression for 54 liver-related genes from GTEx, upregulated in CV group, for each condition. e, Box plot with the scaled expression of selected liver-related genes, upregulated in COVID-19, across CV/NCV cohorts. f, UMAP of human aortic cells (GSE155468) with outlined cell types (left) and mean expression of artery/aorta-related genes from GTEx, upregulated in cohorts E versus A. g, Heat map of normalized gene expression for artery/aorta-related genes from GTEx, upregulated in cohorts E versus A, for each condition. h, Box plot with the scaled expression of selected artery/aorta-related genes, upregulated in cohort E versus A, across A–E cohorts. In e and h, the lower and upper hinges of all box plots represent the 25th and 75th percentiles. Horizontal bars show the median value. Whiskers extend to the values that are no further than 1.5 times the IQR from either the upper or the lower hinge. DC, dendritic cell; NS, not significant.

Given the distinct enrichment of these tissues, we mined public scRNA-seq data for the liver58 and aorta59 to understand if any specific cell type is driving these signatures. When projecting 54 liver-specific genes enriched in the comparison of CV and NCV groups, we observed a very strong specificity to hepatocytes (Fig. 7c–e), indicating an important role in regulating plasma protein level alterations in COVID-19 infection. The artery/aorta-specific signature enriched in aging also demonstrated cell-type-specific enrichment in smooth muscle cells (Fig. 7f–h).

(0)

相关推荐