Introduction

The ontology of lung cancer is a multi-gene involved, multi-stage, long-term and extremely complex pathological process (Greaves & Maley, 2012). Emerging evidence indicates that multistep tumorigenesis develops through a series of progressive pathologic changes known as preneoplastic or precursor lesions, which present prior to the onset of cancer (Pich et al., 2022). With the development of imaging technology and the application of molecular biomarkers for lung cancer screening in recent years, the detection rate of early lung cancer, especially precancerous lesions, has increased significantly. Some of the abnormalities in precancerous lesions are reversible, early diagnosis and guiding precancerous patients away from carcinogens or early intervention and blocking may reverse their further development to minimize the risk of cancer (Yatabe, Borczuk, & Powell, 2011).

As the most common histologic subtype of lung cancer, lung adenocarcinoma (LUAD) predominantly arises from alveolar type 2 (AT2) cells (Zacharias et al., 2018). For the classification of LUAD and its precursors, atypical adenomatous hyperplasia (AAH) is considered to be the first step in a continuum of histomorphologic changes in malignant adenocarcinomas, and is in continuity with preinvasive adenocarcinoma in situ (AIS) in terms of morphological alterations, then microinvasive lesions termed minimally invasive adenocarcinoma (MIA), and eventually invasive adenocarcinoma (IA) (Sivakumar et al., 2017; Travis et al., 2011). However, the scarcity and difficult accessibility of adequate samples limit the investigation of the cellular and molecular landscape of LUAD precursors. Alternatively, studies also found AAH-like lesions in mouse models, as well as adenomas and AIS, prior to the appearance of LUAD (Nemanja Despot Marjanovic et al., 2020; Weichert & Warth, 2014). These models accurately mimic human lung precancerous lesions at the molecular and histopathological levels, providing well-suited materials for our study. Gradually shifting the target of lung cancer treatment from middle and advanced patients with clinical symptoms to asymptomatic patients with early or precancerous lesions is precisely the new concept of oncotherapy in the future.

The tumor microenvironment (TME) homeostasis is determined by close crosstalk within and across all cellular components, including malignant, immune, endothelial, and stromal cells (Vitale, Manic, Coussens, Kroemer, & Galluzzi, 2019). Unveiling potential communications between malignant cells and TME underlying oncogenesis are critical for the future development of mechanism-informed, subset-targeted tumor immunotherapy strategy (Feng & Gao, 2021; Wen, Lambrecht, Ju, & Tacke, 2021). Tumor-associated macrophages (TAMs) are a cell population with plasticity and heterogeneity in the TME. As an important player of tumor pathobiology, on the one hand macrophages may be educated by neoplastic cells to provide a favorable microenvironment, supporting the malignant transformation, disease progression in spite of immunosurveillance, metastasis, and resistance to treatment; and on the other they may play a role in anti-tumor immunity (Mantovani, Marchesi, Malesci, Laghi, & Allavena, 2017; Ruffell, Affara, & Coussens, 2012; Sica & Mantovani, 2012). Part of such heterogeneity may be attributed to the diversity of phenotypic characteristics, metabolic patterns, and functional profiles of TAMs in response to environmental perturbations (Cassetta & Pollard, 2018).

Tumorigenesis relies on reprogramming of cellular metabolism, and metabolic mining of immune cells in the TME have expanded our understanding of tumor-associated metabolic alterations at different stages of tumor development (Bian et al., 2020; W. Li et al., 2018). Metabolic changes of TAMs in TME have been reported to be accompanied by phenotypic and functional changes (Vitale et al., 2019; Xia et al., 2020). The metabolic landscape of TAMs in the context of tumor initiation and the interdependent relationship of development, metabolism, and functional plasticity in TAMs remain largely unraveled. Exploring the major metabolic circuits by which TAMs remodel the TME and digging metabolic clues that affect the functional polarization of TAMs will contribute to the proposal of immunometabolic strategies that utilize TAMs for tumor prevention and therapy.

In order to simulate the process of human lung adenocarcinogenesis, we established a spontaneous LUAD mouse model and used single-cell RNA sequencing (scRNA-seq) to parse the metabolic changes of malignant epithelial cells and macrophages during precancerous period, explore the metabolic heterogeneity and pro-tumor mechanism of alveolar macrophage, so as to probe the key links of metabolic pattern shifts whereby alveolar macrophage subset condition epithelial cell transformation. In depth understanding of the molecular and cellular mechanisms underlying the acquisition of aberrant phenotypes at precancerous and further carcinogenesis stage may accelerate the identification of precancerous metabolic intervention targets. Therefore, the diagnosis and prevention of lung cancer can be advanced to the stage of precancerous lesions, and measures can be taken to prevent or reverse the further development of precancerous lesions.

Results

Histopathology profiling and scRNA-seq of precancerous lesions in mice

A/J mice have the highest incidence of spontaneous lung tumors among various mouse strains (Landau, Wang, Yang, Ding, & Yang, 1998). To more comprehensively mirror tumor initiation and progression process of human lung cancer, A/J mice were feed for 12-16 months, which resulted in three recognizable precancerous and cancerous lesions in the lung. A total of 19 tissues were histologically categorized by two pathologists into four subtypes, including eight normal tissues, three AAHs, three adenomas, and five AISs, which confirmed by Ki67 staining, spanning the cascade from normal to tumor (Figure 1B and 1C). To characterize cell diversity along tumorigenesis, we conducted a time-ordered single-cell transcriptomic profiling starting with normal tissue and ending with AIS (Figure 1A). For each sample, we isolated single cells without prior selection for cell types and used the 10x Chromium platform to generate RNA-seq data. After removing low quality cells, a total of 119698 cells that passed quality control were retained for subsequent analysis, which yielded a median of 910 detected genes per cell. The cell count and genes expressed in each sample were provided in Figure S1A and S1B.

Histopathological grading and single-cell transcriptome profiles of mouse LUAD precancerous lesions.

A. Analysis and experimental flow chart of this study.

B. H&E images of normal tissue and three stages of precancerous lesions (AAH, adenoma, and AIS) in mice. Scale bar: 100 µm.

C. Immunofluorescence staining of ki67 at the four histopathological stages. Scale bar: 100 µm.

D. UMAP plot of 13 cell types from mouse scRNA-seq data.

E. Dotplot of marker genes in all cell types.

F. Cell distribution at the four stages.

G. Reduced transcriptional homogeneity with progression of precancerous lesions. Transcriptional heterogeneity between cells is inversely proportional to normalized mutual information (NMI).

AAH: atypical adenomatous hyperplasia; AIS: adenocarcinoma in situ; MIA: minimally invasive adenocarcinoma; IA: invasive adenocarcinoma; CM: conditioned medium; DC: dendritic cell; SMC: smooth muscle cell; ***p < 0.001.

To identify distinct cell populations based on gene expression patterns, we performed dimensionality reduction and unsupervised cell clustering using methods implemented in Seurat, followed by removing batch effects among multiple samples. As shown using Uniform Manifold Approximation and Projection (UMAP), profiles along the cascade were derived, and a total of 13 main cell clusters were finally identified, which we defined as the single-cell transcriptome atlas of mouse lung in premalignant and early-malignant lesions (Figure 1D). Based on the expression of canonical markers, we classified immune cells into T cells (Cd3e, Cd3d, Cd3g), B cells (Cd79a, Ms4a1, Igkc), neutrophils (Retnlg, S100a8, S100a9), macrophages (Mrc1, Cd68, C1qa), monocytes (F13a1, Ms4a6c), dendritic cells (H2-Eb1, H2-Aa, H2-Ab1), and mast cells (Csf1, Il6), and identified five clusters of non-immune cells including epithelial cells (Sftpc, Scgb1a1, Sftpb), fibroblasts (Col1a1, Dcn, Gsn), endothelial cells (Cdh5, Ramp2, Cldn5), mesothelials (Gpm6a, Upk3b, Msln), and smooth muscle cells (Acta2, Tagln, Myh11). In addition, a cluster expressing prolifering markers (Top2a, Mki67) was identified, named cell cycle cells (Figure 1E). Each of these populations was captured from different pathological stages of different mice (Figure 1F). The proportion of cell types across the four stages showed a dynamic trend, while transcriptional homogeneity was reduced with precancerous progression, illustrating the importance of heterogeneity in tumorigenesis and also proving the reliability of the sampling in this study (Figure S1C, S1D and 1G).

Identification of initiation-associated epithelial marker and resolution of metabolic pattern in malignant epithelial cells

Tumorigenesis relies on reprogramming of cellular metabolism. To obtain comprehensive cell type-specific metabolic profiles, metabolism-related genes were extracted from Kyoto Encyclopedia of Genes and Genomes (KEGG) database and metabolic clustering was performed on scRNA-seq data. The metabolic subsets and cell types corresponded well, indicating metabolic heterogeneity among cell types (Figure S2A and S2B). Through Gene Set Enrichment Analysis (GSEA) against metabolic pathways, 16 subsets manifested distinct metabolic activities, with neutrophils, epithelial cells, and macrophages showing relatively higher metabolic activities (Figure S2C).

Generally, malignant transformation of epithelial cells is the starting point for the development of precancerous lesions. AT2 cells, ciliated cells, and AT1 cells were identified in our epithelial transcriptome (Figure 2A). At the stages of LUAD evolution, despite possible sample heterogeneity and other interference, we observed increased interactions between epithelial cells and surrounding stromal and immune cells in the microenvironment, indicating gradually frequent cell-cell communication and showing a tendency to malignant transformation (Figure 2B). It was worth noting that the Spp1-Cd44 ligand-receptor complex showed more interactions in AAH stage than AIS stage, between epithelial cells and other cell types, especially myeloid cells (macrophages, monocytes, and neutrophils). SPP1-CD44 axis was reported to be involved in tumor initiation and growth, and CD44 level was found to higher in AAH, compared to AIS and IA (Kerr et al., 2004; Pietras et al., 2014). In addition, the pattern of the App-Cd74 signaling suggested incremental communications between epithelial cells and immune cells during the occurrence of LUAD, and it was reported to be implicated in tumor progression and metastasis (Anderson et al., 2023). Then we sought to identify initiation-associated epithelial markers underlying LUAD pathogenesis. Ldha was gradually enriched in the early occurrence of lung cancer, and the increased expression trend along with tumor initiation process were verified by immunofluorescent staining (Figure S2D and 2C). Moreover, analysis of The Cancer Genome Atlas (TCGA) database indicated that the initiation-associated gene LDHA was highly expressed in LUAD and showed an increasing trend in the early stages of the disease, and its high expression was correlated with poor prognosis of LUAD patients (Figure 2D-F). Thus, it deserves to be further evaluated as potentially promising biomarker during lung tumorigenesis.

Identification of initiation-associated epithelial marker Ldha and analysis of malignant epithelial cells.

A. UMAP plot of epithelial cell subtypes.

B. Dotplot showing the significance (P-value) and strength (communication probability) of specific interactions between epithelial cells with other cell types at the four stages. Data was obtained by cell chat analysis.

C. Immunofluorescence staining of Ldha at the four stages. Scale bar: 50 µm.

D. LDHA expression in LUAD and normal tissues from the TCGA database.

E. LDHA expression in stage I-IV LUAD and normal tissues from the TCGA database.

F. Correlation of LDHA expression with overall survival of LUAD patients in TCGA database.

G. Copy number variations (CNVs) (red, amplifications; blue, deletions) across the chromosomes (columns) inferred from the scRNA-seq of each epithelial cell (rows).

H. scMetabolism analysis of malignant epithelial cells at precancerous stages.

I. GSVA enrichment analysis of malignant epithelial cells at precancerous stages. LUAD: lung adenocarcinoma; *p < 0.05.

To distinguish malignant cells from non-malignant cells based on karyotypes, chromosomal copy number variations (CNVs) from each epithelial cell’s profile were inferred (Figure 2G and S2E). There were some CNV variations in the normal group, which may be related to the nature of the algorithm, the sample collection operation or the late processing of the cells; these cells were excluded from subsequent analyses. To explore the phasic metabolic changes in malignant epithelial cells, metabolic activity was quantified using the scMetabolism package. The results indicated that malignant epithelial cells in adenoma stage were enriched in lipid metabolism-related pathways, while in AIS stage, significant enrichment of glycometabolism-related pathways was observed (Figure 2H). The Gene Set Variation Analysis (GSVA) enrichment analysis showed a concordant variation trend (Figure 2I). These data suggested that epithelial cells manifest diverse metabolic patterns during malignant transition.

S100a4+ alv-macro exhibited active fatty acid metabolism in the AAH phase

With the development of tumor metabolism research, it has been found that the metabolism of other cell types in the microenvironment besides epithelial cells can also regulate tumorigenesis. Macrophage was the most prominent cell type that interacted with malignant epithelial cells, as shown by Cell chat analysis (Figure S3A). Macrophages were composed of alveolar macrophages (Itgax and Siglecf) and interstitial macrophages (Itgam and Cx3cr1) in subclustering (Figure S3B and S3C). Further analysis revealed a higher correlation between alveolar macrophages and malignant epithelial cells (Figure 3A and S3D). To resolve the metabolic pattern of alveolar macrophages at a higher resolution, we subclassified them into four subpopulations based on gene expression (Figure S3E and S3F). Enrichment analysis was performed for each of the four subpopulations, and scMetabolism analysis revealed that one of the alveolar macrophage subpopulations already showed a metabolically active state at the AAH stage, especially lipid metabolism, which we hereafter termed as S100a4+ alv-macro (Figure 3B and C). As a representative, the overall abundance trends of glycerolipid metabolism, glycerophospholipid metabolism, and biosynthesis of unsaturated fatty acids were displayed in Figure 3D. Next, we used the Compass algorithm to model the metabolic flux of cells for S100a4+ alv-macro, quantitatively analyzed the metabolic activity of cells, and then focused on metabolic enzymes and genes related to lipid metabolism for in-depth analysis. Fatty acid metabolism was found to be relatively more active in the AAH phase, which was represented by changes in palmitoyl-CoA desaturase, palmitoyl-CoA hydrolase, and carnitine palmitoyl-transferase (Figure 3E). Consistently, the expression levels of related genes Cpt1a and Acot2 were also higher in precancerous lesions relative to normal circumstance (Figure 3F). In our preliminary validation, the presence of F4/80+/Cd11c+/S100a4+ alv-macro was identified in the corresponding tissues by multiplexed immunohistochemistry staining, and their number and proportion were observed to be significantly greater in AAH samples than in normal samples (Figure S3G, S3H and 3G). Cpt1a is responsible for the fatty acid β-oxidation in mitochondria (Nakamura, Yudell, & Loor, 2014), and its positivity was found more frequently in the S100a4+ alv-macro of AAH group (Figure 3G), suggesting fatty acid metabolism in this subpopulation was activated at the AAH stage.

S100a4+ alv-macro was active in lipid metabolism at the AAH stage.

A. Malignant epithelial cells showed the strongest communication weight with alveolar macrophages, as shown by Cell chat analysis.

B. UMAP plot of S100a4 expression in alveolar macrophages.

C. scMetabolism analysis of S100a4+ alv-macro at the four stages.

D. Changes in scores of representative lipid metabolism-related gene sets across the four stages.

E. Compass analysis showing the reaction activities of fatty acid metabolism across the four stages.

F. Density plots of Cpt1a and Acot2 at the four stages. x-axis represents the gene expression level, and y axis represents the density of numbers of cells.

G. Multiplex immunofluorescence validation of F4/80+/Cd11c+/S100a4+ alv-macro in mouse normal and AAH tissues, and comparison of tissue expression of Cpt1a in this subpopulation. Scale bar: 20 µm.

Angiogenesis in S100a4+ alv-macro was associated with fatty acid metabolism

Metabolic profile is coupled with phenotype and functional program of macrophages. Among all pathological stages, although there was no significant difference in the cell proportion of S100a4+ alv-macro in macrophages, its proportion in alveolar macrophages was obviously the highest in AAH stage (Figure S4A and 4A). To evaluate changes in the functional programs of this macrophage subpopulation in the context of precancerous lesions, we performed a GSEA enrichment analysis of the well-known macrophage phenotypes. It was revealed that the capacities for angiogenesis and M2-like polarization were found to be higher in AAH or other precancerous stages relative to normal stage (Figure 4B). Correlation analysis was used to measure the relationship between these functional programs and metabolic activity, where angiogenic and M2-like phenotypes of S100a4+ alv-macro were proved to be positively correlated with lipid metabolism (Figure S4B). Further investigation showed that both of them were positively correlated with palmitic acid related metabolic reactions (DESAT16_2_pos, C160CPT1_pos, and RE0577C_pos, Figure 4C). Next, we queried the angiogenesis gene signature in S100a4+ alv-macro, and the expression levels of the pro-angiogenic genes Anxa2 and Ramp1 were detected to be higher in AAH group than normal group (Figure 4D). Anxa2 supports angiogenesis in specific tumor-related settings (Huang et al., 2022), and its expression level was positively correlated with that of Cpt1a in S100a4+ alv-macro (Figure 4E). What’s more, the correlation between Cpt1a and Anxa2 was verified in the immunostaining of AAH tissues, revealing more F4/80+/Cd11c+/S100a4+/Cpt1a+/Anxa2+ alv-macro compared to normal tissues (Figure 4F). These results suggested that the angiogenic function of S100a4+ alv-macro might be related to its fatty acid metabolic status.

The angiogenic function of S100a4+ alv-macro was related to fatty acid metabolism.

A. Cell proportion comparison of S100a4+ alv-macro in alveolar macrophages.

B. Macrophage functional program analysis of S100a4+ alv-macro across the four stages.

C. Pearson correlation analysis of angiogenesis and M2-like function with fatty acid metabolic reactions in S100a4+ alv-macro.

D. Density plots of Anxa2 and Ramp1 at the four stages.

E. Correlation analysis of Cpt1a and Anxa2 expression in S100a4+ alv-macro.

F. Multiplex immunofluorescence validation of the correlation between Cpt1a and Anxa2 expression in F4/80+/Cd11c+/S100a4+ alv-macro of mouse normal and AAH tissues. Scale bar: 20 µm.

S100a4+ alv-macro displayed the potential to promote the early malignant transformation of lung epithelial cells in vitro

To further validate our findings from scRNA-seq and histological staining, we established S100a4-overexpressed (OE) alveolar macrophages with murine MH-S cell line, which was verified by quantitative real-time polymerase chain reaction (qPCR) and western blotting (Figure 5A and 5B). Then coculture assay was conducted to functionally prove the pro-tumorigenic characteristics of S100a4+ alv-macro in vitro. After culture with conditioned medium (CM) of S100a4-OE MH-S, the proliferation and migration ability of mouse lung epithelial MLE12 cells were significantly promoted, and the cell cycle was more arrested in the G2/M phase (Figure 5C-F). Reactive oxygen species (ROS) produced within cells can lead to genetic mutations that trigger malignant transformation of cells (Santibáñez-Andrade et al., 2023), and we found markedly elevated ROS level in OE-CM cocultured epithelial cells (Figure 5G). Mitochondrial morphological change is one of the organelle indications of malignant transformation of cells (Missiroli, Perrone, Genovese, Pinton, & Giorgi, 2020). As shown in the electron microscope images, compared with the cocultured epithelial cells in the negative control (NC) group, we observed the acquisition of irregular shapes, obvious vacuoles, and disorganized cristae in mitochondria of OE group, which might be associated with energy metabolism in the process of cell transformation (Figure 5H). In addition, we performed reverse verification with siRNA and demonstrated that the functional phenotype of epithelial cells did not change after S100a4-knockdown MH-S coculture (Figure S5).

S100a4-OE MH-S promoted malignant transformation of MLE12 epithelial cells in vitro.

A. S100a4 mRNA expression level in MH-S after plasmid transfection.

B. S100a4 protein expression level in MH-S after plasmid transfection.

C. CCK8 assay showing cell proliferation of MLE12 after coculture with S100a4-OE MH-S.

D. Colony forming ability of MLE12 after coculture, as shown by colony formation assay.

E. Cell cycle distribution of MLE12 after coculture, as shown by cell cycle analysis.

F. Wound healing assay showing cell migration of MLE12 after coculture.

G. Intracellular ROS level of MLE12 after coculture.

H. Transmission electron microscopy showing the morphological changes of mitochondria in MLE12 after coculture.

I. Western blotting of DNA damage marker p-γH2ax, EMT markers (E-cadherin, N-cadherin, and Vimentin), and stem-like markers (Cd44 and Cd133) in MLE12 after coculture.

J. Western blotting of tumorigenesis associated proteins (c-Myc, p-Erk, Sftpa, Vegfa, and Hif-1α) in MLE12 after coculture.

K. Western blotting of macrophage pro-tumor indicators (Vegfa, Mmp9, Tgf-β, and Hif-1α) in S100a4-OE MH-S.

L. Western blotting of fatty acid metabolism-related proteins (Cpt1a and Acot2) and angiogenesis-related proteins (Anxa2 and Ramp1) in S100a4-OE MH-S, and angiogenesis-related proteins (Anxa2 and Ramp1) in MLE12 after coculture.

OE: overexpression; NC: negative control; ROS: reactive oxygen species; *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001

During the in vitro carcinogenesis process, a battery of assays allow for the identification of the cells in the initiation, promotion or aggressive stages of tumorigenesis (Barguilla et al., 2023). We found increased expression of DNA damage marker p-γH2ax and decreased expression of adhesion protein E-cadherin, but no marked changes in N-cadherin, Vimentin, and stem-like markers (Cd44 and Cd133) (Figure 5I). The expression of tumorigenesis associated proteins (c-Myc, p-Erk, Sftpa, Vegfa, and Hif-1α) were elevated in OE-CM-cultured MLE12 (Figure 5J). Combined with the above observations and the absence of difference in invasion potential, we speculated that the coculture partially activated the epithelial-mesenchymal transition (EMT) program, and the transformation of lung epithelial cells was still in the early or intermediate stage and had not yet reached the advanced stage, which validated our findings to a certain extent and might correspond to the AAH stage in histology. In addition, the expression of pro-tumor indicators in macrophages (Vegfa, Mmp9, Tgf-β, and Hif-1α) were also higher in S100a4-OE MH-S (Figure 5K). Next, we sought to identify the metabolic and functional changes found in the AAH phase mentioned above. As expected, fatty acid metabolism-related genes (Cpt1a and Acot2) and angiogenesis-related genes (Anxa2 and Ramp1) were up-regulated in S100a4-OE MH-S, and angiogenesis-related upregulation was also found in the corresponding cocultured MLE12 (Figure 5L). In summary, we demonstrated in vitro that S100a4+ alv-macro promotes the early malignant transformation of lung epithelial cells by promoting angiogenesis through up-regulation of fatty acid metabolism.

S100A4+ alv-macro in similar patterns was present in human precancerous AAH lesions

Finally, we dedicated to corroborate the correspondence between the results analyzed in human samples and our observations in the mouse model. A total of 23 human tissues were included in our study to proceed aforementioned analysis, including nine normal lung tissues, three AAHs, four AISs, three MIAs, and four IAs, all have been pathologically confirmed by hematoxylin and eosin (H&E) staining and pathologist assessment (Figure 6A). The annotation of scRNA-seq profiles yielded 15 cell types, in which macrophages and AT2 cells similarly exhibited significant enrichment of metabolic activities (Figure S6A-C). Malignant cells were extracted from epithelial cells by inferring chromosomal copy number alterations, and GSVA enrichment analysis of malignant epithelial cells was performed across the four pathological stages (Figure S6D and S6E). Along with scMetabolism analysis, the results collectively suggested that human malignant epithelial cells showed relatively active lipid metabolism during the AAH phase, as for the IA phase, glycometabolism and oxidative phosphorylation pathways were obviously enriched (Figure S6F). Besides, the enrichment of macrophage migration and chemotaxis indicated the vital role of macrophages throughout the whole malignant transformation process of LUAD.

S100A4+ alv-macro with similar pattern in human AAH stage

A. H&E images of five stages of human LUAD development (normal, AAH, AIS, MIA, and IA). Scale bar: 100 µm.

B. UMAP plot of subclusters of alveolar macrophages.

C. UMAP plot of S100A4 expression in alveolar macrophages.

D. Compass analysis of the reaction activities of fatty acid metabolism in S100A4+ alv-macro across the five stages.

E. Macrophage functional analysis of S100A4+ alv-macro across the five stages.

F. Dotplot of expression of CPT1A, ACOT2, and ANXA2 in the five stages.

G. Multiplex immunofluorescence staining of CPT1A and ANXA2 expression in CD68+/CD11C+/S100A4+ alv-macro of human normal and AAH tissues. Scale bar: 20 µm.

In order to figure out whether phenotypically and functionally similar subset of alveolar macrophages also exist in human LUAD precancerous lesions, human alveolar macrophages were further subdivided into four cell subsets, which led us to find that the same was true for human S100A4+ alv-macro (Figure 6B and 6C). In addition, the activities of palmitoyl-CoA hydrolase and carnitine palmitoyl-transferase were also verified to be elevated during the AAH phase, corresponding to metabolic reactions RE0577C_pos and C160CPT1_pos in the Compass analysis (Figure 6D). For the macrophage phenotype, several functional programs with higher abundance in the AAH phase in the mouse scRNA-seq followed a consistent trend in human data, including angiogenesis (Figure 6E). In terms of gene expression, the expression of fatty acid metabolism-related genes CPT1A and ACOT2 and angiogenesis-related gene ANXA2 in the AAH stage was also in line with expectations (Figure 6F). Furthermore, we also noticed greater amounts of S100A4+ alv-macro in human AAH tissues and verified the correlation between CPT1A and ANXA2 in this subset (Figure 6G). Taken together, these results illustrated that human S100A4+ alv-macro promotes their angiogenic function by modulating fatty acid metabolism, thereby accelerating the occurrence of precancerous AAH lesions.

Discussion

Here, our scRNA-seq data provided a comprehensive resource to investigate cell state changes alongside tumor initiation in a mouse model of spontaneous tumor simulating the oncogenic transformation in clinical LUAD. We extensively mapped the transcriptional landscape of cell subpopulations and switching cell interactions, some of which were essential in reshaping microenvironment favoring tumor evolution. In the course of precancerous lesions, the transcriptional heterogeneity was gradually enriched, indicating underlying progressive molecular and cellular changes during carcinogenesis. We identified an initiation-associated gene LDHA, it was suggested that this glycolysis-related gene was significantly upregulated in LUAD and could serve as an independent prognostic indicator of unfavorable overall survival and recurrence-free survival (Yu et al., 2018), illustrating the potentially vital role of metabolic alternations in tumorigenesis.

Based on the understanding that TME can affect the state of epithelial cells, metabolic fluctuations in this context are closely linked to cell phenotype and function (Buck, Sowell, Kaech, & Pearce, 2017; de Visser & Joyce, 2023; J. Wang et al., 2021). Macrophages are no exception to this rule, their metabolic profile can be shaped by specific histological TME, and the cellular state is thus activated and exhibits remarkable plasticity (Kumar et al., 2019; Mazzone, Menga, & Castegna, 2018). Besides, it has been demonstrated that macrophages contribute to the induction of early lung cancer lesions (Casanova-Acebes et al., 2021), and purine metabolism has been confirmed to determine the pro-tumor phenotype of macrophage population (S. Li et al., 2022). In our exploration of malignant epithelial cells in the initiation of LUAD, alveolar macrophage was identified as the cell population with which they interacted most strongly. The scRNA-seq approaches enable the resolution of macrophage heterogeneity, assisting us in understanding how these transcriptionally defined subsets connect to macrophage metabolic status and functional phenotypes. Individual macrophage subsets possess distinct metabolic profiles, our analysis highlighted a specific cell subset, which persisted throughout lung carcinogenesis and observably defined by the expression of S100a4 that have previously not been implicated in tumor-associated alveolar macrophages. The most intriguing finding about this cellular state was that the fatty acid metabolism became active during the AAH phase, accompanied by angiogenetic like properties. S100A4, which was regarded as an angiogenesis regulator, has been stated to enhance macrophage protumor phenotype by control of fatty acid oxidation (Kazakova et al., 2023; S. Liu et al., 2021). What is more convincing was that S100A4+ alv-macro with similar characteristics was extended to scRNA-seq profiles of human precancerous AAH lesions. CPT1A was thought to play a crucial role in this metabolic paradigm shift, and CPT1A-mediated fatty acid oxidation has been proved to be conducive to tumor metastasis, radiation resistance, and immune escape (Z. Liu et al., 2023; Tan et al., 2018; Y.-N. Wang et al., 2018). Moreover, the metabolic pattern of existing subcluster may also be reprogramed in the subsequent phases of precancerous lesions, possibly undergo a transition from lipid metabolism to glucose metabolism, based on the results so far. In terms of subset functional phenotype, S100a4+ alv-macro enriched for fatty acid metabolism held restricted M1-like and antigen presentation capacities at the AAH stage, but exhibited M2-like and angiogenic features, creating a permissive environment for neoplasia. The role of ANXA2 in angiogenesis has been widely recognized and it has been implicated in the progression and metastasis of aggressive cancers (Huang et al., 2022; Sharma, 2019; T. Wang, Wang, Niu, & Wang, 2019). Additionally, the elevated expression of Cpt1a and Anxa2 found in our subgroup analysis were also confirmed in mice, as well as in human AAH tissues.

The in vitro cell transformation assays were also carried out, indicating that S100a4 overexpression in alveolar macrophages acquired molecular features associated with fatty acid metabolism and angiogenesis, and induced lung epithelial cells to an initiating or promoting state of transformation. Nevertheless, the molecular mechanisms that drive the acquisition of metabolic and functional switching properties specific to this cell state still require further characterize in the context of precancerous lesions. Continued study into the interplays between S100a4+ alv-macro and other cellular compartments resided in TME that support tumor progression will be key to developing interventional strategies.

In this study, our results provided the unique patterns of cellular components during the initiation process of LUAD, and gave evidence of S100a4+ alv-macro enhances the function of angiogenesis by promoting fatty acid metabolism, thereby accelerating the process of precancerous lesions. Early intervention in solid cancers at the precancer stage may effectively prevent their progression (Prieto et al., 2023). In the cascade of tumor initiation, AAH represented as a probable forerunner of LUAD, serving as a potential hub in carcinogenesis (Mori, Rao, Popper, Cagle, & Fraire, 2001). On the basis of the newly identified pro-tumor subpopulation and relevant metabolic and functional elements, the development of novel immunotherapies targeting the immunometabolism will provide alterative choice for interventions at AAH stage, paving the way for potential benefits for patients.

Materials and methods

A/J mouse model and human samples

A/J mice were obtained from Shanghai Model Organisms Center (Shanghai, China). The 200 mice were divided into groups and housed in an air-conditioned room with a 12-h light/dark cycle and standard chow diet. Mice were killed by decapitation at different months of age. A gross examination of all organs was conducted, the lungs were removed and bisected for tissue fixation and the preparation of single cell suspensions. All the animal surgical and experimental procedures were approved by the Ethics Board of Experimental Animals, West China Hospital, Sichuan University (Approval No. 20230208003).

Human clinical specimens from ten patients were enrolled with informed consent from West China Hospital. All samples were obtained during surgery and prior to any other treatment. Histological classification of precancerous lesions and tumors was determined according to the WHO guidelines (Travis et al., 2015). Fresh specimen was instantly divided into two pieces for single cell suspension preparation and histopathological staining. The utilization of human samples was approved by the Medical Ethics Committee and Institutional Review Board of West China Hospital, with all participants providing written informed consents.

H&E staining

Lung tissue was further analyzed by histopathology to confirm pathological stage. H&E staining was performed on 4 μm paraffin sections according to routine staining protocol. Microscopic H&E-stained slides were independently reviewed by clinical or veterinary pathologists from Department of Pathology, West China Hospital.

Single cell suspension preparation

Resected samples were washed in Hanks’ balanced salt solution (HBSS, GIBCO, Grand Island, USA) on ice immediately, minced into small pieces with sterile surgical scalpel, then transferred into a conical tube. The mechanical dissociation was followed by an enzymatic digestion. The tissue fragments were digested in HBSS with 2 mg/mL collagenase I and 1 mg/mL collagenase IV (Washington, Lakewood, USA) for 30-45 min at 37 °C. The suspension was then filtered using 70 μm and 30 μm cell strainers (Corning, NY, USA) successively, and the cell pellet was resuspended with red blood cell (RBC) lysis buffer (Invitrogen, Carlsbad, USA) to discard RBCs. Following cell sorting, dead cells were removed from the cell suspension based on exclusion of 7-aminoactinomycin D (7-AAD, Life technologies corporation, Gaithersburg, USA), and live cells were sorted into phosphate buffer solution (PBS) with 0.04% bovine serum albumin (BSA) for subsequent scRNA-seq.

Single-cell RNA library construction and sequencing

Single cell capturing and cDNA library generation were performed using the Single Cell 3’ Chip Kit and 10x Chromium 3’ Library & Gel Bead Kit v3 (10x Genomics, Pleasanton, USA) following the manufacturer’s instructions. The barcoded cDNA was purified with Dynabeads MyOne SILANE bead (Thermo Fischer Scientific, Waltham, USA) before amplification by PCR. Amplified cDNA product was size-selected using SPRIselect Reagent (Thermo Fisher Scientific). After adaptor ligation and sample index PCR, libraries were sequenced on an Illumina NovaSeq 6000.

Single-cell RNA sequencing analysis and identification of marker genes

Raw gene expression matrices were generated per sample using CellRanger (version 4.0.0). These matrices were then combined by R (version 4.0.5) and converted into a Seurat data object using the Seurat R package (version 4.3.0.1). We excluded cells based on any one of following criteria: (1) more than 10,000 or fewer than 500 unique molecular identifiers (UMIs); (2) more than 5,000 or fewer than 300 expressed genes; (3) over 10% UMIs that were derived from the mitochondrial genome; (4) genes which were expressed fewer than 10 cells. Subsequently, we carried out downstream analyses by Seurat workflow (Hao et al., 2021) as described below:

Firstly, we normalized the counts on the filtered matrix for each gene to the total library size by the ‘NormalizeData’ function; Secondly, we employed the ‘FindVariableGenes’ function to identify most highly variable genes for unsupervised clustering; Thirdly, these highly variable genes were centered to a mean of zero and were scaled by the standard deviation by the ‘ScaleData’ function of Seurat; Fourthly, principal component analysis (PCA) was performed by the function ‘RunPCA’; The ‘RunHarmony’ function was used to integrate samples and remove batch effects; the ‘FindClusters’ function was used to calculate the clusters of the cells; UMAP plots were used to visualize clusters of cells localized in the graph-based clusters by ‘RunUMAP’ function; Markers for each cluster were identified by finding differentially expressed genes between cells from the individual cluster versus cells from all other clusters by ‘FindAllMarkers’ function.

Estimating heterogeneity within stages

Heterogeneity of single cell profiles within a stage was quantified by examining the average pairwise Normalized Mutual Information (NMI) (N. D. Marjanovic et al., 2020) between the profiles for each stage. Firstly, we selected 100 differentially expressed genes per each of 13 cell type clusters and top 100 genes per each of the top 20 components in harmony reduction. Next, we discretized expression per gene into 10 bins. Considering the differences in the number of cells across samples, we subsampled 100 cells from each stage 500 times and calculated the median NMI across each within-stage sampled pair. NMI was calculated between each pair of cells x and y by first calculating the mutual information (Eq. 1) and then normalizing it by the entropy of each cell (Eq. 2).

where p(x) and p(y) represent the probability distributions of cell x and cell y over all discretized gene expression values, respectively. p(x, y) denotes the joint probability distribution of cell x and cell y over all discretized gene expression values. H(x) and H(y) correspondingly indicate the entropy of cell x and cell y across all discretized gene expression values.

Single-cell metabolic state analysis

According to the Molecular Signatures Database (Subramanian et al., 2005), we extracted 2,710 genes associated with the metabolic pathways for dimensional reduction and clustering analysis. We employed PCA to process the metabolic genes, and use UMAP to do dimensional reduction for the top 20 components. After that, we employ the ‘FindClusters’ function of Seurat for clustering analysis. Next, we used the ‘FindAllMarkers’ function of Seurat to detect the specific expression of metabolic genes for each metabolic state and use Single-sample GSEA to explore the dynamic pattern of metabolic pathways by SingleSeqGset (Cillo, 2021) (version 0.1.2.9) of R.

Pathway activity analysis

GSVA (Hänzelmann, Castelo, & Guinney, 2013) is a widely used approach to assess pathway enrichment of samples for gene expression data. We used the GSVA package (version 1.51.0) to carry out enrichment analysis for the annotated KEGG and Gene Ontology_Biological Process (GOBP) gene sets in the msigdbr package (Subramanian et al., 2005) (version 7.5.1), followed by differential pathway analysis using the limma package (Ritchie et al., 2015) (version 3.54.2). We not only visualized such pathways that have significant differences (adj. p.val<0.05), but also employed the scMetabolism package (Wu et al., 2022) (version 0.2.1) to do enrichment analysis and visualization for KEGG and Reactome metabolic pathways.

Cell–cell communication analysis

We employed CellChat (version 1.6.1) algorithm (Jin et al., 2021) to analyze and quantify cell-cell communication between all major cell types. CellChat identifies differentially over-expressed ligands and receptors for each cell group, and models the probability of communication between cell groups by the law of mass action. Significant interactions are identified based on permutation test (Good, 2013) which can randomly permutes the group labels of cells. We then used chord plots and circle plots to describe the number of cell-cell interactions and communication strength, and visualized significant interactions (ligand-receptor pairs) from some cell groups to other cell groups by the ‘netVisual_bubble’ function of CellChat.

InferCNV analysis

Malignant epithelial cells were identified using inferCNV (Tickle et al., 2019) (version 1.14.2). For mouse dataset, we used all epithelial cells, 300 fibroblasts, and 300 endothelial cells as inputs. And then, we used 500 fibroblasts and 500 endothelial cells as controls. For the human dataset, 1000 normal stage epithelial cells were used as controls and the remaining epithelial cells were used as inputs. We scored each cell for the extent of CNV signal and plotted the cells on a dendrogram, and then we cut the dendrogram into 6 clusters. We labeled all cells clustered with normal cells in the control group as “non-mag” and the remaining clusters as “mag”.

Correlation analysis between cell groups

We calculated the correlation between different cell groups based on the expression of each cell on the genes. First, the mean value of gene expression was taken as the representative value for each group of cells on each gene. Then, the top 1000 genes with the greatest standard deviation were extracted to calculate the correlation between groups by Spearman’s (Spearman, 1987) method, and finally we used the corrplot (Wei et al., 2013) (version 0.92) function to visualize the correlation matrix.

Cellular metabolic activity modeling

We modeled the metabolic state of individual cells by Compass algorithm (Wagner et al., 2021). We extracted the normalized gene expression matrix as input, and then run Compass with default settings. The output of Compass is a reactions.tsv file, where the numbers correspond to penalties for each reaction per cell. We employ negative log to have scores. If the score is great, the reaction is more active.

Gene set module score

The AddMouduleScore function of Seurat can be used to score any gene set. Its workflow is to firstly calculate the average value of all genes in the target geneset on single cell level, secondly divide the expression matrix into several bins based on the average value, thirdly randomly select the control genes (genes outside the target geneset, default 100 genes) as the background value from each bin. Finally, we count all target and background genes as an average value, respectively. The score for the gene set is determined by subtracting the average background value from the average target gene value. Here we used this function to score gene sets of interest at the single-cell level.

Correlation analysis between gene sets

For correlation analysis between gene sets, score of each gene set was defined as above. Here, we used the Pearson (Pearson, magazine, & science, 1901) method for correlation analysis and employ ggpubr package (Kassambara, 2020) (version 0.6.0) to plot scatter plots.

Macrophage function analysis

We collected below gene sets that correspond to well-known macrophage functions.

We then scored these gene sets by AddModuleScore function with default settings and carried out functional enrichment analysis by SingleSeqGset(Cillo, 2021) package.

Immunofluorescence staining

The sections of mouse lung tissue were first blocked with normal goat serum (10%). Primary antibodies were diluted and incubated with the sections at 4 °C overnight: anti-Ki67 (Abcam, Cambridge, USA), anti-Ldha (Proteintech, Wuhan, China). The slides were then washed three times with PBS, and incubated with Alexa conjugated secondary antibodies (Jackson ImmunoResearch Laboratories, West Grove, USA) for 1 h at room temperature. The nuclei were finally counterstained with DAPI (4’, 6-diamidino-2-phenylindole, Sigma-Aldrich, St. Louis, USA) and the sections were mounted. Images were captured by a fluorescence microscope (Olympus, Tokyo, Japan). The pictures presented represent the results of three sections, each of which was collected for nine fields.

Multiplex immunohistochemistry staining

Multiplex immunohistochemistry staining is a tyramine amplification-based method for in situ labelling of tissue proteins. Combined with spectroscopic imaging technology, the rich protein information contained in the tissue can be obtained. The New Opal Polaris 7-color Manual IHC Kit (Akoya Biosciences, Massachusetts, USA) was used for multiple staining. The Opal multi-labeling method used indirect fluorescent labeling in which tissues were incubated with primary antibody and then labeled with horseradish peroxidase (HRP) secondary antibody and fluorescence amplification working solution. The non-covalently bound antibody was then washed by thermal repair, then replaced with another primary antibody and fluorescein substrate, and so on to achieve multiple labeling, followed by DAPI staining and anti-fluorescence quencher sealing. Detailed experimental procedures were described in the kit instructions. The multispectral tissue imaging system (PerkinElmer Vectra®) and analysis software (Phenochart 1.0) were used for image processing. The following antibodies were used in the staining: F4/80 (Cell Signaling Technology, Danvers, USA), Cd11c (Abcam), S100a4 (Abcam), Cpt1a (Abcam), Anxa2 (Abcam), CD68 (Abcam).

Transient transfection in MH-S cells

The murine alveolar macrophage MH-S cell line was purchased from Procell Life Science&Technology (Wuhan, China) and cultured in RPMI-1640 medium (GIBCO) supplemented with 10% fetal bovine serum (FBS, GIBCO) and 0.05 mM 2-mercaptoethanol (Sigma-Aldrich). The plasmids of mS100a4 and NC were commercially obtained from Hanbio (Shanghai, China) and transiently transfected into MH-S cells with Lipofectamine 3000 (Invitrogen, Carlsbad, CA) respectively, according to the manufacturer’s instructions. S100a4 siRNAs (OBiO, Shanghai, China) were also transfected in the manner described above.

Coculture assay in vitro

The mouse alveolar epithelial MLE12 cell line was kindly provided by Dr. Yi Li and cultured in DMEM/F12 medium (GIBCO) with 10% FBS. For coculture experiment, the transfected MH-S cells were incubated with serum-free medium overnight. The CM was centrifuged at 200 g for 5 min and filtered through a 0.22 μm syringe filter. The collected CM was then applied to MLE12 cells.

qPCR

Total RNA was extracted by SevenFast Total RNA Extraction Kit (Seven, Beijing, China). Reverse transcription for cDNA conversion was conducted with PrimeScript RT Master Mix (TAKARA, Tokyo, Japan). qPCR analysis of mRNA expression was measured by TB Green Premix Ex Taq (TAKARA) and normalized to β-actin. All reactions were performed in triplicate. The primers for target genes are list below:

S100a4-F: TCCACAAATACTCAGGCAAAGAG

S100a4-R: GCAGCTCCCTGGTCAGTAG

β-actin-F: GGCTGTATTCCCCTCCATCG

β-actin-R: CCAGTTGGTAACAATGCCATGT

Western blotting

Total protein was extracted using lysis buffer containing protease inhibitor cocktail (MedChem Express, Monmouth Junction, USA) and quantitated by BCA protein assay kit (KeyGEN BioTECH, Beijing, China). Proteins were separated by sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS-PAGE) and then transferred to a polyvinylidene fluoride (PVDF) membrane. Immunoblotting was performed with the following primary antibodies: S100a4 (Abcam), β-actin (Cell Signaling Technology), Sftpa (Boster, Wuhan, China), E-cadherin (Proteintech), N-cadherin (ABclonal, Wuhan, China), Vimentin (ABclonal), Cd44 (Affinity, Jiangsu, China), Cd133 (Proteintech), p-Erk (Santa Cruz Biotechnology, Santa Cruz, USA), Mmp9 (Cell Signaling Technology), p-γH2ax (Abcam), Vegfa (Santa Cruz Biotechnology), Tgf-β (Cell Signaling Technology), Hif-1α (Abcam), c-Myc (Abcam), Cpt1a (Abcam), Acot2 (Novus Biologicals, Littleton, USA), Anxa2 (Abcam), Ramp1 (Invitrogen).

Cell proliferation and migration assay

Cell viability was assessed by using Cell Counting Kit-8 (CCK8, 4A Biotech Co., Beijing, China). Cell proliferation was measured by colony formation assay. Cell migration was determined by wound-healing assay.

Cell cycle assay

PI/RNase Staining Solution (Sungene Biotech Co., Tianjin, China) was used for cell cycle distribution measurement.

Intracellular ROS measurement

The intracellular ROS level was detected using the Reactive Oxygen Species Assay Kit (Beyotime, Shanghai, China). After coculture, MLE12 cells were loaded with DCFH-DA probe in situ and incubated in a 37 °C incubator for 30 min. Cells were harvested and the fluorescence intensity was measured by flow cytometry.

Transmission electron microscopy

Cell samples were prefixed with 2.5% glutaraldehyde, refixed with 1% osmium tetroxide, dehydrated step by step with acetone, and then subjected to osmosis and embedded with Epon-812 embedding agent. 60-90 nm ultrathin sections were made by an ultrathin microtome and transferred to copper mesh, then stained with uranyl acetate for 10-15 min, followed by lead citrate for 1-2 min at room temperature. Image acquisition was carried out with JEM-1400FLASH transmission electron microscope (JEOL, Japan).

Statistics

Statistical significance was determined using two-tailed unpaired Student’s t-test and p-value <0.05 was considered significant (*p <0.05, **p <0.01, ***p <0.001, ****p <0.0001). The correlation coefficient (R) was calculated using the Pearson method with 95% confidence intervals (CI), and corresponding p-values were added to the plots. Statistical analyses were performed using R, Python, and GraphPad Prism.

Data availability

The scRNA-seq data of mouse and human in this study have been deposited to codeocean (https://codeocean.com/capsule/4741227/tree).

Additional information

Author contribution

Hong Huang: Conceptualization; Methodology; Validation; Investigation; Resources; Writing – original draft preparation; Writing – review & editing; Visualisation

Ying Yang: Methodology; Validation; Investigation; Resources

Qiuju Zhang: Software; Formal analysis; Data curation; Visualisation

Yongfeng Yang: Conceptualization; Methodology; Writing – review & editing

Zhenqi Xiong: Software; Formal analysis; Data curation

Shengqiang Mao: Software; Formal analysis

Tingting Song: Software; Formal analysis

Yilong Wang: Validation; Investigation

Zhiqiang Liu: Validation; Resources

Hong Bu: Conceptualization; Writing – review & editing; Supervision; Project administration

Li Zhang: Conceptualization; Resources; Writing – review & editing; Supervision; Project administration; Funding acquisition

Le Zhang: Software; Writing – review & editing; Supervision; Project administration; Funding acquisition

Competing interests

The authors declare no conflict of interest.

Funding

This work was supported by the National Natural Science Foundation of China [92159302 and 62372316], National Science and Technology Major Project [2021YFF1201200], Sichuan Science and Technology Program Key Project [2024YFHZ0091], Sichuan Science and Technology Project [2022ZDZX0018], and 1.3.5 Project for Disciplines of Excellence, West China Hospital, Sichuan University [ZYGD2200].