<p>Parkinson’s disease (PD) is the second most prevalent neurodegenerative disorder, characterized by the loss of dopaminergic neurons (mDA) in the midbrain. The complex underlying mechanisms are only partly understood and there is no treatment able to reverse PD progression. Here, we aim to understand the disease mechanism using mDA neurons obtained through differentiation of human induced pluripotent stem cells (hiPSCs) carrying the ILE368ASN mutation within the PINK1 gene. Single-cell RNA sequencing (RNAseq) and gene expression analysis of a PINK1 and a control cell line identified genes consistently differentially expressed during mDA neuron differentiation. Network analysis revealed that these genes form a core network, members of which interact with all known 19 protein-coding Parkinson’s disease-associated genes and encompasses key PD-associated pathways, including ubiquitination, mitochondrial, protein processing, RNA metabolism, and vesicular transport. Subsequent proteomics analysis showed a consistent alteration in proteins of dopamine metabolism validating the manifestation of the affected neuronal phenotype. Our findings suggest the existence of a network onto which pathways associated with PD pathology converge and offers new interpretation of the phenotypic heterogeneity of PD.</p>
<p>Parkinson’s disease (PD) is the second most prevalent neurodegenerative disorder, characterized by the loss of dopaminergic neurons (mDA) in the midbrain. The complex underlying mechanisms are only partly understood and there is no treatment able to reverse PD progression. Here, we aim to understand the disease mechanism using mDA neurons obtained through differentiation of human induced pluripotent stem cells (hiPSCs) carrying the ILE368ASN mutation within the PINK1 gene. Single-cell RNA sequencing (RNAseq) and gene expression analysis of a PINK1 and a control cell line identified genes consistently differentially expressed during mDA neuron differentiation. Network analysis revealed that these genes form a core network, members of which interact with all known 19 protein-coding Parkinson’s disease-associated genes and encompasses key PD-associated pathways, including ubiquitination, mitochondrial, protein processing, RNA metabolism, and vesicular transport. Subsequent proteomics analysis showed a consistent alteration in proteins of dopamine metabolism validating the manifestation of the affected neuronal phenotype. Our findings suggest the existence of a network onto which pathways associated with PD pathology converge and offers new interpretation of the phenotypic heterogeneity of PD.</p>
<p><strong>Figure 2: Classification of iPSC status. a.</strong> Immunocytochemistry (ICC). Staining for the iPSC markers OCT3/4 and TRA-1-80 of iPSC colonies, prior to differentiation. DAPI was used to
<p><strong>Figure 2: Classification of iPSC status. a.</strong> Immunocytochemistry (ICC). Staining for the iPSC markers OCT3/4 and TRA-1-80 of iPSC colonies, prior to differentiation. DAPI was used to
stain cell nuclei as a reference. <strong>b.</strong> Expression of genes known to indicate iPSC status (MYC, POU5F1-OCT3/4) and of genes identified by a differential expression analysis between iPSCs and differentiating cells (also see Supplement Fig. 12). TDGF-1 is expressed in iPS cells of high stemness60; L1TD1, USP44, POLR3G, and TERF1 are essential for the maintenance of pluripotency in human stem cells 61–64; IFITM1, DPPA4, and PRDX1 are associated with
stain cell nuclei as a reference. <strong>b.</strong> Expression of genes known to indicate iPSC status (MYC, POU5F1-OCT3/4) and of genes identified by a differential expression analysis between iPSCs and differentiating cells (also see Supplement Fig. 12). TDGF-1 is expressed in iPS cells of high stemness60; L1TD1, USP44, POLR3G, and TERF1 are essential for the maintenance of pluripotency in human stem cells 61–64; IFITM1, DPPA4, and PRDX1 are associated with
stemness 65–67. c. Results of Scorecard analysis of iPSCs and embryonic bodies (EBs). iPSCs
stemness 65–67. c. Results of Scorecard analysis of iPSCs and embryonic bodies (EBs). iPSCs
...
@@ -27,7 +27,7 @@ mesoderm, ectoderm and endoderm markers. EBs are cells at an early stage of spon
...
@@ -27,7 +27,7 @@ mesoderm, ectoderm and endoderm markers. EBs are cells at an early stage of spon
differentiation. Scorecard analysis of EBs determines the iPSC line’s potential to
differentiation. Scorecard analysis of EBs determines the iPSC line’s potential to
differentiate into the three germ layers, hence, EBs are expected to express few or no selfrenewal genes and to show expression of some mesoderm, ectoderm and endoderm
differentiate into the three germ layers, hence, EBs are expected to express few or no selfrenewal genes and to show expression of some mesoderm, ectoderm and endoderm
PITX3, LMX1A and DAT in control (top) and PINK1 cell line (bottom), at D35. <strong>b.</strong> Table on the left shows that, based on our SC-RNAseq data, cell lines cluster according to differentiation stage, indicating that gene expression is very homogenous in the control and the PINK1 cell lines, which allows for the detection of even subtle alteration by the presence of the PINK1 mutation. Tables on the right show trajectory of expression of TH and KCNJ6 (GIRK2), two mDA markers. At D21 neurons begin to show TH expression, together with expression of mDA markers, this indicates that they are becoming early postmitotic mDA neurons. Similar observations can also be made from an expression heatmap shown in Supplement Figure</p>
PITX3, LMX1A and DAT in control (top) and PINK1 cell line (bottom), at D35. <strong>b.</strong> Table on the left shows that, based on our SC-RNAseq data, cell lines cluster according to differentiation stage, indicating that gene expression is very homogenous in the control and the PINK1 cell lines, which allows for the detection of even subtle alteration by the presence of the PINK1 mutation. Tables on the right show trajectory of expression of TH and KCNJ6 (GIRK2), two mDA markers. At D21 neurons begin to show TH expression, together with expression of mDA markers, this indicates that they are becoming early postmitotic mDA neurons. Similar observations can also be made from an expression heatmap shown in Supplement Figure</p>
<strong>Figure 5: Differentialy expressed genes (DEGs)</strong> in a cell line homozygous for a mutation in the PINK1 gene, compared to a control cell line, at three timepoints during the
<strong>Figure 5: Differentialy expressed genes (DEGs)</strong> in a cell line homozygous for a mutation in the PINK1 gene, compared to a control cell line, at three timepoints during the
differentiation of mDA neurons (D6, D15 and D21). <strong>a.</strong> Heatmap of the top DEGs. Each
differentiation of mDA neurons (D6, D15 and D21). <strong>a.</strong> Heatmap of the top DEGs. Each
column corresponds to a timepoint for either control (left 3) or PINK1 cells (right 3); each
column corresponds to a timepoint for either control (left 3) or PINK1 cells (right 3); each
...
@@ -59,9 +59,9 @@ analysis performed using the STRING69 database. The top KEGG pathway associated
...
@@ -59,9 +59,9 @@ analysis performed using the STRING69 database. The top KEGG pathway associated
this dataset is Parkinson’s disease. Other three KEGG pathways identified were Spliceosome,
this dataset is Parkinson’s disease. Other three KEGG pathways identified were Spliceosome,
Huntington's disease and Thermogenesis, in order of decreasing significance. Details are
Huntington's disease and Thermogenesis, in order of decreasing significance. Details are
<strong>Figure 6. Network analysis. a.</strong> Protein-protein interaction network based on known interactions available through the STRING69 and GeneMANIA70 databases. Only strong interactions were retained, predicted interactions or text associations were omitted (see Methods). Betweenness centrality was used to illustrate the relative importance of each node within the network through the level of its connectedness to other proteins. In the larger the circle, the more partners the node is connected to. The colours represent the DEG sets, with the top 56 DEGs (group A) in light blue, group B in purple, group C in dark green and group D in light green. CHCHD2 (pink, part of group B) is a DEG which has recently also been identified as a PARK gene. <strong>b.</strong> DEGs which play a role in ubiquitination Additional functional pathways are listed in Supplement Fig. 7 and Supplement Table 9. Specific connection to ubiquitination are shown in Supplement Fig.8. <strong>c.</strong> Based on literature, 68% of the DEGs of this network are already known to be associated with PD (for references see Supplement Table 10). Supplement Fig.9 shows which genes/proteins of the network directly interact with PARK genes through known protein-protein interactions. The topology of all three networks is the same, the different appearance is a result of a separate analysis run, but the connections and size of the nodes remain the same.</p>
<strong>Figure 6. Network analysis. a.</strong> Protein-protein interaction network based on known interactions available through the STRING69 and GeneMANIA70 databases. Only strong interactions were retained, predicted interactions or text associations were omitted (see Methods). Betweenness centrality was used to illustrate the relative importance of each node within the network through the level of its connectedness to other proteins. In the larger the circle, the more partners the node is connected to. The colours represent the DEG sets, with the top 56 DEGs (group A) in light blue, group B in purple, group C in dark green and group D in light green. CHCHD2 (pink, part of group B) is a DEG which has recently also been identified as a PARK gene. <strong>b.</strong> DEGs which play a role in ubiquitination Additional functional pathways are listed in Supplement Fig. 7 and Supplement Table 9. Specific connection to ubiquitination are shown in Supplement Fig.8. <strong>c.</strong> Based on literature, 68% of the DEGs of this network are already known to be associated with PD (for references see Supplement Table 10). Supplement Fig.9 shows which genes/proteins of the network directly interact with PARK genes through known protein-protein interactions. The topology of all three networks is the same, the different appearance is a result of a separate analysis run, but the connections and size of the nodes remain the same.</p>
<strong>Figure 7. Comparative proteomics analysis between CTRL and Pink1 cell line at D25 and
<strong>Figure 7. Comparative proteomics analysis between CTRL and Pink1 cell line at D25 and
D40 validates the manifestation of the transcriptional phenotype.</strong> Results of proteomic
D40 validates the manifestation of the transcriptional phenotype.</strong> Results of proteomic
analysis at D25 and D40 of the differentiation protocol. <strong>a.</strong> Protein levels of protein that were detected as both top differentially abundant at the protein level by the proteomics analysis, and by SC-RNAseq, as differentially expressed at the mRNA level. The data shows results at two timepoints, D25 and D40, in two biological replicates per timepoint. <strong>b.</strong> This figure shows a network of proteins differentially expressed between a control and a PINK1 mutationcarrying cell line, at D25 and D40. Proteins which are differentially expressed at both D25 and D40 are highlighted in green and point to a dysfunction of the dopaminergic system. D25 differentially abundant proteins are in purple, D40 in blue, proteins also identified as by SC-RNAseq differentially expressed at the mRNA level are in pink. For a table of proteins see
analysis at D25 and D40 of the differentiation protocol. <strong>a.</strong> Protein levels of protein that were detected as both top differentially abundant at the protein level by the proteomics analysis, and by SC-RNAseq, as differentially expressed at the mRNA level. The data shows results at two timepoints, D25 and D40, in two biological replicates per timepoint. <strong>b.</strong> This figure shows a network of proteins differentially expressed between a control and a PINK1 mutationcarrying cell line, at D25 and D40. Proteins which are differentially expressed at both D25 and D40 are highlighted in green and point to a dysfunction of the dopaminergic system. D25 differentially abundant proteins are in purple, D40 in blue, proteins also identified as by SC-RNAseq differentially expressed at the mRNA level are in pink. For a table of proteins see
<strong>Supl.Figure1:</strong> Quality control Plots of control sample Day 06. a) Cumulative Total number of counts. The red vertical lines represent the down and upper bound of the expected elbow. The blue dot represent the transitional point calculated using ecp r package. b) Histograms of the three criteria that used for low quality cell filtering. c,d) Violin plots of the three criteria. c) Cell score before filtering. Red dots are the cells that filtered after the quality control. d) The overview of the three criteria after filtering step. </p>
<strong>Supl.Figure1:</strong> Quality control Plots of control sample Day 06. a) Cumulative Total number of counts. The red vertical lines represent the down and upper bound of the expected elbow. The blue dot represent the transitional point calculated using ecp r package. b) Histograms of the three criteria that used for low quality cell filtering. c,d) Violin plots of the three criteria. c) Cell score before filtering. Red dots are the cells that filtered after the quality control. d) The overview of the three criteria after filtering step. </p>
<h2id="data-intergration">Data Intergration</h2>
<h2id="data-intergration">Data Intergration</h2>
<p>The integration of the filtered matrices of the different datasets was performed using scTransform [2] on a Seurat object [3] based on the treatment. The final gene expression matrix which used for the downstream analysis, consist of 4495 cells and 39194 genes. Principal component analysis (PCA) was computed using the 5000 most variable genes on the integrated data. </p>
<p>The integration of the filtered matrices of the different datasets was performed using scTransform [2] on a Seurat object [3] based on the treatment. The final gene expression matrix which used for the downstream analysis, consist of 4495 cells and 39194 genes. Principal component analysis (PCA) was computed using the 5000 most variable genes on the integrated data. </p>