Visualising the Cross-Level Relationships between Pathological and Physiological Processes and Gene Expression: Analyses of Haematological Diseases


The understanding of pathological processes is based on the comparison between physiological and pathological conditions, and transcriptomic analysis has been extensively applied to various diseases for this purpose. However, the way in which the transcriptomic data of pathological cells relate to the transcriptomes of normal cellular counterparts has not been fully explored, and may provide new and unbiased insights into the mechanisms of these diseases. To achieve this, it is necessary to develop a method to simultaneously analyse components across different levels, namely genes, normal cells, and diseases. Here we propose a multidimensional method that visualises the cross-level relationships between these components at three different levels based on transcriptomic data of physiological and pathological processes, by adapting Canonical Correspondence Analysis, which was developed in ecology and sociology, to microarray data (CCA on Microarray data, CCAM). Using CCAM, we have analysed transcriptomes of haematological disorders and those of normal haematopoietic cell differentiation. First, by analysing leukaemia data, CCAM successfully visualised known relationships between leukaemia subtypes and cellular differentiation, and their characteristic genes, which confirmed the relevance of CCAM. Next, by analysing transcriptomes of myelodysplastic syndromes (MDS), we have shown that CCAM was effective in both generating and testing hypotheses. CCAM showed that among MDS patients, high-risk patients had transcriptomes that were more similar to those of both haematopoietic stem cells (HSC) and megakaryocyte-erythroid progenitors (MEP) than low-risk patients, and provided a prognostic model. Collectively, CCAM reveals hidden relationships between pathological and physiological processes and gene expression, providing meaningful clinical insights into haematological diseases, and these could not be revealed by other univariate and multivariate methods. Furthermore, CCAM was effective in identifying candidate genes that are correlated with cellular phenotypes of interest. We expect that CCAM will benefit a wide range of medical fields.

PLoS ONE, vol. 8, no. 1, p. e53544, Jan. 2013
Reiko Tanaka
Reiko Tanaka
Reader in Computational Systems Biology & Medicine