Identification of Cancer Types from Gene Expressions using Learning Techniques

Abstract

Swati B Bhonde*, Sharmila K Wagh and Jayashree R Prasad

Around the globe, the tumor is the leading cause of death. Early detection and prediction of a cancer type are important for a patient's wellbeing. Functional genomic data has recently been used in the effective and early detection of cancer. According to previous research, the use of microarray data in cancer prediction has evidenced two main problems as high dimensionality and limited sample size. Several researchers have used numerous statistical and machine learning - based methods to classify cancer types but still, limitations are there which makes cancer classification a difficult job. Deep Learning (DL) and Convolutional Neural Network (CNN) have proven effective in analyzing a wide range of unstructured data including gene expression data. In the proposed method gene expression data of five types of cancer is collected from The Cancer Genome Atlas (TCGA). Prominent features are selected using a hybrid Particle Swarm Optimization (PSO) and Random Forest (RF) algorithm followed by the use of Principal Component Analysis (PCA) for dimensionality reduction. Finally, for classification blend of Convolutional Neural Network (CNN) and Bi - Directional Long Short Term Memory (Bi - LSTM) is used to predict the target type of cancer. We demonstrate that accuracy of the proposed method is 96.89 %. As compared to existing work, our method outperformed and showed better results.

image