47
0

Machine Learning Algorithm for Noise Reduction and Disease-Causing Gene Feature Extraction in Gene Sequencing Data

Main:7 Pages
6 Figures
Bibliography:2 Pages
3 Tables
Abstract

In this study, we propose a machine learning-based method for noise reduction and disease-causing gene feature extraction in gene sequencing DeepSeqDenoise algorithm combines CNN and RNN to effectively remove the sequencing noise, and improves the signal-to-noise ratio by 9.4 dB. We screened 17 key features by feature engineering, and constructed an integrated learning model to predict disease-causing genes with 94.3% accuracy. We successfully identified 57 new candidate disease-causing genes in a cardiovascular disease cohort validation, and detected 3 missed variants in clinical applications. The method significantly outperforms existing tools and provides strong support for accurate diagnosis of genetic diseases.

View on arXiv
@article{si2025_2505.19740,
  title={ Machine Learning Algorithm for Noise Reduction and Disease-Causing Gene Feature Extraction in Gene Sequencing Data },
  author={ Weichen Si and Yihao Ou and Zhen Tian },
  journal={arXiv preprint arXiv:2505.19740},
  year={ 2025 }
}
Comments on this paper