Mutation prediction in the SARS-CoV-2 genome using attention-based neural machine translation
- Severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) has been evolving rapidly after causing havoc worldwide in 2020. Since then, it has been very hard to contain the virus owing to its frequently mutating nature. Changes in its genome lead to viral evolution, rendering it more resistant to existing vaccines and drugs. Predicting viral mutations beforehand will help in gearing up against more infectious and virulent versions of the virus in turn decreasing the damage caused by them. In this paper, we have proposed different NMT (neural machine translation) architectures based on RNNs (recurrent neural networks) to predict mutations in the SARS-CoV-2-selected non-structural proteins (NSP), i.e., NSP1, NSP3, NSP5, NSP8, NSP9, NSP13, and NSP15. First, we created and pre-processed the pairs of sequences from two languages using k-means clustering and nearest neighbors for training a neural translation machine. We also provided insights for training NMTs on long biological sequences. In addition, we evaluated and benchmarked our models to demonstrate their efficiency and reliability.
Author: | Darrak Moin Quddusi, Sandesh Athni Hiremath, Naim Bajcinca |
---|---|
URN: | urn:nbn:de:hbz:386-kluedo-85051 |
Parent Title (English): | Mathematical Biosciences and Engineering |
Publisher: | AIMS Press |
Editor: | Pedro Carmona Sáez |
Document Type: | Article |
Language of publication: | English |
Date of Publication (online): | 2024/05/20 |
Year of first Publication: | 2024 |
Publishing Institution: | Rheinland-Pfälzische Technische Universität Kaiserslautern-Landau |
Date of the Publication (Server): | 2024/11/21 |
Issue: | 2024, 21(5): 5996-6018 |
Source: | 10.3934/mbe.2024264 |
Faculties / Organisational entities: | Kaiserslautern - Fachbereich Maschinenbau und Verfahrenstechnik |
DDC-Cassification: | 0 Allgemeines, Informatik, Informationswissenschaft / 004 Informatik |
Collections: | Open-Access-Publikationsfonds |
Licence (German): | Zweitveröffentlichung |