Professor-cum-researcher, Dr. Rudra Prasad Saha deliberates on how bioinformatics is playing a critical role in the future development of Covid-19 vaccine.
Coronavirus Disease 2019 or Covid-19 has spread to more than 200 countries, infecting more than 2.3 million people and claiming over 150,000 lives as of April 19, 2020, in less than four months since its first detection in Wuhan in China.
The world today is in a deep shock due to Covid-19, which has forced a majority of the world population to stay indoors to minimize the spread of the virus in the absence of any drug or vaccine. The world economic activity has gone downhill and the IMF and the WTO have already indicated that the world is going to face a major economic crisis, which will be much bigger than the one in 2008. The disease has caught us all on the wrong foot and extraordinary scientific and regulatory efforts are required to develop a vaccine. 
Severe Acute Respiratory Syndrome Coronavirus 2 or SARS-CoV-2 is the causative agent for Covid-19. It belongs to a large family of Coronaviruses that infects a wide variety of animals, birds and human. Shifting from an animal host to a human host for a virus is not easy as it requires genetic changes that would enable the virus to survive and proliferate in humans. SARS-CoV-2 is believed to have acquired those essential mutations when it jumped from a bat to a human via pangolin, as predicted by the scientific community. The virus mutates in such a way that it can evade the human immune system, survive, reproduce and then jump to another human due to its highly infectious nature. The virus is believed to spread via respiratory droplets and aerosols, when an infected person comes in close contact with a healthy individual. The virus causes mild to severe symptoms in humans including fever, dry cough, respiratory distress, pneumonia etc. and may result in death. In the absence of a vaccine, scientists worldwide immediately started working by collaborating and sharing data so that the process of development of a vaccine can be accelerated. Even in many scientists are stuck at their homes as their laboratories have been closed down due to the disease outbreak, their minds are not stuck and therefore, they have started working from home, analyzing the available data and coming up with ideas that can help us to fight this deadly disease.
In this crisis, Bioinformatics has come out as one of the major fields in analyzing data and helping in the development of drugs or vaccines. Sequence data is the ‘holy grail’ in Bioinformatics. Bioinformaticians use computational tools or write their programmes to analyze biological data and come up with theories that can be further tested in the laboratory for validation. In the presence of a large amount of data, scientists have to spend a great amount on time and money to get any fruitful results. In this scenario, Bioinformatics can be used to analyze the data, discard the redundant ones, and come up with a few significant predictions which can be tested in the laboratory quickly.
This way, Bioinformatics can reduce the workload and expenditure significantly, which in turn helps us to address a problem quickly and efficiently. Today Bioinformaticians are closely working with various drug/ vaccine development teams to find out a cure for Covid-19. To predict a drug against Covid-19, the workflow of a Bioinformatician can be roughly divided into five broad areas: (a) retrieving nucleic acid data from the primary sequence databases, (b) analysis and comparison of the data by various sequence comparison tools, (c) phylogenetic analysis of the sequences to find out relationships between various species, (d) homology modeling of target viral proteins, and (e) docking of ligand molecules to viral proteins to facilitate designing of suitable inhibitors.
 Sequence Retrieval from the Database 
The Covid-19 outbreak started in Wuhan, China, in late December 2019. The scientists immediately sequenced the viral genome and deposited that in the Nucleotide database in January 2020 for further analysis by other scientists. Nucleotide database is a freely available public sequence depository where scientists worldwide deposit sequences of DNA and RNA for sharing of data with the community. Immediately upon accession the SARS-CoV-2 sequence data, Bioinformaticians started analyzing it to understand what types of genes are present on the genome of the virus and what kind of proteins can be encoded by these viral genes as knowing this information is the first step in understating the viral defenses that can be utilized for drug development.
Comparative Analysis of the Sequence Data 
Once the viral sequence was available and retrieved, scientists started to compare the data with similar types of viruses to find out how SARS-CoV-2 originated. They found high similarity with another Coronavirus named SARS-CoV which was responsible for a previous outbreak of SARS in 2003. They also found SARS-CoV-2 originated from a bat Coronavirus by analyzing and comparing the sequence data. Also, comparing different isolates of SARS-CoV-2 sequences from different countries gave us an idea of how the virus is mutating when transmitting from one region to the other and whether these mutations or changes in the viral genome would have any effect on the subsequent vaccine development procedure. 
Phylogenetic Analysis of the Sequences 
At present sequences of more than 150 isolates of SARS-CoV-2 are present in the Nucleotide database and by phylogenetic analysis, scientists have found that there are at least three distinct strains (type A, B, and C) of the virus in circulation at different regions of the world. Phylogenetic analysis is a very important methodology, where evolutionary relationships between species, i.e., how a sequence is evolved from another sequence, can be predicted. Significant portions of A and C types are found in the European and American population, respectively, and in Asia, type B is prevalent. These findingsare very important to understand how the virus is evolving in different areas.

0 Comments