SARS-CoV-2: The biology of coronavirus

Among the seven known human coronaviruses (HCoVs), HCoV-229E and HCoV-NL63 belong to α-coronavirus, whereas HCoV-OC43 and HCoV-HKU1 belong to lineage A, SARS-CoV and SARS-CoV-2 to lineage B, and MERS-CoV to lineage C of β-coronavirus

Coronaviruses are a group of enveloped viruses containing a positive-sense, single-stranded RNA. Under an electron microscope, coronavirus offers a crown-like appearance (coronam is the Latin for crown) due to the presence of club-shaped spike glycoproteins on its envelope. With genome sizes ranging from 26 to 32 kilobases (kb) in length, coronaviruses have the largest genomes for RNA viruses. Apart from afflicting a range of economically important vertebrates (such as pigs and chickens), seven coronaviruses have been known to infect human hosts and cause respiratory diseases. Among them, severe acute respiratory syndrome coronavirus (SARS-CoV), severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and Middle East respiratory syndrome coronavirus (MERS-CoV) are zoonotic and highly pathogenic coronaviruses that have resulted in regional and global outbreaks.

According to the International Committee on Taxonomy of Viruses, coronaviruses are classified under the order Nidovirales, family Coronaviridae, subfamily Coronavirinae. The subfamily Orthocoronavirinae classifies into four genera of CoVs: α-coronavirusβ-coronavirusγ-coronavirus, and δ-coronavirus. The genus β-coronavirus further divides into four distinct sub-genera or lineages: lineage A (Embecovirus), lineage B (Sarbecovirus), lineage C (Merbecovirus), and lineage D (Nobecovirus). In addition, a fifth subgenus, Hibecovirus, was also included. Among the seven known human coronaviruses (HCoVs), HCoV-229E and HCoV-NL63 belong to α-coronavirus, whereas HCoV-OC43 and HCoV-HKU1 belong to lineage A, SARS-CoV and SARS-CoV-2 to lineage B, and MERS-CoV to lineage C of β-coronavirus [Trends in microbiology, 2013].

SARS-CoV-2 are spherical or pleomorphic particles, with a diameter of approximately 60–140 nm. They contain positive sense, single-stranded RNA associated with a nucleoprotein within a capsid comprised of matrix protein. The genome of an RNA virus is said to be positive sense if the viral RNA sequence may be directly translated into viral proteins without the need to have an RNA polymerase in the virion. The envelope is decorated with club-shaped projections composed of trimeric spike (S) glycoprotein projections. Some coronaviruses contain additional shorter projections made up of dimeric hemagglutinin-esterase (HE) which is absent in SARS-CoV-2. Like other coronaviruses, it is sensitive to ultraviolet rays and heat. Furthermore, these viruses can be effectively inactivated by lipid solvents including ether (75%), ethanol, chlorine-containing disinfectant, peroxyacetic acid and chloroform except for chlorhexidine. Bioinformatics analysis conducted by Chan et al. revealed that the genome of SARS-CoV-2 has 89% nucleotide identity with bat SARS-like CoVZXC21 and 82% with that of human SARS-CoV [Emerging microbes & infections, 2020]. Its single-stranded RNA contains 29891 nucleotides, encoding for 9860 amino acids, with guanine-cytosine content (G + C ratio) of 38%.  SARS-CoV-2 like SARS-CoV and MERS-CoV probably evolved from a strain found in bats as a result of sequential mutations and recombination of bat coronaviruses, underwent further mutations during the spillover to intermediate hosts, and finally acquired the ability to infect human cells.

Similar to other β coronaviruses, the genome of SARS-CoV-2 contains two flanking untranslated regions (UTRs) and a single long open reading frame encoding a polyprotein. The 2019-nCoV genome is arranged in the order of 5′-replicase (ORF1a/b)-structural proteins [Spike (S)-Envelope (E)-Membrane (M)-Nucleocapsid (N)]−3′. The first two-thirds of the genome typically codes for nonstructural proteins from two open reading frames (ORF) that form the replicase complex. The last third of the genome encodes primarily structural proteins. The four major structural proteins across coronaviruses are the spike (S) protein, membrane (M) protein, envelope (E) protein, and nucleocapsid (N) protein. The S protein is responsible for binding to host cell receptors and subsequent fusion between the viral and host cell membranes to facilitate viral entry to host cells [Nature, 2016]. The N protein primarily functions to bind to the viral RNA genome in a beads-on-a-string fashion, forming the helically symmetric nucleocapsid [Advances in virus research, 2005]. M protein, the most abundant structural protein, defines the shape of the viral envelope and is also considered the central organizer of coronavirus assembly, interacting with all the other major structural proteins [Journal of Structural Biology, 2011]. The E protein is the smallest of the major structural proteins and is present in very low amount in the envelope. Recombinant coronaviruses lacking E exhibit significantly reduced viral titres, crippled viral maturation, or yield propagation incompetent progeny, demonstrating the importance of E in virus production and maturation [Virology, 2007].

While another article [Journal of Microbiology, Immunology and Infection, 2020] in a bid to explain pathogenicity of the virus contends, “In order for the virus to complete entry into the cell following this initial process, the spike protein has to be primed by an enzyme called a protease. Similar to SARS-CoV, SARS-CoV-2 uses a protease called TMPRSS2 to complete this process. In order to attach the virus receptor (spike protein) to its cellular ligand (ACE2), activation by TMPRSS2 as a protease is needed.” Transmembrane protease, serine 2 (TMPRSS2) is an enzyme that is associated with physiological and pathological processes such as digestion, tissue remodelling, blood coagulation, fertility, inflammatory responses, tumour cell invasion, and apoptosis and in human beings, it is encoded by the TMPRSS2 gene [Advances in Cancer Research, 2011]. TMPRSS2 is expressed in the respiratory tract and already has the reputation of bearing the capacity of activating a variety of respiratory viruses. Following its entry into the host cell, the virion SARS-CoV-2 discharges its RNA and takes over the cell machinery to produce and disseminate copies of the virus, which infect more cells. Yet another investigation has revealed that a clinically-proven drug directed against TMPRSS2 potentially blocks SARS-Cov-2 cell entry and this is expected to prompt the discovery of effective therapeutics against SARS-CoV-2 infection [Cell, 2020]. As molecular details about the virus emerge out of scientific investigations, better insights into viral transmission will be achieved enabling the constitution of treatment options.

By Sauvik Raha

Former lecturer in chemistry at Ramanuj Gupta Junior College, Silchar; Indian Association for the Cultivation of Science

View Archive