您好, 访客   登录/注册

乙型肝炎病毒核酸与同义密码子的遗传特点分析

来源:用户上传      作者:

  摘 要:乙型肝炎病毒是一种严重威胁公共健康的病毒性传染病。本文分析了乙型肝炎病毒基因组中开放阅读框中所有编码阅读框序列的遗传特征,于此同时我们还分析了病毒同义密码子使用模式与不同模式生物密码子使用模式的差异性。结果发现,乙型肝炎病毒基因对同义密码子使用的频率是不均一的,其中TCC,TCT,AGT,AGA是过量使用的,而ACC,CGG,GCC,ATA,CGT,CCC,TCG和GGT的使用频率极低。通过分析病毒与不同模式生物的密码子使用模式的差别,发现病毒自身的核苷酸变异和宿主对其自然选择的遗传压力是乙型肝炎病毒在遗传进化方面达到了一个相对平衡的局面。
  关键词:乙型肝炎病毒;基因组;同义密码子;进化
  Abstract: Hepatitis B virus (HBV) is the causative agent of an important disease for human beings. The synonymous codon usage pattern of open reading frame of HBV was analyzed by the relative synonymous codon usage value based on the principle component analysis, viral synonymous codon usage distribution compared with those of the five organisms (E. coli, mouse, human, gorilla gorilla and pan troglodytes), and their overall codon usage bias was estimated by the codon adaptation index. The genetic diversity of 58 strains of HBV indicates that the compositional constraint deriving from nucleotide composition plays an important role in shaping the synonymous codon usage pattern of HBV. It is interesting that the four codons (TCC, TCT and AGT for Ser and AGA for Arg) of HBV are over-represented usage and the eight codons (ACC for Thr, CGG for Arg, GCC for Ala, ATA for Ile, CGT for Arg, CCC for Pro, TCG for Ser and GGT for Gly) are under-represented usage, and these codons usage under the strong translation selection probably serves as a potential genetic marker for HBV. Based on the codon usage bias of HBV, a balance of mutation pressure from virus itself and translation selection from the host acts on the synonymous codon usage patterns of HBV. The candidate systems for expression proteins encoded with the codon usage profiles of HBV are human, mouse and gorilla gorilla, while the overall codon usages of the cells deriving from E. coli. and pan troglodytes are not adapted to the synonymous codon usage distribution of this virus to some degree. These conclusions could not only offer an insight into the synonymous codon usage pattern of HBV, but also assist in understanding the discrepancy of HBV evolution.
  Keywords: Hepatitis B virus; synonymous codon usage pattern; principle component analysis;codon adaptation index; genetic marker
  1.INTRODUCTION
  It is well known that the genetic code chooses 64 codons to represent 20 standard amino acids and stop signals. These alternative codons for the same amino acid are termed as the synonymous codons. Although synonymous mutations tend to occur in the third base position, the cases can be interchanged without altering the primary sequence of the protein product. Some reports indicate that synonymous codons are not chosen equally and randomly in the viruses (Zhao et al., 2003; Zhi et al., 2010; Zhou et al., 2012b; Zhou et al., 2011). In general, translation selection and mutational pressure are thought to be the two major factors accounting for codon usage variation among genomes in various organisms (Saunders and Deane, 2010; Shackelton et al., 2006; Zhou and Li, 2009). In some DNA viruses, compared with translation selection coming from hosts, mutation pressure plays an important role in synonymous codon usage pattern (Gu et al., 2012). Hepatitis B virus (HBV) infection is one of the most prevalent viral infections in human being, and is an acute public health problem in the world. This virus is a unique enveloped double-stranded DNA virus which depends on the error-prone polymerase reverse transcriptase as part of its replication process, resulting in a large genetic variability over the years of virus evolution within its hosts (Araujo et al., 2011). HBV genotypes and subgenotypes have been increasingly associated with differences in clinical and virological characteristics, such as severity of liver disease and response to antiviral therapies (Chu and Lok, 2002; Kao, 2002; McMahon, 2009). As for the role of the natural hosts in the evolutional procession of HBV, some adaptive advantages reflect the highly efficient dissemination of the virus by different modes of transmission, resulting in a widespread distribution of HBV in the world. Then, based on the principles of natural selection, the synonymous codon usage pattern and nucleotide composition of virus represent the foot-print of the viral evolution. Some previous reports pointed out that the synonymous codon usage pattern and nucleotide position of virus reflect the co-evolution between virus and natural hosts (Lobo et al., 2009; Ma et al., 2012; Mueller et al., 2006; Sanchez et al., 2003; Weaver, 2006; Wong et al., 2010; Zhou et al., 2012a; Zhou et al., 2013; Zhou et al., 2011). Here, we analyzed the synonymous codon usage pattern and nucleotide composition of HBV to investigate the genetic marker of this virus and estimated the different adaptation of the expression systems deriving from different organisms to the expression of the viral product in theory.   2.MATERIAL AND METHODS
  The information of the whole coding sequence of HBV strains
  1.The 58 complete genomes of HBV were downloaded from the National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/Genbank/) and were listed in detail, namely, AF405706, X04615, AY741795, U87747, AY123041, AF282917, AY233287, AY233280, AY233283, AY233291, DQ448620, AY373432, AY373430, DQ448620, DQ448621, DQ448622, DQ448625, AY233273, DQ448628, DQ448627, AY233275, AY233278, AY233277, AY233274, AY233276, AY233274, DQ448623, AY233281, AY233279, AY233282, AY233290, AY233288, AY233289, AY233285, AY233284, AY233286, AF282918, AF068756, M57663, AF100308, AB033554, AY741797, U87746, AY741798, AY741796, AY741794, AF100309, GQ872210, AY23329, AY233294, AY233293, AY233296, GQ161799, U95551, GQ161805, AY796031, AY796032, GQ161818, AY796030.
  2.Each general nucleotide composition (T%, A%, C% and G%) and each nucleotide composition in the third site of codon (T3%, A3%, C3% and G3%) of the open reading frame (ORF) of HBV were calculated by the software DNAStar 7.0 for windows. ‘Codon Adaptation Index (CAI)’ is used to estimate the extent of bias towards codons usage. The index range from 0.0 to 1.0, the higher value indicates more bias and higher expression (Sharp and Li, 1987). CAI index has been proved to be the best gene expression level (Naya et al., 2001).
  3.The calculation of the relative synonymous codon usage and the codon usage bias of HBV
  1.To investigate the characteristics of synonymous codon usage without the confounding influence of amino acid composition among different sequences, the relative synonymous codon usage (RSCU) values among different codons in each ORF was calculated. The RSCU value of the ith codon for the jth amino acid was calculated according to the published equation (Sharp et al., 1986). RSCU:
  where gij is the observed number of the ith codon for jth amino acid which has ni type of synonymous codons. The codons with RSCU values more than 1.0 have the property of positive usage, while the values less than 1.0 have the property of negative usage. When RSCU value is equal to 1.0, it means that this codon is selected equally and randomly. Additionally, codon usage frequencies of E. coli, mouse, human, gorilla gorilla and pan troglodytes were obtained from the codon usage database (Nakamura et al., 2000), the RSCU values for these organisms were also calculated for the 59 synonymous codons by the formula for RSCU value.   2.To calculate CUB numerically, we assumed that statistically equal & random choose of all available synonymous codons was the “neutral point” (RSCU0=1.00) for the development of serotypes-specific codon usages, namely, CUB = RSCU-RSCU0.
  4.Principal component analysis
  Principal component analysis (PCA), which is a commonly used multivariate statistical method, is a useful method reducing the data dimensionality by performing a covariance analysis between factors (Zhou et al., 2013). Here, PCA was applied to estimate the discrepancy between the overall codon usage pattern of each HBV strain and those of the five organisms (E. coli, mouse, human, gorilla gorilla and pan troglodytes). The method can provide a direct way to visualize the link between the synonymous codon distribution of the viral strains and the overall codon usage patterns of the target organisms. The HBV strains was represented as a 58 dimensional vector (58 strains), and each dimension corresponded to each of HBV strain, which only included several synonymous codons for a particular amino acid, excluding AUG, UGG & three stop codons. Finally, we can establish a two-dimensional map which provides the evolution distance between the virus and its host at the aspect of codon usage and a relationship between the synonymous codon distribution of the virus and the overall codon usage pattern of the organisms. The processing of PCA was carried out by the statistical software SPSS 11.5 for windows. Graphs were plotted using Sigmaplot 10.0 (Systat Software Inc.).
  5.Nucleotide composition of HBV ORF
  Based on the overlay scatter-plot, the content of each nucleotide at the synonymous third position of sense codons fluctuates following the total content of corresponding nucleotide, especially A3%, C3%, G3% (Fig. 1), suggesting that the synonymous codon usage pattern may be directly and simply correlated to nucleotide content caused by mutation pressure. Turning to the expression levels of viral product in host cells, the relationship between CAI and GC3% indicates that the various codon usage biases of HBV exist in the process of evolution of HBV (Fig. 2), suggesting that the synonymous codon usage patterns of HBV might play an important role in the expression level of viral product of HBV. Some previous reports pointed out that HBV genotypes and subgenotypes have been increasingly associated with differences in virological and clinical features, such as response to antiviral therapies and severity of liver disease (Chu and Lok, 2002; Kao, 2002; McMahon, 2009).   6.Relationship between amino acids and codon usage pattern in HBV
  In order to analyze whether the evolution of CUB was controlled by mutation pressure or translational selection, the CUB data display a numerical representation of the translational machinery (Fig. 3). The transition from maximum-negative to maximum-positive values was smooth and there was no obvious or unambiguous border between the so-called dominant and prohibited codons, namely, all possible codons were used. Furthermore, some synonymous codons usage is over-represented (UCC, UCU and AGU for Ser and AGA for Arg) and under-represented (ACC for Thr, CGG for Arg, GCC for Ala, AUA for Ile, CGU for Arg, CCC for Pro, UCG for Ser and GGU for Gly) (Fig. 3). The result indicated that translational selection in nature has an effect on the pattern of synonymous codon usage and the evolutionary pattern of HBV.
  7.The synonymous codon usage distribution of HBV ORF
  Figure 4 makes immediately obvious the relationship between the overall codon usage patterns of the five organisms and the synonymous codons distribution of HBV, respectively. Based on the standard for identifying the candidate expression systems of heterologous protein (Gustafsson et al., 2004), the candidate systems for expression proteins encoded with the codon usage profiles of HBV are human, mouse and gorilla gorilla, while the overall codon usages of the cells deriving from E. coli. and pan troglodytes are not adapted to the synonymous codon usage distribution of this virus to some degree(Fig. 4). This genetic phenomenon might be explained by the fact that all HBV strains mentioned in this study are isolated from human beings and the influence of human cell environment exists in the evolutional procession of HBV. Furthermore, Fig. 5 indicates that the codon usage bias can standard for the synonymous coodn usage pattern in the evolutionary procession of HBV, suggesting that the codon usage bias play an important role in the formation of the synonymous codon usage pattern and is influenced by the balance between mutation pressure from virus itself and translation selection from human beings.
  8.DISCUSSION
  Codon usage pattern can indicate the genetic diversity of many organisms (Duret, 2002) and has been noted in relationships between some viruses and their hosts (Barrai et al., 2008; Liu et al., 2011; Pinto et al., 2007). Viral RNA sequences are ubiquitous cellular parasites which have a strong tendency to make fast replication and evolve rapidly. As HBV depends on the host cell’s machinery for its replication, codon usage pattern could take part in host adaptation and evolution process of this virus. More specifically, the adaptation would refer to the usage of the highly abundant tRNAs within host cell by viruses, which would be optimally adapted when the codon usage would match that of the host. Depending on maximizing the translation speed of viral protein synthesis, a perfect viral production can impair the immune response inside a virus-infected cell (Bonhoeffer and Nowak, 1994; Dupas et al., 2003). Although over-represented codons correlate with the popular cognate tRNAs available within the cell (Ikemura, 1982) and the codon bias is an important factor in gene expression (Burgess-Brown et al., 2008), heterologous proteins with the proper biological functions need to undergo the process of the correct protein folding. For some codons with contrasting usage bias against corresponding ones of the three hosts, this genetic characteristic can reflect that HBV owns a specific evolutionary process. Among these codons with contrasting usage bias, UCC, UCU and AGU for Ser and AGA for Arg in each genes of HBV are used in high frequency and may regulate the translation speed of corresponding genes to have the target protein folded properly. These functional advantages of preferred codons can be explained by the fact that, when the pools of their cognate tRNAs are smaller, the waiting time till the correct tRNA attachment at the ribosome is increased, thus enhancing the chance for correct protein folding. Comparative genomics analysis has been focused on the ongoing evolution of the codon usage pattern of different organisms (Gustafsson et al., 2004; Knight et al., 2001; Santos et al., 2004), the different synonymous codon usage patterns in different organisms is a barrier to heterologous expression.   9.ACKNOWLEDGMENTS
  We would like to thank Ms. Xiao-xia Ma from Northwest University for Nationalities for her kind discussion, suggestions, and efforts in language editing.
  REFERENCES:
  Araujo, N.M., Waizbort, R. and Kay, A. (2011) Hepatitis B virus infection from an evolutionary point of view: how viral, host, and environmental factors shape genotypes and subgenotypes. Infect Genet Evol 11(6), 1199-207.
  Barrai, I., Salvatorelli, G., Mamolini, E., De Lorenzi, S., Carrieri, A., Rodriguez-Larralde, A. and Scapoli, C. (2008) General preadaptation of viral infectors to their hosts. Intervirology 51(2), 101-11.
  Bonhoeffer, S. and Nowak, M.A. (1994) Intra-host versus inter-host selection: viral strategies of immune function impairment. Proc Natl Acad Sci U S A 91(17), 8062-6.
  Burgess-Brown, N.A., Sharma, S., Sobott, F., Loenarz, C., Oppermann, U. and Gileadi, O. (2008) Codon optimization can improve expression of human genes in Escherichia coli: A multi-gene study. Protein Expr Purif 59(1), 94-102.
  Chu, C.J. and Lok, A.S. (2002) Clinical significance of hepatitis B virus genotypes. Hepatology 35(5), 1274-6.
  Dupas, S., Turnbull, M.W. and Webb, B.A. (2003) Diversifying selection in a parasitoid’s symbiotic virus among genes involved in inhibiting host immunity. Immunogenetics 55(6), 351-61.
  Duret, L. (2002) Evolution of synonymous codon usage in metazoans. Curr Opin Genet Dev 12(6), 640-9.
  Gu, Y.X., Zhang, J., Zhou, J.H., Zhao, F., Liu, W.Q., Wang, M., Chen, H.T., Ma, L.N., Ding, Y.Z. and Liu, Y.S. (2012) Comparative analysis of ovine adenovirus 287 and human adenovirus 2 and 5 based on their codon usage. DNA Cell Biol 31(3), 360-6.
  Gustafsson, C., Govindarajan, S. and Minshull, J. (2004) Codon bias and heterologous protein expression. Trends Biotechnol 22(7), 346-53.
  Ikemura, T. (1982) Correlation between the abundance of yeast transfer RNAs and the occurrence of the respective codons in protein genes. Differences in synonymous codon choice patterns of yeast and Escherichia coli with reference to the abundance of isoaccepting transfer RNAs. J Mol Biol 158(4), 573-97.
  Kao, J.H. (2002) Clinical relevance of hepatitis B viral genotypes: a case of deja vu? J Gastroenterol Hepatol 17(2), 113-5.
  Knight, R.D., Freeland, S.J. and Landweber, L.F. (2001) Rewiring the keyboard: evolvability of the genetic code. Nat Rev Genet 2(1), 49-58.
  Liu, Y.S., Zhou, J.H., Chen, H.T., Ma, L.N., Pejsak, Z., Ding, Y.Z. and Zhang, J. (2011) The characteristics of the synonymous codon usage in enterovirus 71 virus and the effects of host on the virus in codon usage pattern. Infect Genet Evol 11(5), 1168-73.   Lobo, F.P., Mota, B.E., Pena, S.D., Azevedo, V., Macedo, A.M., Tauch, A., Machado, C.R. and Franco, G.R. (2009) Virus-host coevolution: common patterns of nucleotide motif usage in Flaviviridae and their hosts. PLoS One 4(7), e6282.
  Ma, M.R., Hui, L., Wang, M.L., Tang, Y., Chang, Y.W., Jia, Q.H., Wang, X.H., Wei, Y. and Ha, X.Q. (2012) Analysis of Codon Contribution Between the Pestivirus Genus and Their Natural Hosts. Journal of Animal and Veterinary Advances 11(21), 3999-4004.
  McMahon, B.J. (2009) The influence of hepatitis B virus genotype and subgenotype on the natural history of chronic hepatitis B. Hepatol Int 3(2), 334-42.
  Mueller, S., Papamichail, D., Coleman, J.R., Skiena, S. and Wimmer, E. (2006) Reduction of the rate of poliovirus protein synthesis through large-scale codon deoptimization causes attenuation of viral virulence by lowering specific infectivity. J Virol 80(19), 9687-96.
  Nakamura, Y., Gojobori, T. and Ikemura, T. (2000) Codon usage tabulated from international DNA sequence databases: status for the year 2000. Nucleic Acids Res 28(1), 292.
  Naya, H., Romero, H., Carels, N., Zavala, A. and Musto, H. (2001) Translational selection shapes codon usage in the GC-rich genome of Chlamydomonas reinhardtii. FEBS Lett 501(2-3), 127-30.
  Pinto, R.M., Aragones, L., Costafreda, M.I., Ribes, E. and Bosch, A. (2007) Codon usage and replicative strategies of hepatitis A virus. Virus Res 127(2), 158-63.
  Sanchez, G., Bosch, A. and Pinto, R.M. (2003) Genome variability and capsid structural constraints of hepatitis a virus. J Virol 77(1), 452-9.
  Santos, M.A., Moura, G., Massey, S.E. and Tuite, M.F. (2004) Driving change: the evolution of alternative genetic codes. Trends Genet 20(2), 95-102.
  Saunders, R. and Deane, C.M. (2010) Synonymous codon usage influences the local protein structure observed. Nucleic Acids Res 38(19), 6719-28.
  Shackelton, L.A., Parrish, C.R. and Holmes, E.C. (2006) Evolutionary basis of codon usage and nucleotide composition bias in vertebrate DNA viruses. J Mol Evol 62(5), 551-63.
  Sharp, P.M. and Li, W.H. (1987) The codon Adaptation Index--a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res 15(3), 1281-95.
  Sharp, P.M., Tuohy, T.M. and Mosurski, K.R. (1986) Codon usage in yeast: cluster analysis clearly differentiates highly and lowly expressed genes. Nucleic Acids Res 14(13), 5125-43.   Weaver, S.C. (2006) Evolutionary influences in arboviral disease. Curr Top Microbiol Immunol 299, 285-314.
  Wong, E.H., Smith, D.K., Rabadan, R., Peiris, M. and Poon, L.L. (2010) Codon usage bias and the evolution of influenza A viruses. Codon Usage Biases of Influenza Virus. BMC Evol Biol 10, 253.
  Zhao, K.N., Liu, W.J. and Frazer, I.H. (2003) Codon usage bias and A+T content variation in human papillomavirus genomes. Virus Res 98(2), 95-104.
  Zhi, N., Wan, Z., Liu, X., Wong, S., Kim, D.J., Young, N.S. and Kajigaya, S. (2010) Codon optimization of human parvovirus B19 capsid genes greatly increases their expression in nonpermissive cells. J Virol 84(24), 13059-62.
  Zhou, J.H., Gao, Z.L., Zhang, J., Chen, H.T., Pejsak, Z., Ma, L.N., Ding, Y.Z. and Liu, Y.S. (2012a) Comparative [corrected] codon usage between the three main viruses in pestivirus genus and their natural susceptible livestock. Virus Genes 44(3), 475-81.
  Zhou, J.H., Gao, Z.L., Zhang, J., Chen, H.T., Pejsak, Z., Ma, L.N., Ding, Y.Z. and Liu, Y.S. (2012b) Comparative the codon usage between the three main viruses in pestivirus genus and their natural susceptible livestock. Virus Genes 44(3), 475-81.
  Zhou, J.H., Gao, Z.L., Zhang, J., Ding, Y.Z., Stipkovits, L., Szathmary, S., Pejsak, Z. and Liu, Y.S. (2013) The analysis of codon bias of foot-and-mouth disease virus and the adaptation of this virus to the hosts. Infect Genet Evol.
  Zhou, J.H., Zhang, J., Chen, H.T., Ma, L.N., Ding, Y.Z., Pejsak, Z. and Liu, Y.S. (2011) The codon usage model of the context flanking each cleavage site in the polyprotein of foot-and-mouth disease virus. Infect Genet Evol 11(7), 1815-9.
  Zhou, M. and Li, X. (2009) Analysis of synonymous codon usage patterns in different plant mitochondrial genomes. Mol Biol Rep 36(8), 2039-46.
  作者简介:袁昊 (1986-),甘肃庆阳人,硕士研究生,研究方向:动物科学。
  通讯作者:张勇 (1970-),陕西西安人,博士生导师,教授,研究方向:动物生殖生理。
  基金项目:甘肃省传染病专项基金 (No. 3293GS2)
转载注明来源:https://www.xzbu.com/1/view-11909680.htm