Sequence to Structure Analysis of SOD 1 and SOD 2 from Fresh Water Turtles

Superoxide dismutase (SOD) responsible for dismutation of ROS produced in cell controls the aging and longevity of animals. An attempt has been made to report on sequence to structure analysis of the genes and proteins of SOD1 and SOD2 of freshwater turtles, Pelodiscus sinensis and Mauremys reevesii. Analysis of gene and protein sequences of these SODs retrieved from the NCBI database suggested that the there were minor variations in their molecular weight of the gene sequences, melting temperature, folding, aliphatic index and isoelectric point. Gene sequences were all AT rich with 5 restriction sites each in SOD1 of both the turtles and SOD2 of Pelodiscus sinensis while 8 restriction sites in SOD2 of Mauremys reevesii were obtained. SOD1 were dominated by b Strands, whereas, SOD2 were by the alpha helices. Homology models were generated by MODELLER 9.12 presented that all the models of SODs within acceptable range. Solvent accessible surface area (SASA) and active sites analysis of refined models of the SOD proteins were acidic and with 5 to 11 number of active sites in all the proteins and high percentage of exposed aliphatic residues. Therefore, it could well be inferred that these models have the potentiality to be used for understanding the aging process. Corresponding Author: Dhirendra Kumar Sharma, School of Biological Sciences, University of Science and Technology, Meghalaya, Techno city, Kiling Road, Baridua-793101, India. Tel: +91 9435147412; E-mail: dksgu@yahoo.co.uk Received Date: June 17, 2015 Accepted Date: September 28, 2015 Published Date: October 06, 2015


Introduction
Superoxide dismutase (SOD) is the most effective enzyme in the aging process and longevity, regulates the Reactive Oxygen Species (ROS), produced in the metabolic and physiological events of animals [1][2][3] . Unbalanced concentration of ROS often contributes to diseases like cancer, diabetes, premature aging, inflammation and hypertension [4] . The SOD1 is found in cytoplasm and outer mitochondrial space, protects the cells against any lethal effects of radiation, drugs or toxicity of ROS [5] while the SOD2 found in inner mitochondrial space, promotes cellular differentiation, apoptosis, tumorgenesis and hypoxia induced pulmonary disease [6][7][8][9] . Among all the Superoxide dismutase found in organisms, only the SOD1 (Cu, Zn SOD) and SOD2 (Mn SOD) have been sequenced so far in a limited number of chelonians, but without structural information [10] . The group turtles, many of them survived for a prolonged period and that too with very active life has been an interesting model to understand the aging [11,12] . Reliable 3-D structural predictions using Homology modeling of these proteins might be significant to understand the aging process [13] . Pelodiscus sinensis Wiegmann, 1835 and Mauremys reevesii Grey, 1831 are the two freshwater turtles [14] have often been used as models for turtle evolution and development studies, where SOD1 and SOD2 have been sequenced recently [15][16][17] . Therefore, an attempt has been made to analyze the SOD gene and protein sequences and their characterization (SOD1 and SOD2) in Pelodiscus sinensis and Mauremys reevesii at structural and functional level. Prediction of the secondary structures, analysis of gene and protein sequence properties, restriction sites, homology modeling and evaluation, Solvent Accessible Surface Area (SASA) and 2 active site prediction of the SODs of Pelodiscus sinensis and Mauremys reevesii were carried out to understand their structural and functional status.

Homology Modeling of Protein and Evaluation
The 3D structures of SOD1 and SODs were generated using comparative method in MODELLER 9.12 [20] . Validation of the models was done by PROCHECK [21] and RAMPAGE [22] . The minimization of energy and refinement of protein structures were carried out by Discovery Studio package (Accelrys) [23] and Chiron [24] . Refined structures were evaluated using PROSESS [25] , PROCHECK and RAMPAGE and ProFunc [26] .

Solvent Accessible Surface Area (SASA) and Active Site Prediction
SASA and active site predictions were carried out by using the best refined model structures of the proteins. Get Area [27] , Discovery Studio Client 4.0 was used to find out the SASA percentage. Active sites and cleft predictions of SOD1 and SOD2 were determined by Active Site Prediction and Analysis Server, DoG Site Scorer [28] and ProFunc. DoG Site Scorer identifies all cavities in a protein and analyses the amino acid composition of each cavities. It scores the cavities by functional protein lining around them based on their physicochemical properties. Amino acid residue types were evaluated at the largest pocket and clefts.

Nucleotide Sequence Analysis Sequence Retrieval and Sequence Analysis:
SOD1 and SOD2 nucleotide sequences of Pelodiscus sinensis and Mauremys reevesii were downloaded from the NCBI database. Molecular weight for SOD1 was 236.155kDa and 243.483 kDa, whereas for SOD2 it was 463.464kDa and 544.515kDa in Pelodiscus sinensis and Mauremys reevesii respectively. The genes were found to be AT rich. (Table 1)

Restriction Enzyme Digestion
The nucleotide SOD sequences were restriction digested in-silico with 8 restriction enzymes namely, BglII, Eco-RI, FokI, HindIII, MspI, PstI, SmaI and XbaI based on 4-base and 6-base cutters, virtual gene map of the gene sequences (Table 2). Results showed that the Gene sequences of SOD1 and SOD2 of Pelodiscus sinensis have 5 restriction sites, while the SOD2 of Mauremys reevesii presented 8 restriction sites.   3 . The net formal charge indicated that the SOD1 of P. sinensis as well the SOD1 of SOD2 of M. reevesii had higher anion atoms, whereas, the SOD2 of P. sinensis, (MODELLER) had large numbers cationic charge. All the SODs were of single stranded. The size of the exposed protein groups were found to be higher than the size of the buried protein groups in all the SODs. Hydrophilic residue size was higher than the hydrophobic residue size in SOD1s, while vice-versa was detected in SOD2s.

Solvent Accessible Surface Area (SASA) and Active site predictions
Get Area server analysis predicted that the total solvent accessibility of SOD1s were 7508. 24  The pocket properties were tabulated in (Table 4). The largest active site volume (Å 3 ) in SOD1 of P. sinensis was 579.50 against 333.06 in M. reevesii, whereas, the SOD2 had presented the largest active site volume with 559.17 and 991.94 respectively for both the model of P. sinensis and M. reevesii. Gly residue was found to be dominant in both the largest active sites of SOD1 models followed by Val residues. Hydrophobicity ratio of 0.38 of SOD1 model in M. reevesii was better than the 0.52 of P. sinensis in the largest active site. However, in SOD2 model, the largest pockets had the highest number of amino acid residues of Gly and Leu in both the P. sinensis and M. reevesii models with hydrophobicity ratio of 0.49 and 0.65 respectively. ProFunc results showed that the SOD1 models of P. sinensis and M. reevesii had the largest cleft size of 2045.67 Å and 2234.67 Å, while SOD2 had 1666.41 Å and 2122.88 Å respectively. Comparison of residue types in the largest clefts suggested that the SOD1 model of P. sinensis had dominant equal numbers of positive, neutral and aliphatic residues (10 residues in each) while the model of M. reevesii was aliphatic residue dominant (11 residues in each) followed by negative residues (10 residues). The SOD2 was with positive and neutral residues (6 residues each) however, was dominant in the model of P. sinensis while aliphatic (11 residues) followed by aromatic residues (10 residues) were dominant in the model of M. reevesii.

Discussion
Nucleotide base composition analysis of SOD1s and SOD2s suggested that all the genes were AT rich where 'Adenine' was dominating (Table 1). DNA replication starts at the AT rich regions and these regions are universally the most conserved regions found in replicons [29][30][31] . In both the SOD1, adenine percentage was followed by almost the same frequency ranges for guanine, thymine and low percentage for cytosine. While in SOD2, adenine percentage is marginally higher than the thymine followed by guanine and low cytosine percentage. Analysis of SOD1 and SOD2 nucleotide sequences showed that that there were minor variations in melting temperatures and no coding regions were defined in the sequences. In silico analysis of restriction site variability has been suggestive of their differences may lead to high degree of polymorphism. Further, similarity in the conserved sequence is in fact suggestive predicted their identical longevity.
Aliphatic index of SOD1 of Mauremys reevesii were higher, indicative of its stability over different temperatures ranges than the SOD1 of Pelodiscus sinensis. Isoelectric point indicated that the pH of both the SOD1 were acidic when not carrying any net electrical charge. Results indicated that SOD2 of Mauremys reevesii had more stability than the SOD2 of Pelodiscus sinensis. Half-life of all the SODs were > 20 hrs. SOD1 of both the turtles had the similar frequency of hydrophobic residues, but there were variations in frequencies of hydrophilic residue with both the negative and positive charges. Where as in case of SOD2, charged residue frequencies were of same but there were variations in frequencies of hydrophobic and hydrophilic residues. Glycine (G) residue distribution was found to be dominant in both the SOD1, while the distribution frequency of leucine (L) was dominant in SOD2s. Due to dominance of Glycine, helix forming probabilities were low in SOD1, while such helix forming probabilities were evident in the SOD2 with leucine dominance. Moreover, all the four SOD sequences might be highly conserved since the Glycine and Leucine have low mutability and are more frequent in conserved sequence elements [32] .
Prediction of secondary structures locates the positions of the amino acid residues, whether they lie in helixes, strands or in coils [33] . Secondary structure prediction indicated that the SOD1 of both the Pelodiscus sinensis and Mauremys reevesii had same 12 numbers of b strands with no a helices. Though the number of beta strands was same, yet variations at the 2 nd , 3 rd and 4 th strand positions of both the structures were noticed. SOD2 of Pelodiscus sinensis had 13 a helices and 4 b strands compared to 12 a helices and 5 b strands of SOD2 from Mauremys reevesii. The structure was suggestive of the dominance of a helix in both the SOD2. Further, sequence alignment suggested that the SOD1 and SOD2 of both the turtles had 90.32% and 97.78% conserved sequence similarity respectively.
From the ProFunc evaluation, it could be outlined that all the proteins were associated with cellular oxygen and reactive oxygen species and metabolic processes. SASA prediction helps in understanding the probable binding oriented conformational changes that may occur in the protein structures [34] . It could be suggested from the results that the models had greater potentiality of binding to the ligands in solvent.
Active site predictions are essential for prediction of functions, classification and drug binding ability of proteins [35] . It has been found that SOD1 of P. sinensis had 6 numbers of pockets against the model of M. reevesii, which had 7 pockets; on the other hand, the SOD2 had 11 and 5 numbers of pockets for the models of P. sinensis and M. reevesii respectively.
www.ommegaonline.org Structure Analysis of SOD1 and SOD2 The aromatic residues at the active sites stabilize the monomers in a hydrophobic core. Since the binding site affinities and specificities are mainly achieved by hydrogen bond interactions [36] , the results of active sites and clefts predictions, it could well be inferred that both the SOD1 and SOD2 proteins of the two turtles had acceptable and applicable range of active sites and cavity size.

Conclusion
Analysis of the nucleotide sequences of the SODs suggested that the genes were AT rich and had minor melting temperature differences with good number of restriction sites indicate high degree of polymorphism. Analysis of protein sequences of SOD1 and SOD2 and evaluation of the predicted secondary and tertiary refined structures of the two SODs generated by MODELLER 9.12 indicated that all the 4 SODs structure (s) were within acceptable range. Comparison among the models with their counterparts suggested that although they had the sequence similarity of around 90%, yet all the SODs had their own individual structural characteristics. SOD structures were submitted to PMDB database under the IDs PMDB ID: PM0079765 (SOD1 of P. sinensis), PMDB ID: PM0079766 (SOD2 of P. sinensis), PM0079772 (SOD1 of M. reevesii) and PM0079773for SOD2 of M. reevesii. Thus it could be assumed that these models and the data have the potentiality to be used as source for further understanding on the aging process and drug binding related attempts.