Loading

Original Research Open Access
Volume 4 | Issue 1 | DOI: https://doi.org/10.33696/Signaling.4.089

A Computational Investigation on Rho-related GTP-binding Protein RhoB through Molecular Modeling and Molecular Dynamics Simulation Study

  • #Authors contributed equally
  • 1Department of Biochemistry and Molecular Biology, Life Science Faculty, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj 8100, Bangladesh
  • 2Department of Chemistry and Biochemistry, Kent State University, 800 E Summit St, Kent, OH 44240, USA
  • 3Department of Fisheries and Marine Bioscience, Faculty of Biological Science, Jashore University of Science and Technology, Jashore 7408, Bangladesh
  • 4State Key Laboratory of Sustainable Dryland Agriculture, Institute of Wheat Research, Shanxi Agricultural University, Linfen 041000, People’s Republic of China
+ Affiliations - Affiliations

*Corresponding Author

Md. Lutful Kabir, kabir.bmb@bsmrstu.edu.bd;

Md. Mostofa Uddin Helal, mostofa.agr12@gmail.com

Received Date: February 14, 2023

Accepted Date: March 19, 2023

Abstract

Background: An indispensable member of the Rho family, RhoB is an isoprenylated small GTPases that modulate the cellular cytoskeletal organization. While DNA gets damaged, it takes part in the neoplastic apoptotic mechanism. In this study, we evaluated the structure of Rho-related GTP-binding protein RhoB due to the unavailability of 3D structure in the protein data bank database.

Results: The expected pI value of RhoB was 5.10 (acidic). The target–template alignment was computed using the GMQE value meanwhile 6hxu.1.A from Homo sapiens was selected as the template structure. The Swiss model was exploited to complete the model construction task. The structural compatibility and stability were revealed after a 100ns molecular dynamics simulation using GROMACA employing the OPLS-AA force field. Based on their fluctuating activity and their location between 100 and 110 and 140 and 150, PCA analysis discovered relevant residues.

Conclusion: By providing an insight into the biophysical phenomenon of Rho-related GTP-binding protein RhoB inhibitors, this study will assist future investigations addressing the relationship between gene mutation and abnormalities produced by protein Rho-related GTPbinding protein RhoB in apoptotic events.

Keywords

Homology modeling, Molecular dynamics simulation, PCA, RhoB

Introduction

Rho is a small monomeric GTP binding protein that belongs to the Ras superfamily weighing about ~21 kDa. It is characterized as “binary switches” in a diverse signaling pathway [1]. There are 3 subtypes of the Rho family including RhoA, RhoB, and RhoC proteins that are 85% homogenous in their amino acid sequence identity, yet they each play distinctive biological roles in cell migration, and wound healing/fibroelastosis or immune surveillance, cell division [2,3]. Approximately, 20 signaling intracellular molecules are constituted by this family in humans and the dysregulation of their function has been linked to different human pathologies [4-8]. The cycle of RhoB continues through binding and non-binding of GTP/GDP which is similar to the other Rho GTPases. When GTP is in a bound state RhoB is considered an active state whereas a GDP-bound state is considered an inactive state [1,9,10].

RhoB can play both positive and negative regulatory roles in different physiology and pathophysiology. It remains at low steady-state levels in normal cell conditions and abnormal cells responding to several stimuli, including UV irradiation, growth factors, cytokines, and phosphorylation during the cell cycle it can be rapidly and transiently upregulated [11,12]. Because of their involvement in controlling cell migration and proliferation, Rho GTPases, especially RhoB, have been widely researched but least understood and still a mystery for their role in cancer progression [8,13,14]. RhoB regulates DNA damage responses, apoptosis, cell cycle progression, migration, and invasion, all of which contribute to cancer progression. During the progression of certain cancers, RhoB rates decrease. RhoB deficiency has been linked to increased cell proliferation, invasion, and metastasis [15].

RhoB is thought to be the most varied protein of the Rho subgroup, with multiple distinct post-translational modifications from RhoA and RhoC, possibly contributing to its bipolar role in cancer [8]. RhoB's prominent involvement in vascular and endothelial cells likely leads to involvement in cancer vasculature, which seems to have consequences regarding angiogenesis [16-19]. The disparity in these findings has sparked controversy about whether RhoB promotes or inhibits tumor growth. These contradictory findings show that RhoB's roles in cancer are highly contextual and cell-type specific. In the context of tumorigenesis vs tumor development and severity, numerous findings implied that RhoB performs two separate and opposing functions [3,20-23].

Hence, the RhoB protein (mouse model) grabbed our attention to continue our findings. Unlike other organisms, mice (Mus musculus) model has become a potent prop over many years in oncology research and our prior investigation of literature searching showed that homology modeling of the RhoB in mouse protein had not yet been investigated and well understood regarding the tumor and cancers. As it has mimicked human structure or physiology to know the RhoB protein expression using the mouse model is favorable. Considering the ethics of producing pharmaceuticals and medical treatments to treat diseases and illnesses, animals particularly mice are anticipated in scientific studies. Scientists may be able to dig up more prominent and similar medications, and treatments utilizing non-animal research approaches. If a new therapy appears to be promising, it is next tested in the human or human environment to see if it is both safe and effective. Human volunteers are asked to participate in a clinical trial if the outcomes of the animal studies are coming out positive. Animal studies are conducted initially to provide medical researchers with a better understanding of the benefits and drawbacks they might expect to see in people.

Since X-ray crystallography is considered the most trustable structure and its related study is time-consuming, costly, and tiresome which needed extensive effort and equipment; on the other hand, in silico study gives some opportunities within a short period considering a few disadvantages. Therefore, to characterize the structural and functional properties of Rho-related GTP-binding protein RhoB and it is encoded by the gene RhoB (Uniprot ID: P62746) consists of 196 amino acid long sequences to investigate its activity. Its subcellular location is the late endosome membrane, Cell membrane, Nucleus. For this purpose, we retrieved desired protein from Universal Protein Resource Knowledge Base (UniprotKB) [24] as a FASTA file format and due to the unavailability of a 3-D structure in the Research Collaboratory for Structural Bioinformatics (RCSB) protein data bank (PDB) [25].

We analyzed the primary structure by ExPASy’s ProtParam tool [26] which showed its physicochemical features including molecular weight, GRAVY, theoretical pI, instability index (II), etc. For predicting the secondary structure elements SOPMA server was implied. Then the 3-D structure by homology modelling from SWISS-MODEL [27] relying on a central script-based software platform, ProMod3 [28], and this structure was qualified and analyzed from the Ramachandran plot by PROCHECK [29] program from SAVES v6.0, for identifying the compatibility of the atomic model (3D) with its amino acid sequences by VERIFY-3D. The ERRAT tool has been used to understand the non-bond interaction between each protein. Besides, we used ProSA-web [30] program for validating and checking the native protein folding energy of the model which was compared with the energy (potential mean force). Afterward, to visualize the similarity between the template and model we used UCSF Chimera [31]. In addition, the last part of this computational investigation was accomplished by performing 100 ns (100,000 ps) Molecular Dynamics (MD) simulation in GROMACS platform with OPLS AA force field [32] and flexible water model spc216 that generated RMSD value, RMSF value and Radius of gyration (Rg) to identify the compactness of the model and reliability. Moreover, to visualize and identify the PC that grabs the attention in terms of analyzing the characteristics of the protein residue. Furthermore, SASA gave information regarding the accessible surface area of the protein. Hierarchical clustering of the RMSD gave the idea of the clusters that we assumed as PC and demonstrated in the time of 100,000ps through the graphical presentation. To accomplished these tasks, we used the Bio3D function mktrj.pca.

Materials and Methods

Data retrieval

We retrieved our desired protein RhoB FASTA file from the UniProtKB database to understand the amino acid level of the protein identified in the UniProtKB ID P62746 consist of 196 amino acid long sequence and due to the unavailability of 3-D structure in the Research Collaboratory for Structural Bioinformatics (RCSB) protein data bank (PDB), this uncharacterized protein was taken up for predicting the 3-D structure by homology modelling approach as it is challenging and time-consuming to obtain experimental structures from X-ray crystallography or protein NMR method for every query protein. Proteins may fold in varied manners depending on energy conformation, steric factors, temperature, pH, concentration, etc. Hence, structure prediction is crucial to learning about the structural and functional characteristics of an uncharacterized protein.

Primary sequence analysis

The folding and intramolecular binding of the linear amino acid chain, which essentially defines the protein's distinctive 3-D form, is driven by the primary structure of a protein from its amino acid sequence. The function of a protein is determined by its shape. The three-dimensional structure of a protein can be altered due to an alteration in the structure of the amino acid sequence, that protein becomes denatured and fails to fulfill its function as intended. ExPASy’s ProtParam tool; is a web-based server that computationally helps to determine the characteristic of a query protein. Physiochemical properties, include amino acid composition, hydropathicity, Molecular weight, number of atoms, Grade average of hydropathicity (GRAVY), Extinction coefficients, Theoretical pI, and Aliphatic index, which can give a clear definition and scenario of a protein. ExPASy’s ProtParam is a reliable tool which provides a clear understanding using its parameter about the physicochemical properties of Rho-related GTP-binding protein RhoB. Besides, the missing residue is also considered a significant provision to knowing about a protein. PSIPRED 4.0 and DISOPRED3 [33] program.

Secondary structure analysis

The secondary structure evaluation of the query protein has been carried out with the help of the Self Optimized Prediction Method with Alignment (SOPMA) [34]. SOPMA is a tool that can predict the secondary structure of a query protein depending on the primary sequence of a protein, using Window width: 17, Similarity thresholds: 8, Number of states: 4; parameters. According to this methodology, a short homologous sequence of amino acids tends to form a similar secondary structure which shows whether it lies in a helix, strand, or coil.

Homology modeling

Homology modelling harnesses a sophisticated method to elucidate the experimental data with high accuracy in the computational biology, pharmacological, and biomedical research fields [35]. Since the experimentally solved structure of Rho-related GTP-binding protein RhoB is not available in (RCSB) PDB, to elucidate in-silico biophysical characteristics, the 3-D structure has been predicted by using homology modelling became utmost. The structure prediction of our protein of interest was accomplished by the SWISS-MODEL [27] program.

Validation and quality check

The quality of the model was validated to examine its credibility and consistency with a few tools. PROCHECK analysis quantifies the available zone of residues presented by checking the Ramachandran plot [36]. Along with, the VERIFY-3D [37] program to identify the compatibility of the atomic model (3D) with its amino acid sequence (1D). All of that analysis was brought out based on Structure Analysis and Verification Server (SAVES) v6.0. The overall quality factor of the query protein was being checked with the non-bond interaction between each protein, ERRAT [38] tools had been used, and subsequently, PorSA-web [39] program was engrossed in validating and refining to check the native protein folding energy of the model which compared with the energy (potential mean force) derived from a long set of protein structures. Superimposition was checked by UCSF Chimera [31] applied to visualize the similarity the template and model.

Molecular dynamics simulation

All-atom molecular dynamics (MD) simulations are a powerful tool for investigating biomolecular motion on the pico- to nanosecond time scale. While visualizing an MD trajectory provides qualitative insights into the system, quantitative data that describes or supports these insights are often difficult to obtain [40]. So, It is necessary to know the nutshell of the stability, endurance, and dynamics of the modelled protein MD simulations were led using OPLS-AA force field and flexible water model in GROMACS 2021.2 [41] package continued in Linux OS (Ubuntu20.04 distribution). At that time all atoms of 6hxu.1.A protein were enclosed by a water box called the genion of spc216 water model. When it was being solvated, the system was neutralized with 4Na+ ions. That solvated system then minimized the energy through 5000 steps of the steepest invasion to eradicate the van der Waals contacts initiating a favourable structure for MD simulation. Using the LINCS algorithm [42], all bonds were constrained. Furthermore, the electrostatic bond interactions were elucidated by the particle mesh Ewald (PME) algorithm [43].

Essential dynamics analysis

Principal component analysis (PCA) was applied to study conformational flexibility by employing a basic statistical technique called PCA to understand the collective motions of Rho-B. PCA is the process of computing the principal components and using them to change the basis of data, often simply using the first few and disregarding the rest. [40]. PCA is a method for detecting and identifying patterns in high-dimensional data and displaying them graphically to highlight their similarities and differences. The Bio3D function mktrj.pca was used to do the PC analysis [44]. Visual inspection of interpolated atomic displacement trajectories along with the first three PCs. Least-squares fitting to the first frame reduced the trajectory's total translational and rotational motions. Using Cartesian coordinates of C atoms, a 3 N 3 N covariance matrix was created. The covariance matrix was diagonalized, resulting in 3 N eigenvectors, each with its eigenvalue. The covariance matrix is specified in terms of deviations from the averaged coordinates of the trajectory, and the RMSF plot can be shown using eigenvectors. When computing principal component scores, it is preferable to decrease their number to an independent set, whereas eigenvectors are the weights in a linear transformation. The amount of variance explained by each primary component or factor is expressed in eigenvalues. The number of components recovered is equal to the number of observed variables in the analysis, and each Principal Component has the same amount of variance. The first principal component identified accounts for most of the variance in the data. The second component identified accounts for the second-largest amount of variance in the data and is uncorrelated with the first principal component and so on [40] For PC analysis, we employed a 10ns trajectory with 1000 frames. We have followed the PCA analysis protocol described on by Paul et al. for this approach [45]. To reveal concerted motions, the trajectory was projected onto a certain eigenvector. K-means and hierarchical clustering methods were used to cluster the trajectory in the PC space. By reducing the mean squared distance between each observation and its nearest cluster center, k-means splits the observations into k clusters. The trajectory can be projected on eigenvectors to give the principal components pi(t) [44,46]:

pit=RT M12 xi t-[xi]) …….. …………… (1)

System details

The Primary structure from ExPASy’s ProtParam tool, the Disopred plot from PSIPRED 4.0; DISOPRED3 program, the Secondary structure from SOPMA, Clustal Omega [47] for Multiple sequence alignment 3-D structure from SWISS-MODEL through ProMod3 3.2.0 web-server and structure validation analysis using PROCHECK v.3.5.4 [29], [48], Verify 3-D, ERRAT and PorSA-web tool from SAVES v6.0 [38], were carried out in online. Superimpose structure visualization in UCSF Chimera was been performed on Hp ProBook 645 G4; AMD RAYZEN 7pro 2700U. Molecular dynamics simulation was carried out in Science Outreach Server’s GROMACS program via SSH login mode on the local windows subsystem for the Linux command line interface. The Bio3D function mktrj.pca for calculating PC, SASA, and cross-validation to earn the knowledge to earn the total surface area, hydrophobicity index, and residue x residue index [44,46].

Result and Discussion

Primary sequence analysis

The amino acid sequence of Rho-related GTP-binding protein RhoB was retrieved from the UniprotKB database in FASTA format which was exploited for computing physicochemical properties from ExPASy‘s ProtParam tool using its parameter and the subsequent retrieval of Molecular weight, theoretical, Isoelectric Point (pI), GRAVY, and aliphatic Index of query protein was were calculated that are tabulated in Table 1. The estimated molecular weight of the questioned protein was 22123.39 Da. The stability of a protein is measured in a test tube depending on its instability index [49]. The instability index of a protein and its stability isn’t complimentary, if a protein shows an instability index of more than 40, it can’t be able to stable in an in vivo method of more than 5 hrs but a protein with less than 40 shows its stability approximately 16 hrs [50]. Table 1 showed that the instability index of our query protein is 46.35 [51] which classified it as unstable but merging with the threshold line which indicates the probability of a significant level of stability. The theoretical pI of this query protein is 5.10 which indicates its acidic nature whereas the pI of a protein less than 7 is supposed to be acid and shows its acidic nature [49]. A statistical analysis experimented upon the proteins of thermophilic bacteria that showed a higher aliphatic index than that of ordinary proteins. The aliphatic index was defined as the relative volume of a protein occupied by aliphatic side chains (alanine, valine, isoleucine, and leucine). This index may be regarded as a positive factor for the increase of the thermostability of globular proteins. An aliphatic index of 87.96 demonstrated that protein had a positive factor in the increase of thermostability [52]. The enumerated GRAVY was -0.60 as shown in Table 1 conveying its hydrophobicity nature shows the possibility of its better interaction with water [53,54]. Molecular weight and theoretical pI are calculated as in Compute pI/Mw while, the amino acid and atomic compositions are self-explanatory. It is possible to estimate the molar extinction coefficient of a protein from knowledge of its amino acid composition. From the molar extinction coefficient of tyrosine, tryptophan, and cystine (cysteine does not absorb appreciably at wavelengths >260 nm, while cystine does) at a given wavelength, the extinction coefficient of a denatured protein can be computed. Two tables are produced by ProtParam, the first one showing the computed values based on the assumption that all cysteine residues appear as half cystines, and the second one assuming that no cysteine appears as half cystine.

The extinction coefficient is calculated using the equation:

 E(Prot) = Numb (Tyr) × Ext (Tyr) + Numb (Trp) × Ext (Trp) + Numb (Cystine) × Ext (Cystine)

 The absorbance (optical density) can be calculated using the following formula:

Absorb (Prot) = E(Prot)/Molecular_weight

The conditions at which these equations are valid are: pH 6.5, 6.0 M guanidium hydrochloride, 0.02 M phosphate buffer.

ProtParam estimates the half-life by looking at the N-terminal amino acid of the sequence under investigation. The N-terminal residue plays a major role in the process of ubiquitin-mediated proteolytic degradation. The authors have, by site-directed mutagenesis, created beta-galactosidase proteins with different N-terminal amino acids. The β-gal proteins thus designed have strikingly different half-lives in vivo, from more than 100 h to less than 2 min, depending on the nature of the amino acid at the amino terminus and on the experimental model (yeast in vivo; mammalian reticulocytes in vitro, E. coli in vivo). The set of individual amino acids can thus be ordered with respect to the half-lives that they confer when present at the amino terminus of a protein (this is called the “N-end rule”).

to compute an instability index (II), which is defined as:

i = L-1

 II = (10/L) × Sum DIWV(x[i]x[i+1])

 i=1

where: L is the length of sequence

 DIWV(x[i]x[i+1]) is the instability weight value for the dipeptide starting in position i.

The aliphatic index of a protein is calculated according to the following formula:

Aliphatic index = X(Ala) + a × X(Val) + b × [X(Ile) + X(Leu)]

where X(Ala), X(Val), X(Ile), and X(Leu) are mole percent (100 × mole fraction) of alanine, valine, isoleucine, and leucine. The coefficients a and b are the relative volume of valine side chain (a = 2.9) and of Leu/Ile side chains (b = 3.9) to the side chain of alanine.

The grand average of hydropathy (GRAVY) value for a peptide or protein is calculated as the sum of hydropathy (7) values of all the amino acids, divided by the number of residues in the sequence [55].

Table 1. Physicochemical properties of RhoB protein computed using EXPASY’s Protparam tool.

Properties

Number

Number of Amino Acids

196

Molecular weight

22123.39

Theoretical pI

5.10

Total Number of atoms

3,103

Aliphatic index

87.96

Total number of negatively charged residues (Asp + Glu)

32

Total number of positively charged residues (Arg + Lys)

26

Extinction Coefficient; Assuming all pairs of Cys residues form cystines (M-1 Cm-1)

21,930

Extinction Coefficient; Assuming all Cys residues are reduced (M-1 Cm-1)

21,430

Instability Index (II)

46.35

Grade average of hydropathicity (GRAVY)

-0.60

Half-life; mammalian reticulocytes, in vitro (hours)

30

From Figure 1A, among all the amino acids, Valine (Val) residue’s participation was higher around 25% than Aspartic acid (Asp) and Glutamic acid (Glu) contributed more than 15% respectively. The configuration of a protein is generally derived from X-ray diffraction data. If the experimentalists are unable to solve the structure of that area by using X-ray diffraction data, those regions will mark as missing residues in the PDB file. The long-chain amino acid sequence contains more than thirty amino acids, around 1/5th of a typical eukaryote proteome, and 1/3rd of eukaryotic proteins comprise the nearby disordered region and they comprehend a few important regulatory processes which were suggested by conservative estimates of the disorder frequencies in complete genomes [56].

Secondary structure analysis and model building

The secondary structure of query protein was predicted from SOPMA. The success rate in the prediction of the secondary structure of proteins of 69.5% accuracy of amino acids for a three dimensional-state. Description of the secondary structure contains α-helix, β-sheet, and coil in a whole protein [57]. SOPMA prediction uses the parameters of window width 17, the number of states 4, and similarity threshold 8 that were used for the secondary structure prediction. According to Figure 1B, secondary structure prediction by SOPMA shows each amino acid’s contribution in each position to the secondary structure 34.18% of residues as a random coil(cc) compare to 78% to Alpha helix, extended strand (Ee) 19.9% and β turn (Tt) is 6.12% which estimated it as α helix dominated protein.

Multiple sequence analysis (MSA) can provide inferred and phylogenetic analysis which can be elucidated to assess the sequences' shared evolutionary origins. The multiple sequence alignments can predict the sequences are homologous, they descend from a common ancestor. The algorithms will try to align homologous positions or regions with the same structure or function. Clustal Omega [58] uses seeded guide trees and Hidden Markov Model (HMM) profile-profile techniques to generate alignments. Furthermore, The retrieval of a homologous sequence to construct an alignment via Basic Local Alignment Tool (BLAST) Programs [59,60]. The sequence alignment Figure 1C target-template; with 100% identity; indicated the amino acid residues from target and template sequences shared identical amino acid regions and it imposed perfectly with our query protein that will show the physicochemical properties as like the template protein. Secondary structure information and feature data helped to make an accurate alignment and it served as the model of the protein domain. In Figure 1D white color represents disordered residue and there are no existing disordered amino acid residues according to DISOPRED prediction.

Figure 1: Primary sequence, secondary structure elements, disorder residues, and multiple sequence analysis. (A) Among all the amino acids, Valine (Val) residue’s participation shows the higher value that reaches around 25% consequently Aspartic acid (Asp) and Glutamic acid (Glu) shares more than 15% of residues respectively. (B) Contribution of secondary structures are showed in percentages, 34.18% of residues as a random coil (cc) compare to 78% to Alpha helix, extended strand (Ee) 19.9% and β turn (Tt) is 6.12% which estimated it as α helix dominated protein. (C) Multiple sequence alignment (MSA) analysis performed between query protein sequence and template protein sequence, 6HXU.1. A by Clustal Omega tool which possesses the 100% identity between two sequences. (D) Disordered residues of the protein have been depicted here while white color box represents disordered residue and there are no existing disordered amino acid residues according to DISOPRED prediction.

Homology modelling is one of the premise processes to predict the 3-D structure of an unknown protein also known as comparative modelling of protein. A central script-based software platform, ProMod3 [61] has been used to generate the model through the SWISS-MODEL web server [27]. It was performed under a few steps, firstly template structure selection before submission of query protein sequence in FASTA format. The top-ranked templates and alignments are further analysed and sorted according to the expected quality of the resulting models, as estimated by GMQE and, if the target model is predicted to be an oligomer, QSQE. The predicted protein was selected depending on the GMQE score, and identification. From the SWISS-MODEL template library, we got more than 30 templates found by the HHblits method [62] which is profiled by Hidden Markov models (HMMs) the fastest sequence search tool [63] among them 5 most favorable templates is 2fv8.1.A, 6sge.1.A, 2fv8.1.A, 6sge.1.A, 6hxu.1.A (Table 2). In detail, the default template ranking is according to the descending lexicographic order of (is_full_biounit, bin, gmqe + qs_value), where: is_full_biounit is only used for heteromers and is set to 1, if all chains from the template biounit are included for modelling, or 0 otherwise; bin is computed as ceil((gmqe - max_gmqe) / 0.1), where max_gmqe is the best gmqe observed in the templates; gmqe is the GMQE of the template; qs_value is set to QSQE of the template if the target model is predicted to be an oligomer, or 0 otherwise. In table 2, where we manipulated the 5 best temples from the SWISS-MODEL template library with descriptions as well as their GMQE, Identity (%), and coverage out of 100 [61,64].

Table 2. Template selection for modeling RhOB (found by HHblits) through EXPASY’S SWISS-MODEL tool.

Description

Organism

PDB ID

GMQE

Coverage

Identity(%)

Method

The crystal structure of RhoB in the GDP-bound state

Homo sapiens

2fv8.1.A

0.81

0.94

100

X-ray

1.90 Å

Crystal structure of Human RHOB-GTP in complex with nanobody B6

Homo sapiens

6sge.1.A

0.84

0.93

99.45

X-ray        

1.50 Å

The crystal structure of RhoB in the GDP-bound state

Homo sapiens

2fv8.1.A

0.81

0.93

100

X-ray,

1.90 Å

Crystal structure of Human RHOB-GTP in complex with nanobody B6

Homo sapiens

6sge.1.A

0.84

0.93

99.45

X-ray,

1.50 Å

Crystal structure of Human RHOB Q63L in complex with GTP

Homo sapiens

6hxu.1.A

0.85

0.93

99.45

X-ray,

1.19 Å

Considering the Global Model Quality Estimation (GMQE) is a method of quality estimation that calculates both the target–template alignment and the template structure ranging from 0 to 1 indicating the predicted accuracy of a model designed with that alignment and template, as well as the target 6hxu.1.A, coverage score, 0.93 with 99.45% identity considering GMQE value of 0.85, possesses reasonably higher score among those 5 templates. We put forward our work with 6hxu.1. A protein template from Homo sapiens. Furthermore, we proceeded with our work with a favorable GMQE score, identity, and method (X-ray, 1.19A) as an ideal protein prediction template for our purpose. Modelling a protein based on predicted template structure relies on the thread method close to the 3D structure of a stranded protein where a complete structure of a protein atom, as well as folds protein, is also being collected for a possible set of templates [65]. To Visualize the structure of qualified protein, we built the model depicted in Figure 2A from SWISS-MODEL.

For moving forward, identify the template and the model whether they were superimposed or not (Figure 2B). For this purpose, UCSF Chimera gave better visualization for observing the superimposed structure for the query protein. Template of the protein (golden color) and model (light blue color) both which were required from the SWISS-MODEL, entirely showed superimpose structure in UCSF Chimera that was seemingly favorable. The sequence of amino acid residues comprising a binding pocket or cavity site defines its physicochemical properties, which along with its form and position in a protein, describe its functionality. Protein mobility allows for the opening, closure, and adaptation of binding pockets to control binding processes and basic protein function [66-68].

Figure 2. Predicted structure of modeled protein and the superimposed structure with template, 6hxu.1.A. (A) The depicted modeled structure representing from N-terminal to C-terminal where red color dignifies Helix, green color Coil and Strands by yellow color. (B) Superimposed structure between modeled protein of RhoB and the template protein 6hxua.1.A, showed by golden color and Model by light blue color. This structure indicates that the selected template and predicted model superimposed perfectly. Visualization of this structure performed by UCSF Chimera program.

Validation and quality check

PROCHECK; Saves v6.0, an online server tool that Checked the stereochemical quality of a protein structure by analyzing residue-by-residue Figure 3A and 3B gives overall structure geometry, the same resolution and also highlighted the regions that may need further investigation. The number of residues in allowed and generously allowed regions [A, B, L] was 149; 92.0% (model) and 149; 91.4% (template) respectively, and none of the residues were present in the disallowed region of the plot (Table 3). Residues in additional allowed regions [a,b,l,p] 13 (8.0%) for the model whereas 14 (8.6%) for template. No greater than 20%, a good quality model would be expected to have over 90% in the most favored regions. The template structure 6hxu.1A has 91.4% (149/163) of all residues favored regions out of 100.0% and the model (149/162) of all residues were in allowed (92%) regions, and there were no outliers (Figure 3A and 3B).

Figure 3: Ramachandran plot, ERRAT value and Z-Score. Structure validation through Ramachandran plot analyzing (PROCHECK; Saves 6.0), (A) dignifies the modeled protein Ramachandran plot and (B) depicted the template, 6hxua.1.A. The number of residues in allowed and generously allowed region [A, B, L] 149/162; 92.0% (model) and 149/163; 91.4% (template) respectively. The quality of our structure again verified by ERRAT score (98.802) (C) for the model from and the template shows an overall quality factor of 94.737 (D); Based on the condition it can say that our template and model have less error or rejection value. Z score of the Template and Model. The score -6.54 (E) for the template -6.27 (F) analyzed by the ProSA-web program where the negative ProSA energy reflects the reliability of the model.

Table 3. Ramachandran plot statistics of model and template structures.

 

Plot Statistics

Model

Template

Region

Residues in most favored regions [A,B,L]

149 (92.0%)

149 (91.4%)

Residues in additional allowed regions [a,b,l,p]

13 (8.0%)

14 (8.6%)

Residues in generously allowed regions [~a,~b,~l,~p]

0 (0%)

0 (0%)

Residues in disallowed regions

0 (0%)

0 (0%)

Total

0 (0%)

100%

Residues

Number of non-glycine and non-proline residues

162

163

Number of end-residues (excl. Gly and Pro)

2

2

Number of glycine residues (shown as triangles)

7

7

Number of proline residues

10

10

Total number of residues

181

182

So, we assumed our selected template can probably give a well-structured prediction model. The predicted 3D structure was validated on the studied Protein 6hxu.1.A using PROCHECK program [69], which estimates phi/psi angles, and constructed Ramachandran plots. The quality of our structure again verified by the ERRAT score was 98.802 (Figure 3C) for the model and the template shows an overall quality factor of 94.737 (Figure 3D); indicating an acceptable environment for protein. Moreover, after building the model from the selected template we got a few parameters such as Z-score known as a standard score that indicates how far a data point deviates from the baseline. A protein's Z-score is defined as the energy separation between the native fold and the average of an ensemble of misfolds in units of the ensemble's standard deviation [70,71]. ProSA-web program that profiled the energy of the template and model and the Z-score value, is a parameter to measure the quality of a model as it quantifies the total energy of the structure. This Z-score value was obtained from by ProSA-web program that measured the interaction energy of each residue and applied a distance-based pair potential. The score of -6.27 for the template and -6.54 for the model (Figures 3E and 3F) was analyzed by the ProSA-web program where the negative ProSA energy reflects the reliability of the model [49]. The VERIFY- 3D tool (Figure 4) showed that the model passed the requirements and it demonstrated that most of the amino acids had positive values and resides above the value of 0.5 both for the model (Figure 4A) and template protein (Figure 4B) indicating endurance and reliability.

Figure 4. Verified 3D tool fixed the compatibility of an atomic model in 3D structure level with its own amino acid sequence in primary form. Using the VERIFY- 3D tool resulting graph shows that the model passed the requirements and it demonstrated that most of the amino acids had positive value and resides above value 0.5 both for the template and modelled protein indicating endurance and reliability.

Molecular dynamics simulation

The structure was further validated against energy minimization by MD simulation using GROMACS on Ubuntu-2021.2 applying an all-atom optimized potential for liquid systems (OPLS-AA) force field [32]. MD simulation trajectories were analyzed to obtain GROMACS energies, root square deviation (RMSD), root-mean-square fluctuation (RMSF), and the radius of gyration (Rg); values are depicted in Figure 5. The protein's potential energy was -7e106 Kj mol-1, but it reduced during the simulation to nearly -1x106 depicted in Figure 5A. Electrostatic cut-off and Vander Waals cut off value were 1.0 and 1.0 respectively. In the equilibration state NVT and NPT; NVT started to run for 50,000 steps and for 100.0 ps to restrain the position of the protein and based on the parameter of leap-frog integrator, where the initial temperature was 298.948K. NVT ( N expressed as a constant number, V volume and energy (E); and it is the sum of kinetic (KE) and potential energy (PE) which is conserved and T and P are unregulated equilibrations generated temperature (Figure 5B) showed the higher and lower temperature peak in that particular graph and the highest value was around 303K. In NPT (constant number N; Pressure P and Temperature T) equilibrium process it conducted the pressure until it came to the proper density. In Figure 5C the x-axis is considered as time in ps and y-axis as pressure showing the highest and lowest value was nearly 300 bar and -220 bar respectively whereas Figure 5D shows the density: highest peak along y-axis was around 1023 Kg m-3 and the lowest value was around 1,010 Kg m-3.

A molecular dynamics simulation was running for performing 100 nanoseconds (100,000ps). The lowest and highest RMSD value (Figure 5E) is 0.1 nm and nearly 0.36 nm accordingly. During the period of 0 to 30 ns, the trajectory raises steadily and it fluctuates in an orderly for the rest of the period till it reaches. The Radius of gyration (Rg) graph shown in Figure 5F depicts those atoms fluctuated highest at nearly 1.65 nm at around 95ps and the lowest fluctuated value was almost 1.54 nm at around 30ps. That showed atoms of our query protein did not fluctuate much and evaluated that this structure was in a compact condition in contrast the RMSD value represented the reliability of the structure. The RMS Fluctuations, of each atom echo, generated showed in Figure 5G showed that residues range between 120 and 140 (amino acid sequences: DLRSDEHVRTELARMKQEPVR) and fluctuated mostly towards 0.5 nm. The RMS fluctuation figure reveals also the C-alpha backbone variance during the simulation process. The tolerable fluctuations in the backbone proved the reliability of protein models [72].

Figure 5. 100 ns molecular dynamics (MD) simulation of RhoB protein. (A) Protein's potential energy was -7e106 Kj mol-1, but it reduced during the simulation nearly to -1x106. (B) The equilibrations generated temperature showed the higher and lower temperature peak in that particular graph and the highest value was around 303K. (C) The pressure graph showed the highest and lowest value of pressure fluctuation in the entire simulation which was nearly 300 bar and -220 bar respectively. (D) the density trajectory shows highest peak which was around 1,023 Kg m-3 and the lowest value was around 1,010 Kg m-3. (E) A molecular dynamics simulation was running for performing 100 nanoseconds (100,000ps) where the lowest and highest RMSD value is 0.1 nm and nearly 0.36 nm accordingly. (F) The Radius of gyration (Rg) graph depicts that atoms were fluctuated highest nearly 1.65 nm at around 95 ps and the lowest fluctuated value was almost 1.54 nm around 30 ps. (G) The RMS Fluctuations, of each atom echo, generated shows that residues range between 120 and 140 (amino acid sequences: DLRSDEHVRTELARMKQEPVR) fluctuated mostly towards 0.5 nm. The RMS fluctuation figure reveals also the C-alpha backbone variance during the simulation process.

Principal component analysis

A molecular dynamics simulation was run for 100 ns before the actual principal component analysis (PCA). Components that account for the most variance are kept, while those that account for a small amount of variance are removed. In molecular dynamics (MD) simulations, PCA and clustering are used to reveal substantial conformational changes [40]. Only a small part of these PCs describes a great majority of the whole atomic movement [46,73] and these motions are often essential for protein function [74].

As a result, it seemed sensible to use the PCA method to reduce the phase space of proteins for long-term molecular dynamics [75]. For this purpose, PCA is used here to identify a small number of important modes and then project the equations of motion onto the resulting low-dimensional vector space. Due to this crystal structure of RhoB's and immersion in water, the model required a significant amount of equilibration time. The system's RMSD and radius of gyration; Rg were monitored for 100 ps to establish equilibration.

In addition, both the entire 10,000 frames in the 100 ns time frame as well as for the last 1,000 frames were considered to capture principal components. The resulting data matrix was subjected to principal component analysis. Initially, we considered whole 10,000 frames of entire hundred nanoseconds of the alpha-carbon chain (181 atoms, 181 residues) to calculate the maximum pairwise RMSD for clustering (Figure 6A) and it was calculated as 4.905646 nm. RMSD hierarchical clustering demonstrates the hierarchical distribution of the total clusters by color codes ranging from red, blue, sky blue, purple, black, green, and yellow. In this hierarchical clustering, we can observe the yellow clustering which appeared also in the Pairwise PC analysis and the Cartesian coordinate indicates the location of the clustering residues in the last stage of the entire 100, 000ps of simulation. Therefore, cartesian coordinate PCA analysis (Figure 6B) reflects some extent the dominant overall motion rather than the much smaller internal motion of the modeled RhoB protein [76]. Figure 6B describes the cartesian movements of PC that are manipulated as the overall motion of the protein. Whereas, pairwise distance PCA (Figure 6C) in the distance pairwise PC analysis, we were able to predict the position of atoms over time. In this 2D graph, a color range was included and we observed in the starting they were scattered between the 1st and 4th coordinates but in the terminate state, they started to appear in the 2nd and 3rd coordinates which were marked in yellow color.

Figure 6. Principal component analysis for entire 10,00 frames. (A) RMSD Average linkage hierarchical clustering depicts the movements of atoms with respect to time from starting to termination state in a whole for 100 ns. (B) The cartesian coordinate PCA analysis reflects the dominant overall motion to some extent rather than the much smaller internal motion of the modelled RhoB protein. (C) The prediction of atom positions over time using pairwise distance PCA. (D) The fluctuation of Solvent Accessible Surface Area (SASA) during the simulation at 300 K for 100 ns.

To further understand Rho-B correlated dynamics, we considered, Solvent accessibility (SASA) is as a key feature of proteins for determining their folding and stability [77]. To compute the solvent-accessible surface area of each atom or residue in each simulation frame. In our analytical part, SASA analysis revealed the solvent-exposed area, which may decrease the solubility of the protein and can modulate protein-protein interactions [3,78-80]. Prediction of protein solubility is gaining importance with the growing use of protein molecules as therapeutics, and ongoing requirements for high-level expression [74]. In Figure 6D we can see the fluctuation of total atoms; 181 residues plotted against 10,000 frames for 100 ns (100,000 ps). The number of points representing the surface of each atom, higher values lead to more accuracy. Assuming that the points are evenly distributed, the number of points is directly proportional to the accessible surface area. From the GetArea web tool [76] we calculated the SASA and it showed a total surface of 9514.27 (Table 4).

Table 4. Calculated properties from Solvent Accessible Surface Area (SASA).

Parameters

Numbers

POLAR area/energy

3,756.87

APOLAR area/energy

5,757.40

UNKNOW area/energy

0.00

 Number of surface atoms

 797

 Number of buried atoms

 653

Total area/energy

9,514.27

However, PC analysis for the whole simulation time frame can miss relevant motions since major collective dihedral transitions do not usually correspond to major transitions in Cartesian space. Therefore, principal component (PC) analysis was performed using Cα position covariance while we considered terminal 1,000 frames and analyzed principal components using the Bio3D package of R [44,46] to look for further insight into the last 1,000 frames. The atomic coordinates of the Cα atoms (181 total atoms) were used in the analysis to reduce statistical noise and to avoid artificial apparent correlations between slow side-chain fluctuations and backbone motions. A PCA in dihedral angle space is based on internal coordinates which naturally provide a correct separation of internal and overall motion.

Figure 7A is elucidated through an average fluctuation graph of a total of 181 components and a principal component. In this graph along the x-axis the position of residues and along the y-axis the PC was subjected and the rate of fluctuation of PC was higher in the position ranging from 100-150 (Å) with blue color and the average fluctuation rate is in between (0.10). So, residues that hold the position between 100-110 and 140-150 are phenylalanine, glutamate, valine, lysine, proline, histidine, cysteine, asparagine, arginine threonine, aspartic acid, glycine, alanine, methionine, alanine, valine, and arginine that we got from our multiple sequence alignment. These residues take into account. From the colored graph (Figure 7B) we observed that clusters are occupied within all 4 coordinates and in some places, they are highly dense and on the other side, they are not much dense as previously. This dense place is considered as the cluster with the same features or influence of subspace. Variables contributing similar information are grouped, that is, they are correlated. The numerical value of the other variable often changes in the same way when the positively correlated numerical value of one variable fluctuates. When variables are negatively (“inversely”) correlated, they are positioned on opposite sides of the plot origin, in diagonally opposed quadrants [81]. Furthermore, the distance to the origin also conveys information. Further away from the plot origin a variable lies which has the stronger impact on the model. In Figure 7C, it is shown the contributions of the first PC in a structural form with blue and red color through a ribbon structure.

Figure 7. PC1 components and cross correlation analysis. (A) depicts an average fluctuation graph of all 181 components and the principal component. (B) Cross Correlation and Residue-wise loadings for PC1 (black) and PC2 (blue). Correlation analysis (Cij) of the motion during a 100 ns MD simulation of the RhoB. (C) Contributions for first PC in a structural form and from the graph we acquired the knowledge of the contribution of pc1 was around 23% and one fourth in total.

Observing the scree plot of Figure 8, by ranking eigenvectors in order of their eigenvalues, highest to lowest, we got the principal components in order of significance [74]. The first PC accounts for more than one-fourth; of 23.34% of the overall variance. The second PC accounts for 17.95%. Altogether, the first three elements account for 50.88% of the total. From Figure 8 we got pictures que information of three principal components against each other. The conformer plot of all selects is defined by PC1-PC3. Each point represents a structure and the point color indicates the cluster id from a conformational clustering. Here, we can identify 4 coordinates and components clustered together. Highly correlated cells clustered together and such PCA plots are often used to find potential clusters [74]. The extent to which the atomic fluctuations/displacements of a system are correlated with one another can be assessed by examining the magnitude of all pairwise cross-correlation coefficients.

Figure 8. Principal component analysis. For the PCA, the last 100 ns of each trajectory using the Bio3D package implemented in R. The first three eigenvectors, based on complex dominant motion, are extracted and compared. The variance captured by eigenvectors is also shown. Each point represents a structure and the point color indicates the cluster id from a conformational clustering. Projection of the trajectory onto the planes formed by the first three principal components. Conformers are colored according to the k-means clustering: PCA results for our Rho-B trajectory with instantaneous conformations (i.e. trajectory frames) colored from blue to red in order of time. The continuous color scale (from blue to whit to red) indicates that there are periodic jumps between these conformers throughout the trajectory. Below we perform a quick clustering in PC-space to further highlight these distinct conformers. In left side, PCA results for our Rho-B trajectory with instantaneous conformations (i.e. trajectory frames) colored from blue to red in order of time and in right side, Simple clustering in PC subspace.

Conclusion

In this investigation, we studied the physicochemical properties of the Rho-related GTP-binding protein RhoB. The determination of its primary sequence, gave us cognition about its physicochemical characteristics, functional activities, etc. It was acidic by nature as its predicted theoretical PI was 5.10 in the pH range. Furthermore, it was unstable because the value of the Instability index (II) was 46.35. Prediction of the secondary structure of this protein showed it was alpha helix dominating with 39.8% contribution in the construction of the secondary structure. The 3-D structure of a protein is biologically active in nature since our query protein was uncharacterized along with no prior experimental structural information listed in PDB, we predicted its 3-D structure with a computational approach. An appropriate template 6hxu.1.A was selected among more than 30 templates from SWISS-MODEL based on its GMQE, QMAEN value with 100% identity, with 149 long amino acid sequences of the query protein on the other side, Rho-related GTP-binding protein RhoB had 182 long amino acids in sequence. For validating the model from the Ramachandran plot showed around 92% of amino acids occupied the most favored region indicating its reliability. In addition, to affirm the RhoB protein’s stability and compatibility, 100 nanoseconds (100,000 ps) MD simulation was performed to know its atomic behavior, conformational space, energies, RMSD, Radiation of gyration (Rg), RMSF with energies by using spc216 water model, OPLS AA force field and leap-frog integrator. Moreover, the RMSD value was reliable, and can claim based on these values the structure is suitable for advanced analysis. As we also analyzed and visualized the motion and movements of Principle components ignoring the rest reducing the noise and surface so that we can get over from complexity. These principal components can describe the characteristics of all the residues. Along with this we also calculated the SASA from webtool and trustable software that expresses the accessibility of solvents of the protein residues. The hydrophobicity curve, residue x residue index, Cartesian coordinate of PC, and pairwise analysis of PC that was able to give use a better and valid understanding of the intramolecular characteristics of this protein residue. In conclusion, this investigation will assist further studies involving the relation of the gene mutation and abnormalities induced by protein Rho-related GTP-binding protein RhoB in the progression apoptosis and the development of RhoB inhibitors. We performed a computational study on the Rho-related GTP-binding protein RhoB and this study brought some limitations as well, for instance, homology-modelled protein can’t reveal the exact structural features as the structure solved by the X-ray crystallographic method. Statistical knowledge comes before interpreting the extrapolated events of the protein and so further investigation is needed for the exact determination of physicochemical properties of this protein yet this computational study will be helpful for the future study of its clinical purpose.

Declarations

Author contribution

SKP, MLK, and MMUH designed the outlines. SKP, CLNM, and SKS analyzed and interpreted the data. SKP, MS, and BP performed the molecular dynamics simulation. MLK provided scientific guidance. SKP, CLNM, SKS, MS, and BP wrote the manuscript. All authors read and approved the final submitted version of the manuscript.

Acknowledgement

High-performance computing support for this research was provided by Science Outreach Servers.

Funding information

This project is partially funded by the grant from the Ministry of Science and Technology, People’s Republic of Bangladesh (Fund receiver: Md Lutful Kabir).

Data availability statement

Data will be made available on reasonable request.

Declaration of interest’s statement

The authors declare no conflict of interest.

Additional information

No additional information is available for this project.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

References

1. Parri M, Chiarugi P. Rac and Rho GTPases in cancer cell motility control. Cell Communication and Signaling. 2010 Dec;8:1-4.

2. Prendergast GC. Actin'up: RhoB in cancer and apoptosis. Nature Reviews Cancer. 2001 Nov 1;1(2):162-8.

3. Vega FM, Ridley AJ. The RhoB small GTPase in physiology and disease. Small GTPases. 2018 Sep 3;9(5):384-93.

4. Stelzer G, Rosen N, Plaschkes I, Zimmerman S, Twik M, Fishilevich S, et al. The GeneCards suite: from gene data mining to disease genome sequence analyses. Current Protocols in Bioinformatics. 2016 Jun;54(1):1-30.

5. Kroon J, Tol S, van Amstel S, Elias JA, Fernandez-Borja M. The small GTPase RhoB regulates TNFα signaling in endothelial cells. PLoS One. 2013 Sep 26;8(9):e75031.

6. Kersey PJ, Allen JE, Armean I, Boddu S, Bolt BJ, Carvalho-Silva D, et al. Ensembl Genomes 2016: more genomes, more complexity. Nucleic Acids Research. 2016 Jan 4;44(D1):D574-80.

7. Huang M, DuHadaway JB, Prendergast GC, Laury-Kleintop LD. RhoB regulates PDGFR-β trafficking and signaling in vascular smooth muscle cells. Arteriosclerosis, Thrombosis, and Vascular Biology. 2007 Dec 1;27(12):2597-605.

8. Ju JA, Gilkes DM. RhoB: team oncogene or team tumor suppressor?. Genes. 2018 Jan 30;9(2):67.

9. Saito K, Ozawa Y, Hibino K, Ohta Y. FilGAP, a Rho/Rho-associated protein kinase–regulated GTPase-activating protein for Rac, controls tumor cell migration. Molecular Biology of the Cell. 2012 Dec 15;23(24):4739-50.

10. Haga RB, Ridley AJ. Rho GTPases: Regulation and roles in cancer cell biology. Small GTPases. 2016 Oct 1;7(4):207-21.

11. Liu M, Tang Q, Qiu M, Lang N, Li M, Zheng Y, et al. miR-21 targets the tumor suppressor RhoB and regulates proliferation, invasion and apoptosis in colorectal cancer cells. FEBS Letters. 2011 Oct 3;585(19):2998-3005.

12. Tan Y, Yin H, Zhang H, Fang J, Zheng W, Li D, et al. Sp1-driven up-regulation of miR-19a decreases RHOB and promotes pancreatic cancer. Oncotarget. 2015 Jul 7;6(19):17391-403.

13. Ueyama T. Rho-family small GTPases: from highly polarized sensory neurons to cancer cells. Cells. 2019 Jan 28;8(2):92.

14. Tian Z, Wong W, Wu Q, Zhou J, Yan K, Chen J, et al. Elevated Expressions of BTN3A1 and RhoB in Psoriasis Vulgaris Lesions by an Immunohistochemical Study. Applied Immunohistochemistry & Molecular Morphology. 2022 Feb 1;30(2):119-25.

15. Ridley AJ. RhoA, RhoB and RhoC have different roles in cancer cell migration. Journal of Microscopy. 2013 Sep;251(3):242-9.

16. Hancock JF, Cadwallader K, Paterson H, Marshall CJ. A CAAX or a CAAL motif and a second signal are sufficient for plasma membrane targeting of ras proteins. The EMBO Journal. 1991 Dec;10(13):4033-9.

17. Pittayapruek P, Meephansan J, Prapapan O, Komine M, Ohtsuki M. Role of matrix metalloproteinases in photoaging and photocarcinogenesis. International Journal of Molecular Sciences. 2016 Jun 2;17(6):868.

18. Schimmel L, de Ligt A, Tol S, de Waard V, van Buul JD. Endothelial RhoB and RhoC are dispensable for leukocyte diapedesis and for maintaining vascular integrity during diapedesis. Small GTPases. 2020 May 3;11(3):225-32.

19. Wojciak-Stothard B, Zhao L, Oliver E, Dubois O, Wu Y, Kardassis D, et al. Role of RhoB in the regulation of pulmonary endothelial and smooth muscle cell responses to hypoxia. Circulation Research. 2012 May 25;110(11):1423-34.

20. Adini I, Rabinovitz I, Sun JF, Prendergast GC, Benjamin LE. RhoB controls Akt trafficking and stage-specific survival of endothelial cells during vascular development. Genes & Development. 2003 Nov 1;17(21):2721-32.

21. Chen Z, Sun J, Pradines A, Favre G, Adnane J, Sebti SM. Both farnesylated and geranylgeranylated RhoB inhibit malignant transformation and suppress human tumor growth in nude mice. Journal of Biological Chemistry. 2000 Jun 16;275(24):17974-8.

22. Meyer N, Peyret-Lacombe A, Canguilhem B, Médale-Giamarchi C, Mamouni K, Cristini A, et al. RhoB promotes cancer initiation by protecting keratinocytes from UVB-induced apoptosis but limits tumor aggressiveness. Journal of Investigative Dermatology. 2014 Jan 1;134(1):203-12.

23. Kazerounian S, Gerald D, Huang M, Chin YR, Udayakumar D, Zheng N, et al. RhoB differentially controls Akt function in tumor cells and stromal endothelial cells during breast tumorigenesis. Cancer Reserach. 2013 Jan;73(1):50-61.

24. UniProt Consortium. UniProt: The universal protein knowledgebase in 2021. Nucleic Acids Research. 2021 Jan 8;49(D1): D480-D489.

25. wwPDB consortium. Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Research. 2019 Jan 8;47(D1):D520-D528.

26. Duvaud S, Gabella C, Lisacek F, Stockinger H, Ioannidis V, Durinx C. Expasy, the Swiss Bioinformatics Resource Portal, as designed by its users. Nucleic Acids Research. 2021 Jul 2;49(W1):W216-27.

27. Waterhouse A, Bertoni M, Bienert S, Studer G, Tauriello G, Gumienny R, et al. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Research. 2018 Jul 2;46(W1):W296-303.

28. Studer G, Tauriello G, Bienert S, Biasini M, Johner N, Schwede T. ProMod3—A versatile homology modelling toolbox. PLoS Computational Biology. 2021 Jan 28;17(1):e1008667.

29. Laskowski RA, Jabłońska J, Pravda L, Vařeková RS, Thornton JM. PDBsum: Structural summaries of PDB entries. Protein Science. 2018 Jan;27(1):129-34.

30. Wiederstein M, Sippl MJ. ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins. Nucleic Acids Research. 2007 Jul 1;35(suppl_2):W407-10.

31. Pettersen EF, Goddard TD, Huang CC, Meng EC, Couch GS, Croll TI, et al. UCSF ChimeraX: Structure visualization for researchers, educators, and developers. Protein Science. 2021 Jan;30(1):70-82.

32. Siu SW, Pluhackova K, Böckmann RA. Optimization of the OPLS-AA force field for long hydrocarbons. Journal of Chemical Theory and Computation. 2012 Apr 10;8(4):1459-70.

33. Jones DT, Cozzetto D. DISOPRED3: precise disordered region predictions with annotated protein-binding activity. Bioinformatics. 2015 Mar 15;31(6):857-63.

34. Combet C, Blanchet C, Geourjon C, Deleage G. NPS@: network protein sequence analysis. Trends in Biochemical Sciences. 2000 Mar 1;25(3):147-50.

35. Barazorda-Ccahuana HL, Valencia DE, Aguilar-Pineda JA, Gómez B. Art v 4 protein structure as a representative template for allergen profilins: Homology modeling and molecular dynamics. ACS Omega. 2018 Dec 13;3(12):17254-60.

36. Parchaňský V, Kapitán J, Kaminský J, Sebestik J, Bouř P. Ramachandran plot for alanine dipeptide as determined from Raman optical activity. The Journal of Physical Chemistry Letters. 2013 Aug 15;4(16):2763-8.

37. Eisenberg D, Lüthy R, Bowie JU. VERIFY3D: assessment of protein models with three-dimensional profiles. Methods in Enzymology. 1997 Jan 1;277:396-404.

38. Colovos C, Yeates TO. Verification of protein structures: patterns of nonbonded atomic interactions. Protein Science. 1993 Sep;2(9):1511-9.

39. Sippl MJ. Recognition of errors in three-dimensional structures of proteins. Proteins: Structure, Function, and Bioinformatics. 1993 Dec;17(4):355-62.

40. Wolf A, Kirschner KN. Principal component and clustering analysis on molecular dynamics data of the ribosomal L11· 23S subdomain. Journal of Molecular Modeling. 2013 Feb;19:539-49.

41. Abraham MJ, Murtola T, Schulz R, Páll S, Smith JC, Hess B, et al. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX. 2015 Sep 1;1:19-25.

42. Hess B, Bekker H, Berendsen HJ, Fraaije JG. LINCS: A linear constraint solver for molecular simulations. Journal of Computational Chemistry. 1997 Sep;18(12):1463-72.

43. Essmann U, Perera L, Berkowitz ML, Darden T, Lee H, Pedersen LG. A smooth particle mesh Ewald method. The Journal of Chemical Physics. 1995 Nov 15;103(19):8577-93.

44. Skjærven L, Yao XQ, Scarabelli G, Grant BJ. Integrating protein structural dynamics and evolutionary analysis with Bio3D. BMC Bioinformatics. 2014 Dec;15(1):399.

45. Paul SK, Saddam M, Rahaman KA, Choi JG, Lee SS, Hasan M. Molecular modeling, molecular dynamics simulation, and essential dynamics analysis of grancalcin: An upregulated biomarker in experimental autoimmune encephalomyelitis mice. Heliyon. 2022 Oct 1;8(10):e11232.

46. Grant BJ, Skjærven L, Yao XQ. The Bio3D packages for structural bioinformatics. Protein Science. 2021 Jan;30(1):20-30.

47. Sievers F, Higgins DG. Clustal omega. Current Protocols in Bioinformatics. 2014 Dec;48(1):3-13.

48. Laskowski RA, MacArthur MW, Moss DS, Thornton JM. PROCHECK: a program to check the stereochemical quality of protein structures. Journal of Applied Crystallography. 1993 Apr 1;26(2):283-91.

49. Dehury B, Sahu M, Sarma K, Sahu J, Sen P, Modi MK, et al. Molecular phylogeny, homology modeling, and molecular dynamics simulation of race-specific bacterial blight disease resistance protein (xa 5) of rice: a comparative agriproteomics approach. Omics: a Journal of Integrative Biology. 2013 Aug 1;17(8):423-38.

50. Gamage DG, Gunaratne A, Periyannan GR, Russell TG. Applicability of instability index for in vitro protein stability prediction. Protein and Peptide Letters. 2019 May 1;26(5):339-47.

51. Satyanarayana SD, Krishna MS, Kumar PP, Jeereddy S. In silico structural homology modeling of nif A protein of rhizobial strains in selective legume plants. Journal of Genetic Engineering and Biotechnology. 2018 Dec 1;16(2):731-7.

52. Ikai A. Thermostability and aliphatic index of globular proteins. The Journal of Biochemistry. 1980 Oct;88(6):1895-8.

53. Bellingham J, Foster RG. Opsins and mammalian photoentrainment. Cell and Tissue Research. 2002 Jul;309:57-71.

54. Delahaije RJ, Wierenga PA. Hydrophobicity Enhances the Formation of Protein-Stabilized Foams. Molecules. 2022 Apr 6;27(7):2358.

55. Gasteiger E, Hoogland C, Gattiker A, Duvaud SE, Wilkins MR, Appel RD, et al. Protein identification and analysis tools on the ExPASy server. In: Walker JM. (eds) The Proteomics Protocols Handbook. Springer Protocols Handbooks. Humana Press. 2005; pp. 571-607.

56. Mei Y, Su M, Soni G, Salem S, Colbert CL, Sinha SC. Intrinsically disordered regions in autophagy proteins. Proteins: Structure, Function, and Bioinformatics. 2014 Apr;82(4):565-78.

57. Steinegger M, Meier M, Mirdita M, Vöhringer H, Haunsberger SJ, Söding J. HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinformatics. 2019 Dec;20(1):473.

58. Sievers F, Higgins DG. Clustal Omega for making accurate alignments of many protein sequences. Protein Science. 2018 Jan;27(1):135-45.

59. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. Journal of Molecular Biology. 1990 Oct 5;215(3):403-10.

60. O’Driscoll A, Belogrudov V, Carroll J, Kropp K, Walsh P, Ghazal P, et al. HBLAST: Parallelised sequence similarity–A Hadoop MapReducable basic local alignment search tool. Journal of Biomedical Informatics. 2015 Apr 1;54:58-64.

61. Studer G, Tauriello G, Bienert S, Biasini M, Johner N, Schwede T. ProMod3—A versatile homology modelling toolbox. PLoS Computational Biology. 2021 Jan 28;17(1):e1008667.

62. Steinegger M, Meier M, Mirdita M, Vöhringer H, Haunsberger SJ, Söding J. HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinformatics. 2019 Dec;20(1):1-5.

63. Remmert M, Biegert A, Hauser A, Söding J. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nature Methods. 2012 Feb;9(2):173-5.

64. Bertoni M, Kiefer F, Biasini M, Bordoli L, Schwede T. Modeling protein quaternary structure of homo-and hetero-oligomers beyond binary interactions by homology. Scientific Reports. 2017 Sep 5;7(1):1-15.

65. Deng H, Jia Y, Zhang Y. Protein structure prediction. International Journal of Modern Physics. B. 2018 Jul 20;32(18):1840009.

66. Amaral M, Kokh DB, Bomke J, Wegener A, Buchstaller HP, Eggenweiler HM, et al. Protein conformational flexibility modulates kinetics and thermodynamics of drug binding. Nature Communications. 2017 Dec 22;8(1):2276.

67. Stank A, Kokh DB, Fuller JC, Wade RC. Protein binding pocket dynamics. Accounts of Chemical Research. 2016 May 17;49(5):809-15.

68. de Boer M, Gouridis G, Vietrov R, Begg SL, Schuurman-Wolters GK, Husada F, et al. Conformational and dynamic plasticity in substrate-binding proteins underlies selective transport in ABC importers. Elife. 2019 Mar 22;8:e44652.

69. Jehangir M, Ahmad SF. Structural studies of aspartic endopeptidase pep2 from neosartorya fisherica using homolgy modeling techniques. International Journal on Bioinformatics & Biosciences. 2013;3(1):7-20.

70. Zhang LI, Skolnick J. What should the Z-score of native protein structures be?. Protein Science. 1998 May;7(5):1201-7.

71. Sobolev OV, Afonine PV, Moriarty NW, Hekkelman ML, Joosten RP, Perrakis A, et al. A global Ramachandran score identifies protein structures with unlikely stereochemistry. Structure. 2020 Nov 3;28(11):1249-58.

72. Condic-Jurkic K, Subramanian N, Mark AE, O’Mara ML. The reliability of molecular dynamics simulations of the multidrug transporter P-glycoprotein in a membrane environment. PLoS One. 2018 Jan 25;13(1):e0191882.

73. Makarewicz T, Kaźmierkiewicz R. Improvements in GROMACS plugin for PyMOL including implicit solvent simulations and displaying results of PCA analysis. Journal of Molecular Modeling. 2016 May;22:109.

74. Hayward S, De Groot BL. Normal modes and essential dynamics. Molecular Modeling of Proteins. 2008:89-106.

75. David CC, Jacobs DJ. Principal component analysis: a method for determining the essential dynamics of proteins. Protein Dynamics: Methods and Protocols. 2014:193-226.

76. Sittel F, Jain A, Stock G. Principal component analysis of molecular dynamics: On the use of Cartesian vs. internal coordinates. The Journal of Chemical Physics. 2014 Jul 7;141(1):07B605_1.

77. Savojardo C, Manfredi M, Martelli PL, Casadio R. Solvent accessibility of residues undergoing pathogenic variations in humans: from protein structures to protein sequences. Frontiers in Molecular Biosciences. 2021 Jan 7;7:626363.

78. Chen H, Panagiotopoulos AZ. Molecular modeling of surfactant micellization using solvent-accessible surface area. Langmuir. 2019 Jan 9;35(6):2443-50.

79. Ausaf Ali S, Hassan I, Islam A, Ahmad F. A review of methods available to estimate solvent-accessible surface areas of soluble proteins in the folded and unfolded states. Current Protein and Peptide Science. 2014 Aug 1;15(5):456-76.

80. Dong T, Gong T, Li W. Accurate Estimation of Solvent Accessible Surface Area for Coarse-Grained Biomolecular Structures with Deep Learning. The Journal of Physical Chemistry B. 2021 Aug 12;125(33):9490-8.

81. Pandey B, Grover A, Sharma P. Molecular dynamics simulations revealed structural differences among WRKY domain-DNA interaction in barley (Hordeum vulgare). BMC Genomics. 2018 Dec;19(1):132.

Author Information X