Family of proteins which attach to other proteins to modify them
In molecular biology, SUMO (Small Ubiquitin-like Modifier) proteins are a family of small proteins that are covalently attached to and detached from other proteins in cells to modify their function. This process is called SUMOylation (pronounced soo-muh-lā-shun and sometimes written sumoylation). SUMOylation is a post-translational modification involved in various cellular processes, such as nuclear-cytosolic transport, transcriptional regulation, apoptosis, protein stability, response to stress, and progression through the cell cycle.[1] In human proteins, there are over 53,000 SUMO binding sites, making it a substantial component of fundamental biology.[2]
SUMO proteins are similar to ubiquitin and are considered members of the ubiquitin-like protein family. SUMOylation is directed by an enzymatic cascade analogous to that involved in ubiquitination. In contrast to ubiquitin, SUMO is not used to tag proteins for degradation. Mature SUMO is produced when the last four amino acids of the C-terminus have been cleaved off to allow formation of an isopeptide bond between the C-terminal glycine residue of SUMO and an acceptor lysine on the target protein.
SUMO family members often have dissimilar names; the SUMO homologue in yeast, for example, is called SMT3 (suppressor of mif two 3). Several pseudogenes have been reported for SUMO genes in the human genome.
Function
SUMO modification of proteins has many functions. Among the most frequent and best studied are protein stability, nuclear-cytosolic transport, and transcriptional regulation. Typically, only a small fraction of a given protein is SUMOylated and this modification is rapidly reversed by the action of deSUMOylating enzymes. SUMOylation of target proteins has been shown to cause a number of different outcomes including altered localization and binding partners. The SUMO-1 modification of RanGAP1 (the first identified SUMO substrate) leads to its trafficking from cytosol to nuclear pore complex.[3][4] The SUMO modification of ninein leads to its movement from the centrosome to the nucleus.[5] In many cases, SUMO modification of transcriptional regulators correlates with inhibition of transcription.[6] One can refer to the GeneRIFs of the SUMO proteins, e.g. human SUMO-1,[7] to find out more.
There are 4 confirmed SUMO isoforms in humans; SUMO-1, SUMO-2, SUMO-3 and SUMO-4. At the amino acid level, SUMO1 is about 50% identical to SUMO2.[citation needed] SUMO-2/3 show a high degree of similarity to each other and are distinct from SUMO-1. SUMO-4 shows similarity to SUMO-2/3 but differs in having a Proline instead of Glutamine at position 90. As a result, SUMO-4 isn't processed and conjugated under normal conditions, but is used for modification of proteins under stress-conditions like starvation.[8] During mitosis, SUMO-2/3 localize to centromeres and condensed chromosomes, whereas SUMO-1 localizes to the mitotic spindle and spindle midzone, indicating that SUMO paralogs regulate distinct mitotic processes in mammalian cells.[9] One of the major SUMO conjugation products associated with mitotic chromosomes arose from SUMO-2/3 conjugation of topoisomerase II, which is modified exclusively by SUMO-2/3 during mitosis.[10] SUMO-2/3 modifications seem to be involved specifically in the stress response.[11] SUMO-1 and SUMO-2/3 can form mixed chains, however, because SUMO-1 does not contain the internal SUMO consensus sites found in SUMO-2/3, it is thought to terminate these poly-SUMO chains.[12]
Serine 2 of SUMO-1 is phosphorylated, raising the concept of a 'modified modifier'.[13]
DNA damage response
Cellular DNA is regularly exposed to DNA damaging agents. A DNA damage response (DDR) that is well regulated and intricate is usually employed to deal with the potential deleterious effects of the damage. When DNA damage occurs, SUMO protein has been shown to act as a molecular glue to facilitate the assembly of large protein complexes in repair foci.[14] Also, SUMOylation can alter a protein's biochemical activities and interactions. SUMOylation plays a role in the major DNA repair pathways of base excision repair, nucleotide excision repair, non-homologous end joining and homologous recombinational repair. [14] SUMOylation also facilitates error prone translation synthesis.
Structure
SUMO proteins are small; most are around 100 amino acids in length and 12 kDa in mass. The exact length and mass varies between SUMO family members and depends on which organism the protein comes from. Although SUMO has very little sequence identity with ubiquitin (less than 20%) at the amino acid level, it has a nearly identical structural fold. SUMO protein has a unique N-terminal extension of 10-25 amino acids which other ubiquitin-like proteins do not have. This N-terminal is found related to the formation of SUMO chains.[15]
The structure of human SUMO1 is depicted on the right. It shows SUMO1 as a globular protein with both ends of the amino acid chain (shown in red and blue) sticking out of the protein's centre. The spherical core consists of an alpha helix and a beta sheet. The diagrams shown are based on an NMR analysis of the protein in solution.
Prediction of SUMO attachment
Most SUMO-modified proteins contain the tetrapeptide consensus motif Ψ-K-x-D/E where Ψ is a hydrophobic residue, K is the lysine conjugated to SUMO, x is any amino acid (aa), D or E is an acidic residue. Substrate specificity appears to be derived directly from Ubc9 and the respective substrate motif. Currently available prediction programs are:
SUMOplot - online free access software developed to predict the probability for the SUMO consensus sequence (SUMO-CS) to be engaged in SUMO attachment.[16] The SUMOplot score system is based on two criteria: 1) direct amino acid match to the SUMO-CS observed and shown to bind Ubc9, and 2) substitution of the consensus amino acid residues with amino acid residues exhibiting similar hydrophobicity. SUMOplot has been used in the past to predict Ubc9 dependent sites.
SUMOsp - uses PSSM to score potential SUMOylation peptide sites. It can predict sites followed the ψKXE motif and unusual SUMOylation sites contained other non-canonical motifs.[18]
JASSA - online free access predictor of SUMOylation sites (classical and inverted consensus) and SIMs (SUMO interacting motif). JASSA uses a scoring system based on a Position Frequency Matrix derived from the alignment of experimental SUMOylation sites or SIMs. Novel features were implemented towards a better evaluation of the prediction, including identification of database hits matching the query sequence and representation of candidate sites within the secondary structural elements and/or the 3D fold of the protein of interest, retrievable from deposited PDB files.[19]
SumoPred-PLM or SUMOylation site Prediction using Protein Language Model - An AI deep learning utility to predict based on known biological rules around SUMO2 and SUMO3 binding in human proteins incorporating knowledge from a separate pretrained PLM tool developed previously in 2021 by Elnaggar et al. known as ProtT5-XL-UniRef50.[2] Such collaboration between multidisciplinary AI tools is becoming common practice.
SUMO attachment (SUMOylation)
SUMO attachment to its target is similar to that of ubiquitin (as it is for the other ubiquitin-like proteins such as NEDD 8). The SUMO precursor has some extra amino acids that need to be removed, therefore a C-terminal peptide is cleaved from the SUMO precursor by a protease (in human these are the SENP proteases or Ulp1 in yeast) to reveal a di-glycine motif. The obtained SUMO then becomes bound to an E1 enzyme (SUMO Activating Enzyme (SAE)) which is a heterodimer (subunits SAE1 and SAE2). It is then passed to an E2, which is a conjugating enzyme (Ubc9). Finally, one of a small number of E3 ligating proteins attaches it to the protein.
In budding yeast, there are four SUMO E3 proteins, Cst9,[20] Mms21, Siz1 and Siz2. While in ubiquitination an E3 is essential to add ubiquitin to its target, evidence suggests that the E2 is sufficient in SUMOylation as long as the consensus sequence is present. It is thought that the E3 ligase promotes the efficiency of SUMOylation and in some cases has been shown to direct SUMO conjugation onto non-consensus motifs. E3 enzymes can be largely classed into PIAS proteins, such as Mms21 (a member of the Smc5/6 complex) and Pias-gamma and HECT proteins. On Chromosome 17 of the human genome, SUMO2 is near SUMO1+E1/E2 and SUMO2+E1/E2, among various others. Some E3's, such as RanBP2, however, are neither.[21]
Recent evidence has shown that PIAS-gamma is required for the SUMOylation of the transcription factor yy1 but it is independent of the zinc-RING finger (identified as the functional domain of the E3 ligases). SUMOylation is reversible and is removed from targets by specific SUMO proteases. In budding yeast, the Ulp1 SUMO protease is found bound at the nuclear pore, whereas Ulp2 is nucleoplasmic. The distinct subnuclear localisation of deSUMOylating enzymes is conserved in higher eukaryotes.[22]
DeSUMOylation
SUMO can be removed from its substrate, which is called deSUMOylation. Specific proteases mediate this procedure (SENP in human or Ulp1 and Ulp2 in yeast).[15]
In yeast, SMT3 encodes the SUMO protein, and SUMO E3 ligase attaches SUMO to target proteins. In cell cycle regulation, the base case is that SUMO ligation is constantly taking place, leading to polySUMOylation of eligible target proteins. This is countered by the SUMO protease Ulp2 which cleaves polySUMO groups, leaving the protein in a monoSUMOylated state. As shown in the Biorender figure, there is a feedback mechanism in which ULP2 maintains the monoSUMOylated state by passively and diligently cleaving SUMO such that the polySUMOyated state is never stabilized enough to be acted upon by downstream actors. This deSUMOylation is critical to prevent precocious advancement of the cell cycle as discussed in several studies.[23]
The deSUMOylation may be arrested by the inhibitory phosphorylation of the Ulp2 SUMO protease by the Polo-like kinase Cdc5. By inhibiting the deSUMOylation of Ulp2, polySUMOylation is then promoted as the new stable state of target proteins, which are often but not always bound to other proteins in order to regulate major changes within the cell. Cdc5 is countered by the Rts1-PP2A phosphatase, which maintains the active state of the Ulp2 SUMO protease by removing the phosphate group added by Cdc5 kinase.[23]
The consequence of disrupting the counteracting deSUMOylation is the following: First, the targeted protein becomes polySUMOylated. Second, SUMO Targeted Ubiquitin Ligase, or STUbL, (SLX5 or SLX8 in the case of yeast) may then bind the polySUMOylated target and attach Ubiquitin groups (often polyUbiquitinating the already polySUMOylated protein).[24] Third, segregases such as Cdc48 may then dissociate the SUMOylated and ubiquitinated target from its bound protein. Fourth, while the unbound protein it had been bound to is now free to do what it could not do while bound, the dissociated protein may then be degraded by the canonical Ubiquitin-Proteasome pathway.[23]
As studied with budding yeast, in the case of Tof2-Cdc14, Cdc14 release from the nucleolus allows the Mitotic Exit Network to commence, but it is regulated by the binding of Tof2, a protein subject to SUMOylation. Likewise, the Cohesin protein which binds sister chromatids in metaphase is able to be targeted by SUMOylation to allow the Cdc48 segregase to separate Cohesin and allow sister chromatid separation in early anaphase.[23]
In research as is often the case, scientists test drugs known to have significant effects on living systems; one such example is Rapamycin (known in pharmaceuticals as Sirolimus), the well-known inhibitor of mechanistic Target of Rapamycin, or mTOR. With respect to SUMOylation, Rapamycin may be thought of as having a "Sledge Hammer" effect, in which the drug promotes cellular autophagy, part of which includes broad-spectrum promotion of nonspecific SUMOylation for many proteins. This may be beneficial in some circumstances as it supports the breakdown of accumulated waste products.[25]
The importance of these studies in models such as yeast lies in their potential to inform scientists in the research and development of precise biomedical interventions that can translate to the improvement of human health in an array of clinical aspects.
Role in Human Pathology
SUMO protein is implicated in the etiology of many biomedical disease states not limited to: cancer, atherosclerosis, cardiovascular disease, neurodegenerative disease, diabetes, liver disease, intestinal disorders, and even infectious disease. [2][26][27]
In the case of the well-studied cancer tumor suppressor known as p53, there is a regulatory ubiquitin ligase protein in humans called Mouse Double Minute 2 protein, or MDM2, which acts to remove p53 from the cell. MDM2 regulates itself through self-ubiquitination by way of a RING finger domain, targeting itself for proteasomal destruction. When it is SUMOylated at the RING finger domain, MDM2 no longer limits its own function in the cell. When protected from itself, it likewise ubiquitinates p53, marking the protective p53 for destruction instead, whose absence is understood to promote cancer. Here again, the base case is SUMOylation, which is actively being undone by newly discovered SUMO protease SUSP4 and also by the SUMO protease interaction of SMT3IP1/SENP3 which is understood to deSUMOylate both MDM2 and p53.[28][29] One of the ways p53 functions is as a DNA-binding tetramer; interestingly, SUMOylation of p53 delocalizes it from the nucleus, which prevents such activity.[30] The critical nature of p53 cannot be overstated: in fact, if a human carries only one non-functioning copy of p53, it results in a deadly cancer prognosis known as Li-Fraumeni syndrome. Beyond p53, in cancer, many oncogenes and tumor suppressors have been discovered to be SUMOylated in order for the cancer to progress or not, with each SUMOylation event having one of a variety of effects.[31] When IκB is SUMOylated, the SUMO post-translational modification outcompetes ubiquitination, protecting it from degradation, and by extension, the transcription factor NF-κB is bound in a complex with IκB, preventing the expression of genes that may otherwise cause cells with DNA damage to apoptose. In hypoxic conditions as arise in some cancers, HIF-1α, which is usually SUMOylated followed by subsequent ubiquitination and degradation through the von Hippel-Lindau tumor suppressor's ubiquitin ligase activity, is instead deSUMOylated thereby promoting survival of the tumorigenic cells.[32] The fallout from deSUMOylation of HIF-1α includes promotion of MMPs which are understood to contribute to the progression of EMT, a hallmark of cancer.[33]
In atherosclerosis, both p53 and ERK5 are SUMOylated by the stimulus of disturbed blood flow. The stimulus is transduced by the activation of a serine/threonine kinsase called p90RSK, which phosphorylates the human SUMO protease SENP2 at the throenine amino acid residue 368. That phosphorylation is sufficient for the delocalization of the SENP2 from the nucleus. The effects of this phosphorylation-dependent SENP2 inhibition by nuclear export include the SUMOylation of p53 which leads to endothelial cell apoptosis, and SUMOylation of ERK5 which leads to inflammation. Nuclear export of SENP2 additionally downregulates endothelial nitric oxide synthase, eNOS while it upregulates inflammatory adhesion molecules.[34] As eNOS is required for healthy vascular physiology, pathological oxidative stress ensues in vascular endothelial cells.[35] With the oxidative stress comes subsequent accumulation of cellular lipids; this results in the inflammatory foamy cell state that is typified by atherosclerosis as well as the similarly inflammatory myelin-laden macrophages known to produce chronic inflammation in SCI.[36][37]
In cardiovascular disease, many proteins are subject to SUMOylation. To say SUMOylation itself is bad or good regarding this or any other class of disease is to overlook the role of the multiple proteins in question. One common denominator among many conditions is fibrosis; in myocardial fibrosis, PPARγ1 is understood to have a role in regulating expression of some key genes, and its transcriptional activity is generally inhibited by SUMOylation. Therefore, one possible therapeutic intervention in the case of cardiac hypertrophy may be countering the SUMOylation of PPARγ1.[38]
In neurodegenerative disease, we often observe pathological accumulation of proteins. Inclusion bodies form when for example, the Huntington's disease protein, aptly named Huntingtin, accumulates and folds into a form which is impervious to the proteasome. In Huntington's disease, sufficient SUMOylation of the anomalous Huntingtin protein prior to such refolding could perhaps delay the progression of the disease state by enabling timely destruction of the protein while the polypeptide chains are still accessible to the protease subunits within the proteasome. Other accumulating proteins which threaten neurodegenerative disorders include α-synuclein (associated with Parkinson's) and Amyloid β (associated with Alzheimer's), and if acted upon early enough, disease could perhaps be better mitigated.[39]
Recombinant proteins expressed in E. coli may fail to fold properly, instead forming aggregates and precipitating as inclusion bodies.[42] This insolubility may be due to the presence of codons read inefficiently by E. coli, differences in eukaryotic and prokaryotic ribosomes, or lack of appropriate molecular chaperones for proper protein folding.[43] In order to purify such proteins it may be necessary to fuse the protein of interest with a solubility tag such as SUMO or MBP (maltose-binding protein) to increase the protein's solubility.[43] SUMO can later be cleaved from the protein of interest using a SUMO-specific protease such as Ulp1 peptidase.[43]
^Cheng TS, Chang LK, Howng SL, Lu PJ, Lee CI, Hong YR (February 2006). "SUMO-1 modification of centrosomal protein hNinein promotes hNinein nuclear localization". Life Sciences. 78 (10): 1114–1120. doi:10.1016/j.lfs.2005.06.021. PMID16154161.
^Gill G (October 2005). "Something about SUMO inhibits transcription". Current Opinion in Genetics & Development. 15 (5): 536–541. doi:10.1016/j.gde.2005.07.004. PMID16095902.
^Wei W, Yang P, Pang J, Zhang S, Wang Y, Wang MH, et al. (October 2008). "A stress-dependent SUMO4 sumoylation of its substrate proteins". Biochemical and Biophysical Research Communications. 375 (3): 454–459. doi:10.1016/j.bbrc.2008.08.028. PMID18708028.
^Matic I, Macek B, Hilger M, Walther TC, Mann M (September 2008). "Phosphorylation of SUMO-1 occurs in vivo and is conserved through evolution". Journal of Proteome Research. 7 (9): 4050–4057. doi:10.1021/pr800368m. PMID18707152.
^Ren J, Gao X, Jin C, Zhu M, Wang X, Shaw A, et al. (2009). "Systematic study of protein SUMOylation: Development of a site-specific predictor of SUMOsp 2.0". Proteomics. 9 (12): 3409–3412. doi:10.1002/pmic.200800646. PMID19504496. S2CID4900031.
Ulrich HD (October 2005). "Mutual interactions between the SUMO and ubiquitin systems: a plea of no contest". Trends in Cell Biology. 15 (10): 525–532. doi:10.1016/j.tcb.2005.08.002. PMID16125934.
Gill G (October 2005). "Something about SUMO inhibits transcription". Current Opinion in Genetics & Development. 15 (5): 536–541. doi:10.1016/j.gde.2005.07.004. PMID16095902.
Li M, Guo D, Isales CM, Eizirik DL, Atkinson M, She JX, et al. (July 2005). "SUMO wrestling with type 1 diabetes". Journal of Molecular Medicine. 83 (7): 504–513. doi:10.1007/s00109-005-0645-5. PMID15806321. S2CID29252987.
Peroutka Iii RJ, Orcutt SJ, Strickler JE, Butt TR (2011). "SUMO fusion technology for enhanced protein expression and purification in prokaryotes and eukaryotes". Heterologous Gene Expression in E.coli. Methods in Molecular Biology. Vol. 705. pp. 15–30. doi:10.1007/978-1-61737-967-3_2. ISBN978-1-61737-966-6. PMID21125378.