Structural Insights into Protein-Ligand Interactions of Small Leucine Rich Repeat Proteoglycans with a Large Number of Binding Partners: An Overview

Structural Insights into Protein-Ligand Interactions of Small Leucine Rich Repeat Proteoglycans with a Large Number of Binding Partners: An Overview. J Cell Signal. 2023;4


Introduction
Small leucine rich repeat proteoglycans (SLRPs) exist in the extracellular matrix [1][2][3][4][5][6][7]. They are divided into five distinct classes; class I consists of biglycan, decorin asporin (PLAP-1), and ECM2, class II is fibromodulin, lumican, PRELP, keratocan, and osteomodulin/osteoadherin, class III is osteoglylcin/ mimecan, epiphycan, and opticin, class IV is chondroadherinlike protein, nyctalopin, and Tsukushi, and class V is podocan and podocan-like protein 1 (Table S1) [8,9]. SLRPs contain tandem arrays of LRRs flanked by cysteine clusters at the both N-and C-termini [8,9]. The disulfide bonds of the Cys clusters stabilize a capping structure that shields the hydrophobic core of the first LRR unit at the N-terminus and the last unit at the C-terminus [10,11]. The extreme N-and/or C-termini contain low complexity sequences, glycosaminoglycan (GAG) chain and/or sulfated tyrosine residues in some members of SLRPs. The LRRs adopting short β-strands at positions 3 -5 form a parallel β-sheet and form a solenoid structure of a super helix arrangement [10][11][12]. The LRR solenoid structure may be divided into four parts consisting of a concave face, an ascending face, a convex face, and a descending face ( Figure  1) [10,13]. LRRs are characterized by a common molecular architecture adapted to protein-protein interactions [11].

Dermatopontin (Tyrosine-rich acidic matrix protein):
Decorin interacts with dermatopontin [109,110]. The core protein of the decorin molecule binds to dermatopontin and the interaction is probably ionic [110]. The dermatopontindecorin complex binds fold more TGF-β1 than did each component individually [111]. In Silico analysis predicts that the entire concave face of decorin interacts with dermatopontin [112].
Cytosine -phosphate -guanine (CpG) dideoxynucleotide motif (CpG-DNA): Lumican competes with CD14 to bind CpG-DNA in vitro [136]. Biglycan binds CpG-DNA and suppresses TLR9 response [136]. TLR9 with 26 LRRs recognizes bacterial and viral DNA containing CpG-DNA [141]. The structure of the TLR9-CpG DNA complex reveals that CpG-DNA is recognized by both promoters, in particular by the N-terminal LRRNT-LRR10 fragment from one protomer and the C-terminalterminal fragment (LRR20-LRR22) from the other [142,143]. Baumann et al., [144] suggested that CD14 binds to CpG-DNA directly, while Li et al., [145] disputed the claim that CD14 is involved in CpG DNA capture. We infer that CpG DNA may interact with the concave face of the LRR domain in biglycan.
TGF-α: TGF-α belongs to the EGF family. Biglycan binds to TGF-α [154,155]. The structure of TGF-α consists of a third, N-terminal strand (residues 4-6) aligned with the large β-ribbon (residues 19-33) to form a three-stranded β-sheet and an ordered C terminus. The structure of the TGF-α -EGFR complex is available [156,157]; TGF-α molecule is clamped between the concave faces of the L1 and L2 LRR domains from the EGFR molecule. We infer that the binding site of TGF-α may be the concave face of the LRR domain in biglycan.

Low-affinity nerve growth factor receptor (p75NTR):
p75NTR is a type I transmembrane protein and act as a tyrosine kinase co-receptor. PRELP directly binds to p75NTR with low micromolar affinities as well as IGF1R [166].

Ascending face
Fibronectin: Decorin interacts with the cell-binding domain of fibronectin [167] and also binds to the N-terminal fibronectin type III-repeat in collagen XIV [50]. Because heparin competed with decorin competitively, binding of decorin to fibronectin likely occurs at a heparin-binding region [168]. The sequence of NKISK in LRR3 (forming a part of ascending loop) of decorin is possibly involved in the interaction between the proteoglycan and fibronectin [169]. Fibromodulin also interacts with fibronectin [112]. In Silico analysis predicts that the fibromodulin-fibronectin interaction occurs on the entire concave face of fibromodulin [112].

Vesicular endothelial growth factor receptor 2 (VEGFR-2):
Decorin binds VEGFR-2 [170,171]. Osteoglycin interacts with VEGFR2 [172,173], but not with VEGF-A. Decorin binds to the N terminus of VEGFR-2 in a region overlapping with its natural ligand VEGF-A [170]. The binding site of the decorin core protein includes 12 amino acid sequence LGTNPLKSSGIE in LRR5; most avid binding was represented by LGTNPLK at the proximal end [170]. The sequence constitutes an ascending loop in the LRR solenoid structure.

Complement factor (FH) and complement factor H-related protein-1 and -5 (FHR1 and FHR5):
The complement system is a part of the innate immune system that enhances the ability of antibodies and phagocytic cells [174,175] [176] bind to FH. The Tyr-384/402 variant of FH binds fibromodulin better than the His-384 form [177]. The side chain of Tyr/His at position 384/402 is exposed to solvent [178]. Thus, we infer that π-π stacking interaction between neutral histidine in fibromodulin and aromatic amino acid Tyr-384 in FH occurs on the ascending loop face [102,179]. Fibromodulin, osteomodulin, and PRELP bind to complement factor H-related protein-1 and -5 (FHR1 and FHR5) [180]. FHR1 binds to these ECM components through its CCP domains 4-5, whereas FHR5 binds via its middle region, CCPs 3-7. Both FHRs competitively inhibit the binding of FH. Biglycan and decorin do not bind FH, FHR1, and FHR5 [180].

The N-terminal region
Decorin and biglycan have the extreme N-terminal region with GAG chains [7]. Fibromodulin and osteomodulin have N-terminal extensions with a variable number of O-sulfated tyrosine residues [30,31]. Strong ionic interactions are expected between GAGs and proteins. The main contribution to binding affinity comes from ionic interactions between the highly acidic sulphate groups and the basic side chains of arginine and lysine [181]. The interactions of GAGs with proteins also involve a variety of different types of interactions, including van der Waals (VDW) forces, hydrogen bonds, and hydrophobic interactions with the carbohydrate backbone [181].
The heparin-binding mycobacterial surface protein (HBHA): HBHA binds to decorin [182]. A truncated C-terminal HBHA fragment which contains Lys-Pro-Ala-rich repeats binds to decorin. This interaction likely occurs between the sulfated GAG extending from the decorin core protein and the Lys-Pro-Ala repeats at the C terminal side.

Tenascin X (TN-X):
TN-X is an extracellular matrix protein whose absence results in an alteration of the mechanical properties of connective tissue [193]. TN-X consists of the Nto the C-terminal part by a Tenascin assembly domain (TAD), a series of 18.5 repeats of EGF-like motif, a high number of Fibronectin type III module, and a fibrinogen-like globular domain. The DS chains of decorin bind to the heparin-binding site included within the fibronectin-type III domains 10 and 11 of TN-X [194]. Interestingly, a binding site that interacted with the decorin core protein could be assigned to the N-terminal fibronectin type III repeat of collagen XIV [50]. In addition, an auxiliary binding site located C-terminal to this fibronectin type III repeat interacted with the GAG of decorin [50].  [195]. The GAG side chain of decorin is essential for LDL binding [196].
Multiple sequence alignment of the five homologs of Dbps shows that Lys-80 in DbpA and Lys-79 in DbpB are conserved, which indicates their importance of Dbp proteins [206]. Lys-78 and Lys-82 of DbpA, on the contrary, are part of the second potential binding site. The protein core of decorin may be required for detectable binding by DbpA [198,202]. However, there is yet no evidence of direct interactions between the decorin core protein and Dbps.
von Willebrand factor (vWF), matrilin-1, and chordin: vWF is a large protein with 2,813 amino acids and contains three types of VWF domains (vWFA 1-3, vWFC 1-3, and vWFD 1-4). Matrilin-1 contains two vWFA domains and one EGF-like domain. Chordin contains one vWFC domain. Decorin binds to vWF [208]. Decorin or biglycan interact with matrilin-1 [209]. Tsukushi binds to chordin [93]. Biglycan binds chordin and BMP-4 in Xenopus embryos [210]. The GAG side chains of decorin mediate the interaction with vWF [208]. The same binding mode may occur in biglycan. However, Tsukushi has no GAG chain. The structure of the complex of VWF A1 domain -the extracellular LRR domain of GP1bα reveals that the concave face is involved in the interactions [211,212]. We infer that the vWF domain within chordin directly interacts with Tsukushi via its concave face.
p65NF-κB: Nuclear factor-kappa B (NF-κB) is an essential transcription factor in the control of expression of genes involved in cell growth, differentiation, inflammation, and neoplastic transformation. Biotin hbd PRELP and p65NF-κB physically interact; the GAG-binding domain of PRELP acts as a cell type-specific NF-κB inhibitor that impairs osteoclastogenesis [213]. α-Dystroglycan: α-Dystroglycan is an extracellular peripheral membrane glycoprotein anchored to the cell membrane by binding to a transmembrane glycoprotein. Torpedo biglycan, in a fashion dependent on its CS side chains, binds to the protein core of the C-terminal third of α-dystroglycan [214].
Tumor necrosis factor-α (TNF-α): TNF-α is a cytokine that plays a central role in inflammation, immune system development, apoptosis, and lipid metabolism. TNF-α binds to both biglycan and decorin with Kds of 0.81 μM and 1.23 μM, respectively [215]. The binding occurs via both the core protein and the DS GAG chain.
C4b-binding protein (C4BP): C4BP is a complement, potent soluble inhibitor and contains many CCP domains [216]. Osteomodulin, chondroadherin, fibromodulin, and PRELP bind to C4BP [217]. The major interaction site on C4BP is localized to the central core, including CCP8. The binding of osteomodulin, fibromodulin, and PRELP to C4BP shows a concentrationdependent manner and ionic in nature, while the binding of C4BP to chondroadherin shows both ionic and hydrophobic character. PRELP and osteomodulin have overall basic and acidic properties, respectively, which are likely to contribute to their binding properties [217]. A cluster of tyrosine sulfate residues in the N terminus of fibromodulin contributes the anionic character of this SLRP, which may be important for the interactions [217]. Being basic, chondroadherin in contrast may use hydrophobic patches to bind C4BP as well as clusters of charged residues [217].

Heparin-binding proteins:
The fibromodulin N-terminal domain binds motifs of basic clusters in heparin-binding proteins such as basic FGR-2, TSP-1, MMP13, the NC4 domain of collagen IX, interleukin-10, and PRELP [121]. Despite the differences in the tyrosine sulfate domain, binding to osteomodulin was the same as that to the fibromodulin tyrosine sulfate domain, with the interesting exception of MMP-13 [121]. The binding of the NC4 domain of collagen IX to fibromodulin and osteomodulin was also indicated by Kalchishkova et al. [218].
Thrombospondin-1 (TSP-1): TSP-1 contains heparinbinding domain, vWFc, laminin G-like, TSP type 1and 2, and the region of basic and acidic residues. Decorin interacts with TSP-1, which inhibits cell adhesion to TSP-1 [219,220]. The binding sites of decorin to TSP-1 are the GAG chains and the core protein [219,220]. Brain-specific angiogenesis inhibitors (BAIs) contain 4 to 5 TSP type-1 repeats (TSRs), while RTN4 (nogo)-receptors contain the LRR domain with nine LRRs. The structure of the BAI1 TSR3 domain in complex with RTN4 receptor revealed that a single TSR domain binds to the LRR domain of RTN4 receptor [221]. Thus, we infer that the LRR domain as well as the GAG chain participates in the binding to TSP-1.

CD44:
The CD44 antigen is a cell-surface glycoprotein involved in cell-cell interactions, cell adhesion and migration. Biglycan interacts with CD44, which increases M1 autophagy [222]. Extracellular secreted asporin binds to CD44 to activate Rac1 [223,224]. The GAG chains of biglycan and lumican may interact with CD44, because CD44 interacts with the CS side chain of Serglycin [225].
Heparin and heparan sulfate (HS): PRELP binds heparin and HS [227]. This interaction is mediated through the basic parts of highly sulfated sequences of heparin and heparan sulfate. Opticin binds to type XVIII collagen via its HS chains [228]. Opticin binds to heparin, HS, CS, and DS; the binding affinity is dependent on sulfation pattern and oligosaccharide chain [228]. We infer that the binding site of opticin is the arginine clusters of 153-RRTAYLYARFNRISRIR-159.

Platelet-derived growth factor (PDGF):
Decorin binds PDGF [229,230]. Extractable pool decorin DS is able to bind most probably even in irreversible manner both growth factors (PDGF-BB and FGF-2) and fibronectin as judged from very low K d values characterizing all interactions. In turn, biglycan DS displays particularly high affinity to FGF-2 [230].

The N-terminal capping region
Zn 2+ and Ca 2+ : Decorin is a Zn 2+ metalloprotein [231]. The Zn 2+ -binding sites are localized to the N-terminal domain of the core protein that contains 4 Cys residues. This likely results in a large conformational change of the N-terminal capping structure. The N-terminal polyaspartate domain of asporin binds calcium and regulates hydroxyapatite formation in vitro [66].
Fibrinogen: Fibrinogen is a glycoprotein complex that circulates in the blood of all vertebrates. Decorin binds with the globular D domain of fibrinogen in a Zn 2+ -dependent interaction [232,233]. Taken together, the N-terminal capping region of decorin likely participates in the interaction with the fibrinogen D domain.

α-and γ-sarcoglycan:
The sarcoglycans are a family of transmembrane proteins (α, β, γ, δ or ε) involved in the protein complex responsible for connecting the muscle fiber cytoskeleton to the extracellular matrix. Biglycan binds to α-and γ-sarcoglycan but not β-or δ-sarcoglycan [238]. The binding sites on the polypeptide core of biglycan for αand γ-sarcoglycan are distinct. α-Sarcoglycan binds to the N-terminal cysteine-rich domain of biglycan that forms a capping structure [238].

Lysyl oxidase (LOX):
Lysyl oxidase (LOX) enzymes oxidize lysyl and hydroxylysyl residues from collagen and elastin chains [239]. Fibromodulin interacts with LOX and acts as a modulator of its activity fostering a site-specific cross-linking of collagen fibrils [74]. This interaction was mapped to the N-terminal 12 amino acids of fibromodulin with no apparent effect of tyrosine sulfation of fibromodulin [74].
The C-terminal region CCN2/CTGF: CCN2 is a member of CCN protein family which is composed of four distinct domains connected in tandem, i.e., IGF-binding protein-like (IGFBP), von Willebrand type C, thrombospondin type 1 repeat (TSP-1), and C-terminal (CT) domains. Mouse Tsukushi binds to the CT and IGFBP domains of CCN2 [83]. Decorin interacts with CCN2 [236,240]. The interaction is in a saturable manner with a Kd of 4.4 nM and LRRs 10 -12 are important for the interaction with CCN2 [240]. A peptide derived from the VS part of LRR12 (ie, Gln335-Lys359) inhibits CCN2 -decorin complex formation [240]. The part maps α-helix in the C-terminal capping structure. Thus, we suggest that the C-terminal capping structure participate in the interaction with CTGF.
Integrin α2β1: The binding site for integrin α2β1 maps to an α-helix in the C-terminal heparin binding region of chondroadherin (307-CQLRGLRRWLEAK-318) [192], which constitutes a part of the C-terminal capping structure. The core protein of lumican directly interacts with the I domain of α2 integrin subunit in the α2β1 integrin [189].

Transforming growth factor-β receptor 1 (ALK5):
Lumican binds to ALK5 [243]. In silico analysis proposed that the interaction occurs between the C-terminal 50 amino acid region (L EKFDIKSFCKILGPLSYSK IKHLRLDGNRI SETSLPPDMYECLRV ANEVTLN) of lumican and the GS domain of ALK [243]; the lumican C-terminal region comprises a capping structure.
Apolipoprotein(a): Apolipoprotein(a) binds via its C-terminal domain to the protein core of decorin [244]. The binding of Lp(a) to decorin involves both electrostatic and hydrophobic interactions.

Core protein
Microfibril-associated glycoprotein-1 (MAGP-1) and fibrillin-1: MAGP-1 with 183 residues contains a disordered region in in the central, while fibrillin-1 with 2871 amino acids contains 48 EGF-like domains; these proteins are components of extracellular microfibrils. Decorin interacts with each protein individually and with both proteins together form a ternary complex [245]. The decorin core rather than its GAG side chain mediates the interaction. MAGP-1 interacts with biglycan but not decorin in the solution phase [246]. An EGF-like domain in fibrilin-1 might interact with the concave face of decorin.
Tropoelastin: Tropoelastin is the basic building block of elastin making up the majority of elastic fibers [247,248]. Tropoelastin is the soluble precursor of elastin with a molecular weight of about 60 kDa. Biglycan and decorin bind to tropoelastin [246]. The binding sites are contained in the protein cores of the proteoglycans rather than the GAG side chains [246]. Biglycan forms a ternary complex with tropoelastin and MAGP-1 [246]. Like hydroxyproline-rich collagen, elastin contains about one-third glycine and approximately one-ninth proline, and then is characterized by repetitive sequence. Thus, we think that elastin and tropoealstin partially adopt a collagen-like helix. Thus, tropoelastin adopting a collagen-like helix might interact through concave face of biglycan and decorin, as seen in the collagen interactions.

Filamin -A:
Filamins are a family of actin-binding proteins composed of filamin A, B, and C [249]. The LRR region of decorin interacts with filamin-A (ABP-280) [250]. This interaction is dependent on the 288 carboxyl-terminal amino acids of filamin-A, which correspond to repeats 22-24 of its conserved β-sheet structure [250].
Tubulin: Tubulin consists of α-and β -subunits. α-and βtubulins polymerize into microtubules, a major component of the eukaryotic cytoskeleton. Lumican interacts with tubulin [255,256]. The N-terminal part of lumican, and the fragments of spanning LRR1-LRP4, LRR5-LRR7 and LRP8-LRR10 are colocalized with microtubule [256]. Lumican core proteins interact with tubulins. Taken together, we infer that the binding sites might be the concave face of the LRR domain.

Resistin:
Resistin is a cysteine-rich peptide hormone derived from adipose tissue [257]. Decorin lacking the glycation site binds to resistin [257]. This suggests that the decorin core protein interacts with resistin.

Covalent interaction
Aggrecan (Chondroitin sulfate proteoglycan 1): The aggrecan core protein is depicted with three disulphide bonded globular domains (G1-3), an interglobular domain (IGD), and attachment regions for keratan sulphate (KS) and chondroitin sulphate (CS1 and CS2). Aggrecan participates in covalent and nonreducible interactions with lumican in this high-molecular weight complex in the aging human sclera [258]. Theoretical model shows that lumican is covalently linked to the aggrecan through both disulfide bonding and the transglutaminase linkage of Gln-Lys (Q-K) [258].

Unknown binding sites
SOX2 is a transcription factor that is essential for maintaining self-renewal or pluripotency of undifferentiated embryonic stem cells [260]. Tsukushi interacts with Sox2 and BMP-4 which controls stereocilia formation in the inner hair cells [92]. Delta protein from African clawed frog mediates segmentation of the paraxial mesoderm in Xenopus embryos [261]. It is 721 residues long and contains four EGF-like domains (UniProtKB: Q91902). Tsukushi interacts with Delta [83]. Fas ligand (FasL/CD95L) is a type-II transmembrane protein that belongs to the TNF family. Lumican has been suggested to bind FasL/CD95L [262]. Frizzled is a family of atypical G protein-coupled receptors that serve as receptors in the Wnt signaling pathway and other signaling pathways. Chick Tsukushi directly binds to the cysteine-rich domain of frizzled 4 with an affinity of 2.3 × 10 −10 M and competing with Wnt2b [263]. Tsukushi also binds to frizzled 3 [264]. Wnt proteins are secreted glycoproteins that activate different intracellular signal transduction pathways. Podocan directly interacts with Wnt4 [265]. The receptor muscle-specific kinase (MUSK) is indispensable for nerve-muscle synapse formation and maintenance [266]. Biglycan directly binds the ectodomain of mouse MuSK [266]. Both the Ig and Frizzled (CRD/Fz) domains of MuSK are required for biglycan binding. 26S proteasome non-ATPase regulatory subunit 2 (PSMD2) is a component of the 26S proteasome, a multiprotein complex involved in the ATP-dependent degradation of ubiquitinated proteins [267]. Asporin strongly interacts with PSMD2 in gastric cancer (GC) cells [267]. Endostatin is a proteolytically released fragment of the C-terminal domain NC1 of collagen XVIII [263]. Endostatin binds biglycan and LDL [268]. Endostatin and biglycan interact with each other directly [268]. The crystal structure of endostatin reveals a globular form [269]. The LRR domain of biglycan might interact with endostatin. p120 catenin regulates cell-cell adhesion with cadherins. Lumican interacts with nuclear p120 catenin [253,270]. Opticin binds retinal growth hormone in the embryonic vitreous [271]. Nyctalopin is located on the surface of photoreceptor-to-ON bipolar cell synapse in the retina [272]. Nyctalopin interacts directly with transient receptor potential cation channel subfamily member 1 (TRPM1) [273,274] and additionally with glutamate receptor mGluR6 [274]. Nyctalopin forms complexes with both TRPM1 and mGluR6 [274].

Discussion
The concave face, the ascending loop, the N-or C-terminal capping regions, the GAG chains, and/or sulfated tyrosine residues are involved in protein, protein interactions. Their combinations were shown or predicted. In contrast, the descending lateral face and the convex face were not observed in their interactions.
The structures of the EGF -EGFR complex (PDB:ID 3NJP and 1IVO) [275,276] are available. In addition, the structures of the IGF-1 -IGF1R and IGF-2 -IGF1R complexes (PDB:ID 5U8Q, 7S0Q, 6PYH, and 6VWI) have been determined [277][278][279][280]. To characterize the spatial arrangement of the two L-domains in EGFR and IGF-1R, Miyashita et al., [144] proposed two parameters of the distance between the two L domains (L) and the angle between the two axes showing the direction of the β-sheet stacking of the LRRs in the L domains (Ψ). The structural two parameters (L and Ψ) of their complexes in both the free state and the complexed state demonstrate that the EGF binding to EGFR and the IGF-1 or IGF-2 binding to IGF1R bring about large structural changes. Thus, we infer that similar structural changes occur in interactions between SLRPs (decorin, biglycan, asporin and PRELP) and EGFR or IGF1R.
The functions of SLRPs including decorin, biglycan and lumican are known to be altered in human diseases, such as cancers [16,281]. Lumican -derived peptides that interact MMP-14 inhibit melanoma cell growth and migration [236]. Decorin -derived peptide that interacts with CCN2 inhibits its biological activity [240]. Therefore, it would be significant to discuss the possibilities of blocking disease-related SLRPligand interactions as a targeted therapy. Drug delivery systems might be useful [282,283].

Conclusion
We undertook a comprehensive literature search of publications in order to make a list of ligands of all members of SLRPs. We discussed the interacting sites of SLRPs to the binding partners. The protein-ligand interactions occur on not only the concave face but also the ascending face and the N-or C-terminal capping regions. In addition, the extreme N-and/or C-terminal regions with the GAG chains or sulfated tyrosine residues participate in ligand-interaction.

Conflict of Interest
The authors of this manuscript declare that they have no conflicts of interest.

Funding Statement
This work received no funding support.

Competing Interests
The authors declare no competing interest. Table S1 shows the repeat number of LRRs of SLRPs and the helix parameters of the LRR solenoid structures.