Socket-Home Socket2 Woolfson Group, University of Bristol Home About Help Downloads Relevant Links References Socket2 finds knobs-into-holes (KIH) packing between α -helices of protein structures (Crick, 1953; Walshaw and Woolfson, 2001). KIH packing is the structural hallmark of coiled-coil (CC) assemblies. Socket2 unambiguously defines the beginning and end of CC regions from the PDB-format structure files, and present these graphically and interactively. It also assigns heptad repeats and registers to the identified CCs, which are the sequence signatures of CC proteins (Lupas & Bassler, 2017). This webserver uses MAXIT to convert the uploaded mmCIF format files to PDB format and NGL viewer (Rose et al., 2018) to display the identified CCs. It gives a series of outputs including a PyMOL (Schrödinger, LLC) script to allow users to visualise and manipulate the CC structures on their own devices. For single PDB file: Please enter 4 letter PDB-ID: Use Biological Assembly [i]A functional form of the molecule also known as biological unit. for example, the biological unit of Hemoglobin protein has four chains. For more information, please visit PDB 101. ------ or ------ Upload a file containing atomic coordinates in PDB/mmCIF file format: -------------------- Socket Options: Packing-Cutoff: 5.5 5.6 5.7 5.8 5.9 6.0 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 7.0 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 8.0 Å [i] packing-cutoff represents the tightness of the Knob-into-holes interactions; the smaller it is, the more ideal the packing. The default packing-cutoff is 7.0Å. Helix-Extension: 0 1 2 Residue(s) [i] The helix-extension option extends DSSP defined α-helices in the input protein structure. For example, if the option selected is 1, each end will be extended by 1 and helix length is 2 more than the actual. If you haven't already done so, please read the following: Please be aware that use of this facility involves uploading a file of PDB-format coordinates onto our server. If you feel uneasy about uploading your unpublished protein structures, then please don't use it; get a copy of the Socket2 program from the Downloads section and run it locally instead. Of course, we do not wish to compromise the confidentiality of unpublished data; uploaded files are processed only by automatic scripts, are automatically deleted within 12 hours, and are not examined manually. Uploaded files cannot be browsed by any third party. Socket2 finds knobs-into-holes (KIH) packing between α -helices of protein structures (Crick, 1953, Walshaw and Woolfson, 2001). KIH packing is the structural hallmark of coiled-coil (CC) assemblies. Socket2 unambiguously defines the beginning and end of CC regions from the PDB-format structure files, and presents these graphically and interactively. It also assigns a heptad repeats and registers to the identified CC regions, which are the sequence signature of CC proteins (Lupas & Bassler, 2017). KIH packing is arguably the simplest and most regular kind of tertiary and quaternary interactions that occur in proteins structures. Nonetheless, despite sharing similar packing characteristics, the sequences and architectures of CCs are reasonably diverse. A better understanding of the relationship between the sequences and structures of CCs has implications for the prediction of quaternary -interactions between proteins (Lupas et. al., 2017), and for the design of new CCs (Woolfson, 2017) for applications in cell biology, synthetic biology and biotechnology (Dawson et. al., 2019; Beesley & Woolfson, 2019). Specifically, the purposes of Socket2 are to provide a single and interactive web-based tool that: objectively and unambiguously define the location of a CC regions in protein structures automatically assigns the heptad-repeat positions (a, b, c, d, e, f, g) of the component-sequences of these CC regions; and provides an initial structural bioinformatic characterisation of the identified CC regions. In turn, we anticipate that Socket2 and data from it will be used in a variety of ways such as: to gather CC sequence statistics and structural parametersto improve sequence-to-structure relationships for this important and ubiquitous class of protein structure; to create sequence and structure databases (Testa et. al., 2009) that can be used to test CC prediction algorithms, to improve CC modelling (Wood et. al., 2017; Wood and Woolfson, 2018), and to develop rules for CC design (Woolfson, 2017). to highlight unusual assemblies of α -helices that go beyond the traditional CC, and thus expand our understanding of this prolific, varied and versatile protein fold. In short, we hope that the emerging folding and design principles, founded on KIH packing between α-helices, will enable us to create novel and useful protein assemblies. References Beesley, J.L. & Woolfson, D.N. (2019) Curr. Opin. Biotech. 58, 175-182 Crick, F.H.C. (1953) Acta Cryst. 6, 689-697 Dawson, W.M. et. al. (2019) Curr. Opin. Chem. Biol. 52, 102-111 Lupas, A.N. & Bassler, J. (2017) Trends Biochem. Sci. 42 (2), 130-140 Testa, O.D. et. al.. (2009) Nucleic Acids Res. 37, D315-322 Walshaw, J. & Woolfson, D.N. (2001) J. Mol. Biol. 307 (5), 1427-1450 Wood, C.W. et. al.. (2017) Bioinformatics. 33 (19), 3043-3050 Wood, C.W. & Woolfson, D.N. (2018) Protein Sci. 27 (1), 103-111 Woolfson, D.N. (2017) Subcell Biochem. 82, 35-61. Socket2 is an update and upgrade of Socket (Walshaw & Woolfson, 2001) which identifies knobs-into-holes (KIH) packing in protein structures. Input The required input can be the 4-letter PDB ID or an uploaded PDB/mmCIF-formatted atomic coordinate file. Optional parameters are: packing-cutoff This parameter in Angstrom (Å) essentially represents the tightness of the KIH interactions; the smaller it is, the closer the packing of side chains. Socket2 searches for 'knobs' (single side chains from one helix) that fit into 'holes' formed by four side chains of a neighbouring helix. The 'packing-cutoff' parameter specifies how close the centre of mass of the knob and hole side chains must be to constitute a KIH interaction. The four distances must all be within this cutoff for the interaction to be considered as KIH. The default packing-cutoff is 7.0 Å. Helix extension Socket2 finds KIH interactions to assign CC regions in α -helical assemblies. DSSP (Kabsch & Sander, 1983) is used to identify the α -helices. However, many helices have small deviations from their regular helical character due to various factors like solvent-induced distortions, peptide-bond distortions, or the presence of residues like glycine, proline, serine and threonine. Any resulting small breaks in the helices can be avoided by extending the helix with the helix-extension parameter. The default helix-extension value is 0. Major updates from Socket There are two major updates: We have removed the upper limit of the number of helices "allowed" in a CC (previously 6) to allow CCs to be identified effectively with any number of helices and in any orientation. There are many PDB entries with glycine residues in helical regions, but Socket did not consider these fully in the identification of KIH packing. To rectify this in Socket2 dummy Cβ atoms are added to glycine residues. This allows Socket2 to successfully identify CCs where glycine participates in KIH interactions. One example is PDB ID: 4V1F, a membrane protein (Preiss et. al., 2015). New features of the Socket2 Webserver This now allows users to run Socket2 either by providing a 4-letter PDB ID or by uploading a file containing 3D coordinates with file extension .pdb/.mmCIF. It uses MAXIT to convert mmCIF to PDB file format. An added advantage of the webserver is the provision to use biological assemblies by checking the box provided. However, this option is not valid when using user-provided uploaded files. It checks for the presence of modified residues and, if not present, it adds corresponding MODRES record to the input file. It employs the NGL viewer (Rose et. al. , 2018) to allow immediate inspection of CCs, providing users an edge over the standalone version. Each participating helix of the CC is displayed in a different colour, and knob residues can be highlighted in ball and stick representation. Residues in the helices can also be rainbow-colour-coded according to their heptad positions. Figure 1: Output page with NGL visualizer and PDB ID: 6G67 (Rhys et. al., 2018) as an example. A successful run of Socket2 allows the webserver to create separate tab Coiled-Coil # for each identified CC allowing easy analyses. Each CC tab contains three different sub-tabs: (i) Register, (ii) Angle between Helices and (iii) Packing Angle Register: The webserver tabulates the name, number and heptad position for each residue of the helix in a CC. However, the tables do not illustrate or imply any interhelical interactions. Figure 2: A snippet of tabulated information for each residue of participating α -helices of a CC. Angle between Helices: Socket2 computes the angle between two helices of identified CCs and uses them to determine the CC orientation, parallel or antiparallel. Using Matplotlib (Hunter, 2007), webserver plots the angle for each helical pair. Figure 3:Angle distribution for pair of helices in the identified CC in PDB ID: 6G67 (biological assembly). Packing Angle: Socket2 calculates the packing angle for every knob packing into the base of its hole as defined by (Harbury et. al., 1993) and illustrated in (Walshaw & Woolfson, 2001). Using Matplotlib, for each chain, the webserver plots the packing angles for each knob. Figure 4: A snippet of packing angle distribution of knobs of each helix. Output There are three types of output: The standard output describing each coiled coil (if any) The features are described here. a long output (optional) which gives more details of each KIH interaction (example) All residues in the α -helices are listed, one residue per line. The residue number and chain-identifier (if any) are given. Residues that are part of KIH packing regions have the following additional information: the register(s) (preceded by 'R') and topology(s) (in square brackets); the knobtype (preceded by 'T'); and details of the hole into which the knob fits: the helix (preceded by 'H'), the core-packing angle; the register of the 4 hole residues; the chain identifier (if any); the residue name and number of the 4 hole residues; and the dimensions of the hole, i.e. the length of the 4 'sides' (distances between centre-of-mass of each adjacent hole sidechain). N.B. in some unusual helical assemblies based on KIH packing, a residue can have several different registers. An optional PyMOL (Schrödinger, LLC) script, which highlights the CCs along with the knob and corresponding hole residues. Upon clicking, the PyMOL script can load the input file with 3D coordinates that allow detailed analyses of knobs and corresponding holes. The script will generate three images for the protein files where the residues are colour coded according to the position in the heptad repeat. References Harbury, P.B. et. al. (1993) Science 262 1401-1407 Hunter, J.D. Comput Sci Eng (2007) 9 (3) 90-95 Kabsch W. & Sander C. (1983) Biopolymers 22 2577-2637 Preiss L. et. al. (2015) Sci Adv 1 (4) e1500106 Rhys G.G. et al. (2018) Nat Commun 9 (1) 4132 Rose A.S. et al. (2018) Bioinformatics 34 3755-3758 The PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC. Walshaw J. & Woolfson D.N. (2001) J. Mol. Biol. 307 (5) 1427-1450 C Source Code Socket2.zip Socket2.tar.gz Documentation README manual.txt standard.txt (example output) long.txt (example long-format output) The CC+ Database (Testa et. al., 2009), a detailed, searchable repository of CC assignments, which is freely available at http://coiledcoils.chm.bris.ac.uk/ccplus. CCs have been identified using Socket2, which locates CCs based on KIH packing of side chains between α -helices. There are two points of entry into the CC+ Database: the "Periodic Table of Coiled-coil Structures", which presents a graphical path through CC space based on manually validated data; and the "Dynamic Interface", which allows queries of the database at different levels of complexity and detail. The latter entry level, enables the efficient and rapid compilation of subsets of coiled-coil structures. These can be created and interrogated with increasingly sophisticated pull-down, keyword and sequence-based searches to return detailed structural and sequence information. Also provided are means for outputting the retrieved coiled-coil data in various formats, including PyMOL scripts, and Position-Specific Scoring Matrices. The SPIRICOIL database (Rackham et. al., 2010) is a unique resource for the study of the protein super-secondary structure CCs and is freely available at https://supfam.mrc-lmb.cam.ac.uk/SUPERFAMILY/spiricoil/. It provides CC annotations of all currently sequenced genomes and sequence databases such as Uniprot. It also contains structural and evolutionary information known coiled coil structures. Input sequences can also be checked for the presence of coiled coil. CCBuilder is a web-based interactive tool and uses ISAMBARD in the background to generate complete (backbone and side chain) atomistic models of CCs in conformations specified by the user. Models are scored for feasibility with a measure of backbone strain, a test for KIH packing, and two atom-based forcefields. A basic ‘Builder’ mode is capable of modelling homo- and hetero-oligomeric coiled coils in parallel and antiparallel conformations. CCBuilder is avaialable at http://coiledcoils.chm.bris.ac.uk/ccbuilder2/builder. Please consider citing these articles if you use results from Socket2: Kumar, P. & Woolfson, D.N. (2021) Bioinformatics, Link to publisher's site Walshaw, J. & Woolfson, D.N. (2001) J. Mol. Biol., 307 (5), 1427-1450