BCB 590                                                                       

Lab 6                                                                                 Name _____________________________

Function Prediction and Annotation

 

Objectives

  1. Learn how to infer the function of a protein on the basis of sequence similarity to proteins of known function
  2. Predict functional residues on the basis of conservation using ConSurf
  3. Learn to use some advanced features of PyMOL to infer functional residues from structural models

 

Exercises

 

Required questions are in red.  

 

Note: If you were not able to attend the regularly scheduled lab section, it may help to review the background lecture slides, which can be downloaded from the course webpage. Please feel free to ask a TA if you have any questions regarding these slides.

In the previous lab, we saw that

 

1)   ConSurf (http://consurf.tau.ac.il/) One of the simplest ways to predict which residues are functionally important is on the basis of conservation. One webserver that can be used to predict functional residues on the basis of sequence alone is ConSeq , which calculates a sequence conservation score for a user-provided sequence. In today’s exercise, we will actually be using the ConSurf server designed by the same group, as this allows us to view conserved residues on a structure we will provide. The conservation calculations are the same, without utilizing the In this exercise, we will continue to use the Myb protein from the previous lab. As we noticed during yesterday’s exercise, the model produced by SWISS-MODEL is not necessarily useful to us if our end goal is to determine the DNA-binding residues, because the template it selected (PDB ID: 2aje) is not one of the structures in which the protein is complexed with DNA. In today’s exercise, we will resubmit the Myb sequence and specify the particular structure we wish to use as a template.

 

>MYB305

MDKKPCNSQDVEVRKGPWTMEEDLILINYIANHGEGVWNSLAKSAGLKRTGKSCRLRWLNYLRPDVRRGNITPEEQLLI

MELHAKWGNRWSKIAKHLPGRTDNEIKNYWRTRIQKHIKQAENMNGQAANSEQNDHQEGSSSHMSSAGPTETYSPTS

YSANIDTTFQGPFLTETNDNIWSMEDIWSMQLLNGD

 

 

a.    Submit the sequence to SWISS-MODEL as in the previous lab but this time type 1h8aC in the “Use a specific template” field.

b.    Download the model structure and save it in a folder for later use.

c.    Go to the ConSurf website (http://consurf.tau.ac.il/) and enter the PDB file you’ve just downloaded under the “Enter your own PDB file” option. SWISS-MODEL does not provide chain information in their PDB files, so make sure to type “none” in the Chain Identifier box. Scroll down and click “Submit”.

d.    Click on the link to download all results

e.    Click on the link near the bottom of the page for displaying conserved residues in PyMOL near the bottom of the page, and follow the instructions on the resulting page

f.     If you’ve done this correctly, you should now have a PyMOL session with a group of selections corresponding to each of the different conservation classes, with blue colors representing least conserved residues and magenta representing those that are most conserved.

g.   Display this molecule using the spheres representation, then save an image of it and submit this image with your assignment

 

 

 

2)   Advanced PyMOL commands

Now, we want to see if the residues predicted on the basis of sequence appear to actually be the functionally important residues.

a.    Download the PDB file for the template we used: 1h8a

b.    Load this file into your current PyMOL session by using the Open command under the File menu.

c.    We only want to use the DNA from this file, so we will create a separate object for the DNA atoms, and then delete the rest. Use the resn command from the last lab to create a selection containing only the DNA. Remember that you can use the Boolean OR operator to select multiple bases at once. The names for the nucleic acid bases are DA, DT, DG, and DC

d.    Under your DNA selection, click on the A (Action) button and select “create object”.

e.    Now we can delete the 1h8a object by clicking A next to it and selecting “delete object”.

f.     If our modeled protein is not already aligned with the appropriate region of the actual structure, we can align them using the command align <modelname>, 1H8A

g.    You should now have only the dna molecule and our modeled protein structure displayed. Do the conserved residues appear to coincide well with the ones in the protein-DNA interface?

h.   In addition to visual inspection, we can actually use PyMOL to define which residuese are in the interface, by using the around command to select those atoms within a specified distance of the dna molecule. One simple definition of interface is to consider any atoms within 5 angstroms of each other to be interacting. Use the command select interface, dna around 5 AND (NOT dna) to create a selection of all of the atoms that fit this definition.

i.      This will only select the atoms that are within this specified distance, but we want to label the entire residue for any residue that has any atoms that meet the criteria. We can do this by using the command select interface, byres interface. Using this definition, how well does the set of highly conserved residues overlap with the set of actual interface residues? Why do you think some of the highly conserved residues may be far from the interface?

j.     We can also draw any hydrogen bonds between the two molecules by using a particular mode of the distance command. First, we need to create a selection called “protein” by defining protein as everything that is NOT DNA (try to figure out how to do this)

k.    Draw the bonds by typing dist hbonds, protein, dna, mode=2

l.      Use the appropriate set of commands to display the dna molecule in the sticks representation, and the protein as cartoon representation with only interface residues displayed in the sticks representation. Save a copy of this image and submit it along with your assignment

 

 

3)   Other servers

Each year, the journal Nucleic Acids Research (NAR) produces an issue dedicated solely to new and updated webservers.

Go to this year’s issue (http://nar.oxfordjournals.org/content/vol36/suppl_2/index.dtl) and find 2 or 3 servers that appear to be relevant to predicting an aspect of function relevant to a research question you are interested in. Using one of the sequences we’ve provided in the previous labs, or a sequence of particular interest to you, attempt to make predictions on your sequence using the sites you have chosen.

Briefly explain why you chose the particular sites and sequences you used.

 

Attach a copy of your results with your assignment submission

 

 

Please email your completed assignment to Peter.

Email: petez

Domain: iastate.edu