3. Examples for
Using GenomeComp
Chapter Contents
3.1 Two Sequences Comparison (a
step-by-step tutorial)
3.2 Three Sequences Comparison
In this part we will try
to use this program to make a comparison between the two example sequences
(both in Genbank format) in the "example" directory along
with the program itself. If you still have no idea about how this program
work after the previous part of discription, please try to follow the
steps in this section. Then you will find that it is very easy to use
this program to make a comparison!
Here we use two input sequences,
one is Escherichia coli K12 MG1655 section 2 of 400 of the complete
genome with the accession number AE000112
and the other is Escherichia coli O157:H7 EDL933 genome, contig
1 of 3, section 2 of 155 with the accession number AE005178,
all from NCBI
(Note: this two example sequences are already exist in the "example"
directory in the GenomeComp installation path).
The following is some operation
steps (all screen shots are token from Microsoft Windows system):
-
In the configuration
window you may find that entries for locating the external programs
are blank, which means you might do not have them installed in your
system. But just take it easy, because we do not really need them
in this example. Also just let the "Project name" entry
as default. Click the "Browse..." button that following
the "Working directory" entry and specify it to the "example"
directory in the GenomeComp installation directory. For example,
it might be "C:\Program Files\GenomeComp1.2\example".
Make sure this setting is correct because it is very important
for the result of this example.
-
Click "OK"
button after specified the "Working directory" properly
to hide the dialog window and raise the main window.
-
In the main
window click the "Browse..." button after the "The
1st sequence:" entry and find the "example" folder
in the GenomeComp installation folder and choose "sequence1.gbk"
as the first sequence.
-
Click the "Start"
button in the main window to start the comparison.
-
We custom the 1st sequence
name to "Ecoli K12" and the 2nd to "Ecoli
O157" in the confirm window,
and let the other settings as the defaults. Then click the "OK"
button in the confirm window.
-
Then the figure in
the canvas will be shrunk to half of the origin, and it look like
the below, so we can know that these two sequences are very similar
and the only difference is that Ecoli K12 has a insertion
of IS186 in about 4.8 Kb.
- Since the text in
the canvas will not be shrunk, the title of each sequence usually
be covered by others. Hence, one optional way to avoid this matter
is to redraw the figure by re-click the 'Start' button. But, a better
way might be remove the title text from the canvas by unmarking the
'Title' item from the view menu.
Now we use three input
sequences, the 1st is Shigella flexneri 2a 301 complete genome
newly sequenced by the Chinese
National Human Genome Center, Beijing with accession number
AE005674, the 2nd is Escherichia coli O157:H7 EDL933 complete
genome with the accession number AE005174
and the reference sequence is Escherichia coli K-12 MG1655 complete
genome with the accession number U00096.
The main steps are almost
as same as the former comparison process, the following are some differences:
-
Set the compare
type in the options menu to Three
sequences since it is not the default value.
-
Also set the anchor
style in the options menu to "Left" for better visualization
of the left part of those three inputs.
-
We set the "Figure
scale" to 200 to get a more general view about the comparison.
-
We set the "Lower
limit" to 1500 in order to filter the multi-copy transposon
sequences matches.
-
We set the megablast
exception value to 1e-10 to speed the BLAST comparison process in
the confirm window.
-
Since the input sequences
are very long (4~5Mbp) we have to wait several minutes for GenomeComp
to carry out the comparison and represent the results in the canvas.
The following image is
a screen shot of this comparison results.
|