Program seaview was published in:
Gouy, M. Guindon, S. & Gascuel., O. (2010) SeaView version 4 :
a multiplatform graphical user interface for sequence alignment and phylogenetic tree building.
Molecular Biology and Evolution 27(2):221-224.
Galtier, N., Gouy, M. & Gautier, C. (1996)
SEAVIEW and PHYLO_WIN: two graphic tools for sequence alignment and molecular phylogeny.
Comput. Appl. Biosci., 12:543-548.
Version 4.3.3
Binaries and full source code available from http://pbil.univ-lyon1.fr/software/seaview.html
© 1996-2011 Manolo Gouy
Laboratoire de Biometrie et Biologie Evolutive
CNRS / Universite Lyon I
Licensed under the GNU General Public Licence.
Seaview drives the Muscle, ClustalW2, Gblocks, and PhyML programs and uses code from the PHYLIP package for parsimony. Please quote:
Some seaview versions use the PDFlib Lite library for pdf output under the "Open Source Developer Exemption" of the PDFlib Lite License Agreement.
Mouse Use:
Open, Open Mase, Open Phylip, Open Clustal, Open MSF, Open Fasta, Open NEXUS: to load an alignment in one of these formats. Formats Phylip, Clustal and Fasta use _ instead of space in names. Only the interleaved version of the Phylip format is supported. Mase and Nexus formats have the useful feature of allowing extra data beyond sequences and names (comments, accession numbers are really useful!). They can also store trees, site sets [see Sites Menu], species sets, footers, genetic code information.
Import from DBs:
to import one or several sequences from databases (EMBL, GenBank,
SwissProt/UniProt) into the alignment window naming them either
by their ID or LOCUS record name or by their species. Sequences
can be identified by two means:
1. By name, accession number or keyword. Protein IDs, without their extension
(e.g., CAA06608 from /protein_id="CAA06608.1"),
are processed as keywords attached to sequences, so can also be used. Because imported sequences are
named with information taken from their ID/LOCUS record, the protein ID won't be used to name the sequence.
2. By a local text file of names or accession nos (one per line, strict text file, e.g., not
a .doc or .odt file).
In case 1. and for nucleotide databases, it is also possible to import
sequence fragments corresponding to a specified feature key (e.g., CDS, rRNA, tRNA) and,
optionally, to require a given string to be present in the feature's
annotations. If no matching string is specified, only the first matching
feature key is imported.
CDSs are imported with their correct genetic code and reading frame.
Imports require an internet connection that allows outbound access to
port # 5558. Access to a series of other databases is also possible.
Save: when active, saves the alignment in the current file (which name appears as window title). The shortcut for this operation is ctrl-S (cmd-S on Mac).
Save as...: to save the alignment under a name and a format to be chosen in the file selector appearing next. This item is unavailable when a nucleotide alignment is translated to proteins. Use "Save prot alignmnt" below instead.
Save selection: only active when some sequences are selected or when a site line is displayed; allows to save in a file selected sites, or selected sequences, or selected sites of selected sequences.
Save prot alignmnt: only active when a nucleotide alignment is translated to proteins; allows to save resulting protein sequences in any format.
Prepare pdf/ps: Writes the alignment as a pdf/PostScript file with optional choices set through the File/pdf-ps options menu item (or the file dialog on Mac OS):
Concatenate: To add one alignment (source) to the end of another (target). Can be done "by name" (seqs with same name are concatenated) or "by rank" (seqs with same rank in alignments are concatenated). Option "add gaps" replaces names absent in either alignment by gap-only sequences ("by name" only).
New window: Opens a new, empty alignment window.
Close window: Closes the alignment window.
Quit: guess what?
Copy selected seqs: only active when some sequences are selected; copies in
clipboard these sequences, or only their selected sites if
a site line is displayed.
Paste alignment data: To paste previously copied alignment data to the end of
the current alignment window.
Select all: Selects all sequences from the alignment.
Rename sequence: To rename the currently selected (= name in black
background) sequence.
Edit comments: To see or change comments of the currently selected
sequence (Comments can only be saved in mase/NEXUS formats).
Edit sequence: To edit the selected sequence, typically by pasting
external data, or by opening two edit sequence
windows and transferring sequence data between them.
Delete sequence(s): Deletes all selected sequences from the alignment.
Create sequence: Allows to create a new, empty sequence in the alignment;
set "Allow seq. edition" from Props menu ON
to be able to type the sequence in.
Load sequence: Allows to load a new sequence in the alignment.
The sequence can be typed in or pasted from a
selection made in another window.
Duplicate sequence: Duplicates the currently selected sequence
with prefix D_ in its name.
Complement sequence: Creates a new sequence equal to the complementary
strand of the currently selected sequence with prefix
C_ in its name.
Reverse sequence: Creates a new sequence by reading 3' -> 5' the
currently selected sequence and named with prefix R_
Exchange Us and Ts: Exchange bases Us and Ts in all currently selected
sequences.
Dot plot: Performs a dot plot analysis of the two selected
sequences.
Consensus sequence: Computes the consensus of all currently selected
sequences. At any site, the consensus residue is the
most frequent one if its frequency is above a threshold
value. This threshold (60 % by default) can be changed
through item "Consensus options>
Edit threshold" of menu Props.
Below threshold, N or X is used.
Del gap-only sites: Deletes all gap-only sites from the alignment.
Set genetic code: Allows to specify the genetic code used to translate
to protein the selected sequence(s). Active only with
nucleotide alignments. Genetic codes are saved if the
mase or NEXUS file formats are used.
Align all
Align selected sequences:
Runs the chosen alignment program on all or on selected sequences. Alignment program is chosen with Alignment options of menu Align.
With MSWindows and Unix, the chosen alignment program is searched in the directory
of the seaview program and in directories of your PATH.
Align selected sites:
Runs the chosen alignment program on the block of
selected sites and the set of selected sequences.
A window will ask for choosing the reference sequence:
gaps present before alignment in the chosen
sequence will be preserved in the new alignment.
Profile alignment: Align selected sequences against a profile, that is, a group
of pre-aligned sequences.
Possible operations for making a profile alignment:
Fontsize: 8, 10, ..., 36: Sets the font size used to display sequences.
View as proteins:
When ON, DNA sequences are displayed as translated to protein sequences.
This allows to align them, and to go back to the DNA level by
unselecting this item. Items Save/Save as of menu File
and sequence edition are impossible when ON.
Colors:
IUPAC nucleotide ambiguity symbols | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Symbol | M | R | W | S | Y | K | V | H | D | B |
Nucleotides | AC | AG | ATU | GC | CTU | GTU | ACG | ACTU | AGTU | CGTU |
Site sets allow to specify parts of a multiple alignment to be retained for further analyses (e.g., those parts of the alignment taken as reliably aligned). Retained sites are depicted as series of Xs on a special line at bottom of the alignment panel. Mouse clicks and drags on this line allow to construct/alter the set (see Mouse use on the site line).
Several sets of sites can be created and stored with the alignment if the Mase or NEXUS formats of alignment files are used. Each set has a name chosen by the user. One set of sites at most can be displayed at any time through this menu.
Item Save selection of menu "File" allows to save in an alignment file only those sites of the alignment pertaining to the currently displayed site set.
Create set: Several kinds of sets of sites are possible:
Species sets can be created and stored with the alignment if the Mase or NEXUS formats of alignment files are used.
To select one or several species, click or drag on their names; they will appear in black background.
To memorize the current set of selected species, choose "Create set" from this menu. The program will ask for a name for this set.
Delete set: deletes (just from memory) the current set of species.
name: displays with black background the set of species memorized
under that name.
Comment lines can be created and displayed at bottom of the screen. These lines can contain any text and the program will maintain the vertical alignment between this text and sequences. This text can be saved using the mase or NEXUS file formats only.
To edit this text, click on the line name, position the cursor, and type text.
Click again on the line name to stop editing this text.
Show / Hide footers: To show / hide all footer lines
Create footer: To create a new footer line
Delete footer: To delete the currently selected footer line
Type a string in box at right and strike <return> key or push button to position the cursor in the next occurence of this string from its current place.
Push button to position the cursor at next occurence of the current search string.
Sequence gaps are ignored by the search procedure.
Moves the cursor to desired position or sequence:
This menu allows to compute, draw, save, and import DNA or protein phylogenetic trees.
Protein-coding DNA sequences displayed as protein sequences
(item "View as proteins"
of menu Props) are treated as protein sequences.
All tree computations apply to selected sequences (or all if none) and selected sites
(or complete alignment if none).
All trees, sequence, and site selections can be saved together with the alignment data in either
the mase or
Nexus formats.
Parsimony:
Computes parsimony trees using PHYLIP's v3.52 dnapars or protpars programs and returns the strict consensus of all equally parsimonious trees found, without branch lengths.
Randomize seq. order: to repeat search for shortest trees with randomized sequence order.
Ignore all gap sites: if on, all gap-containing sites are excluded from analysis.
Gaps as unknown states: if on, gap-containing sites are coded as unknown states so don't
introduce parsimony steps; if off, they are treated as an additional character state. The interface forbids
the meaningless configuration that would ignore gaps and treat them as unknown states.
Equally best trees retained: maximum number of equally best trees retained in search.
Bootstrap: performs bootstrap evaluation of clade statistical support (can be interrupted).
User tree: computes the number of steps of a user-given tree taken from those in the trees
menu. Such trees can have been previously computed or imported from external source.
Distance methods:
Computes NJ or BioNJ trees on a variety of pairwise phylogenetic distances.
NJ/BioNJ: to select the tree-building algorithm
Save to file: does not compute any tree but saves sequence pairwise distances to a local file.
Distance: select one among a variety of evolutionary distances: J-C: Jukes & Cantor (1969);
K2P: Kimura (1980) JME 16:111;
HKY: Rzhetsky & Nei (1995) MBE 12:131;
LogDet: Lake (1994) PNAS 91:1455;
Lockhart et al. (1994) MBE 11:605;
Ka/Ks: Li (1993) JME 36:96.
ignore all gap sites: if on, all gap-containing sites are excluded from analysis;
if off, not all sequence pairs use the same set of sites for computation of distances.
Bootstrap: performs bootstrap evaluation of clade statistical support (can be interrupted).
User tree: computes least squares branch lengths for selected user tree topology.
PhyML:
Computes trees using PhyML v3.0.1 as an external program.
If needed, PhyML can be downloaded from its web site
Under Unix, file $HOME/.seaviewrc may contain the name of the PhyML executable used by seaview.
Please quote: Guindon S, Gascuel O. (2003)
A simple, fast, and accurate algorithm to estimate
large phylogenies by maximum likelihood. Systematic Biology 52(5):696-704.
Model: select one among a variety of evolutionary models.
Branch support: can be omitted (None) or estimated either by the approximate likelihood
ratio test approach (aLRT) or by bootstrap.
Ts/Tv ratio: (applies to some nucleotide models only) optimize or fix a priori this ratio.
Invariable sites: ignore (None), optimize, or set to an a priori value (Fixed) the fraction
of invariable sites.
Across site rate variation: ignore (None), optimize, or set to an a priori value (Fixed) the
alpha parameter of the gamma distribution of rates across sites. If not ignored, the # of
rate categories must also be set.
Tree searching operations: NNI (nearest-neighbor interchange), SPR (subtree pruning and
regrafting) and 'best of NNI & SPR' are options improving the search for the most likely
tree but requiring increasing computation time.
Starting tree: controls the tree used to start tree-space search; can be BioNJ, a user-given
tree from the trees menu, or a number of random trees (possible only with SPR).
Turn off "Optimize tree topology" to compute the likelihood of a user-given tree.
Import tree: to import an external, Newick-formatted tree.
New tree window: opens a new, empty tree window.
File menu
Save to Trees menu: to save in menu Trees of alignment window a previously computed tree.
Remove from Trees menu: removes a tree from menu Trees of alignment window.
Save rooted (unrooted) (sub)tree: saves displayed tree or subtree to a local file as rooted or unrooted form.
Print: prints displayed tree (see also 'Page count' below).
Save as PDF/PostScript: saves displayed tree to PDF (or PostScript) local file (see Page Count).
Save as SVG: saves displayed (sub)tree to a scalable vector graphics (SVG) local file suitable
to be edited using appropriate programs (e.g., Inkscape).
A4 - Letter: controls the page format for PDF/PostScript operations.
Page count>#: controls the # of pages used for print/PDF/PostScript operations.
Reorder following tree: reorders sequences of alignment window as in displayed tree.
Select in alignment: selects in alignment window all members of subtree (active when only a subtree
is plotted).
Open tree or alignement: opens a new tree
(Newick format) or alignment file (any format).
New window: opens a new, empty tree window where a Newick-formatted tree can be pasted from clipboard.
Close window: closes the displayed tree that gets lost unless it had been saved to Trees menu.
Edit menu
Copy: (Mac OS and MSWindows only) copies tree plot to clipboard for pasting to external programs.
Paste tree: to paste a Newick-formatted
tree contained in the clipboard (only if window is empty).
Find: finds sequence names in tree that contain a user-given string, and red-colors them in tree
display (case insensitive).
Again: repeats 'Find' operation with same matching rules.
Edit tree header: to change the tree's brief descriptive header line.
Bootstrap threshold: only bootstrap values above given threshold become displayed.
Root at tree center: roots the tree at its point most equidistant to all leaves (when branch lengths exist).
Get or Set window size: to control the size of the current tree window.
Edit tree shape: to alter tree shape by moving or deleting sequence groups. A new window displays
the tree without branch lengths. Clicking on a square selects a sequence group that appears in red.
It can be moved to another branch of tree by clicking on another square, or deleted ("Delete group"
button). "Select group" button allows to select another group for further move or delete operations.
Complete edits by pressing "End edit" button, and, possibly, "File/Save to Trees menu".
Font menu: to control font, style and size of all text in tree display.
Br lengths: toggles display of branch length values next to each branch (very small values are not displayed).
Bootstrap: toggles display of bootstrap (or aLRT) support value next to each branch.
squared/circular/cladogram: toggles display between squared plot convenient for rooted trees, circular plot convenient for unrooted trees, and cladogram which displays trees with all branch lengths a multiple of a unit length and all leaves aligned at right. Cladogram is not available for branch length-free trees because it is equivalent to the squared display for them.
Full: normal, full tree display.
Swap: to swap branches around a node. Click on relevant square that appears.
Re-root: to set tree outgroup. Click on relevant square that appears.
Select: to select/unselect alignment sequences from a tree. Click in the tree on a square at
a sequence name or at a node to select/unselect it in the corresponding alignment window.
Conversely, click on sequence names in the alignment window to select/unselect them in the tree.
Selected sequences appear with a red square in the tree, and with their name on black background
in the alignment.
Subtree: to limit display to a subtree. Click on relevant square that appears.
Subtree up: when a subtree is being displayed, adds one more node towards tree root to display.
Zoom: vertical zoom for tree display; one or two scrollers appear. The tree can be moved by the scrollers, the mouse wheel, or by clicking on and dragging the plot.
Enter desired values for the window size and # of matches/window, and click on button "Compute", the dot plot will appear.
Click in the dot plot, the corresponding sequence regions appear in the alignment panel above the dot plot. Use "Magnify" to take a close look.
Click on arrows at left to move the hit point by one residue in either of six directions.
Move the slider below the alignment panel to control the number of displayedresidues.
Fit to window, Reduce, Magnify: perform zoom in and out operations
Write PDF/ps: saves the dot plot to a PDF or PostScript file.
Close: closes the dot plot window
Zero or more header lines each beginning with ;;
Next, for each sequence in the alignment:
One or more comment lines each beginning with ;
Sequence name alone on a line (may be long and may contain spaces)
Sequence data in free form, possibly with numbers and spaces ignored while
reading the file. Dashes denote gaps.
Header lines may contain any text and also contain descriptions of trees, site sets and species groups when such data have been defined.
Trees are as in this example:
;;$ BioNJ tree ;;[BioNJ 658 sites J-C](((boli_haplo_03:0.00146,boli_haplo_06:0.00159) ;;:0.00159,boli_haplo_05:0.00146):0.02886,boli_haplo_01:0.03142);
Site sets are written as in this example:
;;# of segments=10 mychoice ;; 14,74 221,256 416,449 990,1148 1363,1384 1474,1483 1556,1668 ;; 2034,2062 2114,2139 2756,2859where "mychoice" is the name of the set of sites and where the series of pairs of numbers lists the endpoints of successive block of sites.
Species groups are written as in this example:
;;@ of species = 4 distant outgroup ;; 2, 3, 4, 5where "distant outgroup" is the name of the species group and where the series of numbers lists the ranks of sequences members of the species group.
Use item "Customize" of menu Props to further customize the program.
catalog of amino acid colors: colors used for each amino acid family. Click on any to control the desired shade, as explained above. White is used for gaps and for unlisted residues.
Example: with the default coloring scheme, groups of amino acids KR and AFILMVW are displayed with the first and second catalog colors starting from left, respectively.
Click "reset" to use default amino acid families and color catalog. "Apply" or "Set changes permanent" to apply new shades to current alignment.