SeqVis: Visualization of compositional heterogeneity in large alignments of nucleotides

Beside the interactive 3D visualization capability, SeqVis supports sequence queries, various types of cluster analyses and the matched-pair test of symmetry. Full documentation for SeqVis's features can be viewed here. The system's features are summarized as follows:

  • SeqVis reads and writes alignments in the sequential PHYLIP format, the NEXUS format, and the FASTA format. More sequences can be added dynamically either by opening another sequence file, or by entering the sequence through the user interface.



    Fig. 1a. Adding new sequences through the user interface.

    Since version 1.2 of SeqVis, the program has allowed non-sequence data of four attributes that sum up to one to be visualized. Non-sequence data are read and written in a simple text file format (Fig. 1b).



    Fig. 1b. A examplary .tab file for non-sequence data.

  • The tetrahedron can be rotated in all directions, animated, and manipulated interactively; all items (i.e., wireframe, axes, labels, level, point, and background) on display can be changed (e.g., colour, size, and visibility).



    Fig. 2. Examples of different styles of visualization in SeqVis. Visibility and colour of visual items can be changed dynamically through the user interface.

  • By viewing the points orthogonally through one of the surfaces, the distribution of three nucleotides may be assessed while ignoring the fourth nucleotide.



    Fig. 3. Examples of an orthogonal view of the tetrahedron model. In the example, the composition of nucleotide T is ignored (the corner T is pointing outside the screen).

  • The nucleotide composition at the codon sites can be surveyed independently and visualized on a single canvas. This can help assessing the heterogeneity of nucleotide composition across the three codon sites.



    Fig. 4. Codon view of an alignment of nucleotide sequences. The nucleotide compositions at first, second and third codon position are visualized using three tetrahedron model simultaneously on the same canvas. The three models can be animated together.

  • Sequence information can be obtained by mouse-clicking on points of interest or by using inbuilt tools that query the data based on the sequences’ names or attributes.



    Fig. 5. Retrieving sequence information. Sequence can be searched by name and nucleotide content. The matching sequence(s) is(are) highlighted in a different colour. Details information about a sequence (i.e. name, nucleotide composition, sequence length, and number of unknown nucleotides) are displayed in the bottom text panel.

  • By projection, the tetrahedron can be illustrated as a de Finetti plot or a linear plot.



    Fig. 6. SeqVis's lower dimensional view. SeqVis can project and visualize the alignment in lower dimensions (e.g. de Finetti plot and linear plot). User can choose which nucleotides to be pooled together from the user interface.

  • A number of analytical tools are provided: e.g. matched-pairs test of symmetry, hierarchical clustering, k-mean clustering.



    Fig. 7. Analysis tools in SeqVis. A number of common data clustering and analysis tools are available. In this figure, the points inside the tetrahedron are coloured based on results from k-mean clustering. Results from hierarchical clustering and matched-pair test of symmetry are shown as well.

  • On-screen images may be saved in the PNG and JPEG formats.



    Fig. 8. A screenshot captued by SeqVis. The onscreen canvas area can be captured and saved in PNG and JPEG format.

© The University of Sydney, 2004-2011. All rights reserved.