Update: 08/26/2008

A multi-threaded  version is available here.  This version has an accelerated Striped Smith-Waterman for both X86 and Cell Broadband Engine, found in the PS3.  The multi threaded version is useful in benchmarking the algorithm.  I have started working a different version for database searches and alignments.

Introduction

The Smith-Waterman [1] algorithm is one of the most sensitive sequencing algorithms in use today.  It is also the slowest due to the number of calculations needed to perform the search.  To speed up the algorithm, it has been adapted to use Single Instruction Multiple Data, SIMD, instructions found on many common microprocessors today.  SIMD instructions are able to perform the same operation on multiple pieces of data parallel.

The program swsse2 introduces a new SIMD implementation of the Smith-Waterman algorithm for the X86 processor using SSE2 instructions.  The weights are precomputed parallel to the query sequence, like the Rognes [2] implementation, but are accessed in the striped pattern [3].  The new implementation reached speeds six times faster than other SIMD implementations.

Below is a graph comparing the total search times of 11 queries, 3806 residues, against the Swiss-Prot 49.1 database, 75,841,138 residues.  The tests were run on a PC with a 2.00GHz Intel Xeon Core 2 Duo processor with 2 GB RAM.  The program is singlely threaded, so the number of cores has no affect on the run times.  The Wozniak[4], Rognes[2] and striped[3] implementations were run with the scoring matrices BLOSUM50 and BLOSUM62 and four different gap penalties, 10-k, 10-2k, 14-2k and 40-2k.  Since the Wozniak's runtime does not change depending on the scoring matrix, one line is used for both scoring matrices.

The final comparison is against the heuristic programs FASTA 34t26b4 [5] and NCBI BLAST 2.2.14 [6]. For a more meaningful comparison, SSEARCH was modified to use the striped Smith–Waterman implementation. All the searches were run using the BLOSUM50 scoring matrix and gap penalties of 10-2k. The options for all programs were to display the top 500 scores with no alignments.


Build Instructions

  • Download the zip file with the swsse2 sources.
  • Unzip the sources.
  • Load the swsse2.vcproj file into Microsoft Visual C++ 2005.
  • Build the project (F7).  For optimized code, be sure to change the configuration to a Release build. 
  • The swsse2.exe file is in the Release directory ready to be run.

Running

To run swsse2 three files must be provided, the scoring matrix, query sequence and the database sequence.  Four scoring matrices are provided with the release, BLOSUM45, BLOSUM50, BLOSUM62 and BLOSUM80.  The query sequence and database sequence must be in the FASTA format.  For example, to run with the default gap penalties 10-2k, the scoring matrix BLOSUM50, the query sequence ptest1.fasta and the sequence database db.fasta use:

     c:\swsse2>.\Release\swsse2.exe blosum50.mat ptest1.fasta db.fasta
     ptest1.fasta vs db.fasta
     Matrix: blosum50.mat, Init: -10, Ext: -2

     Score  Description
        53  108_LYCES Protein 108 precursor.
        53  10KD_VIGUN 10 kDa protein precursor (Clone PSAS10).
        32  1431_ECHGR 14-3-3 protein homolog 1.
        32  1431_ECHMU 14-3-3 protein homolog 1 (Emma14-3-3.1).
        27  110K_PLAKN 110 kDa antigen (PK110) (Fragment).
        26  1432_ECHGR 14-3-3 protein homolog 2.
        25  13S1_FAGES 13S globulin seed storage protein 1
        25  13S3_FAGES 13S globulin seed storage protein 3
        25  13S2_FAGES 13S globulin seed storage protein 2
        23  12S1_ARATH 12S seed storage protein CRA1
        22  13SB_FAGES 13S globulin basic chain.
        21  12AH_CLOS4 12-alpha-hydroxysteroid dehydrogenase
        21  140U_DROME RPII140-upstream protein.
        21  12S2_ARATH 12S seed storage protein CRB
        21  1431_LYCES 14-3-3 protein 1.
        20  1431_ARATH 14-3-3-like protein GF14

     21 residues in query string
     2014 residues in 25 library sequences
     Scan time:  0.000 (Striped implementation)

Options

    Usage: swsse2 [-h] [-(n|w|r|s)] [-i num] [-e num] [-t num] [-c num] matrix query db
        -h       : this help message
        -n       : run a non-vectorized Smith-Waterman search
        -w       : run a vectorized Wozniak search
        -r       : run a vectorized Rognes search (NOT SUPPORTED)
        -s       : run a vectorized striped search (default)
        -i num   : gap init penalty (default -10)
        -e num   : gap extension penalty (default -2)
        -t num   : minimum score threshold (default 20)
        -c num   : number of scores to be displayed (default 250)
        matrix   : scoring matrix file
        query    : query sequence file (fasta format)
        db       : sequence database file (fasta format) 

Note

The Rognes implementation is not released as part of the swsse2 package due to patent concerns. 

References

[1] Smith, T. F. and Waterman, M. S., (1981) Identification of common molecular subsequences. J. Mol. Biol., 147, 195-197.

[2] Rognes, T. and Seeberg, E., (2000) Six-fold speed-up of the Smith-Waterman sequence database searches using parallel processing on common microprocessorsBioinformatics, 16, 699-706. 

[3] Farrar, M., (2007) Striped Smith–Waterman speeds database searches six times over other SIMD implementations. Bioinformatics, 23:156-161, 2007

[4] Wozniak, A.. (1997) Using video-oriented instructions to speed up sequence comparison.
Comput. Appl. Biosci., 13, 145-150.

[5] Pearson, W. R. and Lipman, D. J.. (1988) Improved tools for biological sequence
comparison
. Proc. Natl. Acad. Sci. USA, 85, 2444-2448.

[6] Altschul S. F., Madden T. L., Schaffer A. A., Zhang J., Zhang Z., MillerW. and Lipman
D. J.. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database
search programs
. Nucleic Acids Res., 25, 3389-3402.