Loading ...
Sorry, an error occurred while loading the content.

SSAHA Psuedocode documentation

Expand Messages
  • greenexile4568
    Hi, Is there a pseudocode representation of the SSAHA algorithm ? I m particulary interested in learning what approaches were taken to implement the various
    Message 1 of 2 , Nov 27, 2005
    • 0 Attachment
      Hi,
      Is there a pseudocode representation of the SSAHA algorithm ?

      I'm particulary interested in learning what approaches were taken to
      implement the various key steps in the algorithm. For example did you
      use a decrease and conquer approach when generating the possible kmer
      permutations..

      Be grateful for any information you can provide
      Thanks
      Brian Wilson
    • Adam Spargo
      Hi, The algorithms are very simple. The key set is very small (when compared to the sequence data). We map ALL kmers whether they appear in the sequence or
      Message 2 of 2 , Nov 28, 2005
      • 0 Attachment
        Hi,
        The algorithms are very simple. The key set is very small (when compared
        to the sequence data). We map ALL kmers whether they appear in the
        sequence or not. There is a 2-bit encoding:

        A = 00
        C = 01
        G = 10
        T = 11

        Then each kmer is a 2*k bit 'integer'. Once k is defined you process each
        kmer by looping over the K bases, assign the 2 bits for each bases and
        bit-shift 2 to the left. So for example if k=12 you use 24 bits.

        There are (|A|)^k permuatations for an alphabet A, so for DNA there are
        4^k permutations, for k=12 there are 4^12 = 16777216 keys. The hashtable
        consists of a head, with one entry for each kmer, and a body.

        head[i] points to an entry in body, so that body[head[i]] is the first
        occurrance of kmer i in the subject sequence, body[head[i]+1] is the next
        occurrance etc, until you reach body[head[i+1]] which is the first
        occurrance of the next kmer.

        To process a query you loop over the (overlapping) kmers, once you have
        calculated i for a kmer, body[head[i]] ... body[head[i+1]-1] are all the
        locations of that kmer in the subject.

        Let me know if you need any clarification.

        Cheers,

        Adam.



        On Mon, 28 Nov 2005, greenexile4568 wrote:

        > Hi,
        > Is there a pseudocode representation of the SSAHA algorithm ?
        >
        > I'm particulary interested in learning what approaches were taken to
        > implement the various key steps in the algorithm. For example did you
        > use a decrease and conquer approach when generating the possible kmer
        > permutations..
        >
        > Be grateful for any information you can provide
        > Thanks
        > Brian Wilson
        >
        >
        >
        >
        >
        >
        >
        > Yahoo! Groups Links
        >
        >
        >
        >
        >
        >
        >

        --
        Dr Adam Spargo
        High Performance Assembly Group email: aws@...
        Wellcome Trust Sanger Institute Tel: +44 (0)1223 834244 x7728
        Hinxton, Cambridge CB10 1SA Fax: +44 (0)1223 494919
      Your message has been successfully submitted and would be delivered to recipients shortly.