Results 1 to 6 of 6

Thread: SACA-K vs. divsufsort for computing BWT

  1. #1
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts

    SACA-K vs. divsufsort for computing BWT

    I did a comparison of SACA-K with divsufsort for computing a BWT using suffix arrays.
    SACA-K is by Ge Nong, found here: https://code.google.com/p/ge-nong/
    and described in the second paper, Practical Linear-Time O(1)-Workspace Suffix Sorting for Constant Alphabets,
    I compared with divsufsort-lite from https://code.google.com/p/libdivsufsort/
    using both divbwt() to compute the BWT directly and divsufsort() to compute the suffix array and compute the BWT from that. All algorithms use 5n memory. The times for enwik8 (100 MB) on a 2.0 GHz T3200, g++ 4.8.1 -O3, Win32, in 1 thread:

    divbwt - 28 sec.
    divsufsort - 31 sec.
    saca-k - 70 sec.

    Code:
    // bwt.cpp -- public domain
    // To compute BWT: bwt in out b|d|s
    // b = use divbwt()  https://code.google.com/p/libdivsufsort/  (libdivsufsort-lite.zip)
    // d = use divsufsort()
    // s = use SACA_K()  http://ge-nong.googlecode.com/files/saca-k-tois-20130413.zip
    // To compile: g++ -O3 bwt.cpp divsufsort.c saca-k.cpp -o bwt
    
    #include <stdio.h>
    #include <stdlib.h>
    #include "divsufsort.h"
    
    void SACA_K(unsigned char *s, unsigned int *SA,
      unsigned int n, unsigned int K,
      unsigned int m, int level);
    
    int main(int argc, char** argv) {
    
      // Open file
      if (argc<4) {
        printf("Usage: bwt input output b|d|s\n");
        return 1;
      }
      FILE* in=fopen(argv[1], "rb");
      if (!in) perror(argv[1]), exit(1);
    
      // Get size, read
      fseek(in, 0, SEEK_END);
      int n=ftell(in);
      fseek(in, 0, SEEK_SET);
      unsigned char* s=new unsigned char[n+1];
      if (fread(s, 1, n, in)!=(unsigned)n) perror("fread"), exit(1);
    
      // compute bwt
      int *t=new int[n+1];
      unsigned char* s2=new unsigned char[n];
      int p=0;
      if (argv[3][0]=='b')
        p=divbwt(s, s2, t, n);
      else if (argv[3][0]=='s') {
        s[n]=0;
        SACA_K(s, (unsigned*)t, n+1, 256, n+1, 0);
        int j=0;
        if (n>0) s2[j++]=s[n-1];
        for (int i=1; i<=n; ++i) {
          if (t[i]==0) p=j;
          else s2[j++]=s[t[i]-1];
        }
      }
      else if (argv[3][0]=='d') {
        divsufsort(s, t, n);
        int j=0;
        if (n>0) s2[j++]=s[n-1];
        for (int i=0; i<n; ++i) {
          if (t[i]==0) p=j;
          else s2[j++]=s[t[i]-1];
        }
      }
      printf("p=%d n=%d\n", p, n);
    
      // Output
      FILE* out=fopen(argv[2], "wb");
      if (!out) perror(argv[2]), exit(1);
      fwrite(s2, 1, n, out);
    
      return 0;
    }
    Note that SACA-K inputs n+1 bytes with the last byte set to 0, and outputs n+1 suffixes with the first suffix being the terminator.

  2. #2
    Programmer kvark's Avatar
    Join Date
    Aug 2006
    Location
    Toronto, Canada
    Posts
    74
    Thanks
    1
    Thanked 1 Time in 1 Post
    Your results are very close to what authors were getting: http://ge-nong.googlecode.com/files/saca-k-tois.pdf
    Their implementation is a raw material and needs to be optimized. So far it's the only known algorithm to produce the suffix array in linear time using only the space for input and output (total 5N bytes).
    Last edited by kvark; 14th February 2014 at 06:52.

  3. The Following User Says Thank You to kvark For This Useful Post:

    Bulat Ziganshin (14th February 2014)

  4. #3
    Member
    Join Date
    Aug 2013
    Location
    United States
    Posts
    4
    Thanks
    4
    Thanked 4 Times in 2 Posts
    It looks like divsufsort is using OpenMP, but the others are not. Was it enabled when you got those results?

  5. #4
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    I compiled without -fopenmp so all results are single threaded.

    I updated the program to include SAIS, both the original version by Ge Nong and the optimized version by Yuta Mori. For comparison, I also added a naive suffix sort using std::sort().

    Code:
    p=22481309 n=100000000 in 27.151 sec.  (divbwt)
    p=22481309 n=100000000 in 29.989 sec.  (divsufsort)
    p=22481309 n=100000000 in 68.644 sec.  (SACA_K)
    p=22481309 n=100000000 in 84.911 sec.  (SAIS, not optimized)
    p=22481309 n=100000000 in 42.067 sec.  (sais, optimized)
    p=22481309 n=100000000 in 39.454 sec.  (sais_bwt, optimized)
    p=22481309 n=100000000 in 234.940 sec.  (std::sort)
    Repeating the test compiling with -fopenmp (all tests with g++ 4.8.1 -O3 on a 2.0 GHz T3200 (2 cores), Win32).

    Code:
    p=22481309 n=100000000 in 24.134 sec. (divbwt)
    p=22481309 n=100000000 in 26.374 sec. (divsufsort)
    Task manager shows both cores in use for about the first 8 seconds, then 1 core after that.

    Code:
    // bwt.cpp v2 -- public domain
    // To compute BWT: bwt in out b|d|s|j|i|c|q
    // b = use divbwt()  https://code.google.com/p/libdivsufsort/  (libdivsufsort-lite.zip)
    // d = use divsufsort()
    // s = use SACA_K()  http://ge-nong.googlecode.com/files/saca-k-tois-20130413.zip
    // j = use SAIS()  (sais.cpp renamed to sa-is.cpp)
    // i = use sais()  https://sites.google.com/site/yuta256/sais/ (sais-lite-2.4.1.zip)
    // c = use sais_bwt()
    // q = use std::sort()
    // To compile: g++ -O3 -DNDEBUG bwt.cpp divsufsort.c saca-k.cpp sais.c sa-is.cpp -o bwt.exe
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <time.h>
    #include <algorithm>
    #include "divsufsort.h"
    #include "sais.h"
    
    unsigned char* t;  // input[0..n-1]
    unsigned n;
    
    // Compare SA elements
    bool less(unsigned a, unsigned b) {
      if (t[a]<t[b]) return true;
      if (t[a]>t[b]) return false;
      int cmp=memcmp(t+a, t+b, std::min(n-a, n-b));
      if (cmp==0) return a>b;
      return cmp<0;
    }
    
    void SACA_K(unsigned char *s, unsigned int *SA,
      unsigned int n, unsigned int K,
      unsigned int m, int level);
    
    void SAIS(unsigned char *s, int *SA, int n, int K, int cs, int level);
    
    int main(int argc, char** argv) {
    
      // Open file
      if (argc<4) {
        printf("Usage: bwt input output b|d|s|j|i|c|q\n");
        return 1;
      }
      FILE* in=fopen(argv[1], "rb");
      if (!in) perror(argv[1]), exit(1);
    
      // Get size, read
      fseek(in, 0, SEEK_END);
      n=ftell(in);
      fseek(in, 0, SEEK_SET);
      t=new unsigned char[n+1];
      if (fread(t, 1, n, in)!=n) perror("fread"), exit(1);
    
      // compute bwt
      const int type=argv[3][0];
      unsigned *sa=new unsigned[n+1];
      unsigned char* bwt=new unsigned char[n];
      int p=0;
      clock_t start=clock();
      if (type=='b')
        p=divbwt(t, bwt, (int*)sa, n);
      else if (type=='c')
        p=sais_bwt(t, bwt, (int*)sa, n);
      else {
        if (type=='d')
          divsufsort(t, (int*)sa, n);
        else if (type=='i')
          sais(t, (int*)sa, n);
        else if (type=='s') {
          t[n]=0;
          SACA_K(t, sa, n+1, 256, n+1, 0);
          ++sa;
        }
        else if (type=='j') {
          t[n]=0;
          SAIS(t, (int*)sa, n+1, 256, sizeof(char), 0);
          ++sa;
        }
        else if (type=='q') {
          for (unsigned i=0; i<n; ++i) sa[i]=i;
          std::sort(sa, sa+n, less);
        }
        else
          return 1;
        int j=0;
        if (n>0) bwt[j++]=t[n-1];
        for (int i=0; i<n; ++i) {
          if (sa[i]==0) p=j;
          else bwt[j++]=t[sa[i]-1];
        }
      }
      printf("p=%d n=%d in %1.3f sec.\n", p, n,
         double(clock()-start)/CLOCKS_PER_SEC);
    
      // Output
      FILE* out=fopen(argv[2], "wb");
      if (!out) perror(argv[2]), exit(1);
      fwrite(bwt, 1, n, out);
    
      return 0;
    }
    Last edited by Matt Mahoney; 14th February 2014 at 20:09.

  6. The Following 3 Users Say Thank You to Matt Mahoney For This Useful Post:

    encode (15th February 2014),Garen (15th February 2014),samsat1024 (14th February 2014)

  7. #5
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts
    Quote Originally Posted by kvark View Post
    Your results are very close to what authors were getting: http://ge-nong.googlecode.com/files/saca-k-tois.pdf
    Their implementation is a raw material and needs to be optimized. So far it's the only known algorithm to produce the suffix array in linear time using only the space for input and output (total 5N bytes).
    I thought Yuta Mori in some of his openbwt produced a linear time bwt using only 5n bytes?

  8. #6
    Programmer kvark's Avatar
    Join Date
    Aug 2006
    Location
    Toronto, Canada
    Posts
    74
    Thanks
    1
    Thanked 1 Time in 1 Post
    Quote Originally Posted by biject.bwts View Post
    I thought Yuta Mori in some of his openbwt produced a linear time bwt using only 5n bytes?
    If you mean "sais-lite" - that's his implementation of Ge Nong's team SAIS algorithm, which (at least originally) consumed more than 5n bytes. Ge Nong's latest draft includes the comparison of divsufsort, sais-lite, their original SAIS, and their new SACA-K. In essence, SACA-K main advantage over SAIS is not requiring more than 5n bytes of memory.

Similar Threads

  1. Computing the BWTS faster
    By nburns in forum Data Compression
    Replies: 4
    Last Post: 20th July 2013, 20:17
  2. GPGPU computing
    By Bulat Ziganshin in forum The Off-Topic Lounge
    Replies: 0
    Last Post: 19th February 2012, 17:10

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •