I've started work on my next M series of BWT compressors. M19 is the next generation of my previous M03 work and, like M03, M19 will be a context aware BWT compressor. However, unlike M03, M19 has a more advanced form of context parsing which can skip about 50% of the context transitions that are needed in M03. The result is a much faster context parsing time during the encoding phase, however, it is unclear if or how much this technique will improve compression. I suspect that it will result in improvements beyond M03 but only time will tell.
So far I have completed the context parsing component only (so not yet a compressor) but this is by far the most complicated part of the project. One of the main goals is to achieve the full boundless context parsing (no limit on max order) while still maintaining a 5N operational space. This was to ensure that the coder demands no more memory than the lightweight suffix array construction stage requires. This goal has been achieved.
As I have done with MSufSort 4, I am documenting the progress here as I have time to complete the work which, in this case, I suspect will not be completed until next year (hence M19). The code for the full context parser is comletely functional and is available on github at https://github.com/michaelmaniscalco/m19 for anyone who was ever curious about how M03 context parsing worked (this is similar but more complex).
I will update this thread as more work is completed. I'll update this thread as soon as possible with a basic overview of how M19 context parsing works.