(first post reserved for topic heading)
(first post reserved for topic heading)
Sometime ago i've started project of rewriting FreeArc from scratch. Now i'm approaching the point of first public alpha, so i will briefly describe its current state. Improvements over existing FreeArc:
- 64-bit versions for Windows and Linux
- global deduplication a-la ZPAQ (but without archive updating/generations)
- Lua-programmable program options
- incompatible archive format (it will be fixed only in the next version)
- archive updating and lots of other features aren't yet implemented (help screen shows the features already implemented)
The current agenda:
- april: will be released soon
- may: FreeArc archive fromat, automatic filetype detection based on contents
- june: simple GUI (a-la HtmlArc)
- july: not determied yet, but probably archive updating/generations
Now you can stare at program docs and fix my dirty English Editing of the Wiki should be available for anyone with GitHub account, but if it doesn't work, i can add you to the list of project contributors.
Still Haskell + C++?
Any chance for a library?
i suggest a couple of features
first: alphanumeric notes when adding. in future extract of version via name, not sequential number
second: append only format. this make very easy and fast rsync
third: stdio support
please do not use advanced hash but slower if not necessary.
sha1 is much better than sha256 for example because twice as fast. md5 even batter
and make a fast extraction of single file.
finally a very fast list function
in other words... zpaq ++
Last edited by fcorbelli; 11th April 2015 at 14:53.
is it fast enough? but current version can extract only entire archiveCode:C:\FB>timer fa.exe create m:\a c:\ d:\ e:\ z:\ --no-data --no-warnings -ds Scanning: 7,532,018,850,656 bytes in 328,525 folders and 2,700,795 files (RAM 204 MiB, cpu 2.964 sec, real 20.065 sec) Archive directory: 153,917,518 bytes Kernel Time = 16.692 = 00:00:16.692 = 82% User Time = 3.120 = 00:00:03.120 = 15% Process Time = 19.812 = 00:00:19.812 = 97% Global Time = 20.281 = 00:00:20.281 = 100% C:\FB>timer fa.exe l m:\a |tail 172,586,496 2011-02-22 03:39:35 .A.... z:\vs2010\sp1\VS10sp1-KB983509.msp =============================================================================== 7,531,763,218,119 => 0 bytes in 328,546 folders and 2,700,803 files Kernel Time = 0.421 = 00:00:00.421 = 7% User Time = 4.929 = 00:00:04.929 = 91% Process Time = 5.350 = 00:00:05.350 = 99% Global Time = 5.383 = 00:00:05.383 = 100% C:\FB>timer fa.exe l m:\a -nNO-SUCH-FILE |tail =============================================================================== 0 => 0 bytes in 328,546 folders and 0 files Kernel Time = 0.312 = 00:00:00.312 = 13% User Time = 1.887 = 00:00:01.887 = 84% Process Time = 2.199 = 00:00:02.199 = 97% Global Time = 2.247 = 00:00:02.247 = 100%
Last edited by Bulat Ziganshin; 11th April 2015 at 15:51.
Skymmer, I have got used to the fact that data compression specialists are literally Huffman worshipers (see e.g. http://pages.cs.brandeis.edu/~dcc/Pr...rogram2015.pdf ) - because they have written dozens of papers/compressors on it ... but now it just means wasting both space and time ( http://encode.ru/threads/1920-In-mem...entropy-coders ).
I see there is an issue understanding ANS - I could help, there are also descriptions of a few other persons, or one can just use e.g. FSE ... changing the format is the perfect moment for the upgrade.
avitar (11th April 2015)
Last edited by JangoFatXL; 11th April 2015 at 21:45.
Jarek, forum frequenters like me know about ANS very well. but freearc is an archiver, not a compressor program, and it relies on lzma, ppmd and other existing algorithms. i don't plan to change the archive format and don't have time to work on improving these compression algorithms
I apology, I was thinking about Tornado.
i rewrote program because it's 10 year old and there are lot of places that need rewrite. i believe that programs should be rewritten from scratch every few years in order to keep them tight, so it even was too late. It was started as FB, with haskell and C++ code, but finally i switched to C++14 and Boost. i believe it's more portable than haskell
about fast adding/listing/extraction - i especially care about scenaries with millions of files and gigabytes of data. in particular, i plan to provide alternative archive format that allows faster handling of archives with millions of files. but ATM i yet implement more basic things
Once the best compressors from my experiments were NanoZip, FreeArc, CCM, and CMM. The company resigned of closed-source programs, because they have bigger possibility of messy and buggy code and FreeArc won.
Last edited by inikep; 12th April 2015 at 12:01.
i know md5 and sha1 pretty well, and nist does not handle this case
you do not need to take the fingertips of a ssl certificate, neither exchange DH.
you need something with a very or virtual null collision rate
do not use an hammer if you neeed a needle.
turning on notes i suggest to add alphanumeric text to the version.
notes is almoa useless for tradizional archives.
but here you can use to extract the versions.
suppose you have a codebase of freearc and you want to take all the versions in one single file.
you can do pretty easily with zpaq
but how to extract v27.2 from the archive?
you cannot, because you need to know that say version 3 of the zpaq contain your source code 27.3
as in a database i want a table to associate internal version numbers 1'2'3'4... to ASCII text initial'ok'deployed'newhash
so i can restore say extract - version=deploy
about working with very big file number, in future, i suggest to split data block file from metadata index
so you can have your little database with say a binary tree or whatever
and yes, i' m a senior or better mysql guru, so i like to think in db ways.
about stdio: does not matter extrac, but ccompress.
because is much common to dump a database to ascii and restore direct
Last edited by fcorbelli; 12th April 2015 at 12:08.
sorry for my terrible language but sometimes i use a smartphone with italian keyboard that do not like very much english
<off topic>Argh! I... Must... Resist...
As I am a real bastard, dirty grammar nazi in my native language, could you please use next time "à la" instead of a-la? Thanks a lot.
Anyway, this is a nice job you try to accomplish here! Congratulations!
open Source is a nonsense in a non USA/UE world.
so when someone post the source code almost nothing can prevent a commercial closed source rebrand.
this can be bad, but it's true.
so open is about launch and forget, like a sidewinder
someone cares, someone do not.
ps back in old days late 80 there is a gentleman agreement to report the original creator.
today... not so common
> turning on notes i suggest to add alphanumeric text to the version.
ok, i got it
> about working with very big file number, in future, i suggest to split data block file from metadata index
> so you can have your little database with say a binary tree or whatever
yes, we have already discussed it. it may be even my own idea
Well, open-source also means at least theoretical possibility, that if one day the original author disappears or abandon the project for various reasons, there is still a chance for users that someone will take over, not speaking about bugs etc... Of course the chance is bigger with smaller and simpler programs, but still... From a user point of view closed-sourceness is not good news.
But nice to see u work on this nice/usefull program, and the plans. Good luck
It's a shame to steal the work of others and claim it as own, I think it is a problem of mentality. Therefore I absolutely respect the decision to publish it as closed source.ps back in old days late 80 there is a gentleman agreement to report the original creator.
today... not so common
Furthermore there is still the possibilty to change the licence 15 years later and publish the sources if you like, F.e. like they did it with some games (Doom and Duke3D I think)
Last edited by JangoFatXL; 12th April 2015 at 16:08.
http://pskibinski.pl/. Programs that I wrote later are fully owned by my employer with a transfer of copyrights.
There is an additional advantage of open-source programs, people can help you find bugs and introduce some improvements. For example I was working with you on 4x4/tornado issues in 2009. I didn't know that the company didn't buy a license from you. I think they chose another compressor.
Last edited by inikep; 12th April 2015 at 20:44.