Page 7 of 9 FirstFirst ... 56789 LastLast
Results 181 to 210 of 249

Thread: Filesystem benchmark

  1. #181
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Got a new version.
    The main change is different treatment of small_iters.
    small_iters are a feature meant to compensate for clock inaccuracies. I run the whole test several times to have long enough running time. Up to now the code would use memcpy to estimate the speed of the fastest codecs and guess the number of small_iters from it - the same for all codecs. However, when the slowest codecs are million times slower than the fastest ones, finding a good number is impossible. From now, codecs don't iterate a fixed number of times but until a fixed time passes. This time is controlled by -s switch which used to mean number of small_iters
    I delayed retests of fsbench overhead because I knew this feature would increase it and I'd have to do it anyway. I did.
    On Core2Duo/Debian the overhead increased to little less than 400 ticks / loop, which with L1 size amounts to 0.014 ticks/byte.
    On Beagle/Timesys it went up to nearly 1200 clocks/loop which proved too much.
    So I added another layer of iters, overhead-reducing ones, controllable with -o switch (default just 1).
    With 10000 iters on Core2Duo/Debian I got to 7 ticks/loop.
    On Beagle it was 11 ticks.
    At 1000 loops it was 12 ticks and at 100 - 22.
    So I recommend doing 100 for L1 measurements, increased to 1000 if you feel generous.

    I considered finding the right number of overhead iters automatically, so users don't have to even know about it, but it would break an important use case - in-memory transforms on small blocks would turn to in-cache, unless explicitly set. Or to something in between (f.e. 2 overhead loops, 1st in RAM, 2nd in L1). This made me decide not to change the current behaviour.

    As to other changes, I added yet another deflate implementation, updated many codecs, added CityHash32 which appeared in the update, added a pair of MurmurHashes that I got implemented but forgot to connect to the frontend.

    Also, up to now I used a fast LZ4 decoder. Now, I optionally enabled testing the safe one too. I called it LZ4fast.

    CityHash added a piece of crappy platform-specific code that broke it on FreeBSD and a ton of other systems. I fixed it, but they don't seem to have a way to contact them w/out publishing my email address. No thx, I get too much spam already, so I'm not going to tell them.

    Changelog:
    Code:
    0.14
    [+] added z3lib 1.3
    [+] added CityHash32
    [+] added 128-bit variants of murmur hash
    [+] added support for safe lz4 decoder. Use LZ4 for the fast one (like before) and LZ4safe for the safe one
    [~] removed bcl-lz from the list of all codecs. It's so extremely slow it was disturbing testing. And lzfast is still there.
    [~] the number of small_iters is not fixed; duration is
    [~] updated CityHash64/128 to 1.1.0
    [~] updated zlib to 1.2.8
    [~] updated snappy to 1.1.0
    [~] updated lzo to 2.06
    [~] updated blosc to 1.2.3
    [~] updated zopfli to 1.0.0
    [~] updated SpookyHash to V2 2012-08-05
    [~] updated LZ4 to r97
    [~] cleanup
    [!] restoring default console colour would fail on some systems
    [!] several minor bugs
    I did some L1 and L2 benches (gcc 4.6-4.7 used in all tests except Phenom 2 - gcc 4.2.1 there):

    Phenom II X4 955, 4 threads@3.2 Ghz L1 cache   
    CodecversionGB/sTicks/B
    xxhash256173.980.16
    fletcher2201060.850.20
    FNV1a-Tesla32013-05-1249.910.24
    FNV1a-Tesla2013-05-1247.180.25
    SpookyHashV2 2012-08-0534.570.34
    FNV1a-YoshimitsuTRIAD2013-05-1231.450.38
    FNV1a-Yorikke2013-05-1231.400.38
    FNV1a-Yoshimura2013-05-1231.250.38
    vhash2007-04-1723.700.50
    FNV1a-Jesteress2013-05-1223.650.50
    FNV1a-Mantis2013-05-1223.650.50
    FNV1a-Meiyan2013-05-1223.650.50
    vmac2007-04-1723.480.51
    CityHash1281.1.020.430.58
    xxhashr2919.890.60
    CityHash641.1.019.410.61
    murmur3_x64_1282012-02-2915.780.76
    FNV1a-YoshimitsuTRIADiiXMM2013-05-1213.800.86
    fletcher4201013.540.88
    CityHash321.1.09.491.26
    murmur3_x86_322012-02-299.491.26
    murmur3_x86_1282012-02-299.481.26
    uhash2007-04-174.922.42
    umac2007-04-174.902.43
    SipHash24reference4.252.81
        
    Core2Duo 2 threads@2.33 Ghz L1 cache   
    Codecversionspeed (GB/s)Ticks/B
    fletcher2201024.800.18
    xxhash256121.720.20
    FNV1a-Tesla2013-05-1216.690.26
    FNV1a-Tesla32013-05-1216.690.26
    FNV1a-YoshimitsuTRIAD2013-05-1214.450.30
    SpookyHashV2 2012-08-0513.800.31
    CityHash641.1.013.510.32
    FNV1a-Yorikke2013-05-1213.470.32
    FNV1a-Yoshimura2013-05-1213.360.33
    CityHash1281.1.012.310.35
    xxhashr298.510.51
    FNV1a-Mantis2013-05-128.280.52
    FNV1a-Meiyan2013-05-128.190.53
    FNV1a-Jesteress2013-05-128.160.53
    vhash2007-04-177.380.59
    vmac2007-04-177.370.59
    murmur3_x64_1282012-02-296.800.64
    fletcher420106.640.65
    CrapWow2012-06-076.350.68
    CityHash321.1.05.450.80
    murmur3_x86_1282012-02-294.520.96
    murmur3_x86_322012-02-293.961.10
    uhash2007-04-172.551.70
    umac2007-04-172.521.73
    SipHash24reference1.542.82
        
    Core2Duo 1 thread@2.33 Ghz L1 cache   
    Codecversionspeed (GB/s)Ticks/B
    fletcher2201012.790.17
    FNV1a-Penumbra2013-06-1611.070.20
    xxhash256111.050.20
    FNV1a-YoshimitsuTRIADiiXMM2013-05-129.510.23
    FNV1a-Tesla2013-05-128.520.25
    FNV1a-Tesla32013-05-128.490.26
    FNV1a-YoshimitsuTRIAD2013-05-127.350.30
    SpookyHashV2 2012-08-056.990.31
    CityHash641.1.06.850.32
    FNV1a-Yorikke2013-05-126.830.32
    FNV1a-Yoshimura2013-05-126.780.32
    CityHash1281.1.06.270.35
    xxhashr294.310.50
    FNV1a-Jesteress2013-05-124.120.53
    FNV1a-Meiyan2013-05-124.120.53
    FNV1a-Mantis2013-05-124.120.53
    vhash2007-04-173.720.58
    vmac2007-04-173.720.58
    murmur3_x64_1282012-02-293.450.63
    fletcher420103.360.65
    CrapWow2012-06-073.220.68
    CityHash321.1.02.760.79
    murmur3_x86_1282012-02-292.290.95
    murmur3_x86_322012-02-292.001.09
    uhash2007-04-171.271.71
    umac2007-04-171.251.74
        
    Core2Duo 1 thread@2.33 Ghz L2 cache   
    Codecversionspeed (GB/s)Ticks/B
    xxhash256111.010.20
    fletcher220109.750.22
    FNV1a-YoshimitsuTRIADiiXMM2013-05-128.950.24
    FNV1a-Tesla32013-05-128.520.26
    FNV1a-Tesla2013-05-128.430.26
    SpookyHashV2 2012-08-057.060.31
    FNV1a-YoshimitsuTRIAD2013-05-126.930.31
    CityHash641.1.06.840.32
    FNV1a-Yoshimura2013-05-126.680.33
    FNV1a-Yorikke2013-05-126.670.33
    CityHash1281.1.06.280.35
    vhash2007-04-174.480.48
    vmac2007-04-174.480.48
    xxhashr294.280.51
    FNV1a-Jesteress2013-05-124.080.53
    FNV1a-Meiyan2013-05-124.080.53
    FNV1a-Mantis2013-05-124.080.53
    murmur3_x64_1282012-02-293.420.63
    fletcher420103.330.65
    CrapWow2012-06-073.170.69
    CityHash321.1.02.750.79
    murmur3_x86_1282012-02-292.290.95
    uhash2007-04-172.270.96
    umac2007-04-172.270.96
    murmur3_x86_322012-02-291.991.09
        
    CortexA8 1 thread@720 Mhz L1 cache   
    Codecversionspeed (MB/s)Ticks/B
    FNV1a-YoshimitsuTRIAD2013-05-121021.790.67
    FNV1a-Yoshimura2013-05-12989.000.69
    FNV1a-Yorikke2013-05-12988.450.69
    fletcher22010834.930.82
    FNV1a-Jesteress2013-05-12682.551.01
    FNV1a-Meiyan2013-05-12682.551.01
    FNV1a-Mantis2013-05-12681.201.01
    xxhashr29520.191.32
    FNV1a-Tesla2013-05-12445.301.54
    CrapWow2012-06-07390.541.76
    fletcher42010389.821.76
    FNV1a-Tesla32013-05-12388.881.77
    xxhash2561354.691.94
    murmur3_x86_1282012-02-29303.362.26
    SpookyHashV2 2012-08-05229.253.00
    murmur3_x86_322012-02-29136.945.01
    murmur3_x64_1282012-02-29133.485.14
    uhash2007-04-17121.585.65
    umac2007-04-17120.155.72
    vhash2007-04-17113.456.05
    vmac2007-04-17113.246.06
    CityHash321.1.084.428.13
    CityHash1281.1.082.578.32
    CityHash641.1.079.148.68
        
    CortexA8 1 thread@720 Mhz L2 cache   
    Codecversionspeed (MB/s)Ticks/B
    FNV1a-YoshimitsuTRIAD2013-05-12523.351.31
    FNV1a-Yorikke2013-05-12504.941.36
    FNV1a-Yoshimura2013-05-12488.281.41
    fletcher22010459.781.49
    FNV1a-Mantis2013-05-12412.401.67
    FNV1a-Meiyan2013-05-12412.401.67
    FNV1a-Jesteress2013-05-12408.951.68
    xxhashr29346.301.98
    FNV1a-Tesla2013-05-12307.872.23
    CrapWow2012-06-07282.902.43
    FNV1a-Tesla32013-05-12281.922.44
    fletcher42010278.062.47
    xxhash2561260.832.63
    murmur3_x86_1282012-02-29231.632.96
    SpookyHashV2 2012-08-05184.263.73
    uhash2007-04-17157.104.37
    umac2007-04-17156.504.39
    murmur3_x86_322012-02-29119.035.77
    murmur3_x64_1282012-02-29115.935.92
    vmac2007-04-17101.906.74
    vhash2007-04-17101.646.76
    CityHash321.1.076.978.92
    CityHash1281.1.074.919.17
    CityHash641.1.072.179.51
    Attached Files Attached Files
    Last edited by m^2; 19th June 2013 at 20:48.

  2. #182
    Member Bloax's Avatar
    Join Date
    Feb 2013
    Location
    Dreamland
    Posts
    52
    Thanks
    11
    Thanked 2 Times in 2 Posts
    Quote Originally Posted by m^2 View Post
    CityHash added a piece of crappy platform-specific code that broke it on FreeBSD and a ton of other systems. I fixed it, but they don't seem to have a way to contact them w/out publishing my email address. No thx, I get too much spam already, so I'm not going to tell them.
    Just use something like this or something? :I

  3. #183
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by Bloax View Post
    Just use something like this or something? :I
    I've been thinking about such thing but decided not to bother - it's their business to provide a way to contact them that's acceptable for users. I won't be going out of my way to help them.

  4. #184
    Member Bloax's Avatar
    Join Date
    Feb 2013
    Location
    Dreamland
    Posts
    52
    Thanks
    11
    Thanked 2 Times in 2 Posts
    That's a pretty backwards stance for correcting something, since if they don't have it already - how would they correct it if nobody tells them to do so in the first place? :1

  5. #185
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Yep. Life sucks.

  6. #186
    Tester
    Black_Fox's Avatar
    Join Date
    May 2008
    Location
    [CZE] Czechia
    Posts
    471
    Thanks
    26
    Thanked 9 Times in 8 Posts
    Quote Originally Posted by m^2 View Post
    Yep. Life sucks.
    I'm intrigued how you can ever leave Windows platform if you refuse to "go out of your way" even to register a throwaway Google account
    Last edited by Black_Fox; 16th June 2013 at 01:05. Reason: typo
    I am... Black_Fox... my discontinued benchmark
    "No one involved in computers would ever say that a certain amount of memory is enough for all time? I keep bumping into that silly quotation attributed to me that says 640K of memory is enough. There's never a citation; the quotation just floats like a rumor, repeated again and again." -- Bill Gates

  7. #187
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by Black_Fox View Post
    I'm intrigued how you can ever leave Windows platform if you refuse to "go out of your way" even to register a throwaway Google account
    I fail to see a connection between the two.

    [ADDED]
    OK, I decided to give a more detailed answer.
    You can be a selfish bastard on Unix just as well as on Windows. Nobody requires you to help others in any way especially if you're a noob.

    And when it comes to this case, there's a thing that I just don't like google.
    They released a mostly free library, good of them, and that's the only thing that made me seek contact in the first place. However, lack of good way to contact them annoyed me and exhausted the tiny amount of good will that I got for them.

    As to registering a Google account, I don't intend to do it ever.
    [/ADDED]

    BTW, there's a problem with the above results: YoshimitsuTRIADiiXMM is missing. TODO.
    Last edited by m^2; 16th June 2013 at 10:38.

  8. The Following User Says Thank You to m^2 For This Useful Post:

    Black_Fox (16th June 2013)

  9. #188
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    856
    Thanks
    447
    Thanked 254 Times in 103 Posts
    Hi Maciej

    I've been trying to compile your version 0.14,

    here is what I got with MinGW :

    Code:
    make all
    [  1%] Building CXX object CMakeFiles/fsbench.dir/simple_codecs.cpp.obj
    In file included from simple_codecs.cpp:864:0:
    codecs/zlib/zlib.h:34:1
    9: fatal error: zconf.h: No such file or directory
    compilation terminated.
    make[2]: *** [CMakeFiles/fsbench.dir/simple_codecs.cpp.obj] Error 1
    make[1]: *** [CMakeFiles/fsbench.dir/all] Error 2
    make: *** [all] Error 2
    zconf.h is indeed missing from zlib directory.

  10. #189
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Heck, it was compiling on my system because it was sucking zconf from the system library.
    Thanks for the report, will fix it ASAP.

  11. #190
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    0.14.1.
    Aside from fixing the zconf bug, I added a simple C implementation of byte swapping.
    Also, there's a new group of codecs 'others' that includes bswaps as well as previously available 'nop' and that group gets tested when you ask fsbench to test all codecs.
    Changelog:
    Code:
    0.14.1
    [+] added byte-swapping transforms (bswap16, bswap32, bswap64)
    [!] added missing zconf.h
    [!] minor fixes
    Attached Files Attached Files

  12. #191
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    I have news:
    1. I added FNV1a-YoshimitsuTRIADiiXMM results to Phenom / Core2 results above (not all; I skipped Core2 in RAM and with 2 threads as these were the least interesting).
    On Phenom gcc 4.2.1 failed to generate reasonable code and the result was poor, I should retry with a better compiler; on Core2 it was the fastest of FNV variants. It should be noted that the tests were performed with SSE2; YoshimitsuTRIADiiXMM supports SSE4.1 and AVX too, but my hardware doesn't (and fsbench doesn't either - which will be fixed as soon as there's sb. to test it).
    2. Seeing my recent benchmark results, Georgi Sanmayce unrolled YoshimitsuTRIADiiXMM. The new has is the fastest in his tests. I shall update fsbench and try it too.
    3. Today I successfully run fsbench on PowerPC/big endian/32-bit CPU (amcc 440). It seemed to work. No results yet though.
    4. I run some benches on a dual Xeon system. Debian Squeeze / gcc 4.4. I need to update them with FNV1a-Penumbra too. (Update: Done)
    2x Xeon E5335, 8 threads@2 Ghz L1 cache   
    Codecversionspeed (GB/s)Ticks/B
    fletcher2201084.090.18
    xxhash256177.300.19
    FNV1a-Penumbra2013-06-1673.340.20
    FNV1a-YoshimitsuTRIADiiXMM2013-05-1265.290.23
    FNV1a-Tesla2013-05-1257.690.26
    FNV1a-Tesla32013-05-1257.500.26
    FNV1a-YoshimitsuTRIAD2013-05-1249.770.30
    SpookyHashV2 2012-08-0548.010.31
    FNV1a-Yoshimura2013-05-1245.730.33
    CityHash641.1.045.200.33
    CityHash1281.1.045.120.33
    FNV1a-Yorikke2013-05-1239.230.38
    xxhashr2929.330.51
    FNV1a-Mantis2013-05-1229.220.51
    FNV1a-Jesteress2013-05-1227.760.54
    FNV1a-Meiyan2013-05-1227.730.54
    vhash2007-04-1725.120.59
    vmac2007-04-1725.120.59
    murmur3_x64_1282012-02-2923.650.63
    fletcher4201022.870.65
    CrapWow2012-06-0721.800.68
    murmur3_x86_1282012-02-2915.270.98
    murmur3_x86_322012-02-2913.691.09
    CityHash321.1.010.861.10
    uhash2007-04-178.751.70
    umac2007-04-178.661.72
        
    2x Xeon E5335, 8 threads@2 Ghz L2 cache   
    Codecversionspeed (GB/s)Ticks/B
    xxhash256174.140.20
    fletcher2201059.980.25
    FNV1a-Tesla32013-05-1256.610.26
    FNV1a-Tesla2013-05-1256.000.27
    FNV1a-Penumbra2013-06-1655.970.27
    FNV1a-YoshimitsuTRIADiiXMM2013-05-1253.510.28
    SpookyHashV2 2012-08-0547.420.31
    FNV1a-YoshimitsuTRIAD2013-05-1246.090.32
    CityHash1281.1.044.840.33
    CityHash641.1.044.450.34
    FNV1a-Yoshimura2013-05-1244.330.34
    FNV1a-Yorikke2013-05-1238.770.38
    vmac2007-04-1729.900.50
    vhash2007-04-1729.880.50
    xxhashr2929.120.51
    FNV1a-Jesteress2013-05-1227.290.55
    FNV1a-Meiyan2013-05-1227.290.55
    FNV1a-Mantis2013-05-1227.270.55
    murmur3_x64_1282012-02-2923.600.63
    fletcher4201022.740.66
    CrapWow2012-06-0721.480.69
    uhash2007-04-1715.620.95
    umac2007-04-1715.610.95
    murmur3_x86_1282012-02-2915.320.97
    murmur3_x86_322012-02-2913.721.09
    CityHash321.1.010.881.37
        
    2x Xeon E5335, 8 threads@2 Ghz RAM   
    Codecversionspeed (GB/s)Ticks/B
    xxhash25619.811.52
    FNV1a-Tesla2013-05-129.761.53
    FNV1a-Tesla32013-05-129.761.53
    FNV1a-Yoshimura2013-05-129.721.53
    fletcher220109.681.54
    FNV1a-YoshimitsuTRIADiiXMM2013-05-129.671.54
    CityHash641.1.09.671.54
    FNV1a-YoshimitsuTRIAD2013-05-129.671.54
    CityHash1281.1.09.661.54
    xxhashr299.651.54
    FNV1a-Meiyan2013-05-129.631.55
    SpookyHashV2 2012-08-059.631.55
    FNV1a-Yorikke2013-05-129.611.55
    FNV1a-Jesteress2013-05-129.591.55
    murmur3_x64_1282012-02-299.581.56
    FNV1a-Mantis2013-05-129.581.56
    fletcher420109.571.56
    vhash2007-04-179.531.56
    vmac2007-04-179.521.57
    CrapWow2012-06-079.491.57
    murmur3_x86_1282012-02-299.431.58
    murmur3_x86_322012-02-299.381.59
    uhash2007-04-179.321.60
    umac2007-04-179.291.60
    CityHash321.1.09.061.65
    Last edited by m^2; 19th June 2013 at 20:45.

  13. #192
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    856
    Thanks
    447
    Thanked 254 Times in 103 Posts
    I tried to compile fsbench 0.14.1

    Under minGW, compilation works correctly.
    The issue of performance is still there though.

    Did you add -DCMAKE_BUILD_TYPE=Release to cmake command line?
    I'm using a cmake GUI, not a command line. I did not found how to add this instruction.
    Anyway, I'm not sure it's a good idea to require the user to enter this instruction.
    Since it is a benchmark program, I think it's expected that the binary produced get optimized by default.


    Last but not least, under GCC / Linux Ubuntu 64, compilation fails :

    Code:
    In file included from /home/yann/Bureau/Dev/fsbench-0.14.1/src/codecs/SHA3/Skein/brg_endian.h:49:0,
                     from /home/yann/Bureau/Dev/fsbench-0.14.1/src/codecs/SHA3/Skein/skein_port.h:49,
                     from /home/yann/Bureau/Dev/fsbench-0.14.1/src/codecs/SHA3/Skein/skein.h:36,
                     from /home/yann/Bureau/Dev/fsbench-0.14.1/src/codecs/SHA3/Skein/skein.c:14,
                     from /home/yann/Bureau/Dev/fsbench-0.14.1/src/codecs/SHA3/fsbench_SHA3.cpp:101:
    /home/yann/Bureau/Dev/fsbench-0.14.1/src/codecs/CityHash/byteswap.h: At global scope:
    /home/yann/Bureau/Dev/fsbench-0.14.1/src/codecs/CityHash/byteswap.h:4:15: error: ?uint16? does not name a type
    /home/yann/Bureau/Dev/fsbench-0.14.1/src/codecs/CityHash/byteswap.h:8:15: error: ?uint32? does not name a type
    /home/yann/Bureau/Dev/fsbench-0.14.1/src/codecs/CityHash/byteswap.h:12:15: error: ?uint64? does not name a type
    make[2]: *** [CMakeFiles/fsbench.dir/codecs/SHA3/fsbench_SHA3.cpp.o] Error 1
    make[1]: *** [CMakeFiles/fsbench.dir/all] Error 2
    make: *** [all] Error 2

  14. #193
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by Cyan View Post
    I tried to compile fsbench 0.14.1

    Under minGW, compilation works correctly.
    The issue of performance is still there though.


    I'm using a cmake GUI, not a command line. I did not found how to add this instruction.
    In the middle of the screen there's a list of variables, CMAKE_BUILD_TYPE is one of them.
    Quote Originally Posted by Cyan View Post
    Anyway, I'm not sure it's a good idea to require the user to enter this instruction.
    Since it is a benchmark program, I think it's expected that the binary produced get optimized by default.
    I totally agree. Unfortunately, this is not possible with CMake. They reject it because with some targets (i.e. MSVC) build type is not being set upon generation
    , but selected by user in the IDE before compilation.
    Funnily, it is possible to set it in the CMake makefile and it sometimes works, f.e. it does on my desktop. But sometimes it doesn't as you and Bulat (and me too on 1 Linux) witnessed.


    Quote Originally Posted by Cyan View Post
    Last but not least, under GCC / Linux Ubuntu 64, compilation fails :

    Code:
    In file included from /home/yann/Bureau/Dev/fsbench-0.14.1/src/codecs/SHA3/Skein/brg_endian.h:49:0,
                     from /home/yann/Bureau/Dev/fsbench-0.14.1/src/codecs/SHA3/Skein/skein_port.h:49,
                     from /home/yann/Bureau/Dev/fsbench-0.14.1/src/codecs/SHA3/Skein/skein.h:36,
                     from /home/yann/Bureau/Dev/fsbench-0.14.1/src/codecs/SHA3/Skein/skein.c:14,
                     from /home/yann/Bureau/Dev/fsbench-0.14.1/src/codecs/SHA3/fsbench_SHA3.cpp:101:
    /home/yann/Bureau/Dev/fsbench-0.14.1/src/codecs/CityHash/byteswap.h: At global scope:
    /home/yann/Bureau/Dev/fsbench-0.14.1/src/codecs/CityHash/byteswap.h:4:15: error: ?uint16? does not name a type
    /home/yann/Bureau/Dev/fsbench-0.14.1/src/codecs/CityHash/byteswap.h:8:15: error: ?uint32? does not name a type
    /home/yann/Bureau/Dev/fsbench-0.14.1/src/codecs/CityHash/byteswap.h:12:15: error: ?uint64? does not name a type
    make[2]: *** [CMakeFiles/fsbench.dir/codecs/SHA3/fsbench_SHA3.cpp.o] Error 1
    make[1]: *** [CMakeFiles/fsbench.dir/all] Error 2
    make: *** [all] Error 2
    I found it too, gcc appeard to prefer local includes over system ones even with <header> type of inclusion....or maybe I misunderstand -I switch.
    Thanks for the report anywany. Will be fixed this evening.

    ADDED:
    Will be fixed even sooner.
    Code:
    0.14.2
    [+] added FNV1a-penumbra
    [!] fixed compilation issues on some Linuces
    Attached Files Attached Files
    Last edited by m^2; 19th June 2013 at 08:07.

  15. #194
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    856
    Thanks
    447
    Thanked 254 Times in 103 Posts
    OK, we are starting to get there

    Latest version (0.14.2) compiles properly with both MingGW (Windows 32 bits) and GCC Ubuntu (Linux 64 bits)
    Specifying "Release" into the editable field also solve the performance issue (thanks for the tip, I thought up to now that only checkboxes could be used in this area).

    My only (minor) suggestion is that it would be a nice option to force 32-bits compilation on a Linux 64-bits distro.
    This way, it would be easier to compare 32-bits and 64-bits performances.

    [Edit] : The benchmark results published on https://code.google.com/p/lz4/ have been updated, using your latest version (0.14.2)
    Last edited by Cyan; 19th June 2013 at 18:56.

  16. #195
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by Cyan View Post
    OK, we are starting to get there

    Latest version (0.14.2) compiles properly with both MingGW (Windows 32 bits) and GCC Ubuntu (Linux 64 bits)
    Specifying "Release" into the editable field also solve the performance issue (thanks for the tip, I thought up to now that only checkboxes could be used in this area).

    My only (minor) suggestion is that it would be a nice option to force 32-bits compilation on a Linux 64-bits distro.
    This way, it would be easier to compare 32-bits and 64-bits performances.

    [Edit] : The benchmark results published on https://code.google.com/p/lz4/ have been updated, using your latest version (0.14.2)
    I've never done such thing, but assuming it's not different from regular cross-compilation, the steps are like that:
    Code:
    export CC=path_to_c_compiler
    export CXX=path_to_c++_compiler
    cmake -DCMAKE_BUILD_TYPE=Release .
    make
    If you use the GUI, just follow the guide after pressing 'configure' button to select the right compiler.

    I updated the results above:
    * Added Penumbra to C2 L1 results (due to a mistake I double tested L1 instead of L1+L2, TOFIX)
    * Added Penumbra to Xeon L1, L2 results.
    Overall, it's clearly an improvement over TRIADiiXMM, but haven't scored a win. Though I suspect that gcc 4.4 used on Xeon doesn't do it a favor, just like 4.2.1 on Phenom....got to use something better.

  17. #196
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    I have AMCC 440EPX (aka. sequoia) results for you.

    And a major change. I intend to keep a single spreadsheet with all results that I have. For me this is much more maintainable. I think for you it's more searchable and analysable.
    However, less accessible (requires download) and for now - bloated and therefore less readable, it's still very raw.
    What do you think about it?

    As for the test....it's the first big endian system that I tried. There have been a couple of high-profile failures, Snappy and LZMAT failed to decompress own output.
    Also, Sequoia is rather unusual. It's a fairly fat core, with 32+32 KB of highly associative L1 cache and no L2 at all, so as long as you stay within the limit you're fine, but exceeding it is extremely costly.
    While testing hashes on it, I noticed an interesting thing. On 32 KB blocks the results were quite bad. I thought - stack competes with data. I halved the buffer, it was fine. I checked 24 KB, the results were still bad, so I don't think it's stack, with cache so highly associative it should fit easily. On 8 KB they were a little lower.
    And I've seen this effect on other CPUs too, half cache size appears to be the sweet spot for speed. Can anyone explain that?
    Attached Files Attached Files
    Last edited by m^2; 24th June 2013 at 21:29.

  18. #197
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    856
    Thanks
    447
    Thanked 254 Times in 103 Posts
    I couldn't open the file "results.ods".
    Neither MS Office nor Google docs would accept it.
    I guess Open Office might do the trick, but it's not installed on my system.
    If you wish your document to be easily read by anyone, I would suggest to use a more spread format, such as, typically .csv.

    It's always good to have more reference points.
    I feel it's a good idea to report performance on a wide range of systems.
    Most of them are not easily accessible by regular programmers, so it will help.

  19. #198
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Thanks for feedback.
    I'd rather keep it in some format richer than csv because this enables me to add some features like:
    * colouring top results
    * charts
    * separation of source and presented data (far future)
    * integrated macro suite to make colouring / separation from the previous point automated (and this is a thing that can be run offline, so users won't have to run them)

    However, I may push it in several formats simultaneously, so there's minimal trouble caused.
    BTW, does Google have troubles only with this spreadsheet or did they drop odf support?

  20. #199
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    OpenOffice can read it.

  21. #200
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    A crashing bug was found, so I post an update despite not having anything else.
    Code:
    0.14.3
    [!] fixed integer underflow
    Attached Files Attached Files

  22. #201
    Member
    Join Date
    Feb 2014
    Location
    Canada
    Posts
    17
    Thanks
    23
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by m^2 View Post
    A crashing bug was found, so I post an update despite not having anything else.
    Code:
    0.14.3
    [!] fixed integer underflow
    m^2 - it might also be useful if you updated your original post with the new versions as they are updated, so there is one "known good" place to get the recent version, rather than looking through the thread for the last upload.

    I had one question - when I tried to use some algorithm like lz4hc, it said something to the effect of "I'm an encoder only, you must pair me with some decoder!" Naturally I'd like to pair that one with plain lz4 for decoding, but I couldn't find the syntax (neither when I checked the code).

  23. #202
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by PSHUFB View Post
    m^2 - it might also be useful if you updated your original post with the new versions as they are updated, so there is one "known good" place to get the recent version, rather than looking through the thread for the last upload.

    I had one question - when I tried to use some algorithm like lz4hc, it said something to the effect of "I'm an encoder only, you must pair me with some decoder!" Naturally I'd like to pair that one with plain lz4 for decoding, but I couldn't find the syntax (neither when I checked the code).
    lz4hc/lz4
    Thx for pointing this out, somehow I missed that neither --help nor README mention it.

    I intend to push fsbench to chiselapp like Bulat suggested long time ago. I tried it back then and failed, but just last month I tried again and succeeded with a simpler case.

  24. #203
    Member
    Join Date
    Feb 2014
    Location
    Canada
    Posts
    17
    Thanks
    23
    Thanked 0 Times in 0 Posts
    Thanks m^2!

    A bug report - when I try to measure the algorithm "nop" it crashes immediately.

    Lit would be great also if you could add a plain memcpy implementation, so an upper bound on the machine bandwidth could be determined. If you accept patches, I could submit one.

  25. #204
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Thanks for the bug report, I can reproduce the issue. compression_rate_buf in printTime is too small. Though really I should have used snprintf + new.
    I would be happy to receive a patch. Though it would land in my private repo and stay there until I move to chiselapp. Which I don't know when will happen, not this month for sure, maybe (but hopefully not) much later.

  26. #205
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    I pushed fsbench to chiselapp.

    To check it out:
    Code:
    fossil clone https://chiselapp.com/user/Justin_be_my_guide/repository/fsbench fsbench
    fossil open fsbench

  27. The Following 2 Users Say Thank You to m^2 For This Useful Post:

    Cyan (15th March 2014),PSHUFB (10th July 2014)

  28. #206
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    856
    Thanks
    447
    Thanked 254 Times in 103 Posts
    Tested it. Worked like a charm, flawless compilation on 1st attempt.
    Doc is very complete and precise.

    The only remaining issue I could find is that general performance is below expectation, for all codecs.
    Maybe an optimization flag (-O3?) may be missing somewhere ?

  29. #207
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Thanks for testing.
    Could you please look into CMakeFiles/fsbench.dir/flags.make (if it doesn't exist on your system, just grep CMake metadata for -fno-tree-vectorize) and test your benchmark with the flags found in there?

    Also, what are you comparing it with? An older version of fsbench or something entirely different?
    Last edited by m^2; 15th March 2014 at 13:47.

  30. #208
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    856
    Thanks
    447
    Thanked 254 Times in 103 Posts
    Found it (see attached file).
    Almost all flags are related to include directories management.
    I noticed almost no performance related flag.

    As a test, I manually added -O3 to CFLAGS, which mostly solved the performance issue, except for Snappy (probably because it is C++, not C).
    Attached Files Attached Files
    Last edited by Cyan; 15th March 2014 at 14:23.

  31. #209
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Have you used GUI, "cmake ." or some other CMake invocation?

  32. #210
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    856
    Thanks
    447
    Thanked 254 Times in 103 Posts
    cmake gui, linux mint

Page 7 of 9 FirstFirst ... 56789 LastLast

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •