10 Apr, 2017

2 commits


08 Dec, 2016

1 commit


07 Dec, 2016

1 commit


23 Nov, 2016

1 commit

  • Gf32 mul silence warning
    
    silence warning like
    
    ```
    /slow/kchai/ceph/src/erasure-code/jerasure/gf-complete/src/gf_w32.c: In function ‘gf_w32_cfmgk_multiply_region_from_single’:
    /slow/kchai/ceph/src/erasure-code/jerasure/gf-complete/src/gf_w32.c:410:5: warning: ‘a’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       g = _mm_insert_epi64 (a, g_star, 0);
         ^
    ```
    
    See merge request !19
    Loic Dachary
     

18 Nov, 2016

1 commit

  • in gf_w32_cfmgk_multiply_region_from_single(), follow warning is
    reported by gcc:
    
    gf-complete/src/gf_w32.c:410:5: warning: ‘a’ may be used uninitialized
    in this function [-Wmaybe-uninitialized]
       g = _mm_insert_epi64 (a, g_star, 0);
         ^
    
    actually, we are using `a` as a dummy parameter for initializing `g` and
    `q`. and only the lower lower 64 bits of them are used when doing
    calculation. but their lower 64 bits are always initialized using
    _mm_insert_epi64(). so this is a false alarm.
    
    but we can silence this warning by moving the statement initializing `a`
    up before passing it to  _mm_insert_epi64(). this change does not hurt
    the performance.
    
    Signed-off-by: Kefu Chai <kchai@redhat.com>
    Kefu Chai
     

14 Sep, 2016

1 commit

  • Support for runtime detection of SIMD
    
    This merge request adds support for runtime SIMD detection. The idea is that you would build gf-complete with full SIMD support, and gf_init will select the appropriate function at runtime based on the capabilities of the target machine. This would eliminate the need to build different versions of the code for different processors (you still need to build for different archs). Ceph for example has 3-4 flavors of jerasure on Intel (and does not support PCLMUL optimizations as a result of using to many binaries). Numerous libraries have followed as similar approach include zlib.
    
    When reviewing this merge request I recommend that you look at each of the 5 commits independently. The first 3 commits don't change the existing logic. Instead they add debugging functions and test scripts that facilitate testing of the 4th and commit. The 4th commit is where all the new logic goes along with tests. The 5th commit fixes build scripts.
    
    I've tested this on x86_64, arm, and aarch64 using QEMU. Numerous tests have been added that help this code and could help with future testing of gf-complete. Also I've compared the functions selected with the old code (prior to runtime SIMD support) with the new code and all functions are identical. Here's a gist with the test results prior to SIMD extensions: https://gist.github.com/bassamtabbara/d9a6dcf0a749b7ab01bc2953a359edec.
    
    See merge request !18
    bassamtabbara
     

13 Sep, 2016

14 commits

  • Bassam Tabbara
     
  • Bassam Tabbara
     
  • ax_ext.m4 no longer performs any CPU checks. Instead it just checks
    if the the compile supports SIMD flags.
    
    Runtime detection will choose the right methods base on CPU
    instructions available.
    
    Intel AVX support is still done through the build since it would
    require a major refactoring of the code base to support it at runtime.
    For now I added a configuration flag --enable-avx that can be used
    to compile with AVX support.
    
    Also use cpu intrinsics instead of __asm__
    Bassam Tabbara
     
  • This commits adds support for runtime detection of SIMD instructions. The idea is that you would build once with all supported SIMD functions and the same binaries could run on different machines with varying support for SIMD. At runtime gf-complete will select the right functions based on the processor.
    
    gf_cpu.c has the logic to detect SIMD instructions. On Intel processors this is done through cpuid. For ARM on linux we use getauxv.
    
    The logic in gf_w*.c has been changed to check for runtime SIMD support and fallback to generic code.
    
    Also a new test has been added. It compares the functions selected by gf_init when we enable/disable SIMD support through build flags, with runtime enabling/disabling. The test checks if the results are identical.
    Bassam Tabbara
     
  • This commit adds a couple of scripts that help test SIMD functionality
    on different machines through QEMU.
    
    tools/test_simd_qemu.sh will automatically start qemu, run tests
    and stop it. it uses the Ubuntu cloud images which are built for
    x86_64, arm and arm64.
    
    tools/test_simd.sh run a number of tests including compiling
    with different flags, unit tests, and gathering the functions
    selected in gf_init (and when compiling with DEBUG_FUNCTIONS)
    Bassam Tabbara
     
  • There is currently no way to figure out which functions were selected
    during gf_init and as a result of SIMD options. This is not even possible
    in gdb since most functions are static.
    
    This commit adds a new macro SET_FUNCTION that records the name of the
    function selected during init inside the gf_internal structure. This macro
    only works when DEBUG_FUNCTIONS is defined during compile. Otherwise the
    code works exactly as it did before this change.
    
    The names of selected functions will be used during testing of SIMD
    runtime detection.
    
    All calls such as:
    
    gf->multiply.w32 = gf_w16_shift_multiply;
    
    need to be replaced with the following:
    
    SET_FUNCTION(gf,multiply,w32,gf_w16_shift_multiply)
    
    Also added a new flag to tools/gf_methods that will print the names of
    functions selected during gf_init.
    Bassam Tabbara
     
  • .gitignore to ignore some autotools files and tests.
    Bassam Tabbara
     
  • enable valgrind for tests
    
    See merge request !9
    Loic Dachary
     
  • NEON fixes/tweaks
    
    This merge request fixes some issues and adds some tweaks to NEON code:
    
    * SPLIT(16,4) ALTMAP implementation was broken as it only processed half the amount of data. As such, this fixed implementation is significantly slower than the old code (which is to be expected). Fixes #2
    * SPLIT(16,4) implementations now merge the ARMv8 and older code path, similar to SPLIT(32,4). This fixes the ALTMAP variant, and also enables the non-ALTMAP version to have consistent sizing
    * Unnecessary VTRN removed in non-ALTMAP SPLIT(16,4) as NEON allows (de)interleaving during load/store; because of this, ALTMAP isn't so useful in NEON
      * This can also be done for SPLIT(32,4), but I have not implemented it
    * I also pulled the `if(xor)` conditional from non-ALTMAP SPLIT(16,4) to outside the loop. It seems to improve performance a bit on my Cortex A7
      * It probably should be implemented everywhere else, but I have not done this
    * CARRY_FREE was incorrectly enabled on all sizes of w, when it's only available for w=4 and w=8
    
    See merge request !16
    Loic Dachary
     
  • Workaround until issue #13 is dealt with.
    
    Signed-off-by: Loic Dachary <loic@dachary.org>
    Loic Dachary
     
  • Signed-off-by: Loic Dachary <loic@dachary.org>
    Loic Dachary
     
  • If --enable-valgrind is given to ./configure, all tests are run with
    valgrind set to fail if an error is reported ( --error-exitcode=1 )
    
    Signed-off-by: Loic Dachary <loic@dachary.org>
    Loic Dachary
     
  • This is harmless really but triggers a valgrind error.
    
    Signed-off-by: Loic Dachary <loic@dachary.org>
    Loic Dachary
     
  • HTML manual fixes
    
    Fixes to HTML manual, for mistakes I've noticed.
    I'm sure there's more, but this is a start...
    
    See merge request !14
    Loic Dachary
     

14 Nov, 2015

1 commit


12 Nov, 2015

4 commits


02 Nov, 2015

1 commit


04 Sep, 2015

1 commit


02 Sep, 2015

3 commits


18 Jun, 2015

4 commits

  • Fix for Coverity issue (from Ceph):
    
    CID 1193089 (#1 of 1): Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN)
     overflow_before_widen: Potentially overflowing expression 1 << g_r with type
     int (32 bits, signed) is evaluated using 32-bit arithmetic, and then used in
     a context that expects an expression of type uint64_t (64 bits, unsigned).
    
    CID 1193090 (#1 of 1): Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN)
     overflow_before_widen: Potentially overflowing expression 1 << g_s with type
     int (32 bits, signed) is evaluated using 32-bit arithmetic, and then used in
     a context that expects an expression of type uint64_t (64 bits, unsigned).
    
    Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
    Danny Al-Gaaf
     
  • Fix for Coverity issue (from Ceph):
    
    CID 1193088 (#1 of 1): Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN)
     overflow_before_widen: Potentially overflowing expression 1 << g_s with type
     int (32 bits, signed) is evaluated using 32-bit arithmetic, and then used in
     a context that expects an expression of type uint64_t (64 bits, unsigned).
    
    Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
    Danny Al-Gaaf
     
  • Fix for Coverity issue (from Ceph):
    
    CID 1193087 (#1 of 1): Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN)
     overflow_before_widen: Potentially overflowing expression 1 << g_r with type
      int (32 bits, signed) is evaluated using 32-bit arithmetic, and then used
      in a context that expects an expression of type uint64_t (64 bits, unsigned).
    
    Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
    Danny Al-Gaaf
     
  • Fix for Coverity issue (from Ceph):
    
    CID 1193086 (#1 of 1): Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN)
     overflow_before_widen: Potentially overflowing expression 1 << g_r with type
      int (32 bits, signed) is evaluated using 32-bit arithmetic, and then used in
      a context that expects an expression of type uint64_t (64 bits, unsigned).
    
    Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
    Danny Al-Gaaf
     

17 Jun, 2015

2 commits

  • Fix for Coverity issue:
    
    CID 1297812 (#1 of 1): Constant variable guards dead code (DEADCODE)
     dead_error_begin: Execution cannot reach this statement: fprintf(stderr,
      "Code conta....
     Local variable no_default_flag is assigned only once, to a constant
      value, making it effectively constant throughout its scope. If this
      is not the intent, examine the logic to see if there is a missing
      assignment that would make no_default_flag not remain constant.
    
    Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
    Danny Al-Gaaf
     
  • Fix for Coverity issue:
    
    CID 1297852 (#1 of 1): 'Constant'; variable guards dead code (DEADCODE)
     dead_error_begin: Execution cannot reach this statement:
      fprintf(stderr, "Code conta....
     Local variable no_default_flag is assigned only once, to a constant value,
      making it effectively constant throughout its scope. If this is not the
      intent, examine the logic to see if there is a missing assignment that
      would make no_default_flag not remain constant.
    
    Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
    Danny Al-Gaaf
     

14 Jan, 2015

1 commit


08 Jan, 2015

1 commit


29 Dec, 2014

1 commit