29 Dec, 2014

3 commits


25 Dec, 2014

3 commits


15 Dec, 2014

1 commit

  • When a fatal error (unaligned memory etc.) is detected, gf-complete should
    assert(3) instead of exit(3) to give a chance to the calling program to
    catch the exception and display a stack trace. Although it is possible
    for gdb to display the stack trace and break on exit, libraries are not
    usually expected to terminate the calling program in this way.
    
    Signed-off-by: Loic Dachary <loic@dachary.org>
    (cherry picked from commit 29427efac2ce362fce8e4f5f5f1030feba942b73)
    Loic Dachary
     

06 Dec, 2014

1 commit


02 Dec, 2014

1 commit

  • When a fatal error (unaligned memory etc.) is detected, gf-complete should
    assert(3) instead of exit(3) to give a chance to the calling program to
    catch the exception and display a stack trace. Although it is possible
    for gdb to display the stack trace and break on exit, libraries are not
    usually expected to terminate the calling program in this way.
    
    Signed-off-by: Loic Dachary <loic@dachary.org>
    Loic Dachary
     

24 Oct, 2014

6 commits

  • arm neon optimisations
    KMG
     
  • Optimisations for 4,64 split table region multiplications. Only used on
    ARMv8-A since it is not faster on ARMv7-A.
    Janne Grunau
     
  • Optimisations for 4,32 split table multiplications.
    
    Selected time_tool.sh results on a 1.7 GHz cortex-a9:
    Region Best (MB/s):   346.67   W-Method: 32 -m SPLIT 32 4 -r SIMD -
    Region Best (MB/s):    92.89   W-Method: 32 -m SPLIT 32 4 -r NOSIMD -
    Region Best (MB/s):   258.17   W-Method: 32 -m SPLIT 32 4 -r SIMD -r ALTMAP -
    Region Best (MB/s):   162.00   W-Method: 32 -m SPLIT 32 8 -
    Region Best (MB/s):   160.53   W-Method: 32 -m SPLIT 8 8 -
    Region Best (MB/s):    32.74   W-Method: 32 -m COMPOSITE 2 - -
    Region Best (MB/s):   199.79   W-Method: 32 -m COMPOSITE 2 - -r ALTMAP -
    Janne Grunau
     
  • Optimisations for the 4,16 split table region multiplications.
    
    Selected time_tool.sh 16 -A -B results for a 1.7 GHz cortex-a9:
    Region Best (MB/s):   532.14   W-Method: 16 -m SPLIT 16 4 -r SIMD -
    Region Best (MB/s):   212.34   W-Method: 16 -m SPLIT 16 4 -r NOSIMD -
    Region Best (MB/s):   801.36   W-Method: 16 -m SPLIT 16 4 -r SIMD -r ALTMAP -
    Region Best (MB/s):    93.20   W-Method: 16 -m SPLIT 16 4 -r NOSIMD -r ALTMAP -
    Region Best (MB/s):   273.99   W-Method: 16 -m SPLIT 16 8 -
    Region Best (MB/s):   270.81   W-Method: 16 -m SPLIT 8 8 -
    Region Best (MB/s):    70.42   W-Method: 16 -m COMPOSITE 2 - -
    Region Best (MB/s):   393.54   W-Method: 16 -m COMPOSITE 2 - -r ALTMAP -
    Janne Grunau
     
  • Optimisations for the 4,4 split table region multiplication and carry
    less multiplication using NEON's polynomial long multiplication.
    arm: w8: NEON carry less multiplication
    
    Selected time_tool.sh results for a 1.7GHz cortex-a9:
    Region Best (MB/s):   375.86   W-Method: 8 -m CARRY_FREE -
    Region Best (MB/s):   142.94   W-Method: 8 -m TABLE -
    Region Best (MB/s):   225.01   W-Method: 8 -m TABLE -r DOUBLE -
    Region Best (MB/s):   211.23   W-Method: 8 -m TABLE -r DOUBLE -r LAZY -
    Region Best (MB/s):   160.09   W-Method: 8 -m LOG -
    Region Best (MB/s):   123.61   W-Method: 8 -m LOG_ZERO -
    Region Best (MB/s):   123.85   W-Method: 8 -m LOG_ZERO_EXT -
    Region Best (MB/s):  1183.79   W-Method: 8 -m SPLIT 8 4 -r SIMD -
    Region Best (MB/s):   177.68   W-Method: 8 -m SPLIT 8 4 -r NOSIMD -
    Region Best (MB/s):    87.85   W-Method: 8 -m COMPOSITE 2 - -
    Region Best (MB/s):   428.59   W-Method: 8 -m COMPOSITE 2 - -r ALTMAP -
    Janne Grunau
     
  • Optimisations for the single table region multiplication and carry less
    multiplication using NEON's polynomial multiplication of 8-bit values.
    
    The single polynomial multiplication is not that useful but vector
    version is for region multiplication.
    
    Selected time_tool.sh results for a 1.7GHz cortex-a9:
    Region Best (MB/s):   672.72   W-Method: 4 -m CARRY_FREE -
    Region Best (MB/s):   265.84   W-Method: 4 -m BYTWO_p -
    Region Best (MB/s):   329.41   W-Method: 4 -m TABLE -r DOUBLE -
    Region Best (MB/s):   278.63   W-Method: 4 -m TABLE -r QUAD -
    Region Best (MB/s):   329.81   W-Method: 4 -m TABLE -r QUAD -r LAZY -
    Region Best (MB/s):  1318.03   W-Method: 4 -m TABLE -r SIMD -
    Region Best (MB/s):   165.15   W-Method: 4 -m TABLE -r NOSIMD -
    Region Best (MB/s):    99.73   W-Method: 4 -m LOG -
    Janne Grunau
     

09 Oct, 2014

7 commits


03 Oct, 2014

1 commit


17 Sep, 2014

2 commits


23 Aug, 2014

2 commits


16 Jun, 2014

4 commits


09 Jun, 2014

3 commits


06 Jun, 2014

1 commit


14 May, 2014

5 commits