Issue #16

0 up
0 down
Open
jerasure/gf-complete#16
Created by zhsj (Edited )

Should not compile whole file with -msse*

Althouth gf-complete supports runtime SSE detection, the SSE functions and generic functions are all in one file. So when compile it with -msse -msse2 ..., you can't guarentee gcc not to generate SSE instructions for the generic functions.

Assignee: None
Milestone: None
2 participants
  • B8f4659d098889f054b4e334296e173e?s=40&d=identicon
    zhsj @zhsj

    The most desired way is to separate the SSE functions into its own file, like the NEON one.

    Choose File ...   File name...
    Cancel
  • Eec633b4322a1884e5bd1861f08b6112?s=40&d=identicon
    Nyan @Nyan

    The whole runtime CPU detection feature isn't implemented properly, as you've noticed, but it may actually work otherwise*. I've found compilers, these days, don't often change intrinsics used, so something using only SSE2 intrinsics may compile to only SSE2 instructions even if compiled with -msse4.1 for example.
    Of course, this isn't a guarantee and can always change in the future. I'm mostly pointing this out as a somewhat 'kinda works' solution, as I'm not sure whether the issue will get properly addressed any time soon.
    Perhaps tests can be set up to double check whether the compiler generates unsupported instructions to guard against future compilers with different behavior.

    Of course AVX support is never going to work under the current system. I recall the original merge request (!18) having some information in comments, but it seems to have disappeared into the ether.

    * on x86-64, which implies SSE2. i386 may be problematic if the compiler auto-vectorizes some loops, but as of now, GF-Complete doesn't really compile on i386

    Choose File ...   File name...
    Cancel
  • B8f4659d098889f054b4e334296e173e?s=40&d=identicon
    zhsj @zhsj

    mentioned in merge request !23

    Choose File ...   File name...
    Cancel