this post was submitted on 08 Apr 2024
59 points (91.5% liked)

Rust

5999 readers
23 users here now

Welcome to the Rust community! This is a place to discuss about the Rust programming language.

Wormhole

!performance@programming.dev

Credits

  • The icon is a modified version of the official rust logo (changing the colors to a gradient and black background)

founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] zweieuro@lemmy.world 16 points 7 months ago (1 children)

Correct me if I am wrong but isn't "loop unrolling/unwinding" something that the c++ and rust compilers do? Why does the loop here not get unwound?

[–] Giooschi@lemmy.world 14 points 7 months ago (1 children)

Loop unrolling is not really the speedup, autovectorization is. Loop unrolling does often help with autovectorization, but is not enough, especially with floating point numbers. In fact the accumulation operation you're doing needs to be associative, and floating point numbers addition is not associative (i.e. (x + y) + z is not always equal to (x + (y + z)). Hence autovectorizing the code would change the semantics and the compiler is not allowed to do that.

[–] bonus_crab@lemmy.world 7 points 7 months ago (1 children)

so if (somehow) the accumulator was an integer, this loop would autovectorize and the performance differences would be smaller ?

[–] Giooschi@lemmy.world 4 points 7 months ago

Very likely yes