Loop fission and fusion

One of the main benefits of loop fusion is that it allows temporary allocations to be avoided, which can lead to huge performance gains in numerical computing languages such as Julia when doing elementwise operations on arrays (however, Julia's loop fusion is not technically a compiler optimization, but a syntactic guarantee of the language).

Consider the following MATLAB code:The same syntax can be achieved in C++ by using function and operator overloading:However, the above example unnecessarily allocates a temporary array for the result of sin(x).

[6] However, as of clang 12.0.0 and gcc 11.1, this loop fusion and redundant allocation removal does not occur - even on the highest optimization level.

[9] Currently, to achieve the same syntax in general purpose languages like C++, the sin and operator+ functions must pessimistically allocate arrays to store their results, since they do not know what context they will be called from.

This issue can be avoided in C++ by using a different syntax that does not rely on the compiler to remove unnecessary temporary allocations (e.g., using functions and overloads for in-place operations, such as operator+= or std::transform).