A C++20 array and expression template library with some J/APL features
|
|
пре 2 година | |
|---|---|---|
| .github | пре 2 година | |
| bench | пре 2 година | |
| box | пре 2 година | |
| config | пре 2 година | |
| docs | пре 2 година | |
| examples | пре 2 година | |
| ra | пре 2 година | |
| test | пре 2 година | |
| .gitattributes | пре 5 година | |
| CMakeLists.txt | пре 6 година | |
| LICENSE | пре 10 година | |
| README.md | пре 2 година | |
| SConstruct | пре 6 година | |
| TODO | пре 2 година |
ra-ra is a C++20 header-only library for handling multidimensional dense arrays. These are objects that can be indexed in 0 or more dimensions; the number of dimensions is known as ‘rank’. For example, vectors are arrays of rank 1 and matrices are arrays of rank 2.
ra-ra implements expression templates. This is a C++ technique (pioneered by Blitz++) to delay the execution of expressions involving array operands, and in this way avoid the unnecessary creation of large temporary array objects.
ra-ra is compact (~5k loc), easy to extend, and generic. There are no arbitrary type restrictions or limits on rank or argument count.
In this example (examples/read-me.cc), we create some arrays, do operations on them, and print the result.
#include "ra/ra.hh"
#include <iostream>
int main()
{
// run time rank
ra::Big<float> A = { {1, 2, 3, 4}, {5, 6, 7, 8} };
// static rank, run time dimensions
ra::Big<float, 2> B = { {1, 2, 3, 4}, {5, 6, 7, 8} };
// static dimensions
ra::Small<float, 2, 4> C = { {1, 2, 3, 4}, {5, 6, 7, 8} };
// rank-extending op with STL object
B += A + C + std::vector {100., 200.};
// negate right half
B(ra::all, ra::iota(ra::len/2, ra::len/2)) *= -1;
// shape is dynamic, so will be printed
std::cout << "B: " << B << std::endl;
}
⇒
B: 2 4
103 106 -109 -112
215 218 -221 -224
Please check the manual online at lloda.github.io/ra-ra, or have a look at the examples/ folder.
ra-ra offers:
<ranges>.len.where with bool selector, or pick with integer selector).constexpr is suported as much as possible. For example:
constexpr ra::Small<int, 3> a = { 1, 2, 3 };
static_assert(6==ra::sum(a));
Performance is competitive with hand written scalar (element by element) loops, but probably not with cache-tuned code such as your platform BLAS, or with code using SIMD. Please have a look at the benchmarks in bench/.
ra-ra is header-only and has no dependencies other than a C++20 compiler and the standard library. I test regularly with gcc ≥ 11.3. If you can test with Clang, please let me know.
The test suite in test/ runs under either SCons (CXXFLAGS=-O3 scons) or CMake (CXXFLAGS=-O3 cmake . && make && make test). Running the test suite will also build and run the examples and the benchmarks.
RA_USE_BLAS=1 in the environment.-fsanitize=address by default, and this can cause significant slowdown. Disable by adding -fno-sanitize=address to CXXFLAGS at build time.-O0, but that can take a long time.() or [] indistinctly. Multi-argument [] requires __cpp_multidimensional_subscript > 202110L (in gcc 12 with -std=c++2b).Please see the TODO file for a concrete list of known issues.
I do numerical work in C++, and I need support for array operations. The built-in array types that C++ inherits from C are very insufficient, so at the time of C++11 when I started writing ra-ra, a number of libraries where already available. However, most of these libraries seemed to support only vectors and matrices, or small objects for vector algebra.
Blitz++ was a major inspiration as an early generic library. But it was a heroic feat to write such a library in C++ in the late 90s. Variadic templates, lambdas, perfect forwarding, etc. make things much easier, for the library writer as well as for the user.
From APL and J I've taken the rank extension mechanism, and perhaps an inclination for carrying each feature to its logical end.
ra-ra wants to remain simple. I try not to second-guess the compiler and I don't stress performance as much as Blitz++ did. However, I'm wary of adding features that could become an obstacle if I ever tried to make things fast(er). I believe that implementating new traversal methods, or perhaps optimizing specific expression patterns, should be possible without having to turn the library inside out.