A data parallel scientific programming model.
Compiles efficiently to different parallel architectures like
distributed memory with message passing [MPI, PVM] and one-sided communication [MPI-2, shmem], shared memory multi-processor and multi-core processors [POSIX
threads, OpenMP, boost threads, Intel TBB], procedure off-loading [Nvidia Cuda, Cell BE, AMD Brook, OpenCL], SIMD vectorization [SSE and
AltiVec], and sequential C++ code.
Transform inherrent parallel expressions into efficient parallel C++ code:
ForEach(tree *b, up, b->x = b->child(0)->y; )
Grid1IteratorSub it(1, n, grid);
ForEach(int i, it, x(i) += ( y(i+1) + y(i-1) )*.5; )