The Vector API was already introduced in Java 16 and Java 17 as an Incubator feature. This API is not about the java.util.Vector
class from Java 1.0 but about mathematical vector computation and
its mapping to modern single-instruction-multiple-data (SIMD)
architectures.
The Vector API aims to improve the performance of vector computations. A vector computation consists of a sequence of operations on vectors.
This API leverages the existing HotSpot auto-vectorizer but with a user model, which makes the computation more predictable and robust.
The new API includes at the top the the abstract class Vector<E> . The type variable E is instantiated as the boxed type of the scalar primitive integral or floating point element types covered by the vector.
A vector also has a shape which defines the size, in bits, of the vector. The combination of element type and shape determines a vector’s species, represented by VectorSpecies<E>.
As an example, here is a simple scalar computation over elements of arrays:
void scalarComputation(float[] a, float[] b, float[] c) { for (int i = 0; i < a.length; i++) { c[i] = (a[i] * a[i] + b[i] * b[i]) * -1.0f; } }
This is the counterpart using the Vector API:
static final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_PREFERRED; void vectorComputation(float[] a, float[] b, float[] c) { int i = 0; int upperBound = SPECIES.loopBound(a.length); for (; i < upperBound; i += SPECIES.length()) { // FloatVector va, vb, vc; var va = FloatVector.fromArray(SPECIES, a, i); var vb = FloatVector.fromArray(SPECIES, b, i); var vc = va.mul(va) .add(vb.mul(vb)) .neg(); vc.intoArray(c, i); } for (; i < a.length; i++) { c[i] = (a[i] * a[i] + b[i] * b[i]) * -1.0f; } }
The main loop then iterates over the array parameters in strides of the vector length (that is., the species length). It loads two FloatVector of the given species from arrays a and b at the corresponding index, fluently performs the arithmetic operations, and then stores the result into array c.
If any array elements are left over after the last iteration then the results for those tail elements are computed with an ordinary scalar loop.