np.einsum

Numpy Einsum generalises algebraic operations on multidimensional arrays, using Einstein Notation. It can be used for matrix multiplications, addition etc.

It has form as,

np.einsum("<subscripts>", arrays...)

where subscripts are array axes separated by command for each operand array. For example,

a = np.array([1, 2, 3])
np.einsum("i,i->i", a, a)
 
# array([1, 4, 9])

It returns the array after doing element wise multiplication.

How does it work?

It does multiplication of corresponding array elements by index, does sum on some axes (if required). For example,

matrix multiplication operation,

A = np.array([
    [1, 2],
    [3, 4],
])
 
B = np.array([
    [5, 6],
    [7, 8],
])
 
np.einsum("ij,jk->ik", A, B)

the Einsum equation can be,

  • multiply A[i,j] and B[j,k]
  • Sum over axis j because it is repeated and it is not required on output (explicit mode).

This whole operation can be achieved using loops as shown below (might be happening inside np.einsum but efficiently).

i_size = A.shape[0]
k_size = B.shape[1]
j_size = A.shape[1]
 
result = np.zeros((i_size, k_size), dtype=A.dtype)
 
for i in range(i_size):
    for k in range(k_size):
        for j in range(j_size):
            result[i,k] += A[i,j] * B[j,k]

It works in two modes

  1. implicit mode
  2. explicit mode

In implicit mode we do not have to specify the output axes. For example,

np.einsum("ij,jk", A, B)

It rearranges the axes in alphabetic order for output array. In this case, it will be, ik. However, if the condition is like below,

np.einsum("cd,da", A, B)

It returns the axes in order ac which is transpose of matrix multiplication.

In explicit mode, we specify the output axes. I prefer this mode as it is more clear.

np.einsum("ij,jk -> ik", A, B)

Some general rules

  1. If axes are omitted after ->, sum happens over those axes.
A = np.array([[1, 2],
              [3, 4]])
 
np.einsum("ij->", A) # should sum over i and j axes.

It can shown as,