Reshape

Dask array can be reshaped using dask.array.reshape method.

a = da.ones((3, 4), chunks=(2, 2))
a.reshape((3, 2, 2))

Reshape in dask array is not similar as numpy reshape because dask arrays have chunks and doing reshaping requires some re-alignment or re-chunking of the chunks. The re-chunking has to be done to make it compatible with the output shape.

Fast moving axes

Axes with less strides. In 2D array, axis=1 is . In re-chunking operation, we move across the fast moving axes and then goes to the next fasting moving axis.

dask-reshaping
figure: Moving across axes for reshaping
If we have chunk size greater than 1 for slow axis (axis=0), then some of the values in the block will be left while moving left to right along the axes.

For example, for below array the first block is of shape (2, 2) and while moving left to right, the block still got two values left at the bottom.

dask chunked
figure: Problem with reshape with chunk size more than 1 on slow axis

Reshaping of the following dask array with chunks defined would require to do re-chunking.

da.ones((3, 4), chunks=((2, 1), (2, 2))).reshape(12)

dask.array.reshape method has logic to do re-chunking by merging the chunks or splitting the chunks on slow axis to 1 chunk size. We can handle this using parameter merge_chunks. By default reshape does merge chunks.

dask_array.reshape(12, merge_chunks=False)

References

  1. https://docs.dask.org/en/stable/array-chunks.html