Skip to main content

Type annotations

The types of values in a function are generally inferred automatically by the ParPy compiler based on the values of the provided arguments. While this may be convenient, using explicit annotations can help with readability of the code, and they can be used to impose constraints on the relative shapes of parameters. Finally, they are also required in certain situations (e.g., in the declaration of external functions).

This tutorial is focused around a naive implementation of matrix multiplication in ParPy. This tutorial explains how to add type annotations to its matrix parameters, and discuss the benefits of doing this. The example code is found in examples/matmul.py in the main repository.

Adding annotations to a naive matrix multiplication

Below, we present a naive implementation of matrix multiplication in ParPy without type annotations. The full example file includes code that runs it and validates the results. This is only used for presentation purposes; we strongly recommend using an existing implementation: either through libraries like NumPy or PyTorch, or by invoking the cuBLAS implementation via ParPy (as discussed in the tutorial on externals)

import numpy as np
import parpy

@parpy.jit
def matmul(A, B, C, M, N):
parpy.label('M')
for i in range(M):
parpy.label('N')
for j in range(N):
parpy.label('K')
C[i, j] = parpy.reduce.sum(A[i, :] * B[:, j])

The function operates on three matrices: A of shape M times K, B of shape K times N, and C of shape M times N. As ParPy has no way to access the shapes of parameters in the body of a function, we also pass the shapes M and N as parameters, to specify the number of iterations in the for-loops. As an alternative, the type annotations of ParPy include both the element type and the shapes of non-scalar parameters. Type annotations are specified using functions from the parpy.types module. An annotated version of the function could look like:

T = parpy.types.type_var()
M = parpy.types.shape_var()
N = parpy.types.shape_var()
K = parpy.types.shape_var()

@parpy.jit
def matmul_annot(
A: parpy.types.buffer(T, [M, K]),
B: parpy.types.buffer(T, [K, N]),
C: parpy.types.buffer(T, [M, N])):
parpy.label('M')
for i in range(M):
parpy.label('N')
for j in range(N):
parpy.label('K')
C[i, j] = parpy.reduce.sum(A[i, :] * B[:, j])

The type_var and shape_var functions produce new variables that can be used in the annotations to represent arbitrary element types or shapes, respectively. Our annotations now explicitly enforce that the element types of the three matrices are the same, and that they have the shapes we expect. Calling both versions using C with M+1 rows, the former version does not fail (the example program fails in the NumPy validation), while the latter produces an error message explaining how the argument violates the constraints. A similar error is reported if any matrix has a conflicting element type. Note that this validation takes place at compile-time only (which happens only the first time we call a function with a set of arguments).

Furthermore, we can refer to the shape variables in the body of the function to specify the iteration count of the for-loops. Providing shapes as separate arguments is error-prone, as their connection to the other arguments is not explicitly declared nor enforced.

Implicit labeling of shape variables

Type annotations can also enable us to remove the use of explicit labels in our code (parpy.label). When a shape variable is used in a parallelizable statement, such as a for-loop, that statement is associated with a label corresponding to the name of the shape variable. For instance, if we have a for statement for i in range(M): ..., and M is a shape variable, the for-loop is implicitly associated with the label M. The example program relies on this to implement a more succinct version of the annotated version above:

T = parpy.types.type_var()
M = parpy.types.shape_var()
N = parpy.types.shape_var()
K = parpy.types.shape_var()

@parpy.jit
def matmul2(
A: parpy.types.buffer(T, [M, K]),
B: parpy.types.buffer(T, [K, N]),
C: parpy.types.buffer(T, [M, N])):
for i in range(M):
for j in range(N):
C[i, j] = parpy.reduce.sum(A[i, :K] * B[:K, j])

This kind of implicit labeling is enabled by default in the compiler, but can be disabled by setting the implicit_shape_labels compiler option to False. This could be useful in case it causes issues, for instance if a user-defined label has the same name as a shape variable. Explicit labeling using parpy.label can be used when we do not want to add type annotations, or when we need more precise control over how statements are parallelized.