A sparsity pattern is the explicit binary mask that defines which parameters in a neural network are zero (pruned) and which are active (non-zero). This pattern is the direct output of a pruning algorithm and determines the model's memory footprint and the structure of the sparse matrix multiplications required for inference. The pattern's regularity—whether it is structured (e.g., removing entire filters) or unstructured (removing individual weights)—dictates the hardware and software support needed for efficient execution.




