Efficient parallel algorithms are derived for the multiplication of a vector by a Kronecker product of matrices. For N square matrices of order ni, i = 1, …, N, it is well-known that the problem can be solved with a total of ∏Ni=1 ni × ∑Ni=1 ni floating point multiplications. The starting point of our investigations is a multidimensional stepwise scheme which respects the previous optimal bound. We give some sequential codes facing the memory space restriction and the sparsity context. Two parallel schedules depending on the choice between data communication and computation redundancies are proposed. In both cases, the number of processors is rather special. However, algorithms are quite efficient as illustrated by some experimental results.
Special Issue Papers