Tag Archives: fast matrix multiplication

Implementation of Faster Complex Matrix Multiplication via Split-Complex Numbers

In my previous post, I showed that split-complex numbers can be used to speed up complex matrix multiplication.

In this post I will do a quick measurement on exactly how much faster is the new approach when compared to the naive implementation in practice. Since we are using 3 real matrix multiplications in the new approach as opposed to 4 in the naive approach we would expect up to 25% improvement in the runtime. We can easily implement and test this by using Python's NumPy library. In the following listing the hyperbolic_matrix_mult function is the new approach that uses split-complex numbers and complex_matrix_mult is the function implementing the naive complex matrix multiplication.

# matrix_mult.py

import numpy as np

def hyperbolic_matrix_mult(Z1, Z2, W1, W2):
    X = 0.5 * (Z1 + Z2).dot(W1 + W2)
    Y = 0.5 * (Z1 - Z2).dot(W1 - W2)
    return X + Y - 2.0 * Z2.dot(W2), X - Y

def complex_matrix_mult(Z1, Z2, W1, W2):
    return Z1.dot(W1) - Z2.dot(W2), Z1.dot(W2) + Z2.dot(W1)

I will be running the test on my MacBook laptop with the following specs:

Hardware Overview:

  Model Name:                   MacBook Pro
  Model Identifier:             MacBookPro13,3
  Processor Name:               Quad-Core Intel Core i7
  Processor Speed:              2,6 GHz
  Number of Processors:         1
  Total Number of Cores:        4
  L2 Cache (per Core):          256 KB
  L3 Cache:                     6 MB
  Hyper-Threading Technology:   Enabled
  Memory:                       16 GB

The test will be as follows. First, we will generate four random real square matrices of order 10K. We use large matrices because we want the cost of multiplying the matrices to be the dominant factor in the runtime. Then, since the Jupyter notebook has nice features for timing the code, we can spin up a Jupyter notebook and perform the timing with the timeit command available as part of Jupyter. We will call timeit on both functions that will, by default, run the multiplication 7 times and calculate the mean and standard deviation of the collected run-times. This is all done in the following Jupyter Notebook snippet.

import numpy as np
from matrix_mult import hyperbolic_matrix_mult, complex_matrix_mult

n = 10000
A = np.random.rand(n, n)
B = np.random.rand(n, n)
C = np.random.rand(n, n)
D = np.random.rand(n, n)

%timeit hyperbolic_matrix_mult(A, B, C, D)
%timeit complex_matrix_mult(A, B, C, D)

After running this for about 45 minutes on my machine, here are the results. The first line shows the runtime of the method using split-complex numbers and the second one is the runtime of the naive complex matrix implementation.

49.9 s ± 388 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1 min 2s ± 985 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Already from this simple test we see that the improvement in time of the new algorithm over naive complex multiplication is about 20% which is not bad. Stay tuned for more thourough testing in the next post.