In my previous post, I showed that split-complex numbers can be used to speed up complex matrix multiplication.
In this post I will do a quick measurement on exactly how much faster is the new approach when compared to the naive implementation in practice. Since we are using 3 real matrix multiplications in the new approach as opposed to 4 in the naive approach we would expect up to 25% improvement in the runtime. We can easily implement and test this by using Python's NumPy library. In the following listing the hyperbolic_matrix_mult
function is the new approach that uses split-complex numbers and complex_matrix_mult
is the function implementing the naive complex matrix multiplication.
# matrix_mult.py import numpy as np def hyperbolic_matrix_mult(Z1, Z2, W1, W2): X = 0.5 * (Z1 + Z2).dot(W1 + W2) Y = 0.5 * (Z1 - Z2).dot(W1 - W2) return X + Y - 2.0 * Z2.dot(W2), X - Y def complex_matrix_mult(Z1, Z2, W1, W2): return Z1.dot(W1) - Z2.dot(W2), Z1.dot(W2) + Z2.dot(W1)
I will be running the test on my MacBook laptop with the following specs:
Hardware Overview: Model Name: MacBook Pro Model Identifier: MacBookPro13,3 Processor Name: Quad-Core Intel Core i7 Processor Speed: 2,6 GHz Number of Processors: 1 Total Number of Cores: 4 L2 Cache (per Core): 256 KB L3 Cache: 6 MB Hyper-Threading Technology: Enabled Memory: 16 GB
The test will be as follows. First, we will generate four random real square matrices of order 10K. We use large matrices because we want the cost of multiplying the matrices to be the dominant factor in the runtime. Then, since the Jupyter notebook has nice features for timing the code, we can spin up a Jupyter notebook and perform the timing with the timeit
command available as part of Jupyter. We will call timeit
on both functions that will, by default, run the multiplication 7 times and calculate the mean and standard deviation of the collected run-times. This is all done in the following Jupyter Notebook snippet.
import numpy as np from matrix_mult import hyperbolic_matrix_mult, complex_matrix_mult n = 10000 A = np.random.rand(n, n) B = np.random.rand(n, n) C = np.random.rand(n, n) D = np.random.rand(n, n) %timeit hyperbolic_matrix_mult(A, B, C, D) %timeit complex_matrix_mult(A, B, C, D)
After running this for about 45 minutes on my machine, here are the results. The first line shows the runtime of the method using split-complex numbers and the second one is the runtime of the naive complex matrix implementation.
49.9 s ± 388 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) 1 min 2s ± 985 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Already from this simple test we see that the improvement in time of the new algorithm over naive complex multiplication is about 20% which is not bad. Stay tuned for more thourough testing in the next post.