NumPy arrays from Python to Matlab

Passing large NumPy arrays from Python to Matlab. This process takes place the Matlab's engine for Python.

NumPy arrays from Python to Matlab

A few months ago, I started executing the Matlab's engine in Python (in fact, doing some questions. During that funny time (🤨) to work passing data between Python and Matlab back and forth, the process could take much time if we implement the straightforward solution. Before continue, let me explain a little bit the context.

I published a post describing how to call a Matlab script using Python. Then, I discovered a huge issue there when I work with NumPy arrays. To be clear, it is significant to identify the libraries that I am using:

  • Matlab version R2020b which is compatible with Python 3.6 to 3.8
  • Python version 3.8.6

In addition, it is important to know that the Matlab interoperability features only support built-in Python types. In this way, NumPy arrays are not part of core Python and therefore they are unrecognized in MATLAB. However, for several applications of non-built-in Python types, the MATLAB equivalent can be used. If you want to use a NumPy array, you can create a Matlab array in Python. In this link, you can check more information about it 😎.

Until this point, everything sounds delightful! Also, according to the official documentation:

"... convert the Python object to a MATLAB array, and then index into the array as needed ..."

It sounds logical! Moreover, a Python container is typically a sequence type (list or tuple) or a mapping type (dict) and it could be received in Matlab in that way (always using an available datatype mentioned in the documentation, e.g. single, double, int8, logical, etc.). Also, you can see the conversion table to transfer data between Matlab-Python.

If a have a float NumPy array called matrix, I only have to call matlab.double(matrix.tolist()).

via GIPHY

Well, here is where the problem is: this process is agonizingly long!! Then, that’s why I started to discover another solution and I found a suitable one (at least under my environment and conditions). My goal is to pass a 3D NumPy array of float values to Matlab to be processed.

My solution is to save the array to disk and after loading it into the Matlab workspace (all this utilizing Python’s Matlab engine). What? Are you foolish? write it to disk? Pfff!. Indeed, I believed the same and that's why I executed the following script:

import matlab
import time
import numpy as np

from scipy.io import savemat, loadmat

width, height, depth = 512, 512, 654
data_3d = np.random.random(size=(width, height, depth))

# ndarray --> list --> single
tic = time.perf_counter()
matlab.single(data_3d.tolist())
toc = time.perf_counter()
print(f"time: {toc - tic:0.4f} s")

# ndarray --> list --> double
tic = time.perf_counter()
matlab.double(data_3d.tolist())
toc = time.perf_counter()
print(f"time: {toc - tic:0.4f} s")

# ndarray --> .mat (no compression)
tic = time.perf_counter()
savemat("data_3d.mat", {"array": data_3d}, do_compression=False)
toc = time.perf_counter()
print(f"time: {toc - tic:0.4f} s")

# ndarray --> .mat (compression)
tic = time.perf_counter()
savemat("data_3d_min.mat", {"array": data_3d}, do_compression=True)
toc = time.perf_counter()
print(f"time: {toc - tic:0.4f} s")

I executed 10 times to get an average obtaining the following results:

method time (seconds)
# ndarray --> list --> single 157.2974
# ndarray --> list --> double 159.9001
# ndarray --> .mat 8.3351
# ndarray --> .mat 60.6324

8 seconds vs 157 seconds! 🤓 Clearly, we need to add the time that the engine would seize to load the file from the disk in the Matlab code. I did the same experiment, 10 times but from the Matlab environment directly as follows:

tic; load(data_3d.mat');toc

This process takes on average: 1.2561 seconds, then, adding this to save process is 8.3351 + 1.2561 = 9.6 seconds which sounds better than the 2.6 minutes using # ndarray --> list --> single

Then, for now, this is my solution: save to disk, store the path as a variable into Matlab's workspace, and then load it from the disk. As might be expected, there are some details that I pass under the table such as my HD is SSD or that my processor is an Intel i7 8th generation, both in a laptop.

I hope this would be valuable to other developers 🤖

From a geek to geeks


Share Tweet Send
0 Comments
Loading...
You've successfully subscribed to The ecode.DEV repository
Great! Next, complete checkout for full access to The ecode.DEV repository
Welcome back! You've successfully signed in
Success! Your account is fully activated, you now have access to all content.