How to communicate data between Fortran and Python04 Jun 2017
In a previous post, I discussed extending: how to build custom Fortran modules you can run in a Python script. If you just want to post-process or plot your Fortran results, you can opt to simply output your data in a readable format, so that you can read it in other languages (e.g. Python, Matlab, C++).
This post lists three ways to output Fortran data: the quick way, the classic way, and the sustainable way.
Since Fortran 2003, it is possible to perform I/O with streams instead of batch writing.
This allows for a platform-independent binary output format that is easily readable by other applications.
Say that after meticulous calculations, we obtained our precious output array
results. We can write this array to a file as follows:
program create_data implicit none integer, parameter :: n = 10 integer (kind=8), dimension(n, n) :: results integer i, j, out_unit do i = 1, n do j = 1, n results(i, j) = i + j enddo enddo out_unit = 3 open(out_unit, file="output.dat", access="stream") write(out_unit) results close(out_unit) end program
We can subsequently use the file
output.dat as input for Python, using NumPy function
The following Python snippet imports your results.
import numpy as np n = 10 data = np.fromfile('output.dat', dtype=np.int64, count=n*n) results = data.reshape(n, n)
When importing the data into Python, it is important that you know the dimensions of the arrays, as well as the number of elements and the data type. When writing the arrays to file, Fortran only stores the array contents. Consider this example:
>>> np.fromfile('output.dat', dtype=np.int) array([ 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 3, ... ]) >>> np.fromfile('output.dat', dtype=np.int32) array([ 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7, ... ], dtype=np.int32)
If you do not correctly supply your array metadata, NumPy will simply consume all the bytes in the file (including header or footer) and convert them to numbers in the default format.
Additionally, don’t forget to clear your output file on subsequent Fortran runs, or add
status='replace' in your
For more information, check the docs.
The classical way
If you don’t have access to a recent Fortran compiler and are stuck working with Fortran 77, the easiest way to output is with the
unformatted file write option.
Using the same example as before:
program create_data implicit none integer, parameter :: n = 15 integer (kind=8), dimension(n, n) :: results integer i, j, out_unit do i = 1, n do j = 1, n results(i, j) = i + j enddo enddo out_unit = 3 open(out_unit, file="output.dat", form="unformatted") ! Adjusted open statement write(out_unit) results close(out_unit) end program
When opening a python script, the results can be read using SciPy’s
from scipy.io import FortranFile import numpy as np n = 15 ff = FortranFile('output.dat', 'r') # Specifying read-only access. data = ff.read_ints(dtype=np.int64) results = data.reshape(n, n)
Along with the caveats from the first method, this way has the additional disadvantage that the data storage is not platform-independent; it may differ depending on your system architecture or Fortran compiler. For some background information, look up Endianness or data headers). If you use gfortran on common 64-bit Linux platforms, you should be fine. Check the details here
The sustainable way
If you are writing code for serious applications and have a lot of data to output, you should consider using a recognized format. The HDF5 standard is used by many applications to store large sets of data, and the HDF5 group released an API library to use in Fortran 90+. For more details, take a look at the example codes.
It is a bit more involved than the other options, but data stored in this format can be reliably ported between many applications, even outside the scope of scientific computing.