How to communicate data between Fortran and Python

In a previous post, I discussed extending: how to build custom Fortran modules you can run in a Python script. If you just want to post-process or plot your Fortran results, you can opt to simply output your data in a readable format, so that you can read it in other languages (e.g. Python, Matlab, C++).

This post lists three ways to output Fortran data: the quick way, the classic way, and the sustainable way.

Quick way

Since Fortran 2003, it is possible to perform I/O with streams instead of batch writing. This allows for a platform-independent binary output format that is easily readable by other applications. Say that after meticulous calculations, we obtained our precious output arrayresults. We can write this array to a file as follows:

program create_data
  implicit none
  integer, parameter :: n = 10
  integer (kind=8), dimension(n, n) :: results
  integer i, j, out_unit

  do i = 1, n
    do j = 1, n
      results(i, j) = i + j
    enddo
  enddo
  out_unit = 3
  open(out_unit, file="output.dat", access="stream")
  write(out_unit) results
  close(out_unit)
end program

We can subsequently use the file output.dat as input for Python, using NumPy function np.fromfile(). The following Python snippet imports your results.

import numpy as np
n = 10
data = np.fromfile('output.dat', dtype=np.int64, count=n*n)
results = data.reshape(n, n)

Caveats

When importing the data into Python, it is important that you know the dimensions of the arrays, as well as the number of elements and the data type. When writing the arrays to file, Fortran only stores the array contents. Consider this example:

>>> np.fromfile('output.dat', dtype=np.int)
array([ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11,  3, ... ])
>>> np.fromfile('output.dat', dtype=np.int32)
array([ 2,  0,  3,  0,  4,  0,  5,  0,  6,  0,  7, ... ], dtype=np.int32)

If you do not correctly supply your array metadata, NumPy will simply consume all the bytes in the file (including header or footer) and convert them to numbers in the default format. Additionally, don’t forget to clear your output file on subsequent Fortran runs, or add status='replace' in your open statement. For more information, check the docs.

The classical way

If you don’t have access to a recent Fortran compiler and are stuck working with Fortran 77, the easiest way to output is with the unformatted file write option. Using the same example as before:

program create_data
  implicit none
  integer, parameter :: n = 15
  integer (kind=8), dimension(n, n) :: results
  integer i, j, out_unit

  do i = 1, n
    do j = 1, n
      results(i, j) = i + j
    enddo
  enddo
  out_unit = 3
  open(out_unit, file="output.dat", form="unformatted") ! Adjusted open statement
  write(out_unit) results
  close(out_unit)
end program

When opening a python script, the results can be read using SciPy’s FortranFile:

from scipy.io import FortranFile
import numpy as np
n = 15
ff = FortranFile('output.dat', 'r') # Specifying read-only access.
data = ff.read_ints(dtype=np.int64)
results = data.reshape(n, n)

Caveats

Along with the caveats from the first method, this way has the additional disadvantage that the data storage is not platform-independent; it may differ depending on your system architecture or Fortran compiler. For some background information, look up Endianness or data headers). If you use gfortran on common 64-bit Linux platforms, you should be fine. Check the details here

The sustainable way

If you are writing code for serious applications and have a lot of data to output, you should consider using a recognized format. The HDF5 standard is used by many applications to store large sets of data, and the HDF5 group released an API library to use in Fortran 90+. For more details, take a look at the example codes.

It is a bit more involved than the other options, but data stored in this format can be reliably ported between many applications, even outside the scope of scientific computing.

Good luck!