Parsing and Sorting with Fortran

January 14, 2020

Fortran is a compiled language that is old as hell. It was cooked up by John Backus way back in 1953. According to the Fortran Wikipedia article, it's also structured, imperative, procedural, and object-oriented. The name was derived from Formula Translation and it was built for mathing and sciencing (both words I made up). Wikipedia has the following to say about Fortran:

It is a popular language for high-performance computing and is used for programs that benchmark and rank the world's fastest supercomputers.

Huh. Alright, well let's use it to make a CSV file.

Quick Fortran Primer

If you want to follow along, you'll need the ability to compile Fortran on your computer. There's online REPLs that can run Fortran, but we're doing file I/O, so that won't fly. If you're a macOS user, you can use gfortran, which gets bundled with GCC. You can install it with Homebrew:

brew install gcc

If you are using Windows, godspeed.

Compiling is a piece of cake:

gfortran example.f95 -o example

I'm not going to spend too much time talking about the various keywords and syntax associated with Fortran. I'll let the code do the talking. I will note that I'm using Fortran 95, which added some neat features and DOESN'T REQUIRE ME TO WRITE ALL CODE IN UPPERCASE.

That's being said, I will mention a couple of things I thought were interesting. Declaring variables is a little wonky: you use two colons (::). Most of the examples I've seen do a lot of reference passing to subroutines (reminiscent of C/C++ and Windows API calls). Here's a contrived example:

program some_program
    integer :: i_need_this

    call get_what_i_need(i_need_this)
    print *, i_need_this ! This prints out 10

contains
    subroutine get_what_i_need(i_need_this)
        integer :: i_need_this

        i_need_this = 10

    end subroutine get_what_i_need
end program some_program

I imagine if you're a die-hard functional programming enthusiast, you're breathing into a paper bag right now. It ain't pretty, but it works.

Getting data from files can also be a little tricky. You have to create a "status" integer variable that indicates when you've reached the end of a file. I haven't programmed in enough languages to know if that's weird or normal.

Don't even get me started on strings. Say what you will about JavaScript, but declaring a string is a piece of cake:

const myString = "stringydoodle";
console.log(myString); // <- Prints "stringydoodle"

Let's try that in Fortran:

program test
    character(12) :: my_string
    my_string = "stringydoodle"
    print *, my_string ! <- Prints "stringydoodl"
end program test

The eagle-eyed among you noticed that the Fortran example printed stringydoodl (sans e). That's because you have to either explicitly specify a character length or use dynamically-allocatable strings (another reason for using Fortran 95). So you can do this instead:

program test
    character(len=:), allocatable :: my_string
    my_string = "stringydoodle"
    print *, my_string ! <- Prints "stringydoodle", hooray!
end program test

With all that out of the way, let's get crackin'.

The Scenario

I'm going to use Fortran to take some input as a text file, sort it, and output it to a CSV file. I generated some data on generatedata.com. It's a list of random names with a numeric value separated by a | (pipe) character. If you want to follow along, I created a gist with the Fortran file and input data. Here's a snippet of the data (input.txt):

Vivian Larson|5458
Geraldine Gibson|9207
Violet Price|9745
Hilda Cohen|6835
...

The actual input.txt file in the gist has 200 names. We're going to treat the number to the right of the pipe character as a debt that's owed to...someone. To whom it is owed isn't really important. As mentioned earlier, the program will read it, sort the values, and output it to a CSV file.

The Execution

Let's start with some initialization code. The snippet below represents the start of the program. I'm defining a type, debtor with a name and debt property (which corresponds to the input file). I'm using global variables (gasp) because I didn't want to pass the debtors and debtor_count to every single method.

After all the initialization mumbo jumbo, I'm calling the assign_values_to_globals() and prioritize_debt() methods to actually make the program do stuff. I'll be going over those methods next.

program debt_prioritizer
  implicit none

  type debtor
      character(len=:), allocatable :: name
      integer                       :: debt
  end type debtor

  type(debtor), dimension(:), allocatable :: debtors
  integer                                 :: debtor_count = 0

  call assign_values_to_globals()
  call prioritize_debt()
contains

The assign_values_to_globals() method calls the assign_debtor_count() and assign_debtors() methods.

  subroutine assign_values_to_globals()
      call assign_debtor_count(debtor_count)
      allocate(debtors(debtor_count))
      call assign_debtors()
  end subroutine assign_values_to_globals

The assign_debtor_count() method loops through the file to determine how many people owe money. I know this seems stupid to open the file twice, but I couldn't find a way to dynamically allocate items to an array that I could actually understand.

  subroutine assign_debtor_count(debtor_count)
      integer :: debtor_count, read_status, sum_of_lines = 0

      open(10, file = "input.txt", status = "old", action = "read")
      do
          read(10, *, iostat = read_status)
          if (read_status /= 0) exit
          sum_of_lines = (sum_of_lines + 1)
      end do
      close(10)

      debtor_count = sum_of_lines
  end subroutine assign_debtor_count

The assign_debtors() method below opens the input file (again), loops through the records, and populates the debtors array with the name and debt of the corresponding debtor. The get_name_and_debt() method parses the file line to split the value into a name and debt.

  subroutine assign_debtors()
      character(50) :: line, name
      integer       :: read_status, debt, line_number = 1

      open(10, file = "input.txt", status = "old", action = "read")
      do
          read(10, "(A)", iostat = read_status) line

          if (read_status /= 0) exit

          call get_name_and_debt(line, name, debt)
          debtors(line_number)%name = name
          debtors(line_number)%debt = debt

          line_number = (line_number + 1)
      end do
      close(10)
  end subroutine assign_debtors


  subroutine get_name_and_debt(line, name, debt)
      character(1)  :: DELIM = "|"
      character(50) :: line, name, debt_string
      integer       :: debt, index

      line = trim(line)
      index = scan(line, DELIM)
      name = line(1 : index - 1)
      debt_string = line(index + 1 : )
      read(debt_string, '(I5)') debt
  end subroutine get_name_and_debt

Once the debtors are loaded into the array and their debt is calculated, the prioritize_debt() method reorders them descending by debt (so whoever owes the most money is right at the top of the list).

The sort_descending_by_debt() method is a ridiculously inefficient bubble sort. It loops through each record, and within that loop it does another loop. The higher debt amount is moved before the lower debt amount. This has to be done for each record, which is why there are 2 loops.

  subroutine prioritize_debt()
      call sort_descending_by_debt()
      call write_debtors_to_file()
      print *, "Done!"
  end subroutine prioritize_debt


  subroutine sort_descending_by_debt()
      integer :: i, j

      do i = 1, debtor_count
          do j = debtor_count, (i + 1), -1
              call swap(debtors(j - 1), debtors(j))
          end do
      end do
  end subroutine sort_descending_by_debt


  subroutine swap(left_debtor, right_debtor)
      type(debtor) :: left_debtor, right_debtor, temp

      if (left_debtor%debt < right_debtor%debt) then
          temp = right_debtor
          right_debtor = left_debtor
          left_debtor = temp
      end if
  end subroutine swap

The final step is the write_debtors_to_file() method below. If the file doesn't already exist, I create it in the same directory as the compiled program. Once the file exists, it writes each debtor to the file.

  subroutine write_debtors_to_file()
      character(len=:), allocatable :: name_trimmed
      integer                       :: debtor_number

      print *, "Creating the output file..."
      open(20, file = "output.csv", action = "write")

      write (20, "(A)") "Name,Debt"

      do i = 1, debtor_count
          name_trimmed = trim(debtors(i)%name)
          print *, debtors(i)%debt
          write (20, 2000) name_trimmed, ",", debtors(i)%debt
          2000 format (A, A1, I5)
      end do

      close(20)
  end subroutine write_debtors_to_file
end program debt_prioritizer

Wrap Up

That's all there is to it. If you want to make sure it works, download the files in the gist and run the following command to compile it:

gfortran debt-prioritizer.f95 -o debt-prioritizer

If everything compiled, try running the program:

./debt-prioritizer

You should see an output.csv file with the names sorted descending by debt amount.

I'll admit I wasn't crazy about Fortran at first, but the more I wrote, the more I liked it. I would imagine I'd appreciate it much more if I had to crunch some serious numbers.

Considering it's one of the oldest compiled languages, and it's still being used all the time, I'd venture to guess that it's pretty powerful. If you ever need to create a CSV file using Fortran under threat of serious bodily harm, you can thank me when you succeed.