Parsing and Sorting with Fortran
January 14, 2020Fortran is a compiled language that is old as hell. It was cooked up by John Backus way back in 1953. According to the Fortran Wikipedia article, it's also structured, imperative, procedural, and object-oriented. The name was derived from Formula Translation and it was built for mathing and sciencing (both words I made up). Wikipedia has the following to say about Fortran:
It is a popular language for high-performance computing and is used for programs that benchmark and rank the world's fastest supercomputers.
Huh. Alright, well let's use it to make a CSV file.
Quick Fortran Primer
If you want to follow along, you'll need the ability to compile Fortran on your computer. There's online REPLs that can run Fortran, but we're doing file I/O, so that won't fly. If you're a macOS user, you can use gfortran, which gets bundled with GCC. You can install it with Homebrew:
brew install gcc
If you are using Windows, godspeed.
Compiling is a piece of cake:
gfortran example.f95 -o example
I'm not going to spend too much time talking about the various keywords and syntax associated with Fortran.
I'll let the code do the talking. I will note that I'm using Fortran 95, which added some neat features and
DOESN'T REQUIRE ME TO WRITE ALL CODE IN UPPERCASE
.
That's being said, I will mention a couple of things I thought were interesting. Declaring variables
is a little wonky: you use two colons (::
). Most of the examples I've seen do a lot of reference
passing to subroutines (reminiscent of C/C++ and Windows API calls). Here's a contrived example:
program some_program
integer :: i_need_this
call get_what_i_need(i_need_this)
print *, i_need_this ! This prints out 10
contains
subroutine get_what_i_need(i_need_this)
integer :: i_need_this
i_need_this = 10
end subroutine get_what_i_need
end program some_program
I imagine if you're a die-hard functional programming enthusiast, you're breathing into a paper bag right now. It ain't pretty, but it works.
Getting data from files can also be a little tricky. You have to create a "status" integer variable that indicates when you've reached the end of a file. I haven't programmed in enough languages to know if that's weird or normal.
Don't even get me started on strings. Say what you will about JavaScript, but declaring a string is a piece of cake:
const myString = "stringydoodle";
console.log(myString); // <- Prints "stringydoodle"
Let's try that in Fortran:
program test
character(12) :: my_string
my_string = "stringydoodle"
print *, my_string ! <- Prints "stringydoodl"
end program test
The eagle-eyed among you noticed that the Fortran example printed stringydoodl
(sans e
). That's
because you have to either explicitly specify a character length or use dynamically-allocatable
strings (another reason for using Fortran 95). So you can do this instead:
program test
character(len=:), allocatable :: my_string
my_string = "stringydoodle"
print *, my_string ! <- Prints "stringydoodle", hooray!
end program test
With all that out of the way, let's get crackin'.
The Scenario
I'm going to use Fortran to take some input as a text file, sort it, and output it to a CSV file.
I generated some data on generatedata.com. It's
a list of random names with a numeric value separated by a |
(pipe) character. If you want to follow along, I
created a gist with the Fortran file and input data.
Here's a snippet of the data (input.txt
):
Vivian Larson|5458
Geraldine Gibson|9207
Violet Price|9745
Hilda Cohen|6835
...
The actual input.txt
file in the gist has 200 names. We're going to treat the number to the right of the
pipe character as a debt that's owed to...someone. To whom it is owed isn't really important. As mentioned
earlier, the program will read it, sort the values, and output it to a CSV file.
The Execution
Let's start with some initialization code. The snippet below represents the start of the program.
I'm defining a type, debtor
with a name
and debt
property (which corresponds to the input file).
I'm using global variables (gasp) because I didn't want to pass the debtors
and debtor_count
to
every single method.
After all the initialization mumbo jumbo, I'm calling the assign_values_to_globals()
and
prioritize_debt()
methods to actually make the program do stuff. I'll be going over those
methods next.
program debt_prioritizer
implicit none
type debtor
character(len=:), allocatable :: name
integer :: debt
end type debtor
type(debtor), dimension(:), allocatable :: debtors
integer :: debtor_count = 0
call assign_values_to_globals()
call prioritize_debt()
contains
The assign_values_to_globals()
method calls the assign_debtor_count()
and assign_debtors()
methods.
subroutine assign_values_to_globals()
call assign_debtor_count(debtor_count)
allocate(debtors(debtor_count))
call assign_debtors()
end subroutine assign_values_to_globals
The assign_debtor_count()
method loops through the file to determine how many people owe
money. I know this seems stupid to open the file twice, but I couldn't find a way to dynamically
allocate items to an array that I could actually understand.
subroutine assign_debtor_count(debtor_count)
integer :: debtor_count, read_status, sum_of_lines = 0
open(10, file = "input.txt", status = "old", action = "read")
do
read(10, *, iostat = read_status)
if (read_status /= 0) exit
sum_of_lines = (sum_of_lines + 1)
end do
close(10)
debtor_count = sum_of_lines
end subroutine assign_debtor_count
The assign_debtors()
method below opens the input file (again), loops through the records, and
populates the debtors
array with the name
and debt
of the corresponding debtor
. The
get_name_and_debt()
method parses the file line to split the value into a name
and debt
.
subroutine assign_debtors()
character(50) :: line, name
integer :: read_status, debt, line_number = 1
open(10, file = "input.txt", status = "old", action = "read")
do
read(10, "(A)", iostat = read_status) line
if (read_status /= 0) exit
call get_name_and_debt(line, name, debt)
debtors(line_number)%name = name
debtors(line_number)%debt = debt
line_number = (line_number + 1)
end do
close(10)
end subroutine assign_debtors
subroutine get_name_and_debt(line, name, debt)
character(1) :: DELIM = "|"
character(50) :: line, name, debt_string
integer :: debt, index
line = trim(line)
index = scan(line, DELIM)
name = line(1 : index - 1)
debt_string = line(index + 1 : )
read(debt_string, '(I5)') debt
end subroutine get_name_and_debt
Once the debtors are loaded into the array and their debt is calculated, the prioritize_debt()
method reorders them descending by debt (so whoever owes the most money is right at the top of the
list).
The sort_descending_by_debt()
method is a ridiculously inefficient bubble sort. It loops through
each record, and within that loop it does another loop. The higher debt amount is moved before the
lower debt amount. This has to be done for each record, which is why there are 2 loops.
subroutine prioritize_debt()
call sort_descending_by_debt()
call write_debtors_to_file()
print *, "Done!"
end subroutine prioritize_debt
subroutine sort_descending_by_debt()
integer :: i, j
do i = 1, debtor_count
do j = debtor_count, (i + 1), -1
call swap(debtors(j - 1), debtors(j))
end do
end do
end subroutine sort_descending_by_debt
subroutine swap(left_debtor, right_debtor)
type(debtor) :: left_debtor, right_debtor, temp
if (left_debtor%debt < right_debtor%debt) then
temp = right_debtor
right_debtor = left_debtor
left_debtor = temp
end if
end subroutine swap
The final step is the write_debtors_to_file()
method below. If the file doesn't already
exist, I create it in the same directory as the compiled program. Once the file exists,
it writes each debtor to the file.
subroutine write_debtors_to_file()
character(len=:), allocatable :: name_trimmed
integer :: debtor_number
print *, "Creating the output file..."
open(20, file = "output.csv", action = "write")
write (20, "(A)") "Name,Debt"
do i = 1, debtor_count
name_trimmed = trim(debtors(i)%name)
print *, debtors(i)%debt
write (20, 2000) name_trimmed, ",", debtors(i)%debt
2000 format (A, A1, I5)
end do
close(20)
end subroutine write_debtors_to_file
end program debt_prioritizer
Wrap Up
That's all there is to it. If you want to make sure it works, download the files in the gist and run the following command to compile it:
gfortran debt-prioritizer.f95 -o debt-prioritizer
If everything compiled, try running the program:
./debt-prioritizer
You should see an output.csv
file with the names sorted descending by debt amount.
I'll admit I wasn't crazy about Fortran at first, but the more I wrote, the more I liked it. I would imagine I'd appreciate it much more if I had to crunch some serious numbers.
Considering it's one of the oldest compiled languages, and it's still being used all the time, I'd venture to guess that it's pretty powerful. If you ever need to create a CSV file using Fortran under threat of serious bodily harm, you can thank me when you succeed.