The Forensic Files API, Part 3

Visibility Zero

February 15th, 2020

Orange text that says Forensic Files and Visibility Zero

In Part 2 of this series, I described how I downloaded all of the Forensic Files episodes off of YouTube with a combination of JavaScript scripts, Go mojo, and plucky determination. With that box checked, I'm one step closer to sending something off to a speech-to-text service. The service is expecting audio files, and I don't want to try sending MP4 video files over the wire because that would just be wasteful.

With that in mind, I need some tooling and a technique for extracting the audio out of the video files. In keeping with the motif of naming packages after episodes, I got clever with this one and used the title of season 8, episode 41: Visibility Zero. Get it? I'm removing the video, so there's zero visibility. Ahem, yes, well...here's the synopsis of that episode, taken from the Forensic Files Wiki:

In 1993, the Amtrak Railroad experienced the deadliest train crash in United States history when the Sunset Limited derailed while crossing Alabama's Big Bayou Canot bridge. Forty-seven passengers and crew were killed; scores more were injured. The clues to the cause of the crash lay etched in twisted steel and buried in the mud of the Big Bayou Canot.

Yikes.

Course of Action

I knew right from the get-go that I was going to use ffmpeg for the audio extraction. Aside from being free, it's been around for almost 20 years and works out of the box without having to pass a bunch of flags or perform an elaborate setup process. I broke this project down into the following tasks:

  1. Create a new /assets/audio directory with sub-directories for each season (using the same naming convention as /videos)
  2. Loop through the episodes in the /assets/videos directories and run the ffmpeg command on each one
  3. Output the result to the /assets/audio/season-X directory with an .mp3 file extension

Preparation

I added a new directory, /internal/visibilityzero, to my project and got to work.

Before I dove into the Go code, I installed ffmpeg. If you're on macOS, installation is a piece of cake with Homebrew:

brew install ffmpeg

As soon as the installation finished up, it was time to Go (heh).

The Extraction Code

I started off by writing a function to get the full path to the audio file that ffmpeg will output. Thanks to Go's filepath methods, the extrapolation process was a piece of cake:

func audioFilePath(videoPath string) string {
	dir, file := filepath.Split(videoPath)
	seasonDir := filepath.Base(dir)
	mp3FileName := strings.Replace(file, ".mp4", ".mp3", -1)
	return filepath.Join(crimeseen.AudioDirPath, seasonDir, mp3FileName)
}

As mentioned in the previous post, the crimeseen package is used to house utility functions and commonly used paths. It's located in the /internal/crimeseen directory of the repo.

Next up, I wrote a function that will execute the ffmpeg command with the proper arguments, given a source video file path and target audio file path. Here's what I ended up with:

func extractAudioFromEpisode(videoPath string, audioPath string) {
	log.WithFields(logrus.Fields{
		"video": filepath.Base(videoPath),
	}).Info("Extracting audio from video file")

	cmd := exec.Command("ffmpeg", "-i", videoPath, audioPath)
	err := cmd.Run()
	if err != nil {
		log.WithFields(logrus.Fields{
			"error": err,
			"video": filepath.Base(videoPath),
		}).Error("Error extracting audio")
	}
}

That handles an individual episode, so the next logical step would be to handle all of the episode files in a season directory. I ended up using Go's filepath.Walk to accomplish this. The second argument to the Walk method is the function that does something with each file in the directory. In my case, I'm checking if the episode exists and calling extractAudioFromEpisode() if it doesn't:

func videoPathWalkFunc(path string, info os.FileInfo, err error) error {
    if strings.HasSuffix(path, ".mp4") {
		// Every 10 videos, take a 5 minute breather. ffmpeg makes the
		// fans go bananas on my laptop:
		if processedCount != 0 && processedCount%10 == 0 {
			log.Infoln("Taking a breather or else I'm going to take off")
			time.Sleep(time.Minute * 5)
		}

		audioPath := audioFilePath(path)

		if !crimeseen.FileExists(audioPath) {
			extractAudioFromEpisode(path, audioPath)
			processedCount++
		}
	}

	if err != nil {
		log.WithFields(logrus.Fields{
			"name":  info.Name(),
			"error": err,
		}).Error("Error in walk function")
		return err
	}

	return nil
}

That processedCount is a global variable that tracks how many videos have been processed. Every 10 videos, I take a 5 minute break. I originally had ffmpeg chugging away non-stop until I noticed that the fans on my MacBook Pro were spinning so fast that I thought it was going to melt. I let ol' Bessie (I just named my laptop Bessie) take a breather so she didn't keel over.

As you may have already guessed, the last thing I need to write (and the first thing I need to run) is the functionality to loop through each season directory so I can walk through the files and process them:

func extractAudioFromAllSeasons() {
	err := crimeseen.Mkdirp(filepath.Join(crimeseen.AudioDirPath))
    if err != nil {
        log.WithField("error", err).Fatal("Error creating audio directory")
    }

    processedCount = 0

    for season := 1; season <= crimeseen.SeasonCount; season++ {
        seasonDir := "season-" + strconv.Itoa(season)
        createSeasonAudioDir(seasonDir)

        err = filepath.Walk(
            filepath.Join(crimeseen.VideosDirPath, seasonDir),
            videoPathWalkFunc,
        )

        if err != nil {
            log.WithFields(logrus.Fields{
                "season": season,
                "error":  err,
            }).Fatal("Error walking season video directory")
        }
    }
}

I make sure I set processedCount to 0 before looping through each season directory and walking through the episodes. Note that I also ensure the /assets/audio directory exists before starting the loop and I create the season directory (e.g. /assets/audio/season-1) before walking through the corresponding /assets/videos directory and running the ffmpeg command.

The only publicly available method, ExtractAudio, does a quick check to make sure ffmpeg is installed then calls the extractAudioFromAllSeasons method:

func ExtractAudio() {
	checkForFFmpeg()
	extractAudioFromAllSeasons()
}

You can see the whole file with comments and any methods I left out on the GitHub repo. Finally, I added a command to alibi to call the public method, then kicked off the long and arduous journey of audio extraction:

go run ./cmd/alibi/alibi.go viszero

I ended up monitoring this job pretty closely, which by "closely", I mean sitting on the couch watching TV while it ran. Whenever the fans really started whirring, I'd cancel the job and give the laptop 10 or 20 minutes to cool down. I was able to get almost everything extracted over the course of two days, but I did hit a bit of a snag.

The Bit of a Snag

While compiling the list of episode URLs into a JSON file (which I covered in part 2 of this series), I came across some seasons with missing episodes. I was able to get most of the URLs by searching for them separately, but these stragglers turned out to be in .m3u format, and ffmpeg did not like that.

I'm sure that I could have dug into the ffmpeg documentation and found out if there was a flag or something I needed to use to get this to work. I did not go that route for the following reasons:

  1. It would take extra time and my goal isn't to become an ffmpeg expert
  2. I would have needed to add additional code to accommodate for the .m3u file extension
  3. There were only 9 episodes in the .m3u format

I already had VLC Player installed, so I used the conversion feature to convert the video to audio and output the resulting MP3 file to the appropriate directory. I now have all of my audio files ready to recognize.

Stay tuned for my next blog post, in which I cover the speech-to-text service integration! Some spoiler alerts: it involves a premium ngrok account, HTTP timeout errors, and surprise at how quickly you can burn through the free minutes of an IBM Watson resource.