The Forensic Files API, Part 3
Visibility Zero
February 15, 2020In Part 2 of this series, I described how I downloaded all the Forensic Files episodes off of YouTube with a combination of JavaScript scripts, Go mojo, and plucky determination. With that box checked, I'm one step closer to sending something off to a speech-to-text service. The service is expecting audio files, and I don't want to try sending MP4 video files over the wire because that would just be wasteful.
With that in mind, I need some tooling and a technique for extracting the audio out of the video files. In keeping with the motif of naming packages after episodes, I got clever with this one and used the title of season 8, episode 41: Visibility Zero. Get it? I'm removing the video, so there's zero visibility. Ahem, yes, well...here's the synopsis of that episode, taken from the Forensic Files Wiki:
In 1993, the Amtrak Railroad experienced the deadliest train crash in United States history when the Sunset Limited derailed while crossing Alabama's Big Bayou Canot bridge. Forty-seven passengers and crew were killed; scores more were injured. The clues to the cause of the crash lay etched in twisted steel and buried in the mud of the Big Bayou Canot.
Yikes.
Course of Action
I knew right from the get-go that I was going to use ffmpeg for the audio extraction. Aside from being free, it's been around for almost 20 years and works out of the box without having to pass a bunch of flags or perform an elaborate setup process. I broke this project down into the following tasks:
- Create a new
/assets/audio
directory with subdirectories for each season (using the same naming convention as/videos
) - Loop through the episodes in the
/assets/videos
directories and run theffmpeg
command on each one - Output the result to the
/assets/audio/season-X
directory with an.mp3
file extension
Preparation
I added a new directory, /internal/visibilityzero
,
to my project and got to work.
Before I dove into the Go code, I installed ffmpeg. If you're on macOS, installation is a piece of cake with Homebrew:
brew install ffmpeg
As soon as the installation finished up, it was time to Go (heh).
The Extraction Code
I started off by writing a function to get the full path to the audio file that ffmpeg will output.
Thanks to Go's filepath
methods, the extrapolation process was a piece of cake:
func audioFilePath(videoPath string) string {
dir, file := filepath.Split(videoPath)
seasonDir := filepath.Base(dir)
mp3FileName := strings.Replace(file, ".mp4", ".mp3", -1)
return filepath.Join(crimeseen.AudioDirPath, seasonDir, mp3FileName)
}
As mentioned in the previous post, the crimeseen
package is used to house utility functions and commonly used paths.
It's located in the /internal/crimeseen
directory of the repo.
Next up, I wrote a function that will execute the ffmpeg
command with the proper arguments, given
a source video file path and target audio file path. Here's what I ended up with:
func extractAudioFromEpisode(videoPath string, audioPath string) {
log.WithFields(logrus.Fields{
"video": filepath.Base(videoPath),
}).Info("Extracting audio from video file")
cmd := exec.Command("ffmpeg", "-i", videoPath, audioPath)
err := cmd.Run()
if err != nil {
log.WithFields(logrus.Fields{
"error": err,
"video": filepath.Base(videoPath),
}).Error("Error extracting audio")
}
}
That handles an individual episode, so the next logical step would be to handle all the episode
files in a season directory. I ended up using Go's filepath.Walk
to accomplish this.
The second argument to the Walk
method is the function that does something with each file in
the directory. In my case, I'm checking if the episode exists and
calling extractAudioFromEpisode()
if it doesn't:
func videoPathWalkFunc(path string, info os.FileInfo, err error) error {
if strings.HasSuffix(path, ".mp4") {
// Every 10 videos, take a 5 minute breather. ffmpeg makes the
// fans go bananas on my laptop:
if processedCount != 0 && processedCount%10 == 0 {
log.Infoln("Taking a breather or else I'm going to take off")
time.Sleep(time.Minute * 5)
}
audioPath := audioFilePath(path)
if !crimeseen.FileExists(audioPath) {
extractAudioFromEpisode(path, audioPath)
processedCount++
}
}
if err != nil {
log.WithFields(logrus.Fields{
"name": info.Name(),
"error": err,
}).Error("Error in walk function")
return err
}
return nil
}
That processedCount
is a global variable that tracks how many videos have been processed.
Every 10 videos, I take a 5-minute break. I originally had ffmpeg chugging away non-stop until
I noticed that the fans on my MacBook Pro were spinning so fast that I thought it was going to melt.
I let ol' Bessie (I just named my laptop Bessie) take a breather, so she didn't keel over.
As you may have already guessed, the last thing I need to write (and the first thing I need to run) is the functionality to loop through each season directory, so I can walk through the files and process them:
func extractAudioFromAllSeasons() {
err := crimeseen.Mkdirp(filepath.Join(crimeseen.AudioDirPath))
if err != nil {
log.WithField("error", err).Fatal("Error creating audio directory")
}
processedCount = 0
for season := 1; season <= crimeseen.SeasonCount; season++ {
seasonDir := "season-" + strconv.Itoa(season)
createSeasonAudioDir(seasonDir)
err = filepath.Walk(
filepath.Join(crimeseen.VideosDirPath, seasonDir),
videoPathWalkFunc,
)
if err != nil {
log.WithFields(logrus.Fields{
"season": season,
"error": err,
}).Fatal("Error walking season video directory")
}
}
}
I make sure I set processedCount
to 0
before looping through each season directory and walking
through the episodes. Note that I also ensure the /assets/audio
directory exists before
starting the loop, and I create the season directory (e.g. /assets/audio/season-1
) before
walking through the corresponding /assets/videos
directory and running the ffmpeg
command.
The only publicly available method, ExtractAudio
, does a quick check to make sure ffmpeg is
installed then calls the extractAudioFromAllSeasons
method:
func ExtractAudio() {
checkForFFmpeg()
extractAudioFromAllSeasons()
}
You can see the whole file with comments and any methods I left out on the GitHub repo.
Finally, I added a command to alibi
to call the public method, then kicked off the long and
arduous journey of audio extraction:
go run ./cmd/alibi/alibi.go viszero
I ended up monitoring this job pretty closely, which by "closely", I mean sitting on the couch watching TV while it ran. Whenever the fans really started whirring, I'd cancel the job and give the laptop 10 or 20 minutes to cool down. I was able to get almost everything extracted over the course of two days, but I did hit a bit of a snag.
The Bit of a Snag
While compiling the list of episode URLs into a JSON file (which I covered
in part 2 of this series), I came across some seasons with
missing episodes. I was able to get most of the URLs by searching for them separately, but
these stragglers turned out to be in .m3u
format, and ffmpeg did not like that.
I'm sure that I could have dug into the ffmpeg documentation and found out if there was a flag or something I needed to use to get this to work. I did not go that route for the following reasons:
- It would take extra time and my goal isn't to become a ffmpeg expert
- I would have needed to add additional code to accommodate for the
.m3u
file extension - There were only 9 episodes in the
.m3u
format
I already had VLC Player installed, so I used the conversion feature to convert the video to audio and output the resulting MP3 file to the appropriate directory. I now have all of my audio files ready to recognize.
Stay tuned for my next blog post, in which I cover the speech-to-text service integration! Some spoiler alerts: it involves a premium ngrok account, HTTP timeout errors, and surprise at how quickly you can burn through the free minutes of an IBM Watson resource.