Podcasting App Transcribes Shows for Better Search

We called this one a long time ago.

How in-audio search works in Castbox. Courtesy Castbox

We called this one, a long time ago. Since then, we’ve been watching the development of natural language processing in the podcasting space ever since. It’s starting to come together.

Podcasts are a product driven by words, spoken words, but, unlike words in text, they are not words that are easily indexed by the crawling robots of Mountain View and Redmond (that is, Google’s and Bing’s). A lot of good information is hiding in podcasts that could be useful to people, but they can’t access it because it’s not searchable.

Searching podcasts is notoriously bad. Nevermind the words spoken on the recording, few of the podcasting apps even do a good job searching titles and show notes. It’s sort of embarrassing.

Enter, Castbox, the latest podcasting application we’ve found that really focuses on search.

Xiaoyu Wang, CastBox CEO. Courtesy Castbox

“I think audio is kind of a Pandora’s Box,” CEO Xiaoyu Wang told the Observer in a phone call, meaning that it has a lot of potential value inside, but it needs to be more fully opened. A lot of podcasts have answers to questions that people are looking for, even if that might not mean they become regular listeners. 

Castbox uses cloud platforms at major companies (mostly Google, with some Microsoft and IBM thrown in) to run transcriptions of every podcast on the app. Till now, the transcriptions have been used to improve their recommendation engine, internally. Over the last two years, it has watched the quality of those transcriptions steadily improve, from around 90 percent accuracy in mid-2016 up to around 98 percent accuracy now (quality and price are all basically the same at all the big providers).

The hard work was turning all that text into effective search results. “We put it into our index to analyze the text paragraph by paragraph to understand what they say,” Wang said. Ultimately, it works much like the Google search engine does with text on the web. It finds keywords and phrases and analyzes how they relate to each other to determine if the phrase “civil war” in a specific episode refers to American history, the Captain America movie or another conflict entirely. 

As of today, if a user searches for a topic or a phrase, it will load relevant channels, episodes (based on the episode title and shown notes) and results from the actual transcript of the show. This function is available both in the app and on the web (attention fellow journalists). For results from the transcript, it gives a timestamp, so listeners can go right to that part of the conversation.

We just did a quick search on the web for “Equifax hack” and here’s what it surfaced.

Search demo in Castbox on the web. Castbox

We have spoken to podcasters who don’t love this kind of functionality because they want their shows to be thought of as a whole piece, not a commodity that can be broken down into parts. That makes sense for shows that are constructed in that way, like Radiolab or Hi Phi Nation. But there’s a ton of shows out there that are all about basic information, shows that feature nothing but interviews with smart people and newsmakers in areas like cybersecurity or investing. The purpose of these shows is to answer people’s questions, and they’ll be happy to have a listener even for part of one episode if that gives them an ad impression and a little more mind-share.

Castbox has entered into a crowded space as a podcast player, made doubly complicated by the fact that podcast downloading is overwhelming dominated by Apple’s native podcasting app.

“I think the first of our unique strengths of CastBox is that we can do it across multiple platforms,” Wang said. So if a user is logged into Castbox, they can pick up where they left off listening on mobile to an episode on Google Home or their desktop. “I think this like the first thing we can try to win.”

“We really want to leverage the technology to provide users with a better user experience,” Wang added, saying that app gets updated two or three times a week, based on observations of users and various A/B tests it’s constantly running. 

In its first year, Castbox was able to break even financially on display ads alone. It plans to expand into producing its own shows in collaboration with others, which will be a new source of revenue. Additionally, it has a premium level inside the app that allows users to make their subscriptions the default homepage and subscribe to unlimited podcasts ($.99 per month). It’s growth so far helped it close a recent series A round of $12.8 million, co-led by Qiming Venture Partners and IDG Capital.

Podcasting App Transcribes Shows for Better Search