Podcasting’s Search Problem Could be Solved by This Spanish Startup

Smab's web app will automatically transcribe podcasts, giving listeners a way to scan and search their content

Xavier Bassas, Lluc Mayol, Lluís Nacenta i Anna Ramos en plena conversa sobre el podcasting, el documental i la ficció
Podcasters in action. (Photo: Gemma Planell / MACBA)

Podcasting is special. It’s an intimate medium for communicating thoughts and ideas directly into the brain of a carefully grown community, one that feels a personal connection with the show’s hosts.

Sign Up For Our Daily Newsletter

By clicking submit, you agree to our <a rel="noreferrer" href="http://observermedia.com/terms">terms of service</a> and acknowledge we may use your information to send you emails, product samples, and promotions on this website and other properties. You can opt out anytime.

See all of our newsletters

And all that.

Yet, podcasts are also a trove of useful, actionable information and perspectives; it’s just not a particularly well indexed trove. There are lots of people in this world who aren’t especially comfortable writing down their expertise, but podcasters are shoving microphones in their faces while these experts talk out what they know. Yet, anyone that isn’t a regular listener of a particular podcast is unlikely to surface that information, even if he or she goes searching for it.

There are podcasts on gardening, drawing, architecture, yoga and philatelism. There might be an expert out there who has answered your burning philately question on a podcast, but you wouldn’t find it via Google (GOOGL), Bing or DuckDuckGo searches, because search engines don’t have a way of indexing information delivered in audio. Podcasting has had a search problem from the start, as the Observer previously reported. A new podcast hosting service out of Spain, called Smab Audio, may be on the way to solving it.

The company takes audio files and generates text files. If those text files are hosted on Smab’s site, a person can click on a word in the transcript and it will take them directly to that part of the recording, because the transcript and the text are synced. In fact, a second program assesses the audio to determine where sentences begin, making it easier to find chunks of audio. Both functions are uneven, but it’s worth noting here that the company is in a very early stage. There are still a lot kinks to be worked out (not the least of which being that the site is primarily built in Spanish right now). Don’t expect an ideal user experience just yet, but it is easy to see where the service is going.

“I started this project three years ago,” Adrian Dogar, Smab Audio’s founder told the Observer in an email. “And finally, I have made it. I am able to bring low-cost, short return time, SEO-ready, interactive transcripts for the large public.” The company is currently working to raise $22,000 on Indiegogo, with just under $5,rig000 raised so far.

Check out this episode of Unfictional from KCRW. Let’s say someone were doing a research paper or blog post on handmade goods from religious communities. A Google search could potentially surface this nugget (pasted exactly as the transcription shows it): “the village that was closest to the monastery was two kilometers away and the source of their economy they would make brooms they would use bamboo shafts and they would use coconut husks for the bristles mn every week…”

Here's what the page of that 'Unfictional' ep looks like on Smab. (Image: Screenshot Smab.Audio)
Here’s what the page of that ‘Unfictional’ ep looks like on Smab. (Image: Screenshot Smab.Audio)


If the podcaster wants to make sure that unusual words used or names come through correctly, he or she can edit the transcript (the site’s design does not currently make this function obvious, but it’s there). Additionally, Smab is working on a batch edit feature, so that, for example, if a name were consistently misspelled in the automatic transcript, all instances of that misspelling could be corrected.

“All these features leads to the easiest monetization system,” Mr. Dogar wrote. “We have prepared the spoken audio industry for the most popular revenue model on the Internet: visual banners. Each time we display a new synced visual element, there is a sound notification that persuade listener to interact with our app.”

Acast, another startup moving into podcast hosting, is betting that making the dynamic ad insertion is the right bet for monetizing the medium, as the Observer previously reported. Though it is worth noting that the two approaches aren’t inherently mutually exclusive.

Smab’s audio cues can direct listeners to look at banner ads during playback of recordings as well as other multimedia elements. There’s some examples of this loaded into the Unfictional episode linked above, which direct the listeners to other sources of information, such as a Flickr page of photos from one of the projects discussed on the show. It’s hard to believe that podcasters, broadly speaking, will like the idea of putting beeps into their shows, but it’s not unprecedented. Former Rap Genius editor Shawn Setaro used to do something similar on his podcast “Outside The Lines” (now known as The Cipher). When the podcast was associated with the annotation site, Mr. Setaro would put audio cues in the podcast to indicate additional information in an annotation on the website.

To get the full suite of features Smab offers, podcasters will need to host their shows on the Smab website. Mobile apps and embeddable transcripts may come later, according to Mr. Dogar. Using the transcript feature to improve sharing is also on its way. Clammr is another company currently working to foster more sharing, as the Observer previously reported.

Smab is still in beta. Podcasters interested in trying it out should reach out to the company. An English version of the user interface is well underway, according to Mr. Dogar, and the site already has an English version of its overview online now.

Podcasting’s Search Problem Could be Solved by This Spanish Startup