rumble. The sound of the digestive muscles moving. The human body does its job. Sometimes, if there’s a microphone nearby, those annoying, gurgling sounds are picked up.
AI audiobook narrators don’t have to worry about weird digestive noises, but Leah Allers and engineer Craig Hinkle aren’t robots. They’re Human, recording for Nashville Audiobook Productions in mid-January, worrying about gurgling, debating where to focus on the word “increase,” and leaning toward detail work to give a “real” sound to a book about how couples communicate.
NAP Studio is located in The Rukkus Room in Nashville, Tennessee, the same place where Taylor Swift recorded her seven-time platinum-titled debut album. The smell of coffee permeates the waiting room. Hinkle is tuning in to every word that comes out of the Allers’ mouth, looking up from her iPad with the text of the book at a large screen sitting at the soundboard in the studio.
“I want to get more emotion into these questions,” Allers told Hinkle before replaying a section of the chapter.
Audiobooks are booming. The market size is expected to reach $33.5 billion by 2030, up from about $4.2 billion in 2021, he said. Acumen Research and Consulting. Whether this is a result of the rise in popularity of podcasts, a matter of listening comfort, or a by-product of the pandemic, it has not escaped the attention of tech companies and the inevitable creep of artificial intelligence.
In 2023, excitement about the potential of artificial intelligence is high, as is concern about jobs being stolen from struggling creators. ChatGPT can write anything from insurance pre-authorization letters to Dating application bioswith varying degrees of success. Artificial intelligence platforms such as Lens AI And OpenAI from Dall-E Spit out art created by artificial intelligence, leaving many who make a living creating digital art worrying about their future.
Technology companies including Apple and Google have been working on AI narration for audiobooks for a while now. In 2022, Google rolled Its services are available to publishers in six countries, including the United States and Canada. Google’s AI narrators have names like Archie, who sounds British, and Santiago, who speaks Spanish. in early january, Apple has provided a stable set of AI voices With names like Madison and Jackson, independent authors and publishers who sell their books on Apple Books can click to read genres from fantasy to romance.
The increasing presence of artificial intelligence in audiobook narratives has human narrators such as Tanya Eby in various stages of tension.
“I don’t know if this will be my full-time job in five years,” said Eby, a Grand Rapids, Michigan-based storyteller who has recorded more than 1,000 books in the past 21 years.
Narrators like Ebbe say their humanity is exactly what helps them do their jobs. Narrators make decisions about everything, especially with fiction, from the character’s voice to how to communicate nuance and emotion in a way that reflects the story.
“If a character is crying after her father’s death, I have to convey those tears and cries in her speech,” said Kathleen Lee, the narrator in Austin, Texas.
The narrators describe the intimacy of being a voice in the listener’s ear, and wonder if even the most lively AI would fall into the uncanny valley. They worry that there is a risk of disrupting the experiment.
AI voices can range from articulate to completely disguised. But even the most fluid can blast valley-piercing tripwires with a delivery or rhythm that seems stalled.
“All about consuming media is that we want to be surrounded by it,” said Jonathan Slip, a narrator who lives outside of Atlanta, Georgia.
Audiobook purists might have a hard time understanding why anyone would choose a synthetic over a human voice. But for smaller publishers and authors, time and money can make a stronger case for the sanctity of creative performance.
Audiobooks don’t make much money for the University of Michigan Press. The publisher produces about 100 academic books a year – by scholars for scholars or students.
It can cost as much as $6,000 to hire a narrator for a book that might only earn a few hundred. And that’s to say nothing of the extensive production process. It can take about six hours to produce one hour of a completed audiobook, According to ACXAmazon Audiobook Creation Exchange.
“The reality is, unless you have some kind of bestseller, economics is not going to work,” said Charles Watkinson, director of the University of Michigan Press and librarian of University Publishing at the University of Michigan Library. He is also president of the University Press Association, a professional organization for academic publishers.
For small authors and publishers, the time and cost of producing an audiobook can be elusive. Artificial intelligence can change that.
About two years ago, Google contacted the University of Michigan Press about participating in a pilot program. Press has been able to use the Google tool to create about 100 digitally produced audiobooks. Still some degree of human intervention is required. Watkinson said some professors who have used Google will have students listen to the recording to check that it is against the text. Smaller presses may still have staffing issues, though the registration process has been accelerated with AI.
Watkinson said the University of Michigan is interested in how AI can increase access to books that might otherwise not be available in audio form.
In the early days of the experiment, they reached out to about 900 writers with a sample narration, and the general response was that the AI’s narration was only slightly better than what a screen reader could do for a visually impaired person. However, for those with vision problems who may not have a screen reader or the like, perhaps AI could help close an access gap.
In other cases, listeners may be happy to have a book recorded in any form. A Watkinson intern will use audiobooks to continue studying in moments when she doesn’t have a book open in front of her, like on the bus or walking to class. I called it “interstitial listening”.
The rise of digital voices
In addition to big names like Apple and Google, there is a thriving group of smaller companies entering the field of AI voice.
DeepZen is one of them. Founded in 2018 and inspired by the 2013 movie Her, DeepZen is about a man who falls in love with his virtual AI assistant, and has created a natural language processing system that can take cues from text and uses AI voices that are generated from licensed human narrators, pseudonymously designated .
One of the biggest challenges, said CEO and co-founder Taylan Camis, was creating a platform that wouldn’t categorically parrot text but rather make it stand out.
It took a few years to hit the market, but DeepZen now lets customers upload a script and, depending on their pricing plan, select an automated or managed service. Both come with levels of quality control, such as a pronunciation check, but the managed option features a proofreading check by human editors and two rounds of corrections.
The automated service will run the customer $69 per hour completed versus $129 for the managed option. DeepZen has produced nearly 3,000 books to date, both fiction and nonfiction.
on its websiteYou can listen to samples of 10 voices, with names like Todd, Dahlia, and Alice.
Somewhere in the world, Todd, Dahlia, and Alice are real people. Camis believes that audio licensing can be a way for narrators to co-exist with the AI in the narrative.
This narrator will earn money while he sleeps and his voice will earn royalties in Japan [or] China or South Africa.
DeepZen is also working on a way to make AI voices speak other languages, to increase market reach.
And don’t bother overcoming the challenges of speaking only one language – death doesn’t have to get in the way. DeepZen has approached the family of famous voice actor and narrator Edward Herrmann, who died in 2014, about licensing his voice. signed on. In a sense, Hermann is still working after his death.
We talk again
Kamis isn’t the only one who thinks there’s a way for AI and humans to get along in audio narration.
Watkinson, of the University of Michigan, wants to use AI as a way to test which books are worth hiring a human to record. If one sells well, the success may justify the cost. He is a fan of audiobooks.
“This is the ramp for us to have human narrators,” he said.
Not everyone is optimistic. Some in the industry worry that there will be fewer jobs for narrators who aren’t popular or don’t have a following of their own.
“All of these middle-class, really solid narrators… do an excellent job and it’s their livelihood — but they’re not necessarily going to be a draw,” said Andrea Flake Nisbet, CEO of the Independent Book Publishers Association.
After two decades in the field, Ebbe said she wonders what would happen if she eventually couldn’t find full-time work to tell.
“What skills do I have that are competitive? How am I going to get into an office, and what am I going to offer?” she asked.
Narrator Jonathan Slip said he knew he had homework – and that he became very interested in the contracts he signed, and what rights he handed over to his voice.
Others, like narrator Andy Garcia-Ross, want to play to their strengths: “All we can do is make them fall in love with our shows and keep working.”
Some authors refuse to use digital audio.
Author Elizabeth Bell said, “I feel the purpose of a novel is to excite the emotions of the reader or listener, and that fiction is about what it means to be human. A machine can’t replicate that.”
Author Chris Stokel-Walker used Google to narrate his 2021 nonfiction book TikTok Boom, about the popular video app, and He wrote about the result in inverse.
“No Longer Was an Audiobook,” Stokel-Walker wrote, “while lacking some of the emotion and drama you’d hoped for, looked decent”.
There are still a lot of questions. In a world where people already hear digital voices like Siri and Alexa every day, would humans stop caring if a digital voice didn’t sound quite human? For Fleck-Nisbet, AI narration is just one of many questions the publishing industry will face. Other doubts are about artificial intelligence and copyright or intellectual property.
In other words, this is only the beginning.
None of this means that the narrators will be on the unemployment line next week.
John Behrens, who owns Nashville Audiobook Productions, has worked with two AI-generated books in the past few years, primarily providing quality control. AI still has problems. He could not pronounce Bible verses, and had difficulty asking rhetorical questions in the text.
A bad audiobook, Burns said, might produce 50 to 100 entries of problems that need to be fixed. AI has produced hundreds. This leads him to believe that the human narrators aren’t going anywhere – for a while, at least. It is advised not to panic.
“If you’re going to live in fear… why are you going to keep investing in this profession if you think it’s going to dry up?” He said.
Back in Rukkus’ room, Allers and Hinkle take a break to chat about robots.
It’s Allers’ first time narrating an audiobook, though she’s done plenty of voiceover and dubbing work, including for Netflix.
Hinkle is not a fan of artificial intelligence.
“A robot is reading a book,” he said. “I still think it will take a long time before he looks so natural and talented.”
Just don’t tell Madison and Jackson.
Editors’ note: CNET uses an artificial intelligence engine to create some personal finance explanations that are edited and verified by our editors. For more see This post.