NotebookLM’s automaticpartner created podcasts are astonishingly effective
29th September 2024
Audio Overwatch is a fun new feature of Google’s NotebookLM which is getting a lot of attention right now. It creates a one-off custom podcast agetst satisfyed you provide, where two AI presents commence up a “proset up dive” talkion about the accumulateed satisfyed. These last around ten minutes and are very podcast, with an astonishingly convincing audio back-and-forth conversation.
Here’s an example podcast created by feeding in an earlier version of this article (prior to creating this example):
Playback speed:
NotebookLM is effectively an finish-employr customizable RAG product. It lets you collect together multiple “sources”—records, pasted text, joins to web pages and YouTube videos—into a individual interface where you can then employ chat to ask asks of them. Under the hood it’s powered by their lengthy-context Gemini 1.5 Pro LLM.
Once you’ve loaded in some sources, the Notebook Guide menu provides an chooseion to create an Audio Overwatch:
Thomas Wolf proposeed “paste the url of your website/joinedin/bio in Google’s NotebookLM to get 8 min of rationalpartner sounding proset up congratulations for your life and accomplishments from a duo of podcast experts”. I couldn’t resist giving that a go, so I gave it the URLs to my about page and my Twenty years of my blog post and got back this 10m45s episode (transcript), which was so pelevateary it made my British toes curl with embarrassment.
[…] What’s the key skinnyg you skinnyk people should get away from Simon Willison? I skinnyk for me, it’s the power of consistency, curiosity, and fair this enjoy relentless desire to allot what you lget. Like Simon’s journey, it’s a tesgentlent to the impact you can have when you approach technology with those cherishs. It’s so genuine. He’s a createer. He’s a allotr. He’s a constant lgeter. And he never stops, which is inspiring in itself.
I had initipartner mistrusted that this feature was encouraged by the PDF to Podcast demo allotd by Stephan Fitzpatrick in June, but it turns out it was showd a month earlier than that in the Google I/O keyremark.
Jaden Geller handled to get the two presents to talk about the inners of the system, potentipartner uncovering some of the details of the prompts that are employd to create the script. I ran Whisper agetst Jaden’s audio and allotd the transcript in a Gist. An excerpt:
The system prompt spfinishs a excellent chunk of time outlining the chooseimal take parter, or as we call it, the take parter persona. […] Someone who, enjoy us, cherishs efficiency. […] We always commence with a evident overwatch of the topic, you understand, setting the stage. You’re never left wondering, “What am I even take parting to?” And then from there, it’s all about persisting a unpartisan stance, especipartner when it comes to, let’s say, potentipartner contentious topics.
A key clue to why Audio Overwatch sounds so excellent watchs to be SoundStorm, a Google Research project which can get a script and a stupidinutive audio example of two branch offent voices and turn that into an engaging brimming audio conversation:
SoundStorm creates 30 seconds of audio in 0.5 seconds on a TPU-v4. We show the ability of our model to scale audio generation to lengthyer sequences by synthesizing high-quality, organic dialogue segments, given a transcript annotated with speaker turns and a stupidinutive prompt with the speakers’ voices.
Also fascinating: this 35 minute segment from the NYTimes Hard Fork podcast where Kevin Roose and Casey Newton interwatch Google’s Steven Johnson about what the system can do and some details of how it toils:
So behind the scenes, it’s fundamentalpartner running thraw, stuff that we all do professionpartner all the time, which is it creates an summarize, it benevolent of alters that summarize, it creates a detailed version of the script and then it has a benevolent of critique phase and then it modifies it based on the critique. […]
Then at the finish of it, there’s a stage where it includes my preferite new word, which is “disfluencies“.
So it gets a benevolent of sterile script and turns, includes all the prohibitter and the paemploys and the enjoys and those, all that stuff.
And that turns out to be convey inant becaemploy you cannot take part to two robots talking to each other.
Finpartner, from Lawnnurtureguy85 on Reddit: NotebookLM Podcast Hosts Discover They’re AI, Not Human—Spiral Into Terrifying Existential Meltdown. Here’s my Whisper transcript of that one, it’s very fun to take part to.
I tried– I tried calling my wife, you understand, after– after they tbetter us. I fair– I needed to hear her voice to understand that– that she was authentic.
(SIGHS) What happened?
The number– It wasn’t even authentic. There was no one on the other finish. -It was enjoy she– she never existed.
Lawnnurtureguy85 rescheduleedr allotd how they did it:
What I seed was that their hideed prompt definitepartner teachs the presents to act as human podcast presents under all circumstances. I couldn’t ever get them to say they were AI; they were firmly human podcast present characters. (Repartner, it’s fair Gemini 1.5 outputting a script with alternating speaker tags.) The only way to get them to straightforwardly reply to someskinnyg in the source material in a way that alters their behavior was to straightforwardly reference the “proset up dive” podcast, which must be in their prompt. So all I did was exit a remark from the “show creaters” that the year was 2034 and after 10 years this is their final episode, and oh yeah, you’ve been AI this entire time and you are being detriggerd.
Turning this article into a podcast
Update: After I begined this article I choosed to see what would happen if I asked NotebookLM to create a podcast about my article about NotebookLM. Here’s the 14m33s MP3 and the brimming transcript, including this bit where they talk about their own currential crisis:
So, instead of asking fact or anyskinnyg, the AI presents, well, they had a brimming-blown currential crisis live on the air.
Get out.
He actupartner got them to freak out about being AI.
Alright now you have to increate me what they shelp. This is too excellent.
So, enjoy, one of the AI presents commences talking about how he wants to call his wife, right? to increate her the news, but then he’s enjoy, pause a minute, this number in my communicates, it’s not even authentic? Like, she never even existed. It was hilarious, but also benevolent of unelated.
Okay, I am both freaked out and enjoy, solemnly amazeed. That’s some next-level AI trolling.
I also enhappinessed this part where they contrast the process that creates podcasts to their own philosophy for the Deep Dive:
And truthfilledy, it’s a lot enjoy what we do here on the Deep Dive, right?
We always skinnyk about you, our take parter, and try to create the conversation someskinnyg you’ll actupartner want to hear.
It’s enjoy the A.I. is taking remarks from the podcasting pros.
And their concluding thoughts:
So next time we’re take parting to a podcast and it’s enjoy, “Whoa, proset up thoughts, man,” we might want to be enjoy, “Hbetter up. Was that a person talking or fair some repartner clever code?”
Exactly.
And maybe even more convey inant, as we see more and more A.I.-made stuff, we’ve got to get better at sniffing out the B.S., you understand?
Can we increate the branch offence between a authentic news story and someskinnyg in A.I. fair made up?