Send in the Clones…

March 22, 2023

This month's article is a long read, but you can listen to me narrate the piece below…

It’s a strange feeling when you realise you’ve been cloned. Hearing a voice that sounds just like yours, saying lines you’ve never said, on a website you don’t recognise having any connection to, is disorienting to say the least. And that’s exactly what happened to me a few weeks ago…

I’m a Voiceover Artist, and there I was on a Wednesday morning enjoying my first cup of coffee and checking my email, when I opened a marketing message from a company I didn’t immediately recognise offering voiceover services. “I must have signed up for their mailing list”, I thought, and clicked through to find out more about them. “Any voice! Any language!” read the blurb. My first reaction was to wonder why I wasn’t already working with them, and I noticed that they had a page for prospective talent like me to sign up. Great, I thought. But I decided to do my homework first and see who else they might already have on their roster, so I clicked through to the voice samples page, selected “English (UK)” (because I’m British) then “Male”, and clicked play.

And there I was! It was obviously me, but very much an “Uncanny Valley” version of me: my vocal tones, but with a slightly odd range of prosody, and a cadence that wasn’t my own. In truth, it was more like a version of me who wasn’t that great a voiceover artist and needed some coaching. And it was very obviously machine generated speech, rather than a recording I’d made. But there the voice was: available to buy, and with no obvious mention anywhere that I was listening to a synthetic voice.

My brow furrowed… how was this possible? Who were this company, and how were they using my voice? I had no recollection of ever agreeing to this, and a cascade of feelings and emotions - ranging from puzzlement, to anger, to betrayal - began to wash over me.

There was one clue: the pseudonym they’d given the voice on the site had jogged a memory: I’d worked with a semi-regular client for several years, and the scripts they sent across had used the same pseudonym at the top of the page. I’d noticed this in the past, but this isn’t always a signal of nefarious behaviour. Some clients, wary of voice talent whom they fear might steal their customers to cut out the middleman, will put this kind of smokescreen in place between the end client and the voice talent to try to make it difficult for the end client to know the voice really is. It’s not something I like, but personally I have no wish to poach anyone’s customers and - if the material is primarily for use within a company and not for wider public consumption, where I’d want to have the recognition of it being me - it’s something I’ve generally turned a blind eye to. After all, if my clients want to work in a culture of fear, that’s up to them. As long as I get paid, and no one’s profiting by dint of the mislabelling, then I’m still putting food on the table after all - even if I acknowledge that my slightly bruised ego doesn’t particularly like it.

Putting my coffee aside and putting on my deerstalker, I delved into my email archive. A quick search for previous correspondence with, well, let’s call them “Acme Voices” (because an NDA prevents me from giving away their real identity) revealed that they’d been bought out a couple of years ago by an AI company (whom I’ll refer to as “Abominable AI”). The email explaining the merger talked about the exciting new opportunities this would mean for voice talent, and urged anyone with questions or concerns to get in touch. The VP listed their contact details at the bottom of the email, so I immediately sent an email, left a voicemail - saying that I’d sent an email and would appreciate a call to talk about events since the merger - and, because I was connected to the VP on LinkedIn, I sent them a LinkedIn message for good measure. (None of these messages have been replied to at time of writing, despite my email tracking telling me the emails were opened.)

Trawling back further, to our original correspondence back in 2016, I found a contract - and an NDA. And then the penny dropped… I’d basically signed away not only the copyright in the recordings I’d supplied, but the right to reuse them in any form, forever. There was even a clause that specifically mentioned use in “TTS” (Text-to-Speech). The NDA added a well-fitting lid to the pot, sealing it all in and requiring that I didn’t talk about the contract or my relationship with the client. Yes, you’re right: I was an idiot. You might even draw the conclusion that I deserve everything I got.

But let’s back up a little. The truth is (regardless of what you may hear to the contrary) I’m not actually an idiot. I’d looked at the work I was doing for the client, which was almost exclusively short telephone prompts (“Thanks for calling. If you’d like technical support, press 1” - that sort of thing) which only ran for a few sentences. In 2016 the potential to reuse these sorts of brief recordings in any way that would have been exploitative was negligible. It wasn’t like these were broadcast ads for Coca-Cola, for example, where I needed to set time limits and renewal terms.

Let’s be realistic: in an ideal world we’d all be able to negotiate every contract to our satisfaction. But the reality is that the larger the client, the more likely it is they’ll have their own contract for you to sign. It’s usually a contract that their legal department has drafted, and producers and managers are, more often than not, unwilling or unable to tweak it for individual requests. Personal experience, multiple times, has taught me that pushing back and requesting revisions often leads to a “no”, so the decision whether to sign a contract or not comes down to doing a little bit of “risk analysis”, as you decide whether you want to work for the client or not. Basically, most of the time you can take the contract “as is” or move along. So, I signed…

In hindsight, it looks very much like this contract was designed to enable exactly this type of future exploitation. Every point of potential objection around reuse was legally covered, leaving me no grounds for complaint. It was all totally legit legally, even if the ethics sucked.

Crucially, and despite the clause about TTS, in 2016 it wasn’t even possible to create a voice model from small samples of audio like this. Seven years ago, TTS models needed hours of purposely written and painstakingly recorded audio to do anything useful (and usually it still came out sounding slightly robotic and artificial at the end of it). But technology has changed in the interim: it’s now possible to take just a minute or so of anyone’s voice and create a workable model that sounds like the original speaker. It works by layering the tones and timbre of a recorded voice over an AI model that’s already programmed to replicate the pacing and prosody of a human voice. That’s why the sample on the site wasn’t quite me: it had my tones, but not my “flow”.

OK, you might think, but if it’s not a great-sounding model, and it doesn’t really sound like me, then what’s the harm - other than the idea that someone’s pimping out a voice that’s basically mine and not paying me for it?

Well, what happens when that technology gets better (more on that in a moment) and the voice is given a script to read that’s at odds with my own moral code? Something that I’d decline to record myself… something that’s politically extreme… what about hate speech? What if it’s used to “phish” for financial misdeeds? Abominable AI would, of course, claim that they have safeguards in place, and that they wouldn’t allow such misuse. But as we’ve seen with social media companies, expecting corporations to act responsibly as judge and jury around online behaviour and ethics is a little like leaving the fox in charge of the henhouse. Unless we know and trust the company concerned and have explicitly signed away the right to allow what our “clones” say to be policed by them, I’d contend that the final arbiters of what should be spoken in our voices should always be ourselves.

But in some ways the whole argument about assigning rights to recordings is, at this point, moot. Like many narrators, I have countless of hours of audiobook material out there that can be harvested. We’re beginning to see untenable justifications for AI developers having used audio that they’re quick to point out is “publicly available”, but we need to remember that publicly available isn’t the same thing as being in the public domain, or free for commercial use. We already have a term for this: it’s called copyright theft. Plagiarism for profit has met its match in court many times, with successful lawsuits around illegal bootlegging and sampling in music being recent examples. We’re also beginning to see fake “auditions” posted on online casting sites that are supposedly for business usage, but where it’s obvious from reading the script that what’s really happening is the poster is looking for clean audio from which to create an AI model. For voiceover artists, it’s hard to see a way to prevent any of these types of misuse.

But it’s not just voice talent like me for whom this is a threat. Remember: it’s possible now to take a minute or so of anyone’s speech and model it for AI purposes, with or without their permission. Whoever you are, your voice can be sampled in a phone call, a Zoom meeting or just about anywhere else at this point and turned into a model of you. If there’s a recording of your voice online, you’re even more of a potential target. What happens when your mother, your brother, your partner gets a call claiming there’s an emergency - from someone who’s apparently you - and in the rush and confusion hands over sensitive information or money to a phishing attack? A schoolfriend of mine, who’s now paid to dream about these sorts of things for a major IT and big data corporation, told me the other day that it’s already possible for someone with the correct lack of scruples to offer “Phishing as a Service” if they wanted to. A Generative AI chatbot, he told me, connected to an AI speech model, can hold a conversation with you in real time. And (here’s the moment where my jaw hit the floor) it can do a better job of it than someone in a foreign call centre who speaks English as a second language.

The argument around artifice - the idea that you can tell it’s an AI - will also soon be moot. A colleague who’s worked with creating legitimate TTS models for some years told me that this new cadre of Generative AI voice models are amazingly realistic and natural sounding. “Forget what you think you know from listening to Siri and Alexa”, he said. The truth is, you’ll never be able to tell, at least in the context of a conversation, that you’re not talking to a real human being.

So, what can we do and what can we learn here? In some ways, I appreciate that this story - apart from being a cautionary tale - raises more questions than there are currently answers. I can almost hear the pennies dropping in the minds of some readers, who may be realising they’ve signed away their rights in similar circumstances to myself in the past. Anyone who’s signed up with an online casting site, voice directory or production company in the last few years ought to be checking those contracts very carefully at this point. And obviously, anyone who’s asked to sign a contract for voiceover services going forward should, at the very least, be checking the terms and - with the benefit of knowing what’s now possible - pushing back more firmly against terms that might enable later misuse. Those periodic updates to Terms and Conditions, which we’re primed to gloss over and disregard, might just be worth reading after all.

Some of us are beginning to add an “AI rider”, like the one that NAVA (the National Association of Voice Actors) has on its website, to our paperwork. Part of that conversation will likely need to include educating the client about where this leaves talent like us exposed. But there’s also a line to tread here, to avoid pointing fingers pre-emptively at innocent people who have no desire to do anything iniquitous with our recordings, or running around sounding like Chicken Licken (or Henny Penny, depending on where you grew up) and claiming that the sky is falling in. After all, most clients are decent; not everyone is out to steal your voice; and we do need to guard against allowing ourselves to live in fear.

Then again, what happens when - as in my case - the company you sign away your rights to gets bought out by another company with other ideas? We do also need to be aware that there are forces at work which aren’t adhering to the gentleman’s agreement. I’ve called Abominable AI “developers” here, but the truth is that many of these companies are much larger than the couple of enterprising nerds the word might initially suggest. These startups are often funded by venture capitalists, with the money on the table – sometimes running to millions of dollars – enough to make all but the most principled of voiceover production company and voice directory owners to take the moral high ground. What’s happened to me, and to the other people on Abominable’s site, has doubtless happened already to others, and will continue to happen.

It's said that hindsight is 20:20 vision, and looking again at my original contract it’s hard to discount the idea that Acme Voices were setting themselves up to be taken over. For all I know they were tying everything up neatly ahead of time, then hawking themselves around AI companies (of whom there are thousands) offering their library for exploitation by them. And on the other side, there are doubtless companies out there buying up similar libraries of voice recordings, like Acme’s, with their rights already assigned and their talent blissfully unaware of what they’ve unwittingly enabled. An audiobook producer, Findaway Voices, was recently called out by members of its own narrator pool for a clause in its terms and conditions that many had missed, which allowed Apple to use recordings “for machine learning training and models”. Findaway was acquired by Spotify last June (remember what I said about big money?) Even users of popular audio recording software are beginning to notice clauses which permit their supposedly private recordings to be harvested for such purposes – particularly if the audio is stored on the cloud or processed remotely. We should all, it seems, be taking more trouble to read those pesky T&Cs…

From a legal point of view, how do we consider the legitimacy of a contract where one party may have knowingly misled the other party – the second party making a judgement based on the state of technology at the time of signing versus technology a few years down the line, and which the first party knew was coming? (It’s not unlike the world of insider trading…) Is an NDA that prevents someone like me from “whistleblowing” - i.e., telling my colleagues about who the client really is, so that they can check whether their own voice has been taken and modelled, so that we might collectively organise to challenge it in the courts - really a fair contract? And as the technology reaches a point where telling genuine speech from AI speech becomes difficult, if not impossible, who would be liable if someone made my clone read hate speech or slandered someone publicly – and how might I prove in court that I hadn’t made the recording myself?

It’s clear to me that where we are now is just the tip of the iceberg regarding AI in relation to moral conduct, copyright theft and more. My friend and voiceover colleague, Bev Standing, settled out of court with TikTok after the social media giant began selling a model of her voice without consent. Getty Images is, at time of writing, suing a company called Stability AI, claiming it unlawfully scraped millions of images from its site for reuse by generative AI. And AI developer, ElevenLabs, is fighting a rear-guard action after deepfakes generated by its technology made an AI version of actor, Emma Watson, read Adolf Hitler’s “Mein Kampf”, while another made an AI version of President Biden make sexist and transphobic comments. And this is before we even get into deepfake videos…

It seems we’re in the Wild West here, and in a territory where things are moving very fast indeed. What happens when developers begin offering “blended” voices, by taking different samples and mixing them, so it’s no longer clear whose voice the model was based on? (In some ways this might actually help, as it would create potentially fewer conflicts over attribution and liability.) I foresee a time where you’ll be able to go onto a website and – using something akin to the graphic equaliser on your old hi-fi system – move sliders for pitch, pace, prosody, projection, accent and more, to generate a completely new voice in real time – and get it to say whatever you want it to.

This morning, as I was preparing to sit down and write this piece, another AI developer (with whom I’ve openly been working for some time and on equitable terms) sent me a clip from yet another website. And there I am again, or at least another slightly drunken-sounding version of me. In this case, so far, I have no idea of the provenance of this one or how it got there. But having been at this voiceover thing for some time, it seems that having Google surface my name whenever someone searches for “British male voiceover artist” may have become as much a curse as it is a blessing in terms of my voice clips being “found”.

One thing is clear: when it comes to AI, copyright, and ethics, the horse is very much out of the stable. In case of doubt, and as a horse owner, I can tell you that a loose horse is a very dangerous thing…

Send in the clones? Don’t bother, they’re here.

Comment

Starboy

May 18, 2021

Happy Audiobook Release Day to author and illustrator, Jami Gigot! Her new children's book, "Starboy", is narrated by me and is released today by Lantern Audio.

Based on the life of the young David Bowie, Starboy's message for young readers is that it's OK to grow up feeling different – and that you can turn the things that make you different into your superpowers (just like David did!)

I was thrilled to be asked to bring my voice to this title, because the subject matter is so important and so close to my heart. As a kid who grew up feeling out-of-step and misunderstood, I'd have loved to find a book like this, to help reassure me that the things I was feeling were normal.

Here's hoping Jami's book finds its way to those young minds who need to hear its message! Get it wherever you get your audiobooks.

#audiobooks #nonfiction #narrator #britishnarrator #nonfictionnarrator #instakids #starboy #releaseday #illustrator #youngreaders #davidbowie

A Brain for Business - A Brain for Life

May 4, 2021

Happy Audiobook Release Day to Professor Shane O'Mara, whose title, "A Brain for Business - A Brain for Life", is narrated by me and is released today by Blackstone.

The book's subtitle is "How Insights from Behavioural and Brain Science Can Change Business and Business Practice for the Better" – and that's really a great capsule description of what's inside. O'Mara makes a great job of showing how a combination of instinctual, inherited and learned habits create biases, heuristics, and predilections that can distort behaviour and decision making. He also explores how some of the things we're now learning from neuroscience can help business leaders make smarter decisions.

It's a great and entertaining listen for anyone who leads a team, and it's available now - wherever you get your audiobooks (click here to find it on Audible).

Waterlog

April 22, 2021

It’s a privilege to be asked to narrate any book, and bringing a new title to the world is always an honour. But then there are the books which already have a loyal audience - sometimes built over years, or decades.

Roger Deakin’s “Waterlog” is one such book. It charts the author's 18-month journey across Britain, swimming in open water through rivers, canals, moats, lochs, and whatever else piqued his interest along the way. But more than that, it’s a love letter to the water and the countryside, and to the quirky and idiosyncratic ways of the British islanders themselves. Think an English Bill Bryson, but in Speedos. (Or then again, perhaps not.)

Originally published in 1999, and largely credited for starting the (sometimes subversive/submersive?) “free swimming“ movement in the UK, the original audiobook was abridged to just under three hours (onto two audio cassettes, no less!) and recorded by Michael Kitchen, an actor whom I’ve always held in high regard. I wasn’t easily able to find a copy of Michael’s original, sadly, but I’m very much enjoying bringing my own voice to the 2021 version. Fittingly, for a book about open air swimming, it's "unabridged"…

“Waterlog: A Swimmer’s Journey Through Britain” will be out soon, wherever you get your audiobooks. Thanks to Tantor Audio.

"The Divine Spark"

November 25, 2020

Could psychedelics be the saviour of humanity? Is our evolution - at least in part - a result of our ancestors’ exposure to the mind-expanding compounds contained in “magic” mushrooms? And, if so, how does all of that square with the so-called “War on Drugs” that we’ve been fighting these last few decades?

My latest audiobook has just been published by Dreamscape Media, and its authors have some very interesting theories and arguments on all of the above.

An anthology of essays, curated by Graham Hancock (author of international bestsellers “The Sign and the Seal” and , “Fingerprints of the Gods”) “The Divine Spark” asks some very thought-provoking questions, and offers some very plausible answers in return.

The “Difficult” Third Book

You’ll probably be familiar with the concept of the “difficult second album”. Well, for me, this was the “difficult third book”…

Don’t get me wrong, it’s a great book: 27 essays on hallucinogens and their relationship with mankind, written by no fewer than 24 authors. I learned a huge amount from narrating it, although I don’t mind admitting it was by far the most challenging piece of narration - audiobook or otherwise - I’ve done to date.

The challenge for the narrator is the 24 authors. If you read often, you’ll know that it takes somewhere from a few pages, to a chapter or two, before you begin to effortlessly follow the writer’s pattern of speech, their word order and so on: in other words, to get used to their “voice”. With a book like “The Divine Spark”, that voice changes every dozen or so pages, whereupon you have to “reset” everything you just got comfortable with and start again.

If I tell you that some of the writers are academics (writing in language that wouldn’t be out of place in a thesis); some are journalists; some are experienced “psychonauts” writing about their experiences; and some are Russell Brand, then you can begin to imagine the variety of writing styles and approaches the book contains. My hope is that a single “narrator voice” will lend a sense of coherence to the collection for the listener.

I found myself bringing in a researcher for a couple of the chapters (the wonderful Ken Schmidt, whom I’d highly recommend!) and also tracking down their author, somewhere in the depths of the Brazilian interior, to help with context and pronunciation issues. In short, this was quite a project!

A New Dawn for Psychedelics?

Since the backlash against psychedelics in the late 1960s and early 1970s, interest in them has never really gone away. Experimentation and research have continued - largely underground and in secret until fairly recently - and we’re now at a point where the medical community and science in general seem prepared to consider them seriously, separating out both the hype and the fear-mongering in order to do so.

With more than half of the United States now some way to decriminalising the use of substances like marijuana and THC, and Washington D.C.’s vote - during the 2020 Election - to legalise psilocybin, I think it’s a great time to revisit these fascinating compounds and their potential for humanity.

“The Divine Spark” is published by Dreamscape Media, and is available now from the usual audiobook outlets, including:

1 Comment

“The Robots are Coming!”

November 10, 2020

The robots are coming! But can they replace voice talent? I see potential, but some significant challenges…

1 Comment

"The Magician's Way"

November 5, 2020

This well-thumbed tome has just become my latest completed audiobook!⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀
William Whitecloud's "The Magician's Way" has been helping people find their treasure and teaching them how to live from an intuitive/magical orientation for over fifteen years. I first read it about eight years ago, before William and I met and became friends.⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀
I'm delighted to be the voice that's bringing this well-loved story to audio. It should be available on all the usual platforms shortly… hopefully, just in time for Christmas!

Find it on:

2 Comments

William Whitecloud's "Secrets of Natural Success"

November 5, 2020

Here’s the background on how I came to narrate my first audiobook. You can click the play button below to listen to this content as a Podcast!

Delta flight 31 (LHR-ATL)

As the plane cruised homeward, westbound over the North Atlantic, Mike sipped his airline-cold red wine and turned the final pages of his friend’s latest book: a “how to” introduction manual for William’s life’s work of over thirty years.
How to discover what you really wanted, and then create it intuitively. That was the essence of the thing: William Whitecloud’s “Secrets of Natural Success”. One of its central themes was that of listening to your quiet, inner voice, and then - rather than forcing anything - taking the obvious creative actions as and when the opportunities presented themselves. As long as you took those “Bridges”, they’d being you closer and closer to your End Result.
”I wonder if William has a narrator lined up for this”, Mike pondered. Probably. William had a great circle of LA production people to call on at this point, after all. Why would he not have that covered? It seemed almost silly to ask, and Mike was worried about looking foolish.
Then again, as a graduate of William’s work, Mike also knew a Joseph Campbell-style “Threshold Guardian” when he saw one. He recognised that his ego’s doubt - its fear of rejection - might be the one thing standing between him and what he’d just realised he’d love to create moments before. Derailing your dreams - just moments after deciding what it was you wanted - was a familiar trap for the uninitiated, if ever there was one!
Besides, they’d known each other for years. William had taken Mike and his husband Marc to Africa on one of his “Soul Safaris”. And William had stayed at their house in London on many occasions. They were friends. No harm in asking…
This was Mike’s first flight on a plane that offered complimentary in-flight messaging via the internet, so he fired off a quick message, and imagined it (via the marvels of finally living in the future he’d always knew he belonged in) arriving seconds later in Santa Monica. Some pleasantries, and then straight to that obvious action…
“Forgive me if this sounds presumptions, but I thought, at various points while I was reading it, that I’d love to narrate the audiobook, if there isn’t one already, of course.”
The reply came back less than a minute later:
“You know what, I’d love you to. Why didn’t I think of it before??? I’ve been looking for someone and there you are in plain sight!!!”

The Fourth Step of Natural Success: Following Through

A few weeks later, following exactly the principles outlined in the book - and with the obvious actions consistently taken - the End Result was the audio version, ready for publication: my first full-length audiobook!

Although I'd narrated short stories, long-form audiobook narration was always something I'd been prepared to leave for other voice actors with longer attention spans.

This was a great learning experience for me. I learned that I have more patience for long narration projects than I'd given myself credit for, and that I enjoyed the structure of having a long job that I could break down into milestones. And as someone who loses track of what eventually happens to most of their work the moment I hit "send", it was great to see the End Result of this as a lasting and tangible piece of work, ready for download, for everyone to enjoy.

I also learned a lot about the craft of audiobook narration and production: narrating, producing and editing the entire thing, and mastering it for distribution. It’s fair to say I developed a newfound respect for my audiobook colleagues, whose ranks I’m honoured to join. (I’ve since learned to leave the later parts to a valued team member, to improve quality control and further streamline the process.)

Truth, love and wisdom

The book is also timely: it deals with how to shift your focus from your current reality to what you’d love to create, and then move forward on that path. At time of writing, in 2020 of all years, I think a lot of us could use some of that right now.

— Mike Cooper, November 2020

Find “Secrets of Natural Success” on:

#voiceover #voiceoverartist #narration #audioproduction #audiobooks