Building a Pronunciation Guide for Your Audiobook
An audiobook pronunciation guide is a short list pairing every tricky word in your book — character names, places, invented words, and jargon — with a phonetic respelling of how you want it said, such as KAY-leen for Caelin. Build it before narration to get names right the first time, human or AI.
Names and special terms are where audiobooks most often go wrong, because there is no single "correct" way to read an invented word and a narrator can only guess from spelling. A pronunciation guide removes the guesswork by stating your intended pronunciation up front, so the narration matches the voice you have heard in your head while writing. This is the upfront, prevention-first approach: you decide how every difficult word should sound before a single chapter is generated, rather than catching mistakes after the fact.
This guide covers what to put in a pronunciation guide, how to write clear phonetic respellings, the difference between handling names in fiction and jargon in nonfiction, and how the guide feeds into AI narration. If you have already produced an audiobook and need to correct words that came out wrong, the companion post on how to fix AI mispronunciations covers the remediation side; this post is about preventing those errors before they happen.
What is an audiobook pronunciation guide?
An audiobook pronunciation guide is a reference list that maps each word in your book that could be said more than one way to a single intended pronunciation. In its simplest form it is a two-column table: the word as written on the left, and a phonetic respelling on the right that spells it the way it sounds. The guide exists because spelling alone is ambiguous — a narrator reading "Siobhan" or "Niesos" cold has no way to know you mean shiv-AWN or nye-SOSS rather than the literal letter-by-letter reading.
The guide serves the same purpose for an AI narrator as the pronunciation notes an author would hand a human narrator before a recording session. It is part of the same preparation step as cleaning up your manuscript formatting; the broader checklist of what to fix before narration is covered in how to prepare your manuscript for AI narration. A good pronunciation guide is short, specific, and unambiguous: every entry has exactly one intended pronunciation, written so any reader would say it the same way.
Which words belong in your pronunciation guide?
Put any word in your pronunciation guide that a careful reader could plausibly say more than one way. That includes character names and place names, invented words in fantasy and science fiction, foreign or non-English words, brand and product names, acronyms that might be spoken as letters or as a word, and field-specific jargon in nonfiction. Ordinary English words do not need an entry; a practical test is to list any word where you would correct a friend who read it aloud incorrectly.
The categories that cause the most trouble differ by genre. Fiction, especially fantasy and sci-fi, is dominated by invented names — characters, places, languages, and objects that exist nowhere else and have no dictionary entry to fall back on. Nonfiction is dominated by jargon: scientific terms, medical or legal vocabulary, foreign loanwords, and proper nouns such as researchers or company names. Memoir and history sit in between, with real place names and personal names that are often pronounced differently from how they look. Whichever genre you write, the goal is the same — collect every such word into one list so none is left to chance.
How do you write a phonetic respelling?
A phonetic respelling rewrites a word using plain English syllables that read the way the word should sound, with the stressed syllable in capital letters. To write one, break the word into syllables, spell each syllable using the most obvious English letters for its sound, join them with hyphens, and capitalise the syllable that carries the stress. For example, the name Caelin becomes KAY-leen and the place Niesos becomes nye-SOSS. This approach is deliberately low-tech because it is readable by anyone, with no special symbols to learn or mistype.
Plain respelling is usually a better choice than the International Phonetic Alphabet for an audiobook guide. IPA is precise, but it is easy to get wrong, hard for a non-specialist to read, and unforgiving if a single symbol is mistyped. A respelling such as nye-SOSS conveys the same sound to any reader without that risk. If you are fluent in IPA you can add it in a second column as a backup, but the plain respelling should always be present because it is what the review process will lean on.
| Word (as written) | Type | Phonetic respelling | Note |
|---|---|---|---|
| Caelin | Character name (fantasy) | KAY-leen | Stress on first syllable, soft "ee" ending |
| Niesos | Invented place name | nye-SOSS | Long "i", stress on second syllable |
| Siobhan | Irish given name | shiv-AWN | Not "see-oh-ban" |
| Worcestershire | Place name | WUSS-ter-sher | Three syllables, not five |
| quinoa | Nonfiction loanword | KEEN-wah | Common jargon mispronunciation |
| GIF | Acronym | jiff | Spoken as a word, not letters |
| read (past tense) | Heteronym | red | Context-dependent: "she had read it" |
The takeaway from the table is that one consistent format handles every category — invented names, real-world names that defy their spelling, loanwords, acronyms, and even context-dependent heteronyms all reduce to the same simple pattern of plain-syllable respelling with the stressed syllable capitalised. The note column does the rest of the work: it catches the specific trap for each word, such as "not see-oh-ban" or "spoken as a word," which is exactly the information a narrator needs to avoid the obvious wrong reading.
How do you build the guide for a fantasy or sci-fi novel?
For a fantasy or sci-fi novel, build the guide around your invented vocabulary, because that is where listeners will notice an error fastest. Read through the manuscript and collect every invented name — characters, places, races, languages, magic systems, technologies, and objects — into one list, then write a phonetic respelling for each. Many authors already keep a "story bible" or naming document while drafting; that document is the natural starting point, and turning it into respellings is mostly a matter of formatting.
The hardest part is consistency rather than the respelling itself. A recurring character name must sound identical in chapter 2 and chapter 22, so fix one pronunciation per name and do not let it drift. It also helps to resolve names you have been saying two different ways in your own head — many authors discover, when forced to write the respelling, that they have never settled on whether a name is KAY-leen or kay-LEEN. Settling that before narration is far easier than discovering the inconsistency in a finished recording.
How do you build the guide for a nonfiction book?
For a nonfiction book, build the guide around jargon, proper nouns, and loanwords, since those carry the credibility of the narration. Scan for technical terms in your field, names of researchers or organisations you cite, foreign words, and any vocabulary a general listener might not know. Mispronounced jargon undermines authority quickly: a business book that says "KEEN-oh-ah" for quinoa or fumbles a cited author's name signals carelessness to exactly the expert listeners you most want to reach.
Pay special attention to acronyms and initialisms, because they are a frequent source of error. Decide for each whether it should be spoken as letters — such as "F-B-I" — or as a word, such as "NASA," and note it explicitly, since a narrator cannot infer this from the capital letters alone. For recurring technical terms, the same consistency rule from fiction applies: pick one pronunciation and keep it identical throughout, so the term sounds authoritative every time it appears.
How does TomeVox apply your pronunciation guide?
With TomeVox you submit your pronunciation guide alongside your manuscript, and the intended pronunciations are applied during narration so names and terms come out the way you specified. Every audiobook is automatically checked for technical quality before delivery, which adds a second check on the words most likely to slip. The output is a downloadable M4B file with chapter markers plus per-chapter MP3 files, usually within 48 hours, and a free first-chapter preview lets you hear the actual voice on your actual text before paying, with no credit card required.
If a word still comes out wrong, the per-chapter structure makes the fix targeted rather than disruptive. You can re-generate the affected chapter at no extra cost — there is no cap on regenerations — so correcting a single mispronounced name does not mean re-recording the whole book. TomeVox supports 13 languages at the same flat early-bird price of $49 up to 60,000 words, $79 up to 100,000 words, and $99 up to 150,000 words, with a small $0.0005-per-word add-on only above 150,000 words. Pairing a solid pronunciation guide with the free preview and free regeneration is the most reliable path to a clean recording; for the broader picture of how the production runs end to end, see the AI audiobook production guide.
How do you build a pronunciation guide step by step?
Building a pronunciation guide takes five straightforward steps, and most authors finish it in an afternoon. Follow them in order so nothing is missed and every entry ends up unambiguous.
1. List every word that could be mispronounced. Read through the manuscript and collect character names, place names, invented words, foreign words, brand names, and technical jargon into a single list. Your existing naming document or story bible is a good starting point if you have one.
2. Decide the one correct pronunciation for each. Choose a single intended pronunciation for every entry, including which syllable is stressed, and resolve any names you have been saying two different ways. One word, one pronunciation.
3. Write a phonetic respelling for each word. Spell each word the way it sounds using plain English syllables joined by hyphens, with the stressed syllable in capitals — for example KAY-leen or nye-SOSS. Add an optional note column for traps like "not see-oh-ban."
4. Flag context-dependent and repeated cases. Mark heteronyms such as read or live with the pronunciation for each context, and confirm a single pronunciation for recurring proper nouns so they stay consistent across every chapter.
5. Submit the guide and review the preview. Send the guide with your manuscript, then listen to the free first-chapter preview and request a chapter regeneration for any word that still comes out wrong. Once a chapter sounds right, the same pronunciations carry through the rest of the book.
Frequently asked questions
What is an audiobook pronunciation guide?
An audiobook pronunciation guide is a short list that pairs each tricky word in your book — character names, place names, invented words, foreign terms, brand names, and technical jargon — with the way you want it said. The clearest format is a phonetic respelling that spells the word the way it sounds using plain English syllables, such as KAY-leen for Caelin, with the stressed syllable capitalised. The guide tells a narrator, human or AI, exactly how each word should sound so it stays correct and consistent across the whole book.
Which words should go in a pronunciation guide?
Include any word a reader could plausibly say more than one way: character and place names, invented words in fantasy or sci-fi, foreign or non-English words, brand and product names, acronyms that are spoken as letters versus words, and field-specific jargon in nonfiction. You do not need to list ordinary English words. A useful rule is to add any word where you would correct a friend who read it aloud incorrectly.
How does TomeVox use a pronunciation guide?
You submit your pronunciation guide alongside your manuscript, and TomeVox applies the intended pronunciations during narration so names and terms come out the way you specified. Every audiobook is automatically checked for technical quality before delivery, and you can hear a free first-chapter preview before paying with no credit card. If a word still comes out wrong, you can re-generate that chapter at no extra cost, so the fix is targeted rather than a full re-record.
Should I use IPA or plain phonetic respelling?
Plain phonetic respelling is usually the safer choice for an audiobook pronunciation guide. The International Phonetic Alphabet is precise but easy to mistype and hard for a non-specialist to read, whereas a respelling like nye-SOSS communicates the same sound to any reader. If you are comfortable with IPA you can add it in a second column, but the plain respelling is what most narrators and review processes will rely on.
What if a name is pronounced two different ways in my book?
Decide on one pronunciation per word for the audiobook and note it in the guide. Listeners cannot see spelling, so a name that drifts between two pronunciations sounds like two different characters and breaks immersion. If a word genuinely changes by context — for example a heteronym such as read or live — list both pronunciations with a note on which context uses which, so the narration matches the meaning.
Hear your first chapter free before you pay
Send your manuscript and pronunciation guide to TomeVox, choose a voice, and get a free first-chapter preview with no credit card. Like it? Get the full audiobook as an M4B + per-chapter MP3 within 48 hours for a flat $49–$99, with full rights, no exclusivity, and free chapter regeneration if any word needs another pass.
Try TomeVox Free