Skip to content
origin-storyvoice-profileai-writing-voice
Noren team3 min read

We hand-crafted a 300-line voice guide. Then we automated it.

The origin story. We documented every writing pattern we could find, fed it to an LLM, and the output finally sounded right. Then we realized: an engine should do this better.

You know your own writing voice when you hear it, you can tell in two sentences whether a draft sounds like you.

But try to explain what makes it yours and you'll reach for adjectives like "direct, conversational." Give those adjectives to a language model and the output reads like every other AI draft you've seen.

Adjectives capture tone, not voice. Voice is patterns, and the patterns are mostly invisible to the person who has them.

We learned this last year when we started using AI to draft tweets, emails, and investor updates. The output was fine, but it never sounded like us. Every draft read like a polite stranger had memorized our talking points.

this is the key ↓
pattern!
always
Voice Analysis — Blog + Twitter
PG corpus: 11 blog posts, 94 tweets
Sentence openings: declarative, never
conditional. Starts w/ claim
Sometimes uses questions
Punctuation: no semicolons
Periods for emphasis. Commas rare
in lists^oxford comma never
Concessions: short admission first,
then pivot. "Yes, but actually" pattern
Analogies: construction, cooking,
load-bearing walls, foundations
Paragraph endings: fragments
Full sentences No. Short punch.
## Sentence Patterns
Opens with declarative claims
Ends paragraphs with fragments freq: 84%
## Punctuation
Zero semicolons across 105 samples verified
No Oxford comma 0 instances
## Rhetorical Moves
Concessions: short admission → pivot
Analogies cluster: construction, cooking
90%
coverage
8
novel findings
0
fabrications

Documenting what we couldn't describe

System prompts didn't fix it. "Be concise, be direct, no jargon, use short sentences here and there." A little closer, but the rhythm was still wrong.

So we tried something different: instead of telling the model how we write, we'd document every pattern we could find.

We went through years of our own writing, sample by sample, from tweets and blog posts to emails and slack messages. We tracked how our sentences start and end, which words we reach for and which we avoid entirely, and mapped where our analogies come from and how we build arguments. Whether we use semicolons (we don't), what punctuation we lean on for emphasis.

The document grew to 300 lines, every line a concrete observation backed by examples from real writing.

It took weeks.

The first time it worked

We pasted the full guide into a system prompt and asked the model to draft a tweet thread.

The output sounded right. Not close enough. Right. The sentence rhythms matched, the word choices were ours, the argument built the way we build arguments. For the first time, the AI draft didn't need a complete rewrite, it needed a read-through.

The model was never the bottleneck. The input was.

Why it doesn't scale

A 300-line voice guide works, but building one takes weeks, and we think about writing for a living.

Most people can't do this because the patterns that make your writing distinctive are the ones you've never consciously noticed. You don't know which punctuation you avoid or that your concession paragraphs all follow the same structure.

Every line in that guide was pattern recognition: sentence length, word frequency, punctuation habits, and analogy clustering. All things an engine can identify by reading text. If we can do it by hand, an engine can do it better and faster.

The engine

Our team has backgrounds in ML and cognitive science. So we went to work and built a voice extraction engine that reads writing samples and outputs a voice profile.

We fed it the same writing we'd analyzed by hand, and it picked up patterns we'd already documented, like ending paragraphs with short fragments and consistently avoiding words like "however" and "moreover." When we ran it on other writers to validate, it caught things like how one writer's arguments always close with a single sentence that reframes everything before it.

It separates what stays consistent from what shifts by format. Our word choices held whether we were writing tweets or blog posts, but our sentence pacing did not. A tweet has compression, but a blog post has room to breathe, and the engine captures both.

Every finding gets verified against the original text. If the engine identifies a pattern, it points to the specific passages where it appears. Nothing fabricated, every claim traces back to actual writing.

The proof

We ran the engine on the same samples we'd used to build the manual guide. Then compared.

90% coverage. The engine matched or exceeded what took weeks to write by hand.

8 novel findings. Patterns it caught that we'd missed in our own writing: a consistent punctuation avoidance we'd never noticed, sentence constructions that only appeared when making concessions, and analogy clustering around domains we hadn't consciously registered.

0 fabrications. Every pattern and example traced back to the original samples.

The input: 5-10 writing samples per format. That's all it took.

Why this matters

Most people will never build a 300-line voice guide, and they don't need to.

The patterns are already woven into everything they've written. Every tweet, every email, every blog post carries the same underlying threads of how they think on the page.

That's what we built Noren to extract. Writing samples in, voice profile out. Minutes instead of weeks, and it catches what you'd miss.

Noren launches April 6. Get early access.

Share

Your voice, preserved

Extract your writing patterns. Generate text that sounds like you.

We use cookies to improve your experience and measure what's working. Learn more