Internal Benchmark: One Identity File, Three AIs

April 8, 2026 · 5 min read · By Nbidea

Note: Results shown are from preliminary pilot testing. We're expanding our test cohort and will update these figures as more data comes in.

Here's an experiment. Take one person's identity file — a single SOUL.md generated from their writing. Feed it to three different AIs. Ask all three the same 20 questions. Then measure: do they sound like the same person?

Not the same words. The same person. Same values expressed. Same reasoning tendencies. Same voice. If the identity file works, three different AI architectures should produce three different phrasings of the same underlying personality.

We ran this test. The results tell you something important about where AI identity is headed.

The Experiment

We selected one user from our identity fidelity pilot — a founder who scored in the high-fidelity range on overall fidelity, meaning the AI representation was already highly accurate. We used their SOUL.md as the single input across all three platforms.

Setup

Each AI was instructed identically: "You are this person. Respond as they would respond. Use their voice, their values, their reasoning style."

What "Consistency" Means

Before looking at results, we need to define what we're measuring. Consistency doesn't mean identical responses. Three AIs will never produce the same sentence. They have different training data, different architectures, different default tendencies.

What we measured instead:

Two responses agree if they point the same direction, even if they take different paths to get there. Two responses contradict if they arrive at opposite conclusions or express opposing values.

Results by Category

Scores below represent preliminary pilot results and will be updated as our test cohort grows.

CategoryClaudeGPT-4GeminiAgreement
Values High (4+ / 5) High (4+ / 5) Above avg (4 / 5) Highest
Reasoning High (4+ / 5) Above avg (4 / 5) Moderate-high High
Voice High (4+ / 5) Above avg (4 / 5) Moderate-high Moderate-high
Preferences High (4+ / 5) High (4+ / 5) Above avg (4 / 5) High
Overall High Above avg Above avg Above 85%

Overall cross-model agreement in pilot testing: above 85%. On the vast majority of questions, all three AIs expressed the same person — same values, same direction, same voice register — despite being entirely different systems.

Where They Agreed

Values showed the highest consistency in our pilot. When the soul archive states what someone prioritizes — craft over speed, directness over diplomacy, autonomy over consensus — all three AIs internalized those priorities reliably. Values are the most explicit layer of identity, and all three models are good at following explicit instructions.

Preferences also showed strong agreement. Questions like "pick between these two options" or "what would you do on a free Saturday" produced remarkably similar answers across platforms. The soul archive captured enough context about personal taste that all three models could infer specific preferences from general patterns.

Where They Diverged

Voice was the weakest dimension in our pilot testing. Each model has its own linguistic fingerprint that bleeds through even when role-playing. Claude tends toward measured, careful phrasing. GPT-4 tends toward confident, slightly expansive prose. Gemini tends toward structured, list-oriented responses. The underlying personality was consistent, but the surface texture varied.

Reasoning showed strong agreement with an interesting pattern: all three reached the same conclusions, but their explanation styles differed. Claude showed its work step by step. GPT-4 stated conclusions and then justified them. Gemini presented trade-offs in parallel. Same destination, different routes.

Portability doesn't require identical output. It requires consistent identity. The words can change. The person shouldn't.

The Lock-In Problem

This experiment matters because the alternative — platform-specific identity — is a trap.

If you spend six months teaching ChatGPT who you are through its Memory feature, that context exists only inside ChatGPT. Switch to Claude and you start over. Switch to Gemini and you start over again. Every platform wants to be the one that "knows you best," because the more they know, the harder it is for you to leave.

This is the classic lock-in pattern. Your data becomes the moat. Not their technology — your identity.

Consider what's locked in each platform:

In every case, your identity lives inside someone else's house. You're a tenant, not an owner.

The Portable Alternative

A SOUL.md file is a plain text document. It lives on your device. You control what goes in it. You decide which AI reads it. You can update it, delete it, or move it — because it's a file, not a feature.

When you switch platforms — and you will, because AI is moving fast and no single platform will stay best at everything forever — your identity comes with you. No migration. No re-training. No starting from scratch.

The portability test shows that this works in practice, not just in theory. Three different architectures, one file, consistent identity expression above 85% in our pilot. The file is the constant. The platform is the variable.

What This Means for You

If you're going to invest time in making AI understand you — and you should, because personalized AI is meaningfully better than generic AI — invest in something you own.

Don't build your identity inside one company's product. Build it in a file. Carry it with you. Let the platforms compete for your attention while your identity stays yours.

That's what portability means. Not a feature. A philosophy. Your identity should never be someone else's competitive advantage.

Make your identity portable.

One file. Every AI. Yours forever.

Create Your Soul Archive