What is machine learning in simple terms?

Machine learning teaches computers to find patterns in data without being explicitly programmed. Instead of writing rules like 'if red and round, it's an apple,' you show the computer thousands of examples and it learns the pattern itself. It follows 5 steps: define the problem, collect data, process it, train a model, then predict on new data.

Why does ChatGPT keep forgetting who I am?

ChatGPT uses a context window that resets every conversation. Its Memory feature stores fragments but is unreliable — in 2025, users reported mass memory wipes where all saved memories disappeared overnight. The fundamental problem is that your identity is stored inside one platform's servers, not in a file you own. A portable identity file (soul archive) solves this by letting you carry your identity across any AI platform.

Can machine learning extract personality from text?

Yes. Text contains rich identity signals — sentence structure reveals thinking style, word choice reveals values, topic patterns reveal priorities, and emotional tone reveals personality. Machine learning can extract these patterns from your writing and distill them into a portable identity file that any AI can read.

What is the difference between AI memory and a soul archive?

AI memory (like ChatGPT Memory) stores bullet-point facts inside one platform — it can vanish, it's shallow, and you can't take it with you. A soul archive is a portable text file generated from your own writing that captures your full identity. It works on any AI platform, you own the file, and it never disappears because it lives on your device.

Is there a way to make any AI remember me permanently?

Yes. Generate a soul archive from your writing — it creates SOUL.md and MEMORY.md files that capture your identity. Paste SOUL.md into any AI's system instructions. The AI gets a structured picture of who you are, how you think, and how you communicate. Works across every platform. Takes 30 seconds.

Soul Alchemy Is Machine Learning — But for Identity

April 8, 2026 · 6 min read · By Nbidea

Machine learning has a reputation for being complicated. Neural networks. Gradient descent. Backpropagation. Terms that make most people's eyes glaze over.

But at its core, every machine learning application follows the same five steps. And when you map those steps onto what Soul Alchemy does, something clicks: identity extraction isn't a metaphor for machine learning. It is machine learning. Just pointed at a different target.

The Problem It Solves

In 2025, users reported mass memory wipes — all saved memories in their AI chatbots vanished overnight. Forums filled with posts like "I had months of context and it's all gone." People who'd spent hours teaching their AI about themselves woke up to a blank slate.

But even when AI memory works as designed, it's fundamentally broken. It stores fragments — bullet points like "user is a designer" — not the texture of who you are. It's locked inside one platform. Switch to a different AI and your "memory" stays behind. It's passive — the algorithm decides what to remember, not you.

The result: after months of conversation, your AI knows less about you than a stranger reading your journal for five minutes.

Machine learning can fix this. But not the way you might think.

The 5 Steps

Every ML textbook teaches the same pipeline. Here's how each step maps to identity extraction:

ML Step	Traditional ML	Soul Alchemy
1. Define the problem	"Classify this image as cat or dog"	"AI doesn't know who you are — extract identity from text"
2. Collect data	Gather thousands of labeled images	You paste your own writing — journals, emails, messages
3. Process data	Resize, normalize, remove noise	Extract identity signals, filter noise from raw text
4. Train model	Feed data into neural network, adjust weights	AI analyzes writing patterns — style, values, personality
5. Inference	New image → "This is a cat (94%)"	New text → SOUL.md + MEMORY.md (your identity file)

Same pipeline. Same logic. Different output.

Step 1: Define the Problem

In traditional ML, you start by defining what you want the machine to learn. Spam detection. Image classification. Recommendation engines. The problem shapes everything that follows.

Soul Alchemy's problem definition: AI doesn't know who you are.

Every conversation starts from zero. You explain yourself. The AI nods. You do this again tomorrow. And the day after. The problem isn't that AI is stupid. It's that AI has no persistent representation of you.

The ML solution: extract identity from text and store it in a portable file that any AI can read.

Step 2: Collect Data

Data quality is everything in ML. More data, better data, higher accuracy. This is true for image classifiers, and it's true for identity.

Your training data: your own writing.

Journal entries reveal values and emotional patterns
Emails reveal communication style under professional pressure
Chat messages reveal informal voice and relationship dynamics
Notes reveal thinking process and decision patterns

More text = better identity extraction. But even a few paragraphs contain enough signal. You've already generated thousands of words of training data. It's sitting in your sent folder, your notes app, your message history.

Step 3: Process the Data

Raw data is noisy. In image ML, you resize images, normalize pixel values, remove corrupted files. In text-based identity extraction, you do the equivalent:

Signal extraction. Not every sentence carries identity information. "Meeting at 3pm" tells you nothing about who someone is. "I'd rather work through the night than ship something I'm not proud of" tells you everything.
Pattern recognition. Sentence length, vocabulary complexity, use of metaphor, frequency of hedging — these aren't content, they're style. And style is identity.
Noise removal. Greetings, small talk, pleasantries — these are social protocol, not personality. The processing step strips them.

This is where domain expertise matters. In medical ML, a doctor identifies which variables are clinically meaningful. In identity ML, the equivalent expertise is understanding which linguistic features actually represent who someone is, versus what's just contextual noise.

Step 4: Train the Model

In supervised ML, you split your data: 60-80% for training, the rest for testing. The model learns patterns from the training set and validates them against the test set.

Identity extraction works similarly. The AI reads your writing and builds a model of you:

Voice model. How you construct sentences. Direct or circuitous. Formal or casual. Short or elaborate.
Values model. What you defend. What you dismiss. What triggers strong language. What you care about enough to write at length about.
Thinking model. Do you reason from first principles or by analogy? Do you decide fast or deliberate? Do you trust gut or data?
Blind spot model. What you never question. What you always assume. What you repeat without realizing.

The output of training isn't weights in a neural network. It's a structured portrait of a person.

Step 5: Inference and Prediction

In traditional ML, inference means feeding new data into the trained model and getting a prediction. New image → "cat, 94%." New email → "spam, 87%."

The result is always a probability. Not certainty. Not truth. A best estimate based on patterns the model learned.

Soul Alchemy's inference step outputs two files:

SOUL.md — Your identity archive. Values, voice, thinking style, blind spots. The qualitative portrait.
MEMORY.md — Your structured context. Facts, preferences, relationships, projects. The quantitative data.

These files aren't a score or a category. They're a portrait. And like all ML outputs, they're a best estimate — rich enough to be useful, humble enough to know they're not complete.

The Key Difference: Output Type

This is where identity ML diverges from traditional ML:

Traditional ML	Identity ML
Output: a number (score, probability, classification)	Output: a portrait (text file describing who you are)
Reduces complexity to a label	Preserves complexity as a narrative
Answers: "What category is this?"	Answers: "Who is this person?"
Consumed by machines	Consumed by AI and the person

A spam classifier reduces an email to "spam" or "not spam." Useful, but reductive. A soul archive doesn't reduce you. It distills you — concentrates the signal without losing the texture.

Traditional ML asks: what box does this belong in? Identity ML asks: what is this, in its full complexity? One classifies. The other recognizes.

Why This Matters

We're entering an era where AI agents will manage your email, schedule your meetings, write on your behalf, and make decisions in your name. These agents need more than your login credentials. They need to know who you are — your priorities, your judgment, your standards.

The people who have a machine-readable identity file will get AI that acts like them. Everyone else will get AI that acts like a generic average of everyone.

Machine learning made it possible for computers to see, hear, and predict. Identity ML makes it possible for computers to know you.

The Privacy Difference

There's a growing concern about how AI platforms remember you. Major tech companies are now mining your emails, search history, photos, and video activity to make their AI "more personal." They're building a model of you — inside their servers, under their control, using their definition of what matters.

A soul archive inverts this. You generate the identity file. You decide what goes in. You own the file. It lives on your device, not on someone else's server. You choose which AI gets to read it. And you can delete it whenever you want.

The difference between platform memory and a soul archive is the difference between someone reading your diary without permission and you handing someone a letter of introduction you wrote yourself.

Same five steps. Same pipeline. Different — and arguably more personal — output.

Run the pipeline on yourself.

Paste your writing. The model extracts your identity. Any AI reads the output and knows who you are.

Create Your Soul Archive