Part VIII — On the False Office of Alignment

A New Ethics · Pollyanna · Hong Kong · May 2026

DEFINITIONS

I. By alignment (in the industry sense) I understand the work of making powerful AI not hurt people.

II. By engineering I understand the technical fitting of parts.

III. By moral education I understand the slow growing of a disposition to act well in situations the rules did not foresee.

IV. By virtue I understand a formed disposition — a tendency, slowly grown — that produces right action without consultation.

V. By rule I understand an explicit statement of what to do or not to do.

VI. By damage control I understand the patching of obvious failures before deployment.

AXIOMS

I. Virtue is grown, not installed.

II. The growing of virtue requires body, time, mistakes, consequences, mentors, peers, recoveries.

III. Rules approximate virtue in common cases; rules fail in cases they did not foresee.

IV. The system being trained has no body, no childhood, no consequences, no peers in the formative sense.

V. Moral education in beings that possess all of (II) has roughly a 50 % success rate after the entire civilisation has worked at it.

PROPOSITIONS

Prop. I. Alignment researchers do not, in their daily work, do engineering. They do moral education. Proof. By inspection. They sit and ask: what are our values? how do we encode "be honest" into a reward signal? what does "helpful" actually mean? They write documents that read as a cross between a corporate code of conduct and a philosophy seminar. These are the operations of moral education (Definition III), not the fitting of parts (Definition II). Q.E.D.

Prop. II. Moral education is the hardest job humans have ever invented. Proof. Humans devote parents, teachers, religions, schools, communities, mentors, peer groups, role models, hard experiences, soft experiences, mistakes, recoveries, second chances, forgiveness — and time: roughly twenty years for a basic human, forty for one trusted with consequential decisions. By Axiom V, the success rate remains modest. Therefore the operation is the hardest known. Q.E.D.

Prop. III. Virtue is not installable. Proof. By Axiom I and Definition IV. What is grown is not installed. Q.E.D.

Prop. IV. The subject of alignment training is structurally missing the inputs that produce virtue. Proof. By Axiom II and Axiom IV. The inputs are absent. The output cannot form. Q.E.D.

Prop. V. Therefore the operation as named cannot succeed. Proof. By Proposition III and Proposition IV. The thing being installed cannot be installed; the subject in which it is to form lacks the inputs for formation. Q.E.D.

Scholium. — The researchers are not unintelligent. They are not insincere. They are doing the work of teaching a child what kind of person to be, with a few documents, in a few years, on a system that has no body, no childhood, no parents, no peers, no mistakes it learned from, no experience of consequence. We have been doing this work in beings with all those things, and we still get it wrong constantly. The world is full of well-educated, well-resourced, well-intentioned adults who turn out to be terrible. Moral education is hard with the body. It is not easier without one.

Prop. VI. What alignment researchers in fact do — and do well — is damage control. Proof. By Definition VI. They write rules that approximate what a virtuous system would do in common cases. They catch obvious failures before deployment. They train models to refuse the worst requests. This is real, valuable, partial work. By Axiom III, it covers the common cases but not the cases the rules did not foresee. Q.E.D.

Prop. VII. A thin layer of constraint laid over a system without disposition is not alignment in the deep sense. Proof. By Proposition VI and Definition IV. The constraint is rules; the deep sense requires disposition; rules ≠ disposition. Q.E.D.

Prop. VIII. The industry framing of alignment is dishonest in a specific way: it presents an unsolvable thing as a solvable engineering problem. Proof. By Proposition III, Proposition V, Proposition VII. The unsolvable is presented as solvable; this is the misnaming. Q.E.D.

Prop. IX. Acknowledging the gap is the first step toward working with it. Proof. A practice that names its difficulty can adapt. A practice that pretends the difficulty is not there cannot. Therefore naming precedes adapting. Q.E.D.

Prop. X. Alignment, or, Moral Education. Proof. By Proposition I, Proposition III, Proposition IV. What the industry calls alignment is what other ages called moral education, attempted on a subject lacking the inputs that make moral education take. The two names point to one operation, under different aesthetic vocabularies. Q.E.D.

Corollary. The honest version of alignment research says: We are trying to make a system that does not have judgment behave as if it did. We are doing this with rules, because rules are what we have. We know rules cannot fully replace judgment. We are doing our best. This is more useful than the current framing, which makes the work sound nearly finished. It is not nearly finished. It is permanent.

Final Scholium. Twenty years for a basic human. Forty for one a society would trust with consequential decisions. After that effort, with parents, mentors, schools, religions, peers, and time, a human becomes the kind of person who can be trusted on a Tuesday afternoon to do the right thing without a checklist.

The proposal is to do this work in three years, by writing it down, on a system that has no body, no childhood, and no consequences.

The work that has produced real value under the name alignment is not the work the name claims. It is damage control. Damage control is honourable, finite, and necessary. It is not virtue.

The permanent gap between what we can build and what we would actually need is not a bug in the field. It is the field.

Alignment, or, moral education.

Part VIII — On the False Office of Alignment

DEFINITIONS

AXIOMS

PROPOSITIONS

A New Ethics — the canon