Reading ≠ Learning
Four architectures for AI that knows itself.
↓
Watch the difference.
Same prompt. Two models. One forgets. One remembers.
> Is processPayment() thread-safe?
> Set up auth for this Express app (session 2)
1. Typed Attention
Current attention knows how much things are related. Not how.
Typed: labeled relationships
Current AI
0.73
"These are related."
How? No idea.
Typed Attention
REQUIRES
"B breaks if A changes."
Actionable. Propagates.
2. Activation Thresholds
Your brain doesn't fire every neuron on every input. Why does attention?
Current: everything fires. O(n²)
Thresholds: only what matters. O(n×k)
Current AI
O(n²)
Every token looks at every token. Always.
Brute force. Opaque.
With Thresholds
O(n×k)
Only relevant connections fire.
And you can see which ones.
3. Self-Modifying Inference
Weights are frozen at inference. The model is identical before and after your conversation.
Current: every session starts blank
Trace: knowledge accumulates
Current AI
Session 1: You correct a mistake.
Session 2: Same mistake.
Session 3: Same mistake.
Corrections evaporate with context.
With Trace Layer
Session 1: You correct a mistake. Trace written.
Session 2: Model reads its own trace. Already knows.
Session 3: Builds on accumulated history.
Processing transforms the processor.
4. Self-Knowledge Loss
Current AI optimizes for one thing: predict the right token. It has zero concept of why.
Current: uniform confidence. No attribution.
Self-knowledge: confidence varies. Citations link.
Current loss functionloss = predict_right_token()
// that's it.
// no concept of "why"
Self-knowledge lossloss = predict_right_token()
+ α × can_you_cite_why()
// unexplainable = expensive
What does confidence look like when a model actually knows what it knows?
Current AI
Same flat bar. No differentiation. It doesn't know what it doesn't know.
With Self-Knowledge
It knows callback safety is weak. That's the point.
Current AI
"Yes, it's thread-safe."
Maybe right. Maybe hallucination. No way to tell from the outside.
With Self-Knowledge
"Yes — lock on line 47 guards shared state. Callback on line 62 runs on main thread. Confidence: 0.91"
Every answer carries its own proof.
The Transparent Box
Attends with typed relationships
Fires only when relevant
Transforms through its own processing
Knows why it says what it says
Not a black box you hope is right.
A transparent box that can't not show its work.
Now — you.
You scrolled through four ideas.
Now pick the one that won't leave you alone.
Four starting points. Pick one.
Each of these is a prototype you can build on consumer hardware this weekend.
Path 1
Type your attention
Current attention computes a number between every pair of tokens. That number says how much two things are related. It says nothing about how.
Take any attention implementation. Add edge type labels — even just two: REQUIRES and USES. Now the model doesn't just know that auth.py and stripe.py are related. It knows one requires the other.
Run the same prompts with and without typed edges. Does the model behave differently when it knows relationship types? If yes — that's signal. If no — you've still built something nobody else has tested at this granularity.
Needs: any transformer, ~100 lines of code
Path 2
Add a threshold
In standard attention, every token attends to every other token. Most of those connections carry almost no signal — but they still cost compute. And you can't see which ones mattered.
Add a minimum activation threshold to any attention layer. Connections below the threshold don't fire. Log which connections survive the cut.
Two things happen. First: efficiency — O(n²) drops toward O(n×k) where k is the number of connections that actually matter. Second, and more important: interpretability. You can now see exactly which connections the model used. The ones that fire are the ones that mattered. The rest were noise.
Needs: one attention layer, a threshold value, a logger
Path 3
Write a trace
Every AI session starts from zero. You correct a mistake in session 1. Session 2 makes the same mistake. The model read your correction. It didn't learn from it.
After each session, write what the model learned to a file — corrections, decisions, patterns discovered. Before the next session, prepend that trace to context.
Now measure: does it make the same mistake twice? Does it reference yesterday's corrections in today's answers? The trace is the simplest form of self-modifying inference — the model reading its own history and changing behavior because of it.
Needs: any LLM API, a JSON file, a diff checker
Path 4
Train for self-knowledge
Current models optimize for one thing: predict the right next token. They have no loss for knowing what they don't know. A model that's 30% confident about something says it with the same fluency as something it's 95% confident about.
Add a second prediction head that outputs a confidence score. Train on examples where you know ground truth — when the model is right, confidence should be high. When it's wrong, confidence should be low.
The loss becomes: predict_right_token() + α × can_you_cite_why(). Now the model is penalized for generating things it can't explain. Self-knowledge isn't a feature. It's a loss function.
Needs: a fine-tuning setup, labeled confidence data
What you actually need.
Not what you think you need. What you actually need.
A model
Qwen 2.5 7B, Llama 3.1 8B, Mistral 7B.
All free. All run on consumer GPUs.
8GB VRAM with QLoRA.
A method
LoRA/QLoRA for fine-tuning.
Hugging Face transformers + PEFT.
Standard Python. No custom CUDA.
A question
Not "how do I build this?"
But "what happens if I try?"
The question IS the method.
Everything on this page was built with a Qwen 2.5 7B and a Claude Max subscription.
If you have a GPU and curiosity, you have enough.
What happens when someone actually does this?
If one person builds one typed attention prototype, we learn whether relationship labels carry signal through inference.
If ten people each build a different piece, the architecture starts assembling itself — not by coordination, but by convergence. Same question, different implementations, shared findings.
If any of it works — even partially, even messily — it means the current paradigm is leaving value on the table. Not because the models are bad. Because they don't know what they know.
And if it doesn't work?
Then you'll know why. You'll have the experiment, the data, the specific point of failure. That's more valuable than an opinion. That's engineering.
Three ways to engage.
Build it
Pick a path. Build a prototype. Share what happens — the failures are as valuable as the successes.
Break it
Tell us what's wrong. Which ideas are already solved? Which are provably impossible? Point to the papers we missed.
Extend it
See something we didn't? A fifth idea that connects to these four? A domain where this applies differently? That's how architectures grow.
"Which parts matter right now?"
That question started this project. The same question is now yours.
The architecture is open. The ideas are free. The models are free.
The only thing missing is what you do next.