Research 2025-09-20 • 12 min read

Stitching a Mind: Why AGI Is a Systems-Integration Problem, Not a Single-Model Moment

By Dr. Ash Khalilian

AGI as a systems-integration approach combining specialised components.

TL;DR: Modern large models behave like a cortex-style pattern engine. Add (1) durable memory, (2) an offline “sleep” cycle for self-improvement, (3) sensors and action modules (software tools now; robots soon), and (4) an executive layer to orchestrate them, and you approach general capability in digital environments. Whether this is AGI depends on definitions: economic (outperform most human work) versus cognitive (human-like fluid reasoning on novel tasks). On the latter, ARC-AGI-2 shows humans can solve all tasks while frontier systems remain single-digit, so fluid reasoning is still a frontier.

1) First principles: what counts as “AGI” and “ASI”?

AGI (OpenAI Charter): highly autonomous systems that outperform humans at most economically valuable work [1].
GPAI (EU policy): a governance label for broadly usable models, with a 2025 Code of Practice guiding providers, useful for compliance, not a capability threshold [2].
ASI (Bostrom): much smarter than the best human brains in practically every field, a philosophical yardstick, not an engineering spec [3].

Implication: “Already AGI” is credible if you adopt the economic lens; under a cognitive lens (fluid abstraction, causal learning), not yet.

2) How neuroscientists carve up the brain (and why that matters)

Brains are systems of systems, not one module.

Large-scale networks in cortex (Default Mode, Frontoparietal/Executive, Salience, Attention, Sensorimotor, Visual) are reproducible across people (7/17-network maps) [4]. The salience system helps switch between self-referential (DMN) and task-focused executive control [5-6].
Key subsystems with clear roles:
- Basal ganglia for action selection and reinforcement-learning-like gating across motor and cognitive sequences [7-8].
- Cerebellum for coordination and cognition/affect (CCAS/Schmahmann) [9-11].
- Thalamus as a cortical hub that shapes information flow, attention, and state [12].

These maps give us a principled template for engineering analogues.

3) The thesis, formalised: “stitch the missing pieces”

Proposition: A large model is a cortex-like pattern engine. Add long-term memory, an offline “sleep” loop, embodied I/O (eyes/ears/tools/robots), and an executive that orchestrates them, and you have practical AGI for software environments; scaled across many specialities, you approach ASI-like breadth.

Brain to engineering mapping

Brain system (coarse)	Today’s engineering analogue
Cortical networks (perception/abstraction/language)	LLM/Multimodal LLM (text, code, images, audio)
Basal ganglia (action selection, RL)	Planner/policy choosing the next tool/skill chain
Hippocampus to cortex (episodic to semantic consolidation)	Long-term memory: vector stores + knowledge graphs + retrieval
Sleep (replay; synaptic homeostasis)	Nightly replay, evaluation, pruning; light fine-tuning/LoRA [13-15]
Salience + executive networks (switching)	Orchestrator that routes tasks by uncertainty/priority and escalates
Embodiment (sensors/actuators)	Software tools now; VLA-style robot policies increasingly capable [17-20]

Why the “sleep” loop matters: Non-REM dynamics (spindles/ripples) support systems-level consolidation; Synaptic Homeostasis suggests global renormalisation, clean metaphors for nightly self-optimisation [13-15].

4) Are we there yet?

Economic lens: Tool-using systems already execute research, coding, analysis, design, and customer workflows at professional levels in many domains, consistent with the policy spirit of AGI [1-2].
Cognitive lens: ARC-AGI-2 explicitly targets fluid abstraction on novel tasks. Humans solve 100% of tasks (pass@2); frontier “reasoning” systems score single digits; pure LLMs ~0% [16]. That’s a real gap.

Verdict: Near-AGI in software under an economic view; not yet on human-level fluid reasoning.

5) A brain-inspired architecture you can ship today

Cortex-like core: A strong multimodal model for text/code/tables/images/audio.
Executive/salience layer: An uncertainty-aware scheduler that routes tasks among agents and tools; handles mode-switching (cf. salience, executive, DMN) [5-6].
Action selection (basal-ganglia analogue): An RL/bandit policy that picks the next tool/skill chain, with explicit stop/escalate criteria tied to business KPIs [7-8].
Long-term memory:
- Episodic (append-only logs of interactions/outcomes)
- Semantic (distilled knowledge graph + vector index with provenance)
Sleep cycle (offline self-improvement): Replay from episodic logs, then targeted re-tests, dataset pruning, light LoRA refresh, and redeploy [13-15].
Embodiment:
- Now: safe, auditable tool-use (search, spreadsheets, CRM, code, RPA).
- Next: VLA/generalist policies (RT-2, Open-X-Embodiment, Octo, OpenVLA) show transfer and broader dexterity across platforms [17-20].

6) What this buys you (and what it doesn’t)

Strengths today

Breadth: With tools + memory, one stitched agent can complete many white-collar workflows end-to-end.
Multi-specialist behaviour: A single system can perform competently across multiple professions, rare for humans.

Gaps to close

Fluid abstraction on novel tasks: Still a differentiator for humans (ARC-AGI-2) [16].
Causal/world modelling: Active research; embodiment likely improves robustness [17-20].
Reliability and governance: Capabilities rise faster than controls; track GPAI-style obligations if operating at scale [2].

7) A pragmatic AGI scorecard (use both lenses)

Economic metrics

% of core tasks at human-parity or better
Profit/compute-hour vs fully loaded human cost
Time-to-competence on a new but typical workflow

Cognitive metrics

ARC-AGI-2 pass rate (zero-/few-shot) + sample efficiency [16]
Out-of-distribution tool use; transfer to new APIs without schema hints
Causal/abductive reasoning batteries from cognitive science

If you clear both, you’ve got a credible “AGI” under most reasonable definitions.

Conclusion: Treat AGI as an integration milestone

Stop asking if a single model is a person. Ask whether your stitched system (with memory, offline replay, and action) shows general, reliable competence across open-ended tasks. In software, we’re close. In fluid reasoning and physical autonomy, we’re not done.

For builders: prioritise orchestration, memory hygiene, offline replay, rigorous evaluation, and safety envelopes over model monotheism. Don’t leave “part of a brain on the table”.

References

Definitions and policy

OpenAI Charter (2018–): “AGI… highly autonomous systems that outperform humans at most economically valuable work.”
European Commission (2025). General-Purpose AI (GPAI) Code of Practice (policy page, 10 July 2025).

Brain systems and networks

Bostrom, N. (2014). Superintelligence: Paths, Dangers, Strategies. Oxford Univ. Press.
Yeo, B.T.T. et al. (2011). The organisation of the human cerebral cortex by intrinsic functional connectivity. J. Neurophysiol., 106, 1125-1165.
Menon, V. (2023). Large-scale brain networks and the triple-network model. Neuron, 111, 1776-1799.
Sridharan, D., Levitin, D.J., Menon, V. (2008). Right fronto-insular cortex in switching between executive and default-mode networks. PNAS, 105, 12569-12574.
Maia, T.V., Frank, M.J. (2011). From RL models to psychiatric and neurological disorders. Nat. Neurosci., 14, 154-162.
Jin, X., Costa, R.M. (2019). Basal ganglia in action sequence learning and performance. Neurosci. Biobehav. Rev., 94, 219-227.
Buckner, R.L. (2013). The cerebellum and cognitive function. Neuron, 80, 807-815.
Argyropoulos, G.P.D. et al. (2019). The Cerebellar Cognitive Affective/Schmahmann Syndrome. The Cerebellum, 19, 102-125.
Stoodley, C.J. (2018). Functional topography of the human cerebellum. Prog. Neurobiol., 168, 1-58.
Halassa, M.M., Sherman, S.M. (2019). Thalamocortical circuit motifs: a general framework. Neuron, 103, 762-770.

Sleep, consolidation and replay

Klinzing, J.G., Niethard, N., Born, J. (2019). Mechanisms of systems memory consolidation during sleep. Nat. Neurosci., 22, 1598-1610.
Foster, D.J. (2017). Replay in the medial temporal lobe. Nat. Neurosci., 20, 152-158.
Wilson, M.A., McNaughton, B.L. (1994). Reactivation of hippocampal ensembles during sleep. Science, 265, 676-679.

Benchmarks

ARC Prize Foundation (2025). ARC-AGI-2 (official benchmark page and summary).

Embodiment and VLA robotics

Brohan, A. et al. (2023). RT-2: Vision-Language-Action models transfer web knowledge to robotic control. arXiv:2307.15818 / PMLR 2023.
Open X-Embodiment Collaboration (2023-2024). Open-X-Embodiment datasets and RT-X models. arXiv:2310.08864 / OpenReview 2024.
Octo Model Team (2024). Octo: An open-source generalist robot policy. RSS 2024 + project materials.
Kim, M.J. et al. (2024). OpenVLA: An open-source Vision-Language-Action model. arXiv:2406.09246 (rev. 2024).