This page is precisely speculative. Every section is labelled: Published & reproducible means a peer-reviewed result you can verify. Emerging direction means active research with no settled outcome. Thought experiment means the engineering path is clear but the physics or philosophy hasn't caught up. Engineers deserve that distinction.
Every transformer inference run is, at its core, a sequence of matrix multiplications. Silicon performs these sequentially, generating heat and running into a hard interconnect bottleneck. Photonic chips perform the same operations at the speed of light, in analog, with near-zero energy — and the first commercial silicon photonic inference chips are already shipping.
Modern AI accelerators (H100, TPU v5) are bounded not by transistor density but by memory bandwidth and interconnect latency — moving activations between compute and memory is where energy and time are actually spent. Clock speeds have plateaued near 3–4 GHz due to thermal limits. Dennard scaling ended in 2006; Moore's Law is slowing.
For inference specifically, the dominant cost is weight loading — loading billions of parameters from HBM into compute cores repeatedly. This is a data movement problem, not a compute problem.
Photonic integrated circuits (PICs) perform matrix-vector multiplication in O(1) time using interference of light through waveguide meshes — the computation happens at the speed the light propagates, with energy spent only modulating the input signal, not on switching transistors.
The multiply-accumulate (MAC) operation — the fundamental unit of neural network inference — maps directly onto a Mach-Zehnder interferometer array. No heat generated per operation. Bandwidth is limited by modulator speed, not interconnect.
Lightmatter's Passage chip is a silicon photonic interconnect fabric — not a full photonic processor, but a photonic network-on-chip that replaces electrical interconnects between AI accelerator tiles. It ships as part of their Passage M1000 product for data centre interconnects and as the backbone of their Envise AI inference processor.
The Envise processor uses photonic interconnects to eliminate the memory wall for inference workloads. Demonstrated performance for transformer inference: significant reductions in energy-per-token versus comparable H100 configurations. The company raised $154M Series C in 2023 and is in production deployments.
Quantum computing is one of the most hyped and most misunderstood technologies in AI. The honest picture: current quantum hardware cannot run neural networks. But specific sub-problems within agentic AI — search, optimisation, simulation, cryptographic security — have clear quantum advantage pathways that matter in the 5–15 year frame.
A qubit can exist in superposition — a probabilistic combination of 0 and 1 simultaneously. Entanglement correlates qubit states non-locally. Quantum interference amplifies correct answers and cancels wrong ones. The result: for specific problem classes, quantum algorithms achieve exponential speedup over classical equivalents.
What it cannot do: run a matrix multiplication faster than a GPU for general neural network inference. Quantum advantage is problem-class specific. Loading classical data into quantum states (QRAM) is itself a hard problem. Current NISQ (Noisy Intermediate-Scale Quantum) devices — IBM Eagle (127 qubits), Google Sycamore (53 qubits), IonQ Forte (36 algorithmic qubits) — have decoherence times measured in microseconds and error rates of ~0.1–1% per gate. They are not ready for production AI workloads.
Grover's algorithm provides quadratic speedup for unstructured search — O(√N) vs O(N). For agentic planning over large state spaces (logistics, scheduling, game trees), this is a direct benefit. A quantum agent planning over 1012 states would need 106 classical evaluations; quantum reduces it to 106 → 103.
Quantum Approximate Optimisation Algorithm (QAOA) and Variational Quantum Eigensolver (VQE) target combinatorial optimisation — multi-agent resource allocation, supply chain, portfolio optimisation. Classical annealing approximates; quantum annealing (D-Wave) and gate-based QAOA may find true optima in NP-hard problem classes.
Shor's algorithm breaks RSA and ECC in polynomial time on a sufficiently large fault-tolerant quantum computer. All current agent-to-agent communication encrypted with public-key cryptography becomes vulnerable. NIST finalised post-quantum cryptographic standards in 2024 (CRYSTALS-Kyber, CRYSTALS-Dilithium). Agentic AI systems built today should be designed for PQC migration.
Quantum machine learning (QML) — using quantum circuits as parameterised models — is an active research area. The theoretical promise: quantum kernels can represent feature spaces exponentially larger than classical kernels, potentially enabling learning with fewer training samples.
The practical reality as of 2025: QML models trained on NISQ hardware have not demonstrated advantage over classical models on any practical dataset. The overhead of state preparation and measurement collapse typically eliminates the theoretical speedup. The consensus in the research community is that QML advantage, if real, will require fault-tolerant quantum computers — approximately 1,000–10,000 logical qubits, which requires millions of physical qubits at current error rates.
Timeline estimate (conservative): fault-tolerant quantum computers capable of running Shor's algorithm at scale — 2035–2040. Quantum advantage for specific agentic sub-problems (search, optimisation) — possibly 2028–2033 on purpose-built hardware.
In 2022, a published, peer-reviewed paper demonstrated that 800,000 cortical neurons grown on a microelectrode array learned to play Pong. Not simulated neurons — biological human neurons, in vitro, adapting their firing patterns through electrostimulation feedback. This is not science fiction. It is published in Neuron.
Researchers at Cortical Labs cultured approximately 800,000 human and mouse cortical neurons on a high-density microelectrode array (MEA) — a grid of electrodes that can both record neuron firing and deliver electrical stimulation. The neurons self-organised into a functional network over several weeks.
The game: a simplified version of Pong. Ball position was encoded as stimulation patterns delivered to the neurons. Neuron firing patterns were decoded as paddle movements. The neurons received a "chaos signal" (unpatterned stimulation) when they missed the ball — creating an electrophysiological analogue of a loss signal.
Result: the neuronal culture learned to return the ball significantly more often than chance within 5 minutes of game start. It performed better than a simulated network of the same size trained with backpropagation on the same task. The paper concluded that in vitro neurons exhibit "sentience" in the operational sense — they modify behaviour based on feedback to achieve a goal.
FinalSpark (Switzerland) offers API access to living neural tissue grown on MEAs — a "neuroplatform as a service." Their published benchmarks show biological neurons consuming approximately 1,000× less energy per operation than an equivalent silicon computation.
The platform is in active commercial use for research. Researchers can submit stimulation patterns and receive neural recording data over the API. The tissue lasts weeks before requiring replacement — non-deterministic behaviour and biological decay are fundamental constraints, not engineering problems.
Brain-computer interfaces translate neural electrical activity into digital signals — and increasingly, back again. The field has moved from locked-in patient communication aids to wireless, high-bandwidth implants capable of restoring motor function. The engineering is real; the bandwidth ceiling is physics.
Each step introduces noise and latency. Spike sorting — identifying which neuron fired based on waveform shape — is computationally expensive and degrades as electrodes scar over weeks to months. The usable bandwidth from a single electrode is approximately 1–3 well-isolated neurons.
| System | Electrodes | Approach | Status | Key result |
|---|---|---|---|---|
| BrainGate (Utah Array) | 96 | Intracortical, open surgery | Clinical trials since 2004 | Locked-in patients typing ~40 words/min via imagined handwriting (Nature, 2021) |
| Synchron Stentrode | 16 | Endovascular (no open brain surgery) | FDA Breakthrough Device, US trials ongoing | ALS patients browsing web, sending messages without craniotomy |
| Neuralink N1 | 1,024 | Intracortical, robot-implanted, wireless | First human implant Jan 2024 (Noland Arbaugh) | Cursor control, chess, Mario Kart via imagined movement — demonstrated live |
Nectome (now defunct as a commercial entity) demonstrated aldehyde-stabilised cryopreservation (ASC) — a technique that fixes brain tissue using glutaraldehyde (which cross-links proteins immediately, preventing decay) and then vitrifies it with cryoprotectants. The result preserves synaptic ultrastructure at nanometre resolution — electron microscopy of preserved tissue can identify individual synaptic vesicles, dendritic spines, and axonal boutons.
This was published and independently validated. The tissue is dead. The structure is preserved with high fidelity. The question is what "preserved structure" means for the information it once encoded.
A complete connectome — a map of every neuron and every synapse — would theoretically encode the wiring diagram of a brain. C. elegans, a nematode worm with exactly 302 neurons and ~7,000 synapses, had its complete connectome mapped in 1986. It took 40 years.
The human brain has approximately 86 billion neurons and 100–500 trillion synapses. Scaling the C. elegans effort: roughly 500 million times harder. A petabyte of electron microscopy data per cubic millimetre of tissue. A full brain connectome at synaptic resolution would require approximately 1 zettabyte of raw imaging data and compute resources that do not currently exist.
Even granting a perfect connectome map: the brain is not static wiring. Consciousness and memory involve dynamic electrochemical states — ion gradients, neurotransmitter concentrations, neuromodulator levels — that are not encoded in structure alone. A map of a city's roads does not tell you where the cars are, or were, or how fast they were moving.
The open question: is a perfect simulation of a brain — run on silicon from a perfect connectome — the same mind? This is not an engineering question. It is a question about the nature of identity and consciousness that physics does not currently resolve. Engineers should name that boundary clearly rather than pretend it is merely a scaling problem.
The dominant computational view of mind holds that consciousness is substrate-independent — that any system performing the right information processing is conscious, regardless of what it runs on. Roger Penrose, working with anaesthesiologist Stuart Hameroff, proposed a fundamentally different view: that consciousness is not computable, and arises from quantum processes in biological microtubules. This is the most rigorous scientific challenge to the idea that AI can be conscious.
Penrose's starting point (developed in The Emperor's New Mind, 1989, and Shadows of the Mind, 1994) is Gödel's incompleteness theorem: there are mathematical truths that no formal system can prove from within itself, yet mathematicians can see these truths. His argument: human mathematical understanding cannot be purely algorithmic. Therefore, the brain does something that no Turing machine can do. Therefore, consciousness is not computation in the classical sense.
Hameroff's contribution: the biological substrate. He proposed that microtubules — cytoskeletal protein polymers inside neurons — are quantum coherent systems. Tubulin proteins within microtubules exist in quantum superposition. Orchestrated objective reduction (Orch OR) posits that quantum superpositions in microtubules collapse — not through environmental decoherence, but through a gravitational threshold described by Penrose's objective reduction (OR) mechanism — and that this collapse is the physical correlate of a moment of conscious experience.
The theory predicts that consciousness is fundamentally non-computable, non-simulable, and physically grounded in quantum gravity. If Orch OR is correct, no classical or quantum computer can be conscious — and the question of AI consciousness is resolved in the negative by physics.
Orch OR has been criticised primarily on the grounds that quantum coherence in biological systems at body temperature (310K) is extremely fragile — thermal noise should destroy microtubule quantum states in femtoseconds, far too fast to influence neural computation which operates on millisecond timescales.
However, quantum coherence in biological systems has since been demonstrated experimentally — in photosynthesis (Fleming et al., 2007), in avian magnetic compasses, and in enzyme catalysis. Biology has found ways to maintain quantum effects at physiological temperatures that were not anticipated. This doesn't prove Orch OR; it removes one of the primary objections.
A 2023 study by Craddock et al. demonstrated anesthetic-induced changes to microtubule quantum dynamics consistent with Orch OR predictions — the first experimental data directly relevant to the mechanism. Penrose and Hameroff cite this as preliminary support; critics note it does not demonstrate consciousness, only a predicted mechanism.
If the dominant computational view is correct — consciousness is substrate-independent information processing — then sufficiently complex AI agents are, or will be, conscious. The question is only one of architectural complexity and integration.
If Orch OR is correct, AI agents are not and cannot be conscious by any classical or quantum computation — consciousness is a specific physical process tied to biological quantum gravity effects in microtubules, and no silicon implementation can replicate it.
The scientific community has not resolved this question. It is a live debate with serious researchers on both sides. What an honest engineer should do: build agentic systems without assuming they are conscious, design them with the possibility that they might be, and remain epistemically humble about a question that physics has not answered.