Also available Academic Defense Deck →
SOMACH — THE FULL JOURNEY MINERVA UNIVERSITY
Capstone '25–'26
Carl Kho

Education Disguised as a Project

THE FULL JOURNEY

From designer to engineer in 12 months.
Spring 2025 (San Francisco) → Spring 2026 (Hyderabad)

5 Phases 3 arXiv Papers 4,033 CSVs 24 Blog Posts $40 Hardware
01 — THE SPARK

"I was building interfaces for everyone but myself."

In 2023 at Northeastern's ReGame XR Lab, I built eye-tracking games for non-speaking autistic children — Ice Cream Peers, Sugar Slay. I bridged intent→action for others.

Then the realization: "If I can do this for them, why not for my own ADHD?"

"The cobbler's children have no shoes."

2022: Made a YouTube sponsorship video. Got ₱300,000 from the city mayor. Got into Minerva. Completed UWashington's Neural Engineering course online.

🎮

ReGame XR Lab

Eye-tracking games for non-speaking autistic children

Northeastern University, 2023

02 — THE DREAM

Thoughts → Text

The original, naive vision.

The Dream

Silent, hands-free task offloading. Transcribe inner speech — the voices in your head. Inspired by Stanford/Meta Brain2Text research and Nieto et al.'s "Thinking Out Loud" Nature paper.

The Confidence

"It converts your thoughts into text. I was pretty confident." — Pitch deck, early 2025. The Ed Sheeran demo had investors nodding. But the science was about to humble me.

03 — THE FOUNDATION

San Francisco: Building the Muscles

Spring 2025 — Every project that taught me what I'd need.

Nieto et al. Inner Speech PR

Pull request to Inner Speech Dataset repo. Built TUI + GUI for EEG classification. ~50% accuracy (chance=25%). First hands-on BCI code.

Neurosity Crown SDK

React web app for recording EEG via $200 headset. Tried kinesis predictions. Result: couldn't tell up from down (50/50).

EEG → Playwright

Microsoft VLM pipeline: EEG → browser automation via StageHand + OmniParser + AgentQL. First attempt at BCI-to-web.

Look Ma No Hands 🏆

Voice→browser via Gemini Multimodal Live API. Won W&B Hackathon.

Un-Normie Inator 🏆

Gemini Live API crypto onboarding. Won Coinbase Hackathon.

Open Source PRs

Hyperbolic Labs PR#42, OpenAI Cookbook Whisper PR#1271. Building credibility in the ecosystem.

Cloud Builder Ambassador Stellar PH 100 Under 30 YCA Startup School Cerebras Fellowship
04 — THE PIVOT

THE WALL

Neurosity Crown kinesis: 50/50. Literally a coin flip for up vs. down.

Stanford HAI told me: "You need a $3,000 clinical-grade headset, not a $200 consumer one."

I was becoming the next Theranos — investors believed me but I had nothing to show.

THE WAY OUT

MUSCLES
NOT
BRAINS

"I stopped trying to read the brain. I started reading the muscles." Advisor Watson said: "Put your head down, try to make it work. If it fails, pivot to something downstream."

05 — PHASE 1: MOVEMENT

Walk. Jump. Punch.

September 2025 — Taipei, Taiwan

Android phone strapped to shin → UDP → Pynput → Hollow Knight/Silksong. First Android app ever built. Learned Kotlin, UDP sockets, Android Studio from scratch.

Key discovery: "The Data Dictates the Design." Every human calibrates differently — Minimalists, Maximalists, Asymmetrics. Built personalized calibration system.

5
Loom Demos
100
Hz Sampling
<25
ms Latency

IT WORKS — Phase 1 Final Demo

06 — PHASE 2: SMARTWATCH

(Re)Building at Midnight

October–November 2025 — Taipei

Pixel Watch → phone sensor fusion. Built 2nd Android app (hold-to-record + haptic alerts). 86.7% accuracy wasn't enough — rebuilt from scratch.

WhisperX Disaster: 244 detected "jumps," only 7 usable training samples. Voice-motion timing misalignment is fundamentally broken.

The Button Grid Fix: Pivoted to press-to-label app. 780 labeled samples. Binary walking: 94%. Multi-class: 57%.

CNN-LSTM Architecture

1D Conv → BiLSTM → Dense. 94% binary accuracy. 10 Jupyter notebooks. SVM: 94.8% on held-out test set.

ACCURACY PROGRESSION

Voice Labels

57%

Multi-class

70%

Binary Walk

94%

SVM Test

94.8%

Silksong Body Controller Demo

07 — SIDE QUESTS

Learning Hardware from Zero

October–November 2025

🔧

NTU Makers Club

Joined NTU's lab in Taiwan. Arduino intro → ESP32 WLED → Smart Voice Assistant → OnShape CAD. Lab space access for electronics testing.

🧠

Teaching EEG Mind Control

Taught 30-min class to 6 Minerva students. Built 3 interactive browser games (neuron simulation, pyramidal neurons, EEG signal). The snap analogy. The mountain analogy.

📚

CS156 Comp. Neuro

Prof. Watson's class: connectionism, PDP models, DQN Breakout revelation, fruit fly preprocessing, transfer learning. 24 sessions. The theoretical backbone.

"I was a designer 6 months ago. Now I'm soldering sensors at midnight."

08 — PHASE 3: MUSCLES

The LED That Changed Everything

November–December 2025

First EMG LED Dimmer — "The stronger I clenched, the more the LED lit up." Pivotal hardware milestone.

Repurposed $3 AD8232 cardiac sensor for forearm EMG. Fought a 72-hour debugging gauntlet: corrupted Arduino IDE caches, ESP32 Core v3.0 breaking changes, sensor producing identical noise whether connected to body or floating in air.

The EMG bike turn signal — killed by a broken electrode cable in Taiwan. The tap test failure was definitive.

First arXiv paper written — 18 models benchmarked. Random Forest: 74.3% accuracy at 0.01ms latency. MaxCRNN: 83.2%, 99% CLENCH precision.

1,540s
Raw Data
1,300
Windows
18
Models Tested

THE DISCOVERY

AD8232

A $3 heart sensor that reads muscles. The AD8232's bandpass (0.5–40 Hz) matches EMG motor unit frequencies. No new hardware needed.

KEY LESSON: WATSON'S ADVICE

"Do not consider auxiliary problems until you have failed to build this pipeline with the essentials. If you build this pipeline and the classification accuracy is poor, then you are permitted to consider these problems, but until then you just build it."

09 — THE BREAKTHROUGH

"They Weren't Reading the Brain.
They Were Reading the Jaw."

December 9, 2025: Watson explained the Dualist Fallacy. MIT's AlterEgo wasn't reading "thoughts" — it was reading jaw and throat muscle signals. Brain signals get cleaner downstream at muscles, not upstream in cortex.

MIT ALTEREGO

$1,250+

  • 7 electrodes
  • 24-bit ADC (16M levels)
  • Custom fabrication

MY PLAN

$40

  • 2 electrodes
  • 12-bit ADC (4,096 levels)
  • Off-the-shelf cardiac sensors

The AD8232's bandpass (0.5–40 Hz) already covers the speech motor unit range (1.3–50 Hz).

10 — PHASE 4: EXPLORATIONS

Beyond the Lab

October–December 2025

🚗 Voice-Controlled Uber

Playwright MCP + Omi AI necklace. 3 failed approaches (API, manual Copilot, unstealth Playwright), then success. 312 lines of JS. 89% success rate. 47 real voice commands processed.

Phase 4 foundation: if biosensors → web automation is possible, anything is.

🔬 OpenEMG for Kaggle

Browser-based biosignal studio. Web Serial API at 1000 Hz. Custom Canvas renderer at 60fps. Push-to-Event labeling protocol. Non-destructive DSP. Gemini 3 Pro quality control.

The tool I wished existed 3 months earlier.

📝 DQN Breakout Revelation

Deep Q-Learning exploration for CS156. Feynman method applied to neural networks. Experience replay, target networks, distributed representations. Connectionism in action.

⏱️ AI Pomodoro Logger

1,394-line Python app. FFmpeg compression. AI summaries. Tracked every work session across the capstone — the meta-tool for documenting the documentation.

PHILIPPINES

Diskarte

INDIA

Jugaad

11 — GOING PUBLIC

January 2026 — Presented at T-Hub Hyderabad 4 times. "SOMACH: The $40 Silent Speech Interface." Created one-pager (LaTeX), slide deck, journey narrative. Met Manu, Stiven, AJ (Deloitte London → NGO advisor).

"Resourcefulness in the face of constraints."

MARKET SIGNAL — SAME MONTH

Apple acquires Q.ai for $2 billion — whispered speech + facial muscle detection. Jan 29, 2026.

Merge Labs (Sam Altman + OpenAI) raises $250M for nonverbal BCI. Jan 15, 2026.

While building at T-Hub, $2.25B was committed to the exact same idea. The market validated the research before the research was finished.

12 — HARDWARE HELL

8 Dated Build Logs

January–February 2026 — Hyderabad

Jan 18

Robuin board — mouthing then subvocalization tests

Jan 25

Robocraze AD8232 + 3-channel wiring + Y-splitter for shared reference

Jan 30

AlterEgo replication methodology — 4 documents: deep-dive, breakdown, feasibility, market

Feb 1

Soldered sensors + electrode mapping confirmed (3 serial monitor screenshots)

Feb 2

Clean-slate hardware rebuild

Feb 3

ChatGPT RL wiring confusion — "the wires were wrong"

Feb 5

Complete Instructables build guide — 6 HTML pages, full Python pipeline, Arduino firmware

Feb 11

3rd AD8232 broke — pivoted to 2-channel system

13 — THE SCIENCE

Two Studies, One Vocabulary

February 2026

STUDY A — CHIN + UNDER-CHIN

Mentalis + Mylohyoid

4-phase transfer learning: overt → mouthing → exaggerated closed → covert. 900 CSVs.

51.8% ± 2.8%

5-FOLD CV • 6 CLASSES • 3.1× CHANCE

STUDY B — CHIN + THROAT

Laryngeal + Curriculum Learning

5-phase intensity spectrum: Speech Intensity Curriculum. 1,500 CSVs.

48.9% ± 3.1%

5-FOLD CV • 64.1% GATED @ θ=0.60

Cross-study transfer: 25–31% (near chance) → Electrode placement creates fundamentally incompatible feature spaces. CN V (chin trigeminal) vs CN X (throat vagus) — different cranial nerves, different signals.

14 — THE DISCOVERY

"You Are NOT Doing Silent Speech.
You Are Doing Onset Classification."

Feature importance: 100% of discriminative weight in the first 20 timesteps. Onset Masking Experiment: zeroed out first 80ms → accuracy collapsed to 17.6% (pure chance).

62%

Raw Baseline

17.6%

Onset Masked

80ms

All Signal Lives Here

The $40 sensor can only detect the violent "POP" when the brain fires the motor command — the onset spike. It cannot see sustained articulation. 12-bit ADC (4,096 levels) vs MIT's 24-bit ADC (16 million levels).

15 — ENGINEERING AROUND LIMITS

From 49.7% to 93.5%

February 25–28, 2026

62.0%
Raw 6-class single-session
21.9%
Cross-session (1cm electrode shift)
49.7%
Multi-session combined
57.0%
4-class merge (drop LEFT)
77.9%
Confidence gating @ 60%
93.5%
Ensemble + gate @ 60%

"You don't need the model right 100% of the time — you need it to know when it doesn't know."

16 — THE INFRASTRUCTURE

4,033 CSVs. 10 Scripts. 3 Firmware Sketches.

💾

Data

4,033 CSVs — Study A (900), Study B (1,500), covert (1,534), pilot (99). 87 MB total. 250 Hz, 2 channels, 6 classes.

UP • DOWN • LEFT • RIGHT • SILENCE • NOISE

⚙️

Pipeline

10 Python scripts: validate → explore → normalize → train → evaluate → predict → phase-eval → cross-session → hyperparameter → curriculum.

+ rigorous_eval.py • generate_all_figures.py • live_demo.py

🔌

Firmware

3 Arduino sketches: basic test, 3-channel, 250 Hz high-speed capture. ESP32 DevKit V1 + 2× AD8232.

Dual license: MIT (code) + CC BY 4.0 (data)

17 — THE OUTPUT

24 Blog Posts. 8+ Looms. 6 Instructables Pages.

01Gaming's Next Controller Isn't a Brain Implant
02Every Human Is Perfectly Calibrated for Chaos
03EMG Sensors: When Your Muscles Start Talking
04The Technical Deep Dive
05Phase 1 Technical Log
07The Ghost in the Machine (PDP Reading)
08DQN Breakout Revelation
09EMG Bike Turn Signals
10Phase 3 WIP: Building in Public
13WhisperX Lied to Me
146 Buttons vs Voice Labelling
15Phase 2 Comprehensive Technical Narrative
16Voice-Controlled Uber (Playwright MCP)
17Rebuilding Phase 2 at Midnight
18Teaching EEG Mind Control
19Hip(s) Don't Lie (72-Hour Debugging)
20Publishing First arXiv Paper
21Eureka: Subvocalization Pivot
22Thank You Prof. Watson
23OpenEMG: Kaggle/DeepMind
24AI Pomodoro Tracker
2 Websites Deployed Chrome Extension Typographic Watermarking
18 — THE PAPERS

3 arXiv Papers

PAPER I — STUDY B

Curriculum Learning for Silent Speech Classification

20 pages, 5 figures, 18 refs. 48.9% ± 3.1% CV, 64.1% gated accuracy.

arxiv.org/abs/2601.06516 →

PAPER II — STUDY A

Electrode Placement Boundaries

15 pages, 4 figures, 20 refs. 51.8% ± 2.8% CV. Cross-study transfer fails (25–31%).

Negative result: a scientific contribution.

PAPER III — PHASE 3

Pareto-Optimal Model Selection for Low-Cost EMG

18 models benchmarked, 1,540s data. RF: 74% accuracy at 0.01ms latency.

Edge ML: feature engineering > raw compute.

19 — TOOLS & INFRA

Tools Built Along the Way

📡

OpenEMG Bench

Browser oscilloscope, 1000 Hz

⏱️

AI Pomodoro Logger

1,394-line Python

🌐

Blog Folio v1 → v2

MIT Media Lab feedback

📱

Android IMU Recorder

Phase 1 — first app ever

Wear OS Labeler

Replaced WhisperX

🚗

Voice Uber

Playwright + Omi

🎮

3 EEG Games

Teaching session

📦

Instructables Guide

6 HTML pages

20 — THE PEOPLE

28+ Watson Meetings

Prof. Patrick Watson (Advisor)

Computational Neuroscience PhD. Paced the project: accelerometry → biosensors → subvocalization. "He didn't want you to succeed easily; he wanted you to do hard engineering."

Prof. Shekhar (Second Reader)

Committee meetings for milestone checks. Kept the academic rigor bar high.

Manu & T-Hub Hyderabad

Led to 4 presentations, hardware sourcing from SP Road electronics markets, lab access for soldering and testing.

AJ (Deloitte London → NGO)

NGO advisor. Oral defense timeline review. Helped shape the narrative for non-technical audiences.

21 — THE FUTURE

AHA Benchmark

Does AI cognitive offloading erode autonomy? An LLM-as-Judge tool for detecting over-scaffolding vs autonomy preservation in AI tutors.

If AI makes you dumber, the interface failed.

Biological HUD

Video-game F3 debug mode for the human body. Resource bars: Mana = Executive Function, HP = Physical Energy, Bandwidth = Sensory Load. Status ailments mapped to genes (MTHFR, AGT, ADRB2).

Diagnostic loop: "I feel X" → genetic/lab baselines → biological mechanism → protocol.

"Knowledge = Control. The bigger picture is gaining control of self."

22 — END

ORAL DEFENSE — MARCH 10, 2026 — UNANIMOUS PASS

"Congratulations Carl, this was an easy pass for us. Hardware projects are difficult, and yours worked. Your documentation throughout has been exceptional. This was one of my favorite projects in a long time to advise."

— Prof. Patrick Watson, CS/ML (Advisor)

"Getting hardware to work under less than ideal circumstances — low budget — is amazing. You could make a strong case for why you should be part of the MIT team. I would call yours an ideal capstone trajectory."

— Prof. Shekhar, Signal Processing (Second Reader)

"Carl did this specific thing, which is a very difficult engineering challenge, and here's how he got it to work."

— Prof. Patrick Watson

Carl Vincent Ladres Kho

Minerva University Class of 2026

kho@uni.minerva.edu

5 Phases 3 arXiv Papers 4,033 CSVs 24 Blog Posts 8+ Looms $40 Hardware