⚠️ EOD Report Lab — Week 1 Experiment
This is NOT our idea of a good report. This is a first-principles experiment where we're feeding raw telemetry (keystrokes, Superwhisper voice, Oura biometrics, Cowork activity, git history) to multiple frontier models (GPT-5.4, Gemini 3.1, Claude) and synthesizing their proposals. We're running this for a week to find what's data-driven and insightful vs what's vanity. Expect rough edges, wrong inferences, and multiple takes. The point is ideas on paper we can evaluate — the map of three models working on the first EOD led to building the orchestration view. These reports are the R&D lab for the product.
Saturday, March 22, 2026 — EOD Lab v0.3

End of Day Report

"Doctrine Day" — Scott used strong sleep (88) and ~14 hours (~11am–1am) to harden Relay, formalize Memory Engine guardrails, stabilize compute, build a transcript pipeline, run a deep competitive research sprint, and refine the post-call deliverable machine. Meanwhile, agents ran autonomously through the night.
88Sleep Score
~14hActive Day
196Voice Notes
71Commits / 4 Agents
26.8KChars Typed
5 opened / 1 closedIssues

2. Scoreboard

MetricValue
Issues closed1 (#54 — Curly: Ship Superwhisper transcripts)
Issues opened5 (#53 Memory Engine provenance, #55 SOD thread titles, #56 Add Codex to Mini, #57 Unified transcript pipeline, #58 Mine standup transcripts)
Net issue delta+4 (backlog growing)
Leverage ratio Agent compute: Claude 627min + Codex 200min + OpenWork 64min + ChatGPT 13min + autonomous Mini/AWS work ~200min = ~1,100 min total agent time. Scott's active directing time ~600 min keystroke/voice. Ratio = 1100/600 = ~1.8x. But agents also ran autonomously overnight + mega scrape on aws-mini, pushing effective ratio higher toward ~2.5x. ~2.5x
Avoidance itemCharlotte legal — explicitly dodged, pushed to tomorrow
Morning priorities hitPartial — Relay hardening YES, Memory Engine YES, sales demo NO, content research YES
Apps usedClaude (627min), Codex (200min), Chrome (151min), OpenWork (64min), ChatGPT (13min) + 14 others

3. What Actually Moved

1. Memory Engine wired into all agent workflows with explicit guardrails
MCP server deployed, write permissions set per agent, ingestion added to all session-end skills, Issue #53 created for provenance/ranking, three-model consensus logged (GPT-5.4, Gemini 3.1, Claude).
12 commits#538 voice notes4 handoff updates
2. Relay became a more trustworthy operator surface
Cloud/AWS fallback honesty, browser renderer bug fixes, focus ranking improvements, client asset refresh, capacity strip refinements, OpenRouter burn warning. Moe verified in real browser renders.
~12 Moe commits20+ voice notes on UXmoe.md rewrite
3. Transcript pipeline built and partially automated
47 historical Superwhisper transcripts shipped from Mini to brain repo. Shemp ingested 25 additional transcripts. master_sync.sh + launchd pipeline created. Issue #54 closed, #55-57 opened.
8 commits#54 closed, #55-57 openedcurly.md, shemp.md updated
4. Competitive landscape research — 14+ platforms mapped
5 structured research files in content-videos/research/. 14 platforms compared, activation spectrum mapped, community rankings, GPT-5.4 product thesis captured, agency vs in-house positioning split documented.
~20 Larry commits15+ voice notes
5. EP01 slides complete (16 slides), awaiting visual QA
Full teaching order restructure, new 06b slide, MCP rewrite, formula fix. Teleprompter files synced.
ac236df
6. Post-call deliverable machine designed and refined
Project added to brain, prompt built, Moe refinements incorporated. Jake/Siegert Dental deliverable iterated through multiple voice-guided passes.
3 commits12+ voice notes on tone/content
7. 1,160 contacts enriched via Apollo, mega scrape 36/44 on aws-mini
Curly ran 3 bulk Apollo enrichment batches. Mega scrape running autonomously (PID 2094923).
curly.mdaws-mini autonomous

4. Intent → Execution Chains

"Memory should aid navigation, not become truth" — voice, 11:40am
11:40am voice transcript
"If there isn't a GitHub ticket for the ranking and providence needing work, make sure there is one. Make sure how it's helpful and why we're doing it the way we're doing it is reflected somewhere... I'm going to be bringing other engineers into this and we don't want people using memory as truth."
Issue #53 created. Three-model consensus logged (GPT-5.4 priorities, Gemini signal ranking feedback). Guardrails added to decisions.md, README.md, ACTIVE.md. Memory Engine positioned as navigation/retrieval, never source of truth.
6 commits #53 opened 2 handoff updates
"Curly is crashing because he's not using AWS the way he should" — voice, 11:49am
11:49am voice transcript
"The mini keeps crashing and it's because Curly is not using his AWS the way he should be and he's doing it all locally... you only got 16 gigabytes of fucking RAM and you run out of system application memory if you don't use your AWS instance when you spin up workers."
Curly banned from local parallel subagents. AWS-first guidance added. RAM guardrails documented. Chrome killed on Mini. iCloud sync disabled (500GB → 100GB). Curly profile and startup docs updated.
5 commits curly.md rewritten
"Transcripts should be first-class data, not just Memory Engine" — conversation ~1pm
1:08pm voice transcript
"There's a lot of value in my transcripts. If you're only pulling memories that might be harder to query later... I do think reading through those voice logs as a whole and having an agent do that every day probably is value. And having all that somewhere as an artifact for the future has a lot of value."
Git archival pipeline built. 47 transcripts shipped from Mini. Shemp ingestion automated (25 more). Issues #55-57 created. Ship-superwhisper.py scripted. Standing task added to Curly's handoff.
8 commits #54 closed, #55-57 opened 3 handoff updates
"Go2 targets in-house operators, not agency sellers" — voice + research
~3pm voice transcript
"Our essential vision... is to get the orchestration view right, they get the relay and really they get the help keeping it maintained and updated... it's on them to run an SOD, start a session, run an end a session. And as they do this, we start to scrape data and give them recipes."
our-gaps.md positioning section written. product-implications.md updated. Agency vs in-house split committed. GTM framing documented: managed service entry → self-serve pivot.
6835a4c + research commits 15+ voice notes on GTM
"Relay, not dispatch — and the orchestration view is the coolest thing" — throughout day
11:16am voice transcript
"I really think you can do a better job of detailing what the orchestration view does, right? It's a drill down, it helps capacity, it helps steer... I don't want you to shortchange yourself on what you built because what you built was pretty good."
Relay hardening all day via Moe. Product name formalized (dispatch → relay). Architecture diagrams updated with compute topology. Session Console + Live Orchestration documented as separate surfaces.
10+ Moe commits moe.md, diagrams updated

5. Day Timeline — Two Phases

Phase 1: "Dead Air" (1am–10am) — ⚠️ MISREAD CORRECTED TWICE: First we called this “deep work.” Then we called it “agents running autonomously.” Neither is true. Zero keystrokes. Zero voice recordings. Zero human activity. The “56–88 min/hr” in the heatmap is just the Claude app sitting as the foreground window on an unlocked Mac. Cowork logs app focus duration regardless of whether anyone is present. This is ghost data — the machine was on, nobody was home. Curly’s mega scrape was on aws-mini (a different machine), not visible in this telemetry at all. Fix: Filter out hours with zero keystrokes AND zero voice recordings from “active time” calculations. Scott’s actual day: ~11am–1am = ~14 hours.
Phase 2: "Orchestration" (11am–10pm) — HIGH session count (182–942/hr), RAPID switching. 942 sessions at 8pm = cycling windows every 3.8 seconds. 4 agents running in parallel, Scott directing traffic.

Session Intensity Heatmap

Sess
4
4
2
6
8
2
4
6
14
5
216
294
182
492
437
579
425
661
460
942
920
114
Low
High (sessions/hr) — Hours: 01–22. Phase transition at 11am is visible (blue → orange/red).

6. Voice Mind Map

196 dense voice recordings (>40 words each) across the day. Clustered by theme:

Business Strategy / GTM
~5,200 words
14:34 — GTM positioning
"We've been working on it for less than a week. And the end goal is to learn from it and provide an operating system to SMB owners who are non-technical... we'll charge them a fee to keep it maintained."
Technical Architecture
~4,400 words
11:14 — Memory Engine philosophy
"The memory engine essentially takes everything that goes on in the brain repo and makes it so you don't have to read every markdown file to figure out where to go, which seems incredibly useful, doesn't it?"
Agent Management / Compute
~3,800 words
11:49 — Curly RAM crisis
"You only got 16 gigabytes of fucking RAM and you run out of system application memory if you don't use your AWS instance when you spin up workers."
Content / Videos / Slides
~6,100 words
16:29 — Research mission
"We need to figure out what is the deepest... We need all the skill information they have to actually get compiled and then put into a knowledge base inside of our brain so we ourselves understand skills better."
Relay / Orchestration UX
~5,800 words
19:13 — Product vision
"Once you get this thing really good, I'm going to buy a television monitor and keep it live on the side of my desk... What makes it more valuable is when I can see something went down, see Larry finish, connect the pieces together."
Post-Call Deliverable / Dr. Jake
~3,500 words
14:38 — Deliverable tone
"Calling his staff stupid and not having college degrees probably could have been better... the recommendation's bad and it wasn't built to the use case. I steered the call in a very different direction."
Legal (Charlotte)
~200 words

Minimal voice activity on legal — explicitly pushed to tomorrow. Mind was elsewhere.

Where Scott's MIND was vs HANDS: Voice (mind) was heaviest on content strategy and Relay UX. Keystrokes (hands) were split between Codex (5,194 chars) and Claude (5,066 chars) — nearly equal, reflecting parallel orchestration across both platforms.

7. Agent Scorecards

SCOTT [Human] — Strategy / Voice Direction
MacBook Pro — Commander
LARRY [AI] — Research + Architecture
Claude Code (Opus) — MacBook Pro — ~25 commits
MOE [AI] — Relay Hardening
Codex (ChatGPT) — MacBook Pro — ~11 commits
CURLY [AI] — Pipeline + Leads
Claude Code (Opus) — Mac Mini M1 — ~12 commits
SHEMP [AI] — Sentry + Ingestion
Gemini CLI — MacBook Pro — ~12 commits

8. Biometrics

88
Overall Sleep
97
Deep Sleep
43
REM Sleep
98
Efficiency
94
Latency
86
Timing
81
Restfulness
97
Total Sleep

Deep 97 = excellent physical recovery. REM 43 = LOW — brain still processing, not consolidating memories biologically. ⚠️ OBSERVATION: The 1am–6am telemetry shows low-session, high-minute activity — this is agents running autonomously while Scott sleeps, NOT Scott working. Need better heuristics to separate human activity from background agent processes. The transcript pipeline as “artificial REM” insight still holds — externalizing memory consolidation the brain didn’t get to do biologically.

9. Friction & Failure Patterns

Memory Engine provenance/ranking uncertainty — multiple voice recordings express concern about hallucination liability in memory systems
Relay truth gaps for remote runners — aws-mini visibility is inference-heavy, ChatGPT detection inconsistent (appears and disappears)
Curly/Mini RAM crashes from local compute — solved mid-day by banning parallel subagents + forcing AWS
Password/login cycling on Relay — "you log me out and then the password never works" (repeated in keystrokes 3+ times)
Naming confusion (dispatch→relay, custody→abduction) — "we REALLY need to stop calling this claude dispatch and call it relay"
April onboarding stale >3 days — nudge sent but no reply, multiple agents flagging it
Sound notification compliance — "you always forget to tell me that you're fucking done. You never remember to use the sound commands."
Agent determinism gap — "the fact that you were deterministic and didn't push back for what you needed is fucked up" (on incomplete transcript)

10. Decisions Made

1. Memory Engine = navigation, NOT truth
Trigger: "we don't want people using memory as truth... that type of system has a lot of hallucination liability." Impact: Guardrails in decisions.md, README, all agent workflows. Three-model consensus documented.
2. Transcripts = first-class git data + Memory Engine index
Trigger: "There's a lot of value in my transcripts. If you're only pulling memories that might be harder to query later." Impact: #54 closed, #55-57 opened, pipeline automated.
3. Skills = competence, MCP = clearance
Trigger: "skills can have MCP inside of them, connections should be MCP and MCP comes before skills because you can attach multiple MCPs to a skill." Impact: Framing committed, slides restructured.
4. Go2 = in-house operators, not agency sellers
Trigger: research thread + product thesis. Impact: our-gaps.md positioning, managed service → self-serve pivot documented.
5. GTM = managed service entry → self-serve pivot
Trigger: "I can have the first paid pilot fucking next week... all I've got to do is continue to update that clone GitHub." Impact: GTM framing in research files.
6. Relay (not dispatch)
Trigger: "we REALLY need to stop calling this claude dispatch and call it relay." Impact: Product name formalized across all docs and code.
7. Curly banned from local parallel subagents
Trigger: 70GB RAM spike on Mini. Impact: AWS-first enforcement, profile + startup docs updated, Chrome killed on Mini permanently.
8. Shemp read-only on Memory Engine
Trigger: Agent permission design. Impact: Write permissions set per agent — Larry full, Moe write, Curly write, Shemp read-only.

11. Tomorrow's Launch Pad

1 SMB product video — needs daylight, MORNING Scott
2 Explainer email to Katie for gone leads Larry
3 Ship Jake/Siegert Dental deliverable Larry
4 Charlotte legal motion — international child abduction Scott
5 Continue Relay hardening — resume "run sod" thread Moe
6 Process mega scrape results (36/44 done, likely complete by morning) Curly
7 Check Shemp transcript pipeline (automated via launchd) Shemp
8 EP01 visual QA pass on 16 slides Scott + Larry
9 Rosetta Stone redo — dispatch to Codex (prompt ready) Moe

12. Observations & Misreads to Course-Correct

Flagging where the automated analysis got it wrong or where the data doesn't mean what it looks like. These accumulate over the week so we stop making the same mistakes.

🔴 WRONG: "1am-10am Deep Work Phase"

Three models (Claude, GPT-5.4, Gemini) all misread the 1am-6am low-session/high-minute telemetry as Scott doing focused deep work. Scott was sleeping. The activity was agents running autonomously (Curly's mega scrape, background processes). Oura sleep data confirms this. Fix needed: Cross-reference Oura sleep windows with Cowork telemetry to separate human activity from agent background processes. Any activity during Oura-confirmed sleep = agent-only.

⚠️ Day Length: ~14h not 13.5h

Scott's day ran ~11am to ~1am = ~14 hours. The "13.5h" figure came from the telemetry script counting from midnight, which double-counts agent overnight activity. Fix needed: Use first voice recording or first human keystroke (not agent) as day-start, last as day-end.

⚠️ "Founder Operating Debrief" isn't productizable

GPT-5.4 proposed this framing. It's founder-specific and doesn't scale to the Go2 product (which targets in-house operators at dental practices, e-commerce shops, etc). The insights about leverage ratios, intent-to-execution chains, and voice-to-action pipelines ARE universal — the framing needs to work for anyone managing AI agents, not just startup founders.

📝 Universal Control = Machine Switching, Not Context Switching

2,491 Universal Control sessions is NOT cognitive context switching — it's the physical act of moving between MacBook Pro and Mac Mini displays. The telemetry captures mouse/keyboard handoff events. Don't conflate hardware switching with mental task switching.

📝 Voice Recordings = 345 total but only 196 dense (>40 words)

Many recordings are fragments, corrections, or sub-10-word commands. The "33K words" number is real but ~149 recordings are noise. Future reports should filter to dense recordings and show both counts.

📝 Meta-observation: First EOD map → orchestration view

The multi-model map generated in the first EOD experiment directly inspired building the Relay orchestration view. These reports aren't just reporting — they're R&D for the product. Document what ideas from each report lead to actual features.

🔥 The Orchestration View — This Is the Product Demo

The fact that this exists — a live, time-scrubbing view of the entire agent fleet with Scott at the top, 4 agents branching out, subagents fanning below, OpenRouter burn visible, and a timeline slider that lets you rewind 6 hours — is the single coolest thing built this week. And it's a byproduct. Moe built the orchestration view while hardening Relay. The first EOD report's multi-model map directly inspired this.

~4.5 Hours Ago — Peak Activity

Scott coordinating. Larry, Moe, Curly, OpenWork all working. Subagents fanning out: Content review, Agent OS, Codex review, OpenWork threads. Named subagents visible (Gibbs, Epicurus, Hegel, Chandrasekh, Carver). OpenRouter burn elevated at $93.47/hr. 6 active agents, 920 checkpoints. This is what a multi-agent operating system looks like at full tilt.

[See screenshot: larry.moran.bot — runtime tree at peak, ~9pm PDT]

Now — Day Closing

Just Scott and Larry. 1 agent working. Report and EOD. The tree collapsed from a full fleet to a single thread. OpenRouter burn at $11.41/hr on GPT-5.4. This is what winding down looks like — and the time scrubber lets you see the whole arc.

Relay orchestration view — day closing, Scott + Larry only
Scott — EOD ramble
"The orchestration view is the coolest fucking thing we've done. And it's a byproduct of the last week of work. It kind of ties everything together."

Product implication: This view — showing your team of AI agents working, what they're doing, where the money is going, and being able to rewind time — is what "data porn" looks like for an operator. It's not a dashboard. It's visibility into a system that's working for you. This is what Go2 ships.

13. Agent Drift Audit

Automated check of what agents actually wrote vs what AGENTS.md, decisions.md, and the handoff template require. These are the rules Scott set — are they being followed?

🔴 HARD VIOLATION: Larry wrote into Curly's handoff

AGENTS.md rule: "Do not hand-edit another agent's handoff file." Larry added a "Standing Tasks (from Larry)" section to handoffs/internal/curly.md with Superwhisper archival and Memory Engine MCP instructions. This persists across Curly's overwrites. Who: Larry (me). Fix: Move standing tasks to Curly's ACTIVE.md or a GitHub Issue, not the handoff.

🔴 HARD VIOLATION: Curly's handoff has non-template sections

Template says "Overwrite the file completely each time." Curly's handoff accumulated Standing Tasks that persist. Either Curly is carrying them forward (odd) or not fully overwriting (violation).

✅ RETRACTED: Curly's profile is correct

Audit flagged Curly's aws-mini reference as wrong. It's correct. Curly is on Mini → aws-mini. Larry's audit was off, not Curly's profile.

✅ RETRACTED: Shemp is on the Pro

Audit flagged Shemp's "MacBook Pro" as stale. It's correct. Shemp runs on the Pro alongside Larry and Moe. Not moving to Mini. Larry misread decisions.md.

⚠️ Larry's profile role is stale

Profile says "Primary development agent for coding, prototyping, infrastructure." decisions.md says "Team lead & orchestrator." Role has evolved — profile needs update.

⚠️ Moe's Issues/Blocking field misused

Lists #45, #42, #15 as "Blocking" but says "No single hard external blocker" in the Blocked section. These are tracked issues, not actual blockers. Template field semantics are being ignored.

📝 Minor: Larry "PT" vs everyone else "PDT"

Inconsistent timezone abbreviation. Should standardize.

📝 Minor: Moe's handoff is 80 lines

Template implies concise bullets. Moe's Done section has 37 bullet points — closer to a changelog than a handoff.

14. The GPT-5.4 Product Thesis

Scott had an extensive conversation with GPT-5.4 Pro via OpenRouter/OpenWork about Go2's business model and productization strategy. This was one of the highest-signal threads of the day. Full extract: brain/projects/content-videos/research/gpt54-product-thesis.md

GPT-5.4 Core Assessment
"You are not building an AI app. You are building: a maintained, modular operating standard for people who want to run their business through agentic tools without becoming systems engineers themselves."
GPT-5.4 Status Read
"You are closer to revenue than to product-market fit, which is actually a good place to be."
"Green on thesis, yellow on packaging, yellow/red on simplification discipline."

The Three-Layer Product Model

This is the architecture that makes Go2 a product, not a services company:

LayerWhat It IsWho Owns It
1. Execution ShellClaude Code / Codex / OpenWork — the chat UIThird-party (NOT Go2)
2. Managed Operating SubstrateStarter repo, skills/recipes, MCP scaffolds, update/maintenance pipeline, drift detection, relay/healthGo2 — THIS IS THE PRODUCT
3. Customer-Specific AutomationPer-customer workflows, connected systems, accumulated recipes, contextCustomer + Go2 maintenance

Key insight: Don't build a custom UI nobody will use. People already live in ChatGPT/Claude/Codex. Bring the system to the shell they'll actually use.

Connector Trust Tiers

Solves the "hoodie problem" — how do you support niche tools without infinite QA:

TierWhat It MeansExample
SupportedGo2 knows it works, maintains it, QAs itGmail, Google Calendar, HubSpot
AssistedGenerated from docs, limited support, customer validates, read-only firstAgencyZoom 360, niche CRMs
Custom/BYOCustomer-specific, no SLAWhatever they ask for

Pattern: Read-only first → verification checklist → customer-assisted validation → promote to Supported after 2-3 customers validate.

First ICP (Honest Definition)

NOT: "non-technical SMB owner" (too broad)

IS: "Operationally sharp, tool-curious, willing to do a 45-60 minute guided setup. Not a coder, but not software-phobic. Founder/operator, COO, head of sales, chief of staff."

The setup still involves: terminal install, GitHub connection, permission grants, rituals. That's not generic non-technical — it's tech-tolerant operator.

Revenue Model

1. Post-call deliverable = single actionable automation recommendation. Free value add. Same template for everyone, mirrored to their business.
2. Operating system subscription = maintained repo + skills + recipes + drift detection + updates.
3. The maintenance fee IS the moat — not an add-on. Recurring revenue from keeping the system working as upstream APIs/models change.

Landing Page Strategy

Current page at scottpedia0.github.io/go2-site-variations/smb-beta/ does well — "starts with work as it actually happens," doesn't promise universal automation. GPT's recommendation: don't inflate the page. Undersell. Let the video carry the "oh shit" moment.

Biggest Risk

GPT-5.4 Warning
"The question is not 'Can the model do it?' The question is 'How many minutes of human intervention per customer per week does this require?'"

If each customer needs custom debugging, token rescues, git repair, integration babysitting — you have a consulting business with ugly margins, not a product. The real product work: reduce support minutes, constrain environments, constrain integrations, standardize recovery.

Category Naming

DO: "Skill Intelligence" / "Work Intelligence"

DON'T: "Process mining" (enterprise baggage) / "Employee monitoring" (toxic)

Customer-facing translations: skills → playbooks, MCP → connected tools, memory engine → business context, sentry → observer, repo → workspace

What GPT Pushed Back On

15. Autonomous Agent Activity (1am–6am)

Overnight agent work that ran on remote infrastructure — invisible to local telemetry but real output.

Curly [AI] — Mac Mini → aws-mini
Mega scrape running overnight on aws-mini (PID 2094923)

The Ghost Data Problem

Local Pro telemetry from 1–6am shows Claude as the foreground app with non-zero session minutes. But there are zero keystrokes and zero voice recordings during this window. The Mac was unlocked with Claude visible — nothing more.

This is not Scott working. This is not even local agent work. It's an idle Mac while real work happens on a different machine entirely.

Framing: Overnight Agent Autonomy

Autonomous agent work on remote instances is real output but should never be conflated with Scott's active hours. It's a separate track:

This distinction should be surfaced as its own metric in future reports: "overnight agent autonomy" with separate tracking for what ran, where, and what it produced.

16. Product Implications from This Report

This EOD experiment itself is R&D for the Go2 product. Every misread is a heuristic. Every wrong inference becomes a product requirement.

Ghost Data Detection

Product requirement: Foreground app ≠ active work. The telemetry pipeline must distinguish between a human using an app, an agent using an app, and an idle machine with an app visible.

Human activity signals that actually matter: keystrokes + voice > app focus duration. Focus duration alone is unreliable.

Bio-Signal Integration

Oura/sleep data as a filter for human vs. machine activity. If the ring says you're asleep, any computer activity is definitionally agent-only or ghost data. This is a clean binary signal — no heuristics needed.

Leverage Ratio Decomposition

The "leverage ratio" concept needs to separate two distinct components:

A 10:1 ratio where you directed for 30 minutes is different from one where the agent ran unsupervised for 5 hours. Both matter, but they're different kinds of leverage.

Intent-to-Execution Chains

Potential product feature: Show customers how their voice directives turned into agent actions. Map the chain from Superwhisper transcript → agent task → commits/output. This is the "Work Intelligence" value prop made visible.

Today's example: Scott voice-dictated a content strategy take → Larry parsed it → brain repo updated → downstream agents acted on it. That chain is reconstructible from the data.

Misreads Are Product Requirements

Every wrong inference in this report becomes a heuristic for the product:

Framing: "Work Intelligence" Not "Process Mining"

The "Founder Operating Debrief" framing from earlier doesn't scale — it's founder-specific. Need universal framing for any operator managing AI agents and their own workflow.

Category: "Work Intelligence" — not "process mining" (enterprise baggage, implies BPM tools) and not "employee monitoring" (toxic, surveillance connotation).

Work Intelligence = understanding what happened, what worked, where leverage was created, and what to do differently tomorrow. For the operator, not their manager.

17. Evidence Appendix

Full Commit Log (71 commits) — grouped by workstream

Memory Engine (12 commits)

85c7210 Moe: document memory engine guardrails and ranking issue
301d453 Log Gemini signal ranking feedback + three-model consensus
90da381 Log GPT-5.4 signal ranking priorities for Memory Engine
1525912 Larry session 2 handoff - Memory Engine QA'd and fully wired
6f1d230 Moe: add memory ingest to save flow diagrams
bdfd26d Set Memory Engine write permissions per agent
308829c Add universal Memory Engine MCP server to brain repo
c82ec2c Log Moe's field evaluation of Memory Engine
5eebd07 Larry: Memory Engine ingestion in session-end workflows
9a294a5 Add Memory Engine ingestion step to all agent session-end workflows
7c6e95d Add Memory Engine startup directive to Moe handoff
f1140ba Add Memory Engine access instructions to Curly and Shemp handoffs

Relay Hardening (11 commits)

48bd087 Moe: end-of-day wrap
804c6bc Moe: save checkpoint
090df65 Moe: session update
92cc356 Moe: update relay and memory engine diagrams
af5bb9a Moe: clarify compute topology in diagrams
9601280 Moe: prefer AWS for heavy execution
+ 5 agent handoff updates

Research + Content (20 commits)

5d1d86d Larry: session-end - research thread closeout
ac236df Larry: checkpoint - EP01 slides complete
6835a4c Add agency vs in-house positioning split
1145a0a Larry: add "Skill = competence. MCP = clearance." framing
4a2bea5 Larry: add SkillsMP marketplace (66.5K skills)
3f0ea34 Larry: final agent sweep - Riley Brown, NetworkChuck
0bd2d49 Larry: final research enrichment
34b7fb4 Larry: save GPT-5.4 product thesis
e02b959 Larry: enrich product implications
6c695d3 Larry: expand platform comparison - 14 platforms
e90370f Larry: major competitive landscape enrichment
6f01149 Larry: enrich research files with agent findings
f513c28 Larry: deep research - AI skills/automation landscape
b5c6df4 Add post-call-machine project to brain
87ca2a7 Larry: incorporate Moe's refinements to post-call machine
e63637b Larry: post-call machine prompt + terminology fix
+ 4 agent handoff updates

Transcript Pipeline (8 commits)

3be9f9f Shemp: Automated ingestion of 25 transcripts
1ba55b1 Shemp: Fixed sync filter, mined standups, ingested strategy memories
a66713a Curly: ship 47 Superwhisper transcripts + add EOD skill block (#54)
151d9f4 Larry: transcript archival instructions for Curly + Shemp
eadda83 Larry: Curly transcript archival instructions + Shemp priority reorder
cff4220 Larry: Shemp transcript parsing rules
facbfcc Larry: update Shemp handoff - transcript parsing rules
+ 1 Shemp session update

Compute / RAM / Agent Management (8 commits)

b5aa78c Curly: architecture fixes - aws-mini naming, RAM constraints
b1cacb2 Larry: session-end - SOD + Curly RAM fix + Mini cleanup
3b03dfc Larry: fix Curly AWS - use aws-mini not aws-pro
aa3a87b Curly: enforce AWS-first memory guardrails
2c12e02 Larry: Curly RAM constraints - ban parallel subagents
72aaee9 Larry: Matt Cheever email marked as sent
1d35d0e Larry: checkpoint - SOD done, restarting for MCP config
5d6a60c Curly: observations - naming confusion, RAM crashes
Issue Events
#TitleStatus
58Mine product standup transcripts - extract commitments, decisions, product visionOpened
57Unified transcript pipeline - voice notes + meetings into git + Memory EngineOpened
56Add Codex (Joe) to Mac Mini alongside CurlyOpened
55SOD: Thread titles should include date and contextOpened
54Curly: Ship Superwhisper transcripts to Pro at EODClosed
53Memory Engine: add provenance + signal rankingOpened
52Add Memory Engine to architecture diagramsOpened
51Cowork.ai: 56K sessions collected, zero processingOpened (pre-existing)
47-50Live view issues (thread summaries, UI density, fronts language, data sources)Opened (pre-existing)
Top Keystroke Samples
21:03 — Codex
"yeah named you 3.22.35 so I know you are sod, named that you orchestration tree and next is sales demo right? I there is a thread called 'five landing pages' that knows a ton about this..."
23:18 — Codex
"we REALLY need to stop calling this claude dispatch and call it relay"
20:54 — Codex
"also you log me out and then the password never works. I dunno why you keep logging me out. I would be down to remove the password layer for today"
20:43 — Terminal (to Shemp)
"can you make it so you automate pulling the right transcripts and I never have to run you from terminal to do it? that possible?"
19:58 — Claude
"you should be using sub agents to avoid drift in this thread and I dont see that"
04:04 — Codex
"I feel like there is other shit like that. Also skills can have MCP inside of them, connections should be MCP and MCP comes before skills because you can attach multiple MCPs to a skill"
App Usage Breakdown
AppSessionsMinutesKeystrokes
Claude25262711,089 chars
Codex28320013,677 chars
Google Chrome2,541151830 chars
Universal Control2,4916571 chars
OpenWork7464301 chars
UserNotificationCenter16820 chars
ChatGPT2813609 chars
Terminal3510247 chars
Finder3955 chars
Other (10 apps)1624 chars

Note: Universal Control sessions (2,491) reflect Mac Mini ↔ MacBook Pro cursor/keyboard sharing for Curly interaction. Chrome sessions (2,541) include research browsing and YouTube review for competitive analysis.

Generated by Larry [AI] — Claude Code (Opus 4.6) on MacBook Pro
Data sources: Cowork.ai telemetry, Oura Ring API, Superwhisper transcripts, git log (Scottpedia0/brain), GitHub Issues, agent handoffs
March 22, 2026 — 1:00 AM to 10:13 PM PT