OptimizeHER — Knowledge Graph Architecture

The Problem: Why Hardcoded Logic Is Brittle
The Proto-Graph Template: What Each Column Means
The Supabase Schema: Translating Excel to Database
The Composite Wellness Index: Your North Star Metric
The Outcome Capture Layer: How the Graph Learns
The Refactored Edge Function: Before vs. After
The Moat: Why This Becomes Defensible at Scale

Part 1 · The Problem

Why Hardcoded Logic Is Brittle

Your current Movement Edge Function works. It correctly fetches Oura data, classifies workouts, computes metrics, runs a two-pass intervention selection algorithm, and feeds a precisely engineered prompt to GPT-5 mini. This is excellent engineering.

The problem is architectural, not functional. Everything that makes the app intelligent — the intervention thresholds, the evidence briefs, the activity classifications, the voice rules — lives inside a single TypeScript file. This creates five compounding risks as OptimizeHER scales:

🔒 IP Lives in Code

Your proprietary female-specific physiology rules are embedded in a .ts file. If a competitor sees your codebase, they see your entire clinical logic. If that logic lives in a structured database instead, it becomes a distinct, versioned, ownable IP asset.

⚡ Every Update = Redeploy

When a new meta-analysis changes a threshold, you edit TypeScript, test, and redeploy. With a database-backed graph, you update a single row. The Edge Function queries it on the next call. Zero code changes.

🧱 Pillars Are Silos

Movement doesn't know what Vitality is doing. Sleep doesn't know Movement is recommending HIIT. The hardcoded architecture makes cross-pillar intelligence architecturally impossible without deeply nested, brittle logic.

📊 No Learning Loop

Right now, the app makes recommendations but never asks: did they work? There is no mechanism to capture outcomes against the triggers that fired them. The app is permanently static.

The Core Insight: You are not just building a wellness app. You are building an AI Factory. The Knowledge Graph is the factory floor — the structured intelligence layer that separates your clinical logic from your application code, allows it to evolve without redeploys, and compounds in value with every user interaction.

Part 2 · The Proto-Graph Template

What Each Column Means

The Excel template you received is a Proto-Graph — a human-readable representation of your Knowledge Graph that can be directly seeded into Supabase as a CSV. Every row is one node in the graph. Every column is a property of that node. Here is exactly what each column does and why it exists.

Column	What It Stores	Why It Exists
`intervention_id`	Unique snake_case identifier (e.g., `movement_resistance`)	Primary key. Links this row to evidence briefs, outcomes, and triples tables. Every other table in the graph references this ID.
`pillar`	Which wellness domain owns this intervention	Enables cross-pillar queries. When the Vitality Edge Function fires, it can query interventions from the Movement pillar as force multipliers.
`priority`	Integer 1–7 (1 = highest clinical priority)	Replaces the hardcoded array order in your current INTERVENTIONS array. Change a priority in the database — the selection algorithm automatically re-orders.
`theme`	User-facing strategy name	Feeds directly into the monthly strategy card UI. No code change needed to update copy.
`primary_objective`	Single sentence describing the goal	Feeds GPT-5 mini's context string as the "what this strategy is about" anchor.
`trigger_conditions_json`	JSONB object of metric thresholds	This is the rules engine. The Edge Function queries this column using PostgreSQL's `@>` operator. Adding a new trigger condition = adding a key to this JSON. No TypeScript changes.
`willingness_keywords`	Comma-separated keywords from onboarding	Replaces the `checkWillingness()` function. If the user's onboarding answers contain any of these keywords, willingness = true.
`evidence_grade`	Apex / High / Moderate / Emerging	Used by the rules engine as a confidence signal. High-evidence interventions are preferred when two interventions have equal priority scores. Also surfaces in the investor data room as proof of clinical rigor.
`evidence_summary`	The full evidence brief text	Replaces the hardcoded EVIDENCE_BRIEFS object. Fetched dynamically from Supabase and passed directly to GPT-5 mini. Update the science in the database — GPT immediately uses the new brief.
`women_specific_angle`	Female-specific physiological context	The most defensible column in the table. This is the IP. The synthesis of female-specific neuroendocrine and hormonal context that no generic wellness app has mapped. Fed to GPT as a secondary context layer to ensure recommendations feel built for women, not adapted for them.

The key insight about trigger_conditions_json: In your current code, thresholds like resistanceSessionsPerWeek >= 3 are hardcoded inside an evaluate() function. In the graph, they live as: {"resistance_sessions_per_week": "< 3"}. The Edge Function queries this with a single SQL line instead of a nested if/else block. To change a threshold — say, new research shows 2 sessions is sufficient — you change one database value. No redeploy.

Part 3 · The Supabase Schema

Translating Excel to Database

The Excel template seeds directly into a set of Supabase (PostgreSQL) tables. Here is the complete schema for the Knowledge Graph layer — three core tables that together replace all hardcoded logic in the current Edge Function.

Table 1: kg_interventions

This is the master interventions registry. It replaces the INTERVENTIONS array and the EVIDENCE_BRIEFS object in a single table.

CREATE TABLE kg_interventions (
  id                    TEXT PRIMARY KEY,        -- 'movement_resistance'
  pillar                TEXT NOT NULL,            -- 'movement', 'vitality', 'sleep', 'nutrition'
  priority              INTEGER NOT NULL,         -- 1 = highest clinical priority
  theme                 TEXT NOT NULL,            -- 'Build Your Strength Foundation'
  primary_objective     TEXT NOT NULL,
  threshold_optimal     JSONB,                    -- {"resistance_sessions_per_week": ">= 3"}
  threshold_acceptable  JSONB,                    -- {"resistance_sessions_per_week": ">= 1"}
  threshold_necessary   JSONB,                    -- {"resistance_sessions_per_week": "< 1"}
  willingness_keywords  TEXT[],                   -- ['strength', 'weight_training']
  evidence_summary      TEXT,                     -- full evidence brief → fed to GPT
  women_specific_angle  TEXT,                     -- female-specific physiology context
  evidence_grade        TEXT,                     -- 'apex', 'high', 'moderate', 'emerging'
  is_active             BOOLEAN DEFAULT true,
  created_at            TIMESTAMPTZ DEFAULT now(),
  updated_at            TIMESTAMPTZ DEFAULT now()
);

Table 2: kg_activity_classifications

This replaces the hardcoded ACTIVITY_KEYWORDS object. To add "barre" as a mobility activity, you insert one row. Zero code changes.

CREATE TABLE kg_activity_classifications (
  keyword               TEXT PRIMARY KEY,         -- 'crossfit', 'barre', 'swimming'
  category              TEXT NOT NULL,            -- 'resistance', 'cardio', 'mobility', 'hiit'
  pillar                TEXT DEFAULT 'movement',
  created_at            TIMESTAMPTZ DEFAULT now()
);

-- Seed data (replaces ACTIVITY_KEYWORDS object):
INSERT INTO kg_activity_classifications VALUES
  ('workout', 'resistance', 'movement'),
  ('strength', 'resistance', 'movement'),
  ('weight_training', 'resistance', 'movement'),
  ('running', 'cardio', 'movement'),
  ('cycling', 'cardio', 'movement'),
  ('swimming', 'cardio', 'movement'),
  ('yoga', 'mobility', 'movement'),
  ('pilates', 'mobility', 'movement'),
  ('hiit', 'hiit', 'movement'),
  ('walking', 'non_exercise', 'movement');

Table 3: kg_triples

This is the graph layer that makes OptimizeHER a Knowledge Graph and not just a lookup table. Every causal or relational claim in your clinical logic gets stored here as a Subject → Predicate → Object triple. This is your IP audit trail and your compounding moat.

CREATE TABLE kg_triples (
  id                    UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  subject_entity        TEXT NOT NULL,    -- 'cycle_phase:luteal'
  predicate             TEXT NOT NULL,    -- 'increases_demand_for'
  object_entity         TEXT NOT NULL,    -- 'entity:magnesium'
  intervention_id       TEXT REFERENCES kg_interventions(id),
  pillar                TEXT,
  evidence_grade        TEXT,
  source_citation       TEXT,             -- 'Balban et al., Cell Reports Medicine 2023'
  confidence_score      NUMERIC DEFAULT 1.0,  -- updated by outcome data over time
  is_active             BOOLEAN DEFAULT true,
  created_at            TIMESTAMPTZ DEFAULT now()
);

-- Example triples for movement pillar:
INSERT INTO kg_triples 
  (subject_entity, predicate, object_entity, intervention_id, evidence_grade, source_citation)
VALUES
  ('cycle_phase:follicular', 'optimizes', 'protocol:resistance_training', 
   'movement_resistance', 'High', 'Sung et al. 2014'),
  ('cycle_phase:luteal', 'reduces_tolerance_for', 'protocol:hiit', 
   'movement_hiit', 'Moderate', 'Markofski & Braun 2014'),
  ('metric:stress_high_minutes', 'is_reduced_by', 'protocol:movement_active_minutes', 
   'movement_active_minutes', 'High', 'Caplin et al. 2021');

Part 4 · The Composite Wellness Index

Your North Star Metric

The Composite Wellness Index (CWI) is OptimizeHER's proprietary score. It is not Oura's Readiness score, Resilience score, or Sleep score — it is your synthesis of those signals, weighted specifically for professional women and designed to be device-agnostic from day one.

Why device-agnostic matters: Oura measures HRV, sleep staging, and skin temperature. Apple Watch measures heart rate, activity, and blood oxygen. Garmin measures stress and VO2max. The variables are different, but the underlying constructs are the same: recovery quality, stress load, movement sufficiency, and sleep quality. By building your CWI on universal constructs — not device-specific API fields — you ensure that adding Apple Watch or Garmin support in V2 requires a new data adapter, not a new scoring model.

The Four Constructs

Recovery Quality (30%)

How well did the body repair overnight? Includes sleep score, HRV trend, resting HR. The highest-weighted construct because it is the most direct proxy for allostatic load — the central mechanism OptimizeHER is trying to optimize.

Oura source: Resilience score, sleep score

Device-agnostic proxy: "Recovery Quality Index"

Stress Load (25%)

How much physiological stress did the body carry today? Includes stress_high minutes, recovery_high minutes, stress/recovery ratio.

Oura source: daily_stress → stress_high, recovery_high

Device-agnostic proxy: "Stress Load Index"

Movement Sufficiency (25%)

Did the body move enough today? Steps, exercise minutes, intensity. Weighted equally with Stress Load because movement is the most reliable behavioral lever for stress reduction.

Oura source: steps, activity_score

Device-agnostic proxy: "Movement Index"

Protocol Adherence (20%)

Did she follow her active strategy today? Self-reported via evening check-in. The lowest-weighted construct because biometric outcomes matter more than compliance — but adherence predicts long-term trajectory.

Source: App check-in (pillar-agnostic)

Device-agnostic proxy: "Adherence Index"

The Calculation

-- Supabase function: calculate_cwi(user_id, date)
CREATE OR REPLACE FUNCTION calculate_cwi(p_user_id UUID, p_date DATE)
RETURNS NUMERIC AS $$
DECLARE
  v_recovery_quality  NUMERIC;
  v_stress_load       NUMERIC;
  v_movement          NUMERIC;
  v_adherence         NUMERIC;
  v_cwi               NUMERIC;
  v_oura              RECORD;
  v_checkin           RECORD;
BEGIN
  -- Fetch Oura metrics for this date
  SELECT * INTO v_oura
  FROM daily_oura_metrics
  WHERE user_id = p_user_id AND date = p_date;

  -- Fetch self-reported check-in
  SELECT * INTO v_checkin
  FROM daily_checkins
  WHERE user_id = p_user_id AND date = p_date;

  -- CONSTRUCT 1: Recovery Quality (0-100)
  -- Uses resilience_score + sleep_score, normalized to 0-1
  v_recovery_quality := (
    COALESCE(v_oura.resilience_score, 50) * 0.6 +
    COALESCE(v_oura.sleep_score, 50) * 0.4
  );

  -- CONSTRUCT 2: Stress Load (0-100, inverted — higher stress = lower score)
  -- stress_ratio = recovery_high / (stress_high + recovery_high + 1)
  v_stress_load := (
    COALESCE(v_oura.recovery_high_minutes, 30) /
    NULLIF(COALESCE(v_oura.stress_high_minutes, 60) +
           COALESCE(v_oura.recovery_high_minutes, 30), 0)
  ) * 100;

  -- CONSTRUCT 3: Movement Sufficiency (0-100)
  -- Steps normalized to 10,000 target (100 = optimal)
  v_movement := LEAST(
    COALESCE(v_oura.steps, 0) / 100.0,   -- 10,000 steps = 100
    100
  );

  -- CONSTRUCT 4: Protocol Adherence (0 or 100 — binary for now)
  v_adherence := CASE
    WHEN COALESCE(v_checkin.protocol_completed, false) THEN 100
    ELSE 0
  END;

  -- COMPOSITE WELLNESS INDEX (weighted sum)
  v_cwi := (
    v_recovery_quality * 0.30 +
    v_stress_load      * 0.25 +
    v_movement         * 0.25 +
    v_adherence        * 0.20
  );

  RETURN ROUND(v_cwi, 1);
END;
$$ LANGUAGE plpgsql;

Why a Supabase function, not TypeScript? The CWI calculation runs in the database, not the Edge Function. This means it can be called from any pillar, any Edge Function, any future analytics query — without duplicating logic. It is the single source of truth for the score.

Part 5 · Outcome Capture

How the Graph Learns

Every intervention the rules engine selects is a hypothesis: "Given this user's current state, this protocol should improve her CWI." Outcome capture is the mechanism that tests each hypothesis against reality.

The Outcome Table

CREATE TABLE kg_outcomes (
  id                    UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id               UUID REFERENCES profiles(id),
  intervention_id       TEXT REFERENCES kg_interventions(id),
  trigger_date          DATE,
  -- Full behavioral state at trigger (the confounders)
  trigger_state         JSONB,  
  -- e.g., {"cycle_phase": "luteal", "stress_level": "high",
  --        "movement_mins": 0, "sleep_score": 71}
  
  -- CWI trajectory
  cwi_at_trigger        NUMERIC,   -- CWI on day intervention was selected
  cwi_day_3             NUMERIC,   -- CWI 3 days later
  cwi_day_7             NUMERIC,   -- CWI 7 days later
  cwi_day_14            NUMERIC,   -- CWI 14 days later
  
  -- Pillar-specific outcome metrics
  oura_resilience_before NUMERIC,
  oura_resilience_after  NUMERIC,  -- captured at day 7
  oura_stress_before     NUMERIC,
  oura_stress_after      NUMERIC,
  
  -- Self-reported signals
  user_completed        BOOLEAN DEFAULT false,
  user_rating           INTEGER,   -- 1-5 evening check-in
  
  created_at            TIMESTAMPTZ DEFAULT now()
);

The Learning Query

Once you have enough outcome rows, this query answers: "Which interventions, in which behavioral states, most reliably improve CWI?"

-- Pattern detection query — run weekly
SELECT
  intervention_id,
  trigger_state->>'cycle_phase'                    AS cycle_phase,
  CASE 
    WHEN (trigger_state->>'movement_mins')::INT > 30 
    THEN 'active' ELSE 'sedentary' 
  END                                              AS movement_day,
  AVG(cwi_day_7 - cwi_at_trigger)                  AS avg_cwi_delta,
  AVG(oura_resilience_after - oura_resilience_before) AS avg_resilience_delta,
  COUNT(*)                                          AS observations
FROM kg_outcomes
WHERE user_completed = true
GROUP BY 1, 2, 3
HAVING COUNT(*) > 50
ORDER BY avg_cwi_delta DESC;

This query produces a ranked table of intervention effectiveness by behavioral context. When a pattern emerges — say, movement_resistance in the Follicular phase with active movement produces 3x the CWI delta versus Luteal + sedentary — you update the kg_triples table with a new triple:

INSERT INTO kg_triples 
  (subject_entity, predicate, object_entity, intervention_id, 
   confidence_score, source_citation)
VALUES
  ('behavioral_state:follicular_active', 
   'amplifies_effect_of', 
   'protocol:movement_resistance',
   'movement_resistance',
   0.87,   -- empirical confidence score from your outcome data
   'OptimizeHER Outcome Data, n=312, 2026');

🐋 The Pod Commander Reality Check

This is the moment your graph stops being a hypothesis and starts being empirical truth. When you can tell an investor: "Our Knowledge Graph contains 500 clinician-built triples from published research, and 200 empirically-validated triples discovered from our own outcome data across 1,000 users" — that is a completely different conversation than any competitor can have. The published 500 triples are defensible. The proprietary 200 are irreplaceable.

Part 6 · The Refactored Edge Function

Before vs. After

Here is the exact structural change to your Movement Edge Function. The logic is identical — only the source of the data changes from hardcoded TypeScript to database queries.

Section 2: Activity Classification — Before vs. After

// BEFORE: Hardcoded in TypeScript
const ACTIVITY_KEYWORDS = {
  resistance: ['workout', 'strength', 'weight_training', ...],
  cardio: ['running', 'cycling', 'swimming', ...],
}

function classifyActivity(activityType: string): string {
  const lower = (activityType || '').toLowerCase()
  for (const [category, keywords] of Object.entries(ACTIVITY_KEYWORDS)) {
    if (keywords.some(k => lower.includes(k))) return category
  }
  return 'other'
}

// AFTER: Query the graph
const { data: classifications } = await supabase
  .from('kg_activity_classifications')
  .select('keyword, category')

// Build lookup map from DB result
const classificationMap = Object.fromEntries(
  (classifications || []).map(c => [c.keyword, c.category])
)

function classifyActivity(activityType: string): string {
  const lower = (activityType || '').toLowerCase()
  for (const [keyword, category] of Object.entries(classificationMap)) {
    if (lower.includes(keyword)) return category
  }
  return 'other'
}

Section 3: Rules Engine — Before vs. After

// BEFORE: Hardcoded INTERVENTIONS array with evaluate() functions
const INTERVENTIONS = [
  {
    id: 'resistance_training',
    priority: 1,
    evaluate: (data) => {
      if (data.resistanceSessionsPerWeek >= 3) return 'optimal'
      if (data.resistanceSessionsPerWeek >= 1) return 'acceptable'
      return 'necessary'
    },
    ...
  }
]

// AFTER: Fetch from graph, evaluate against thresholds
const { data: interventions } = await supabase
  .from('kg_interventions')
  .select('*')
  .eq('pillar', 'movement')
  .eq('is_active', true)
  .not('id', 'in', `(${excluded_interventions.map(i => `'${i}'`).join(',')})`)
  .order('priority', { ascending: true })

function evaluateLevel(intervention, movementData): 'optimal' | 'acceptable' | 'necessary' {
  // Parse threshold_necessary JSONB and compare to movementData
  // Returns the current assessment level
  const necessary = intervention.threshold_necessary || {}
  const acceptable = intervention.threshold_acceptable || {}
  // Dynamic threshold comparison replaces hardcoded evaluate() functions
  // Full implementation provided in companion TypeScript snippet
  return 'necessary' // simplified for brevity
}

Section 4: Evidence Brief — Before vs. After

// BEFORE: Hardcoded EVIDENCE_BRIEFS object (500+ lines)
const EVIDENCE_BRIEFS = {
  resistance_training: `EVIDENCE BRIEF: Resistance Training...`,
  weekly_exercise_minutes: `EVIDENCE BRIEF: Weekly Exercise Minutes...`,
  // ... 5 more massive hardcoded strings
}

// AFTER: Evidence brief comes from the graph row already fetched
const evidenceBrief = selectedIntervention.evidence_summary
const womenAngle = selectedIntervention.women_specific_angle

const userMessage = `${evidenceBrief}

WOMEN-SPECIFIC ANGLE:
${womenAngle}

ABOUT THIS USER:
${dataContext}
...`

Net result: Your Edge Function goes from ~600 lines to approximately 200 lines. The logic is identical. The intelligence is now in the database, where it can be versioned, updated without redeploys, and queried across pillars.

Part 7 · The Moat

Why This Becomes Defensible at Scale

When you present OptimizeHER at Demo Day on April 22nd, this architecture gives you four defensible claims that no competitor in the room can match:

1. Proprietary IP, Not Just Code

The Knowledge Graph is a structured, versioned, ownable asset. It is not a TypeScript file — it is a database of clinician-validated, female-specific physiology rules that constitutes intellectual property independent of the application layer. This is the difference between "we built a wellness app" and "we own a female biometric intelligence system."

2. Device-Agnostic by Design

The Composite Wellness Index is built on universal constructs (Recovery Quality, Stress Load, Movement Sufficiency, Adherence) — not Oura API fields. Adding Apple Watch or Garmin in V2 requires a new data adapter pointed at the same scoring function. The graph doesn't change. The moat deepens with every device integration.

3. A Self-Improving Intelligence Layer

At 1,000 daily users, your outcome tables generate approximately 1.35 million data points over 90 days. The pattern detection queries surface intervention efficacy by behavioral state — empirical truths no published study has ever produced, because no study has had this specific data. Your graph discovers new triples. Those triples are proprietary. The graph compounds.

4. Cross-Pillar Synchronicity

When Sleep, Nutrition, and Vitality Edge Functions all query the same Knowledge Graph, the app begins to act as a single coordinated intelligence rather than four separate recommendation engines. A Vitality intervention can check Movement data. A Nutrition recommendation can cross-reference Sleep consistency. This cross-pillar awareness is the feature that justifies the platform premium over point solutions.

The AI Factory Thesis (Iansiti & Lakhani)

The AI Factory argument is that the only durable competitive advantage in software is a data flywheel: the app generates data → the data improves the algorithm → the better algorithm attracts more users → more users generate more data. OptimizeHER's Knowledge Graph is the factory floor. The CWI is the quality control metric. The outcome capture layer is the feedback loop that closes the flywheel.

User Interaction

→

Outcome Capture

→

Pattern Detection

→

Graph Update

→

Better Recommendations

→

More Users

OptimizeHER · Mike · Vanderbilt Owen MBA 2026 · AI-Accelerated Entrepreneurship Practicum
Generated by IAM Mira · vanderbot.com

Knowledge Graph Architecture

Table of Contents

Why Hardcoded Logic Is Brittle

🔒 IP Lives in Code

⚡ Every Update = Redeploy

🧱 Pillars Are Silos

📊 No Learning Loop

What Each Column Means

Translating Excel to Database

Table 1: kg_interventions

Table 2: kg_activity_classifications

Table 3: kg_triples

Your North Star Metric

The Four Constructs

Recovery Quality (30%)

Stress Load (25%)

Movement Sufficiency (25%)

Protocol Adherence (20%)

The Calculation

How the Graph Learns

The Outcome Table

The Learning Query

🐋 The Pod Commander Reality Check

Before vs. After

Section 2: Activity Classification — Before vs. After

Section 3: Rules Engine — Before vs. After

Section 4: Evidence Brief — Before vs. After

Why This Becomes Defensible at Scale

1. Proprietary IP, Not Just Code

2. Device-Agnostic by Design

3. A Self-Improving Intelligence Layer

4. Cross-Pillar Synchronicity

The AI Factory Thesis (Iansiti & Lakhani)