RealSkin — Knowledge Arch & Risk Analysis

1. Knowledge Graph Architecture

Core Entities (Nodes)

Entity	Attributes (Properties)	Validation Source
User (SkinIQ)	ID, Skin_Type, Fitzpatrick_Scale, Sensitivity, Primary_Concern, Hormonal_Pattern, Climate_Geo	User Input + Apple/Google Health API
Product	ID, Brand, Name, Category, Price, Active_Ingredients, Formulation_Type	Brand APIs + INCI Database
Ingredient	INCI_Name, Comedogenic_Rating, Irritation_Level, Function, Contraindications	Clinical/Dermatology Journals
Review	ID, User_ID, Product_ID, Star_Rating, Text, Time_Used, Verified_Purchase	Platform Generated
Dermatologist	ID, Name, Board_Cert_Number, Specialties, Clinic_Location	Medical Board API (NPI Database)

Semantic Triples (Relationships)

      (User_A) --[HAS_PROFILE]--> (SkinIQ_Profile)

      (SkinIQ_Profile) --[LIVES_IN]--> (Climate_Humid)

      (User_A) --[WROTE]--> (Review_123)

      (Review_123) --[RATES]--> (Product_BHA)

      (Product_BHA) --[CONTAINS]--> (Ingredient_SalicylicAcid)

      (Ingredient_SalicylicAcid) --[CONTRAINDICATES]--> (Ingredient_Retinol)

      (Dermatologist_DrMehta) --[VERIFIED]--> (SkinIQ_Profile)

2. "What We Don't Know" — AI & Domain Risk Analysis

The Risk Register (Failure Tree)

Failure Mode	Severity	Trigger / "What We Don't Know"	Mitigation / Kill Criteria
Cold Start Graph Poisoning	HIGH	Without initial users, the match algorithm has no data. Do we simulate reviews? If we use AI to scrape/generate initial reviews, we violate our "Authenticity" pillar.	Mitigation: Launch in private beta to 500 Owen/Vanderbilt students. Manually seed the database with real humans. Kill Criteria: If we generate fake reviews, the brand dies.
Medical Hallucination (Routine Generator)	HIGH	LLM Routine Generator hallucinates and tells a user to layer 20% AHA with 0.1% Tretinoin, causing severe chemical burns.	Mitigation: Hard-coded rule engine intercepts LLM output. The Generative Graph cannot output a routine without checking the [CONTRAINDICATES] edges in the Knowledge Graph.
Marketplace Contamination	MED	SecondSkin users sell expired or contaminated products, causing infections.	Mitigation: Only allow resale of products in "pump" packaging or sealed containers. Ban jar packaging from the marketplace. Implement batch-code verification.
Health Data Privacy Liability	HIGH	Apple Health data (menstrual cycles, sleep) is highly sensitive post-Roe v. Wade.	Mitigation: Zero-knowledge architecture. Cycle tracking stays on-device. Platform only receives anonymous hash flags (e.g., Phase_3) not dates/medical records.

AI Reliability Audit

Where LLMs Fail in Skincare: LLMs are built on internet consensus. The internet thinks "coconut oil is great for skin" because of SEO spam, despite it being highly comedogenic. If we use a standard RAG pipeline, RealSkin will give bad advice.

The Fix (Grounded AI): RealSkin's Generative Graph does not query the open web. It uses a strict RAG pipeline grounded only in our internal Ingredient Database (verified by derms) and our internal Review Graph. If a product isn't in our graph, the AI says "I don't know," rather than hallucinating.