RealSkin — Knowledge Architecture & Risk Analysis

Section 1 — Entity Ontology

The five core entity classes that form the nodes of the RealSkin Knowledge Graph. Every piece of data must belong to one of these classes.

Biological · The User

UserProfile

UserID
SkinType
FitzpatrickScale
PrimaryConcern
SensitivityLevel
AgeRange
HormonalPhase
ClimateZone

Chemical · The Product

ProductProfile

ProductID
Brand
Category
ActiveIngredient
Concentration
InactiveIngredient
pHLevel
Texture

Clinical · The Science

ClinicalNode

DermatologistID
ClinicalStudy
Contraindication
VerifiedRoutine
EfficacyRating
DermSpecialization
ConflictRule

Environmental · The Context

EnvironmentNode

ClimateZone
HumidityLevel
UVIndex
HardWaterIndex
Season
PollutionIndex
ZipCode

Experiential · The Community

ReviewNode

ReviewID
MatchScore
UsageDuration
EfficacyOutcome
AdverseReaction
PurchaseVerified
DisclosureFlag

Section 2 — Predicates & Logic Rules

The edges of the graph. These are the verbs that connect entities and power the Match Algorithm. Every rule here is a constraint on the AI's reasoning.

User ↔ Product Rules

Subject	Predicate	Object	Rule / Note
User	has_sensitivity_to	Fragrance	IF Product contains_ingredient Fragrance → MatchScore penalty −40%
User	has_concern	Hyperpigmentation	Boost products where treats_concern = Hyperpigmentation by +20%
User	lives_in	High_Humidity	Penalise heavy cream textures; boost gel and water-based textures
User	is_in_phase	Luteal_Phase	Increase sebum_risk flag; recommend BHA over AHA during this window
Product	conflicts_with	Product	Retinol conflicts_with AHA/BHA same-night use → routine safety warning

User ↔ User Rules (The Match Engine)

Subject	Predicate	Object	Rule / Note
User A	has_SkinIQ_similarity	User B	Weighted: SkinType 25%, Concern 20%, Fitzpatrick 20%, Sensitivity 15%, Climate 10%, Texture 10%
User B	reports_success_with	Product X	IF similarity(A,B) > 85% THEN Product X → recommended_to User A
Review	has_usage_duration	Under_14_Days	Programmatically filtered from efficacy score for actives (Retinol, AHA, BHA, Vitamin C)

Section 3 — Core Knowledge Triples

A sample from the 100-triple domain Knowledge Graph. Each triple is a machine-readable fact: Subject → Predicate → Object. This is how the AI knows what it knows.

001Ameesha→has_SkinType→Combination-Oily

002Ameesha→lives_in→Humid_Climate

003Ameesha→has_concern→Hyperpigmentation

004Ameesha→has_concern→Hormonal_Acne

005Ameesha→has_sensitivity_to→Fragrance

006Ameesha→uses_product→Paula_Choice_2%_BHA

007Ameesha→reports_outcome→Reduced_Breakouts

008Ameesha→logs_cycle_phase→Luteal_Phase

009Luteal_Phase→correlates_with→Increased_Sebum

010Increased_Sebum→exacerbates→Clogged_Pores

011Paula_Choice_2%_BHA→contains_active→Salicylic_Acid

012Salicylic_Acid→treats_concern→Clogged_Pores

013Salicylic_Acid→requires_companion→SPF_50

014Salicylic_Acid→conflicts_with→Retinol_Same_Night

015Retinol→requires_sun_avoidance→True

016Retinol→requires_usage_duration_for_efficacy→28_Days_Minimum

017Niacinamide→treats_concern→Hyperpigmentation

018Niacinamide→treats_concern→Enlarged_Pores

019Niacinamide→compatible_with→Retinol

020Niacinamide→suitable_for_SkinType→All_Skin_Types

021Vitamin_C→treats_concern→Hyperpigmentation

022Vitamin_C→degrades_in→Light_Exposure

023Vitamin_C→requires_companion→SPF_50

024SPF_50→critical_for_concern→Hyperpigmentation

025SPF_50→required_if_using→Any_Active_Ingredient

026High_Humidity→exacerbates→Oiliness

027High_Humidity→makes_unsuitable→Heavy_Cream_Texture

028High_Humidity→makes_suitable→Gel_Moisturizer

029Fitzpatrick_IV→higher_risk_of→Post-Inflammatory_Hyperpigmentation

030Fitzpatrick_IV→requires_higher_priority→SPF_Daily_Use

031Dr_Riya_Mehta→specialises_in→South_Asian_Skin

032Dr_Riya_Mehta→verifies_efficacy_of→Salicylic_Acid_for_Luteal_Acne

033Community_Review_Priya→has_SkinIQ_match_to→Ameesha: 92%

034Community_Review_Priya→reports_success_with→Paula_Choice_2%_BHA

035Community_Review_Priya→usage_duration→6_Weeks

036Review_Under_14_Days→excluded_from→Efficacy_Score_For_Actives

037CeraVe_Hydrating_Cleanser→suitable_for→Sensitive_Skin

038CeraVe_Hydrating_Cleanser→contains_active→Ceramides

039Ceramides→repairs→Skin_Barrier

040Compromised_Barrier→contraindicated_with→High_Concentration_Retinol

Section 4 — What We Don't Know (Domain Risk Analysis)

A Beaver doesn't just build the dam — they calculate where it will break first. These are the structural blind spots in our Knowledge Graph that must be actively mitigated.

Risk 1 — The Self-Reporting Illusion

High Risk

Assumption being challenged: Users can accurately self-identify their own skin type and concerns.

What We Don't Know

Users are notoriously poor at self-diagnosing their skin. A user may report "dry skin" when they actually have a compromised skin barrier from over-exfoliation. A user may report "oily skin" when they have dehydrated skin overproducing oil as a compensatory mechanism. If the SkinIQ profile is built on wrong self-reported inputs, every downstream match is built on corrupt data.

✅ Mitigation: Replace diagnostic labels with behavioral questions. Instead of "Do you have oily skin?" ask "How does your skin feel 2 hours after washing with no products applied?" This forces observed behavior over self-diagnosis and dramatically increases input accuracy.

Risk 2 — The Formulation Opacity Gap

High Risk

Assumption being challenged: Ingredient lists are sufficient to predict product efficacy for a given SkinIQ profile.

What We Don't Know

Two serums both labeled "10% Niacinamide" can behave entirely differently based on their base formulation, delivery system, and inactive ingredient interactions. A Knowledge Graph built purely on active ingredient data will produce false matches. Additionally, brands frequently reformulate products without updating ingredient marketing — so the product a reviewer used 18 months ago may not be the same formulation available today.

✅ Mitigation: Weight the Experiential Graph (community efficacy data) heavier than the Chemical Graph (ingredient lists) in the match algorithm. Real-world outcomes from matched-skin users are more reliable than ingredient-based theoretical predictions. Tag reviews with product batch/year to track formulation drift.

Risk 3 — The Lag-Time Variable

Medium Risk

Assumption being challenged: Reviews will be written after sufficient product usage to reflect true efficacy.

What We Don't Know

Active ingredients like Retinol, Vitamin C, and BHA require 4–8 weeks of consistent use before meaningful results appear. However, platforms consistently see peak review volume in the first 3–7 days of use — before any active ingredient can possibly work. If early reviews dominate our match scoring, we will systematically mislead users about product efficacy.

✅ Mitigation: All reviews must carry a <UsageDuration> tag. Reviews under 14 days for active ingredients are programmatically excluded from the efficacy score. Reviews under 3 days carry a mandatory disclaimer: "Too early to assess actives." Gamify long-term reviews with GlowPoints bonuses for 8-week+ reviews.

Risk 4 — The Epigenetic & Water Blindspot

Medium Risk

Assumption being challenged: Climate zone (humidity + UV) captures sufficient environmental context.

What We Don't Know

Tap water hardness varies dramatically across zip codes and directly affects skin barrier function and how cleansers perform. Hard water (high calcium/magnesium) leaves mineral deposits on skin that disrupt pH and exacerbate eczema and acne. We currently track humidity and UV but not water hardness — meaning two users in the same "Humid_Climate" zone may experience products completely differently based on their municipal water supply.

✅ Mitigation: V2 will cross-reference user zip codes with publicly available municipal water hardness databases (EPA Water Quality Reports) to add a <HardWaterIndex> node to the Environmental Graph. This is a zero-user-effort enrichment that dramatically increases match precision.

Risk 5 — The Cold Start Problem

Medium Risk

Assumption being challenged: The match algorithm will be useful from Day 1 with a small user base.

What We Don't Know

The Skin Match Algorithm only works when there are enough users with similar SkinIQ profiles to generate statistically meaningful matches. For rare skin combinations (e.g., Fitzpatrick VI, dry skin, eczema, cold climate), early users may find no relevant matches — producing the exact frustration we are trying to solve. This risks churning our most underserved users first.

✅ Mitigation: Pre-seed the Knowledge Graph with curated derm-verified content for underrepresented skin types before launch. For any profile where match density is under threshold, fall back to dermatologist-verified clinical recommendations rather than community reviews. Users see "Derm Recommended — building your community match" as a transparent placeholder.

Risk 6 — The Adverse Event Liability Gap

Managed Risk

Assumption being challenged: Community reviews can recommend products without clinical validation and without triggering medical liability.

What We Don't Know

RealSkin occupies a regulatory grey zone. We are not a medical device, but we are surfacing health-correlated insights (cycle, sleep, stress patterns). If a user follows a skin-matched recommendation, has an adverse reaction, and claims RealSkin's algorithm was responsible, we face reputational and potential legal risk. This is especially acute for Tretinoin (Rx) and prescription-grade actives appearing in routines.

✅ Mitigation: All insights carry a mandatory "Not medical advice" disclosure. Prescription items are flagged with "Rx Required" and never recommended without a verified derm consultation in the platform. Community reviews cannot recommend Rx products. Legal counsel review of all clinical claim language before launch.