Agentic Architecture Blueprint
Target: Replicating a tool-calling, autonomous AI architecture for your ETA Platform and LEDGER.
1. The Cognitive Engine (Local Routing)
You need an LLM to act as the "router" — reading user input and deciding whether to chat, query a database, or execute a tool. For LEDGER, this must be local.
- Tech Stack:
Llama.cpp or MLX (Apple Silicon).
- Model: Llama-3-8B-Instruct or Mistral-Nemo. Fast enough for real-time game dialogue, capable of JSON function calling.
- Mechanic: You use system prompts to define strictly what the model *can* do, paired with "Structured Outputs" (forcing the model to reply in valid JSON).
2. The Knowledge Graph (RAG + Graph DB)
The AI needs context. For your ETA firm acquisitions, flat tables fail. You need a graph.
- Tech Stack:
Neo4j (Graph DB) + Pinecone/Weaviate (Vector DB for text).
- Mechanic: When a user asks a question, the system runs a similarity search over the RAG database (GAAP rules, tax codes) AND traverses the Graph DB (client-to-entity-to-transaction relationships).
- Injection: The retrieved context is silently injected into the LLM's prompt before it generates an answer.
3. Tool Calling (The Action Layer)
This is how an AI actually "does" things instead of just talking.
- Definition: You provide the LLM with a JSON schema of tools it can use (e.g.,
update_reputation_score(character, amount) or generate_financial_model(inputs)).
- Execution: The LLM pauses its text generation, outputs a JSON tool call, your Python/C# backend intercepts it, runs the actual code, and feeds the result back to the LLM.
- LEDGER Application: The LLM realizes the player insulted a partner. It calls
update_reputation(faction="Vanguard", change=-10). Unity intercepts this and updates the UI state.
4. The Orchestrator
You need a framework to manage the loop between the LLM, the Tools, and the Memory.
- Tech Stack: You can use
LangChain or LlamaIndex, but as an operator, you're better off writing a custom Python/C# state machine. Frameworks get bloated.
- The Loop: User Input -> Retrieve Context -> LLM Prompted -> LLM decides to Use Tool -> Tool Runs -> Result fed to LLM -> Final Response given to User.