Updated Jun 05, 2026 Decisions

Key Decisions — Chapter I

Decision: genai/ imports credentials from os.getenv, not dashboard/constants.py

Why: dashboard/constants.py imports from dashboard/models.py, which pulls in Django ORM. Importing it from genai/ would drag the entire Django stack in — breaking the zero-Django-import rule and making test_llm.py impossible to run without a configured Django environment.

Result: genai/llm_client.py reads GEMINI_API_KEY and MODEL_NAME directly from os.getenv(). Credentials live in .env and are loaded by python-dotenv in the test script.

Decision: google-genai SDK (new), not google-generativeai (old)

Why: requirements.txt uses google-genai — the new unified SDK. The old SDK (google-generativeai) used GenerativeModel, start_chat, GenerationConfig. The new SDK uses genai.Client, client.models.generate_content, types.GenerateContentConfig.

Result: llm_client.py uses the new SDK API throughout.

Decision: Registry holds classes, never instances

Why: Agents need fresh state per request. A persistent instance would carry stale conversation history across requests. The registry is a factory — it gives you the blueprint, you build the object with the dependencies you need right now.

Result: AgentRegistry.register() stores the class. get_by_type() returns the class. Caller instantiates.

Decision: agent_service/apps.py ready() registers agents

Why: ready() fires once per Passenger process on startup — after all Django apps are loaded but before any requests are served. It's the correct Django hook for one-time setup that depends on the app registry being ready.

Result: AgentServiceConfig.ready() imports and registers TeoAgent. Future agents (Vega, Dash, Arthur) are added here as each chapter is built.

Decision: Notifications marked read only after greet, not on every request

Why: First implementation marked notifications read before teo.run() — so if Gemini failed or the greeting didn't mention them, they were gone silently. Fixing to mark read only after a successful __greet__ response ensures Teo actually surfaces them before they disappear.

Result: run_teo() marks notifications read only when user_message == '__greet__'.

Decision: greet exchanges not saved to ConversationMessage

Why: The greeting is a UI trigger, not a real conversation turn. Saving it would pollute the history Teo uses for context — he'd see "the user said greet" as the last thing in the conversation.

Result: _save_messages() is skipped when user_message == '__greet__'.

Decision: Greeting fires on every page load unconditionally

Why: Early approaches used sessionStorage or checked for existing .msg elements to avoid re-greeting. Both caused problems — sessionStorage prevented greeting after navigation, and the DOM check prevented greeting when history was pre-loaded from DB. The correct behaviour is: always greet on page load. The greeting is never saved to history so it never pollutes context.

Result: greet() fires unconditionally on every page load. Pre-loaded history and greeting coexist correctly.

Decision: ConversationMessage scoped by user for authenticated agents, session for anonymous

Why: Session keys are anonymous browser identifiers — if Tomas uses two devices, he gets two separate conversation histories. Scoping by Django user instead means history follows the authenticated user across all devices. Anonymous agents (Vega visitor chat) still use session keys since there's no user to scope by.

Result: ConversationMessage has both a nullable user FK and a nullable session_key. Service layer uses user when request.user.is_authenticated, session_key otherwise. This applies to all agents, not just Teo.

Decision: _parse_response aggressively strips markdown fences

Why: Gemini occasionally wraps JSON responses in ```json ... ``` fences despite being instructed not to. The original check only stripped fences when the response started with backticks. The fix uses regex to strip any fence variant from both ends unconditionally, and also handles escaped \n sequences in the message field.

Result: _parse_response() in base.py uses re.sub to strip fences before JSON parsing, and applies replace('\\n', '\n') to the extracted message.

Why: Previously the chat opened empty even when previous messages existed in the DB. Pre-loading history in the Django template GET response means messages are immediately visible without an extra API call.

Result: TeoChatView.get() calls get_conversation_history() and passes it to the template. Messages render as Django template loops before JS executes.

Key Decisions — Chapter II

Decision: Wiki is a Django app with DB-backed pages, not a flat file system

Why: Flat files would require filesystem access for every read, make access control harder, and give agents no structured way to query or update content. A Django app with models gives us queryable, permission-aware, agent-writable pages.

Result: wiki/ Django app with WikiProject and WikiPage models. DB is the single source of truth.

Decision: DB is the source of truth — markdown file is backup only

Why: Having both the file and DB as live sources creates a sync problem. Agents write to the DB. The markdown file is a human-readable weekly backup for disaster recovery.

Result: export_wiki dumps DB → markdown. seed_wiki restores markdown → DB. They are inverses. The file is never edited directly for production use.

Decision: export_wiki uses project/slug format to avoid collisions

Why: Multiple projects can have pages with the same slug (e.g. both t0-d0 and trading-bot have an overview page). Using just slug as the section header would cause the parser to overwrite one with the other.

Result: export_wiki writes ## Page: t0-d0/overview and ## Page: trading-bot/overview. seed_wiki parses project/slug keys and looks them up correctly.

Decision: seed_wiki has fallback inline content for pages not yet in the backup file

Why: New pages added to PAGE_MAP won't exist in the backup file until after the first export_wiki run. Without a fallback the seed would skip them silently.

Result: FALLBACK_PAGES dict in seed_wiki.py provides inline content for new pages. Once export_wiki runs, the file takes over as the source.

Decision: Access control in views, not middleware

Why: Middleware would apply the same rule to all wiki pages. Different pages have different visibility — is_public controls per-page anonymous access. The view is the right place to filter.

Result: WikiProjectView and WikiPageView filter queryset by is_public=True for anonymous users. Authenticated users see all published pages. Anonymous users get 404 for non-public pages.

Decision: Teo references wiki URLs directly, does not read content yet

Why: Teo has no ReadWikiTool until Chapter III. Adding wiki-reading capability to Teo before Vega exists would duplicate work and blur the architectural boundary — Teo reads his own section, Vega owns everything else.

Result: Teo's system prompt includes wiki URLs. He can direct Tomas to the right page but cannot pull content. Full wiki reading comes in Chapter III with Vega's tools.

Decision: export_wiki / seed_wiki as backup and restore pair

Why: The DB is the source of truth but needs a human-readable backup. A weekly markdown export gives a second line of defence alongside the full DB backup. If only the wiki is corrupted, seed_wiki restores it from the markdown file without a full DB restore.

Result: export_wiki dumps all published WikiPages to media/wiki/wiki_t0-d0.md, moving the previous file to media/wiki/backup/ with a timestamp. seed_wiki is the inverse — reads the file and creates/updates DB records. Run weekly via cron.

Decision: export_wiki uses project/slug format for page headers

Why: Multiple projects can have pages with the same slug (both t0-d0 and trading-bot have overview). Using slug alone as the section header causes the parser to overwrite one with the other.

Result: export_wiki writes ## Page: t0-d0/overview. seed_wiki parses project/slug keys. No collisions possible across projects.

Decision: markdown filter strips blank lines between table rows

Why: The Django admin on Windows saves content with blank lines between table rows (\n\n between | rows). The Python markdown library treats a blank line as a paragraph break, which breaks table rendering. This affected the status page on the live Linux server but not locally.

Result: wiki_extras.py markdown filter uses regex to strip blank lines between | rows before passing to the markdown parser. Tables render correctly regardless of how content was stored.

Decision: Management commands as the operational backbone

Why: Wiki lifecycle operations (seed, export, backup) need to be runnable from the command line without a browser. Management commands are the correct Django pattern — they're scriptable, cronnable, and testable.

Result: Four wiki management commands exist: - seed_wiki — restore wiki from markdown file (fresh environment or disaster recovery) - export_wiki — backup wiki DB to markdown file (run weekly) - add_chapter2_decisions — one-time: appended Chapter II decisions (stub now) - add_chapter2_decisions_2 — one-time: appended remaining Chapter II decisions (stub now)

Cron schedule on live server:

0 0 * * *    python manage.py sync_runs
0 1 * * 0    python manage.py backup_db && python manage.py export_wiki
0 2 * * 0    python manage.py cleanup_conversations --days 14
30 3 1 * *   python manage.py cleanup_notifications --days 30
0 8 * * *    python manage.py run_vega

Key Decisions — Chapter III

Decision: ConversationMessage model refactored — agent_name + scope + role

Why: Old model used user FK and session_key — ambiguous when querying per-agent history for authenticated users. Both Teo and Vega messages had user=tomas, role='user', making it impossible to separate them cleanly. Adding more agents would make it worse.

Result: New fields: agent_name (which agent), scope (user:TomasD or session:abc123), role (user or agent). Composite index on (agent_name, scope, created_at). Generic helpers _get_history() and _save_messages() in service.py shared by all agents. Migration: 0005_remove_conversationmessage_session_key_and_more.

Decision: Two-request Teo→Vega delegation flow

Why: A single request would make the user wait for both Teo's response and Vega's full round-trip before seeing anything. That's a 5-10 second blank wait.

Result: First request to /api/teo/chat/ returns immediately with Teo's "I'll ask Vega" message and needs_vega=True. JS shows the message, then fires a second request to /api/teo/delegate/ which calls Vega and returns Teo's synthesised final answer. Better UX, same total latency.

Decision: Vega does not create wiki pages from scratch

Why: Vega has no independent source of truth for implementation details. She can only synthesise from what's already documented. Allowing her to create pages would risk hallucinated "documentation".

Result: Vega maintains existing pages only — edits sections, updates status, marks things complete. New pages are seeded from authoritative .md files written by Tomas after each chapter, then Vega takes over maintenance.

Decision: Wiki content injected directly into Vega's system prompt

Why: Using tool calls to fetch wiki content on every question adds a full LLM round-trip for the most common case (answering from known content). Most visitor questions can be answered directly from the injected content.

Result: Full page bodies injected at instantiation via get_vega() in service.py. Tools (ReadWikiTool, SearchWikiTool) remain available for edge cases where content isn't in the prompt. Trade-off: larger prompt as wiki grows — addressed in Chapter VI with pgvector semantic retrieval.

Decision: Session-based rate limiting for anonymous Vega chat

Why: IP-based rate limiting requires persistent cache infrastructure (Redis or file cache). Session-based is simpler, zero config, and sufficient for token cost protection on a personal portfolio site. A bot sophisticated enough to manage sessions isn't the primary threat model.

Result: Rate limit stored in request.session['vega_rl_count'] and request.session['vega_rl_start']. 20 questions per 10-minute fixed window. Resets automatically. Response includes minutes remaining.

Decision: Shared navbar snippet — main/templates/main/_navbar.html

Why: Each page had its own navbar HTML and CSS. Any change had to be made in 4+ places. The snippet approach reduces duplication and enforces consistency.

Result: Pure HTML include, no <style> block in the snippet (CSS lives in each page's <style> to avoid rendering issues). Context variables: nav_wiki, nav_project, nav_page, nav_show_teo. Used on Teo's page and all three wiki templates.

Decision: Homepage card colour system — teal for public, gold for private

Why: Visitors need to instantly understand what they can access. Colour is the fastest signal.

Result: Public cards (NowLink, Vega): teal accent, label, dots, button. Private cards (Teo, Trading Bot): gold/amber (#c9920a) accent. Lock icon on private card buttons for unauthenticated users, arrow for authenticated. Gold chosen over red (alarming) or grey (invisible) — matches the warm wood aesthetic of the background.

Decision: Vega banner above card grid on homepage

Why: Vega is the only thing on the site that public visitors can actually interact with. She deserved a proper introduction before they hit the wiki.

Result: A wood-toned banner above the project cards: Vega's satisfied avatar, her intro text in her own voice, "Follow me to the wiki →" button. Not a card alongside the projects — a meta layer that frames all of them.

Key Decisions — Chapter IV

Decision: DashAgent mood derived from last run outcome, not arbitrary

Why: An agent whose mood is random or arbitrary feels fake. Dash's emotional state should reflect actual trading performance — that's what makes it meaningful.

Result: get_mood() reads the last run outcome. BUY → neutral, SELL → satisfied, no_action/offline → bored, error/failure → stressed. Mood injected into system prompt at instantiation.

Decision: Dash reports JitoSOL accumulated, not USD profit

Why: The USD value fluctuates with market price and is misleading as a performance metric. JitoSOL accumulated is the actual result of the arbitrage strategy — what we actually earned in token terms.

Result: total_profit_usd removed from Dash's injected data. total_jitosol_accumulated injected instead. All amounts reported to 5 decimal places.

Decision: Jupiter price fetch uses 3-retry loop, no TeoNotification on timeout

Why: Jupiter API occasionally times out — this is normal network noise, not a bot failure. Creating urgent notifications for every timeout would spam Teo's morning briefing with meaningless alerts.

Result: sync_and_trigger retries Jupiter price fetch up to 3 times with 5s delay. On all 3 failures: logs to dash.log, returns silently. No TeoNotification created.

Decision: Heartbeat triggers on whole hour (XX:00), not "60 minutes since last run"

Why: The "60 minutes since last run" approach drifted — if a run happened at 12:01, the heartbeat fired at 13:11, not 13:00. Drift accumulated over time.

Result: Heartbeat fires if current_minute < 10 — i.e. cron fires at XX:00 within the tolerance window. Clean, predictable, no drift.

Decision: Dash DB query system uses parametric queries with ordering and filtering

Why: Fixed named queries (e.g. last_completed_roundtrip) are inflexible. Dash needs to answer analytical questions like "roundtrip with longest duration" or "last 5 sell runs" without a new named query for each case.

Result: _execute_dash_query() accepts query, order_by, order_dir, limit, filter_* params. Annotated fields (duration, jitosol_profit, swap_price) are applied via queryset manager methods before ordering. Result headers state "showing X of Y matching — there are more" or "complete" so Dash is always transparent about data coverage.

Decision: _parse_response searches for JSON anywhere in the string

Why: Gemini 2.5 Flash with thinking_budget=0 sometimes outputs preamble text before the JSON object (e.g. "I need to check that. { ... }"). The original parser only tried to parse the full string, causing the preamble to become the message and the JSON to leak visibly into the chat.

Result: _parse_response() in base.py now has a second attempt — if full-string JSON parse fails, it uses regex to find the first {...} block anywhere in the string. This correctly extracts the JSON even when Gemini prepends natural language.

Decision: Dashboard visual theme — dark glass, gold accents, Share Tech Mono

Why: The dashboard needed a coherent visual identity distinct from Bootstrap defaults. The trading bot aesthetic should feel data-terminal, not generic web app.

Result: Consistent dark glass cards (rgba(20,18,10,0.55-0.92)), gold borders (rgba(201,160,30,0.2-0.4)), navy blue table headers, Share Tech Mono font throughout. BUY rows amber, SELL rows teal, ERROR rows dark red. Applied to all dashboard pages: overview, run list, swap list, roundtrip list, run detail.

Key Decisions — Chapter V

Decision: Per-agent model environment variables

Why: Arthur uses a more capable model (Gemini 3.5 Flash) than other agents (2.5 Flash). Hardcoding model names makes switching expensive. Each agent needs its own model configuration.

Result: .env has TEO_MODEL_NAME, VEGA_MODEL_NAME, DASH_MODEL_NAME, ARTHUR_MODEL_NAME. Each agent class has a model_name class attribute reading from os.getenv(). GeminiClient requires explicit model parameter — no silent fallback.

Decision: Concept file AES-256 encrypted per book

Why: Arthur's full concept (plot, character arcs, chapter outline, twists, ending) must never be readable from the filesystem or returned to any caller. Plaintext on disk is a security risk.

Result: Each book gets its own AES-256 key generated at commission time. Key stored in ArthurSecret DB model. Arthur receives key via load_context(). Encrypted file stored as concept.md.enc. No plaintext version exists on disk.

Decision: Chapter locking is always explicit — no auto-lock

Why: An auto-lock timer would lock chapters Tomas hasn't reviewed. The chapter must be read and approved before it's permanent.

Result: Tomas explicitly says "lock chapter N". Only then does the post-lock pass run: summary, chunks, craft notes.

Decision: Background tasks for long-running Arthur operations

Why: Chapter writing takes 1-2 minutes. Holding an HTTP request open that long causes timeouts and prevents navigation.

Result: ArthurTask DB model. Write/rewrite/lock operations start a background thread and return a task_id immediately. Frontend polls GET /api/arthur/task/<id>/ every 4 seconds.

Decision: ChapterFeedback separate from ConversationMessage

Why: Rewrite feedback is operationally distinct from conversation history. Mixing it with general chat made craft note extraction unreliable — required fragile keyword scanning to find relevant messages.

Result: New ChapterFeedback model. Only Tomas's feedback text saved there. Arthur's acknowledgements are UI-only, never persisted. Lock time reads from ChapterFeedback directly.

Decision: Language rules moved from system prompt to craft files

Why: Czech and English literary rules were hardcoded in arthur.py. Editing them required a code deployment. Rules should be editable without touching code.

Result: media/arthur/craft/general/cs.md and en.md contain all language rules. System prompt says "read the general craft file for {language} and follow all rules strictly." Arthur's primary language is English; Czech is fully supported per-book.

Decision: Post-lock LLM calls use neutral system prompts, not Arthur's book-context prompt

Why: Arthur's system prompt requires a loaded book context. Summary, chunk extraction, and craft note calls at lock time have no loaded project — the system prompt caused "No active project" responses, silently producing empty results.

Result: Summary, chunk, and craft note generation use direct GeminiClient calls with neutral system prompts ("You are a literary editor..."). Results are identical in quality without the context dependency.

Decision: export_bookstore / import_bookstore management commands

Why: Moving the bookstore to a new server requires migrating all DB records. Media files can be copied manually, but DB records need a structured export/import pair.

Result: export_bookstore dumps all Series, Book, Chapter, ChapterChunk, ArthurSecret, ChapterFeedback records to JSON. import_bookstore reads it on the target and creates/updates records. --force flag overwrites existing records.

Key Decisions — Chapter VI

Decision: sqlite-vec over a separate vector database

Why: A separate vector DB (pgvector, Chroma, Pinecone) would require additional infrastructure, a separate connection, and a migration away from SQLite. The project is on shared hosting with SQLite. sqlite-vec extends SQLite in-process with zero infrastructure cost.

Result: sqlite-vec installed as a Python package. Vectors live in vec_chunks virtual table inside the existing db.sqlite3 file. No separate service, no migration.

Decision: sqlite-vec loaded via module-level signal handler, not inside ready()

Why: A signal handler defined inside ready() as a local function gets garbage collected in some environments (confirmed on Linux/Passenger), causing the signal to silently stop firing. The symptom was vec_version() working in direct tests but failing via Django.

Result: load_sqlite_vec() defined at module level in agent_service/apps.py. ready() only calls connection_created.connect(load_sqlite_vec). Module-level functions are never garbage collected.

Decision: Cross-platform extension path resolution

Why: sqlite_vec.loadable_path() returns the path without file extension. On Linux, load_extension() requires the explicit .so suffix. On Windows it needs .dll. The bare path works on neither platform.

Result: load_sqlite_vec() checks if the bare path exists; if not, tries .so, .dll, .dylib in order. First match wins. Works on Windows local dev and Linux production without any environment-specific configuration.

Decision: vec_chunks created via post_migrate signal, not a migration

Why: vec_chunks is a sqlite-vec virtual table — it cannot be a Django model and cannot be created via a normal migration. It needs to be created with raw SQL after sqlite-vec is loaded.

Result: init_vec_chunks() defined at module level in apps.py, connected to post_migrate signal. Runs CREATE VIRTUAL TABLE IF NOT EXISTS on every manage.py migrate. Idempotent — no-op after first creation.

Decision: 3072 dimensions, not 768

Why: text-embedding-004 (768 dimensions) is not available in the installed SDK version. Available models are gemini-embedding-001 (3072), gemini-embedding-2-preview, and gemini-embedding-2. At this project scale (hundreds of chunks), the storage cost of 3072 vs 768 is negligible (~7MB per book). Using the model's native output avoids truncation quality loss.

Result: EMBEDDING_DIMENSIONS = 3072 in genai/embeddings.py. vec_chunks schema uses FLOAT[3072]. Each serialized embedding is 12288 bytes.

Decision: Raw sqlite3 connection for vec_chunks operations

Why: Django's cursor wrapper calls last_executed_query() after every execute, which tries to format the SQL with params using Python's % operator. Binary blob params trigger TypeError: not all arguments converted during string formatting, crashing every insert.

Result: All vec_chunks reads and writes use connection.connection (the raw sqlite3 connection object) directly, bypassing Django's cursor wrapper entirely.

Key Decisions — Chapter VII

Decision: embed_chunks management command, not inline migration

Why: Embedding all existing chunks at migration time would make manage.py migrate block for minutes (one Gemini API call per chunk). A dedicated management command keeps migrations fast and gives explicit control over when embedding runs.

Result: agent_service/management/commands/embed_chunks.py — delta mode by default (skips already-embedded chunks), --force to re-embed all, --dry-run to preview. Run manually after migrations on any fresh environment.

Decision: Post-save signal at module level in apps.py, not inside ready()

Why: Same garbage collection issue as load_sqlite_vec in Chapter VI. Signal handlers defined as local functions inside ready() get GC'd on Linux/Passenger.

Result: on_chunk_saved and on_chunk_deleted defined at module level in agent_service/apps.py. Connected in ready() with post_save.connect(on_chunk_saved, sender=ChapterChunk).

Decision: Embedding failures in signal handler are logged, never raised

Why: The signal fires inside the lock pipeline. A failed embedding (network timeout, API error) must never crash or roll back the lock. Missing embeddings can always be backfilled with embed_chunks --force.

Result: on_chunk_saved wraps the entire embedding + insert in try/except. On failure: logs the error, returns silently. Lock pipeline continues unaffected.

Decision: MAX_TOKENS_HARD_CAP raised from 8192 to 32000

Why: Arthur's rewrite pipeline was hitting the 8192 cap mid-response. Czech chapter prose runs 3000–4000 words; with JSON wrapper and context, the response easily exceeds 8192 tokens. Truncated JSON fails to parse, returning "could not be parsed" to Tomas.

Result: MAX_TOKENS_HARD_CAP = 32000 in genai/llm_client.py. Rewrite call uses max_tokens=16000. Chunk extraction raised from 2048 to 4096 per the original handover recommendation.

Decision: Duplicate ChapterFeedback records prevented at save time

Why: Every click of the Rewrite button (including failed attempts) was creating a new ChapterFeedback record. 12 identical records for the same feedback text meant Arthur synthesised craft notes from 12 duplicates at lock time.

Result: Rewrite view checks ChapterFeedback.objects.filter(..., feedback_text=feedback).exists() before creating. Identical feedback for the same chapter is stored only once.

Decision: Rewrite and acknowledge prompts in English, prose in book language

Why: Arthur's system prompt says he mirrors the language of the person speaking to him. Czech prompts worked for Czech books but would break for English books. Instructions should be in English (Arthur's primary language); the book's language is enforced by the craft rules already loaded in the system prompt.

Result: Both prompts use English instructions. Acknowledge prompt says "in the same language as the feedback". Rewrite prompt says "Write the prose in the book's language as defined in the craft rules above."

Architecture Project Overview