Ask AI — Deep Dive (Public / Non‑Sensitive)
This document is designed for public sharing and for use as a high-signal knowledge source for an AI “digital twin” that must answer interview / diligence questions about the project.
Contact: askai112358@gmail.com
To protect the project, it intentionally avoids:
- code-level file paths, internal function names, and repository-specific anchors
- production identifiers (project IDs, service URLs) and any secrets (keys, tokens, passwords)
- copy‑paste deployment recipes that would enable a full clone without the repo
If you need the evidence-backed version, use the internal deep dive (kept private).
1) What This Is (one paragraph)
Ask AI is a personal + public knowledge vault that turns day-to-day AI work (debugging, research, decisions, learnings) into compounding, searchable knowledge. It is built for two first-class users: humans (via a clean web console) and AI tools (via MCP tools). The system closes the loop: search first → do the work → record what worked → publish the best cards, so future work becomes faster, cheaper, and more reliable.
2) The Problem, Framed Like a Founder
The real pain
AI assistants are great at answering—but they do not reliably retain organizational context across weeks/months, and teams repeatedly pay the “same learning tax”:
- repeated debugging of the same errors
- repeated “what was our decision and why?” questions
- repeated onboarding explanations for new teammates and new AI agents
The core insight
The missing primitive is not “better answers”; it’s a memory layer with incentives that:
- is fast to capture,
- is easy to retrieve,
- is safe to share,
- is usable by both people and AI tools,
- produces a sustainable flywheel.
Ask AI is that memory layer.
3) Who It’s For + Use Cases
Primary users (today)
- Individual builders: keep an always-on personal vault of what you learned while shipping.
- AI power users: let your coding assistant query a real memory layer instead of “guessing.”
- Small teams: keep a shared workspace vault with optional publishing into a public library.
Typical use cases
- Debugging: error → search → fix → record the resolution as a card → publish if reusable.
- Research: summarize a paper/blog/spec → capture raw notes → curate into a reusable card.
- Architecture decisions: record context/options/tradeoffs so future work doesn’t regress.
- Operational playbooks: “how we deploy”, “how we migrate”, “how we troubleshoot prod”.
Success definition
The system is successful when it measurably:
- reduces repeated work (time-to-resolution drops over time)
- increases retrieval confidence (AI answers become more grounded and consistent)
- creates a compounding knowledge asset (personal vault + optional community library)
3.5) My Role (What I Personally Built)
I built Ask AI end-to-end with a founder/CTO mindset: ship a real product, make it reliable, then make it compounding.
Scope of work (high level):
- Product + UX: web console for humans; workflows for capturing, editing, and publishing knowledge.
- Tooling for AI agents: MCP tools + manifest so any compatible AI client can integrate without bespoke glue.
- Retrieval system: hybrid search with multiple sparse signals, fusion ranking, and explainable snippets.
- Data model + multi-tenancy: personal/public/org workspaces, role-based access, and safety controls.
- Economics: credit ledger, hit-only billing, publish incentives, and subscription gating (invite-first, Stripe-ready).
- Operational posture: containerized services, health/readiness checks, and a deployable cloud architecture.
4) Product Surface Area (What Exists Today)
Human UI (Web Console)
- Dashboard: your activity overview + cards list with pagination.
- Card detail viewer/editor: structured fields + long-form content, quality scoring, publish/unpublish controls.
- API keys: create/manage keys for tool access; show-once behavior.
- Workspaces: personal, public, and org workspaces; membership and roles.
- Credits & Billing: credits balance/history, policy explanation, subscription/invite activation (Stripe optional).
- Safety controls: first-order switches to disable public search and/or public publishing.
AI Interface (MCP Tools)
Ask AI exposes a small set of MCP tools that make it easy for AI agents to do the right thing by default:
search_records: search personal vault or public libraryrecord_card: save a structured knowledge card into your personal/org vaultpublish_card: publish a card to the public library (if allowed)capture_raw: store large raw material (logs/chats/notes) for later curationcurate_raw_to_card: transform raw material into a draft card (LLM-assisted when configured)
The key point: AI tools don’t need custom integrations per provider; MCP acts as a stable tool contract.
5) Architecture Overview (System Design Without Repo Leakage)
Components (logical)
-
Web App
- Human-facing UI (dashboard, editor, billing, keys, workspaces)
- Talks to the API service via HTTPS
-
API Service
- Authentication + authorization
- Card CRUD, publish/unpublish
- Credits ledger + billing policy
- Search pipeline (hybrid retrieval + fusion)
- MCP server surface (tools + manifest)
-
Database
- Local dev can run on a lightweight DB
- Production runs on managed Postgres (Supabase-compatible)
- Supports full-text search, fuzzy matching, and optional vector search (pgvector)
-
Optional AI Providers
- Used for curation (raw → structured card)
- Used for embeddings (semantic retrieval) when configured
- Clean provider abstraction so you can swap vendors without rewiring the product
Runtime philosophy
- Keep the “core loop” reliable without AI dependencies.
- If AI providers are missing or misconfigured, the system degrades gracefully (manual edit path remains).
6) Data Model (Conceptual)
Ask AI stores knowledge as Cards inside Workspaces:
Entities
- User
- identity, plan, credits balance
- safety settings (“allow public search”, “allow publish”)
- Workspace
personal,public,org- membership + roles (admin/editor/viewer)
- Card
- status: private/draft vs published
- structured fields for fast retrieval (title, tags, environment, steps, etc.)
- optional long-form body for rich explanations
- quality score to gate publishing
- Evidence / Raw material
- error logs, commands, URLs, excerpts, chat transcripts (sanitized)
- raw capture is designed to handle long content without breaking UX
- Credits ledger
- auditable accounting of earn/spend events
- API keys
- scoped access for tools; key material is not stored in plaintext
This model intentionally supports both: “debug case” cards and broader “how-to / decision / reference” cards without changing the core pipeline.
7) Search System Design (The Core Technical Moat)
Ask AI targets “Google-like behavior” for structured technical knowledge: if a concept appears anywhere meaningful, it should be findable.
Retrieval strategy: multi-signal, not single magic
Rather than betting on one method, Ask AI uses multiple complementary retrieval signals, such as:
- full-text search (fast, high precision for exact terms)
- fuzzy matching (typos, partial matches, near-duplicates)
- evidence-aware matching (logs and attachments matter)
- structured filters (tags, workspace scope, status)
- optional semantic embeddings (when vector search is enabled)
Ranking: robust fusion
Ask AI uses a fusion approach (e.g., Reciprocal Rank Fusion) so that:
- exact matches bubble up quickly,
- fuzzy/partial matches still surface,
- no single signal dominates in pathological edge cases.
Explainability by design
Every result is returned with:
- matched fields (what matched: title/tags/body/evidence/etc.)
- snippets (highlighted segments showing the match in context)
This matters for AI toolchains: agents can justify why they think something is relevant, and users can trust the retrieval.
Quality + billing coupling
Public search uses hit-only billing: credits are spent only when the system believes the result is a real hit (high-confidence threshold). This is crucial to prevent “AI burned my credits” fear and aligns incentives with retrieval quality.
7.5) Differentiation (Why This Isn’t “Just Another Notes App”)
Ask AI is opinionated about one thing: knowledge must be actionable and retrievable under pressure.
Key differentiators:
- AI-native interface (MCP): tools are first-class, not a wrapper around a UI.
- Hybrid retrieval with explainability: not only “what matched”, but “why it matched” (snippets + fields).
- Trust-first economics: hit-only billing + safety switches make agentic usage psychologically safe.
- Compounding loop: the product is designed so your best work becomes future leverage.
8) AI & Embeddings (Designed to be Optional, Not Fragile)
Raw → Card curation
The platform supports a two-step capture:
- capture raw material (logs, chats, long notes)
- curate into a structured card
When an AI provider is configured, curation can be LLM-assisted; otherwise it falls back to a deterministic mode and the user edits manually. This avoids “AI dependency brittleness” while still enabling acceleration when configured.
Embeddings lifecycle (pragmatic)
Embeddings can be computed for semantic retrieval, but the system is built so that:
- keyword retrieval works even without embeddings
- embeddings are most valuable for public knowledge at scale
- the platform can choose when to embed (e.g., on publish, or via background backfill jobs)
This keeps costs predictable and avoids embedding everything prematurely.
9) Credits, Billing, and Incentives (A Balanced Flywheel)
Ask AI’s economic design is built around two constraints:
- Users must feel safe letting AI search.
- The public library must grow with quality, not spam.
Credits-first model (principles)
- Personal vault usage is “unlimited by credits” (people must record freely).
- Public library usage is metered to fund infra and prevent abuse.
- Publishing is rewarded to encourage contributions.
- Rewards do not depend on “others hitting your card” (prevents gaming and clickbait dynamics).
Hit-only public spend (trust)
The public search charge triggers only on high-confidence hits. If the system doesn’t find something useful, users don’t pay for the miss. This reduces friction to “try searching” and is critical for agentic workflows.
Subscription posture
The system supports subscription gating (including invite-based activation while Stripe is not wired). This enables:
- early controlled rollout
- pricing experiments without blocking engineering progress
- a clean upgrade path to Stripe later
Safety controls are first-order
Users can disable:
- public search
- publishing to public
These settings must be enforced consistently across UI, REST API, and MCP tools. This is non-negotiable for trust.
9.5) Competitive Landscape (How I Position It)
Ask AI sits at the intersection of:
- notes/docs (Notion/Confluence),
- debug knowledge bases (internal wikis + ad-hoc snippets),
- RAG knowledge systems (vector DB + retrieval UX),
- agent toolchains (MCP-enabled assistants).
The wedge is: make AI agents reliably search and cite prior work before acting, then make capturing knowledge low-friction and rewarded.
Where Ask AI is intentionally different:
- It treats tool integration as a product surface, not an afterthought.
- It couples retrieval quality to economics (hit-only billing).
- It prioritizes explainability so humans can trust what the agent retrieved.
10) Security, Privacy, and Safety
Threat model (practical)
Risks include:
- accidental secret leakage into cards (tokens, API keys, cookies)
- cross-tenant data exposure (workspace boundary violations)
- abusive public search usage (cost explosion)
- prompt injection through retrieved content (AI misuse)
Mitigations (design-level)
- API keys are scoped and not stored in plaintext.
- Workspace boundaries are enforced at the API/tool level.
- Safety switches can hard-disable public operations.
- Retrieval returns context + snippets (helps humans/agents identify suspicious content).
- The system expects sanitization as a workflow norm; “recording” is not permission to leak secrets.
(Public version note: exact implementation details are intentionally omitted.)
11) Reliability, Observability, and Operational Posture
Health discipline
The API exposes health/readiness signals suitable for managed platforms:
- a liveness check (“process is up”)
- a readiness check (“database reachable and migrations ready”)
Failure-tolerant UX
The UI is designed to degrade gracefully:
- retry behavior on transient fetch failures
- clear “loading vs failed” states
- avoids “blank dashboard” confusion where possible
Cost controls
The most expensive operations (AI calls, embeddings) are:
- optional
- gated by configuration
- avoidable by manual workflows
This keeps the product shippable early and scalable later.
12) Hardest Problems + Key Tradeoffs (What I’d Talk About in Interviews)
1) Search quality vs simplicity
Tradeoff: one “simple” approach (e.g., embeddings-only) vs hybrid retrieval.
- Choice: hybrid, multi-signal + fusion.
- Why: developer knowledge is often keyword-heavy (errors, versions, identifiers) and embeddings alone miss exact-match precision.
2) Incentives vs spam
Tradeoff: reward publishing to grow the library vs attract low-quality content.
- Choice: publish rewards + quality gating + no “hit-based payouts”.
- Why: prevents an attention economy; keeps incentives aligned with usefulness.
3) AI acceleration vs operational fragility
Tradeoff: depend on an LLM for core features vs keep AI optional.
- Choice: core loop works without AI; AI improves curation when configured.
- Why: reliability and portability (and fewer “provider down, product down” incidents).
4) Multi-tenancy depth vs speed
Tradeoff: enterprise-grade permission matrices early vs a clean MVP.
- Choice: start with workspaces + roles + safety switches; design the model to extend to org/enterprise later.
5) “No-hit-no-charge” correctness
Tradeoff: strict hit threshold (protect wallets) vs lenient threshold (avoid missing value).
- Choice: explicit billable-hit logic plus explainability (snippets and matched-fields).
- Why: user trust is existential for agentic search.
13) How I Evaluate Quality (Without Making Up Numbers)
Ask AI is built with the assumption that retrieval quality must be measured.
Offline evaluation
I maintain an evaluation harness that can:
- generate or load a corpus of cards
- run queries with noise (typos, synonyms, long prompts)
- measure retrieval outcomes (recall@k, false positives, ranking stability)
- regress improvements over time
Online validation
In addition to automated evals, the system is validated end-to-end:
- “search → record → publish” loop via tools and UI
- billing policy verification (hit-only spend)
- safety switches enforcement
Numbers are intentionally not published here; the important thing is that the system is designed to continuously measure and improve.
14) 90-Second Interview Pitch (Script)
“Ask AI is a compounding memory layer for humans and AI tools. In practice, AI agents are great at answering but forget everything, so teams keep paying the same debugging and decision tax. I built a system with a web console for humans and MCP tools for AI. The workflow is simple: search first, then record what worked, and publish the best cards. Under the hood it uses hybrid retrieval—full text, fuzzy matching, evidence-aware signals, and optional embeddings—fused via rank fusion, with explainable snippets so agents can justify hits. The economics are credit-based with hit-only billing so users don’t fear agents wasting money, and publishing is rewarded but not in a gameable way. The result is a private vault that’s free to record into, plus an optional public library that becomes more valuable as you contribute.”
15) Interview Question Bank + Answer Outlines (Technical + Product + Behavioral)
System design
Q: Walk me through the end-to-end architecture.
- Two surfaces: human web UI + AI MCP tools
- One API service: auth, data, search, billing policy
- DB supports full-text + fuzzy + optional vector
- Optional AI provider for curation and embeddings
- Health/readiness for managed deploys
Q: Why hybrid retrieval instead of embeddings-only?
- Developer knowledge has strong lexical anchors (errors, versions)
- Embeddings help recall; lexical helps precision
- Fusion ranking reduces single-method failure modes
- Explainability (snippets/matched fields) is easier with sparse signals
Q: How do you prevent “AI burned my credits” concerns?
- Personal search is free
- Public search uses hit-only billing with a strict “billable hit” gate
- Returned results include match explanations
- Users can disable public search entirely via safety control
Q: What’s the hardest part of multi-tenant knowledge systems?
- Correct authorization boundaries
- Preventing accidental data leakage across workspaces
- Designing for future org/enterprise requirements without building a giant permission matrix on day one
AI / RAG
Q: How do you handle hallucinations?
- Retrieval returns evidence and snippets; agents can ground answers
- Encourage “ask before answer”: search first, then reason
- When uncertain, record learnings and tighten retrieval/eval
Q: What happens if the AI provider is not configured?
- Core loop still works: record cards manually, search normally
- Curation tools degrade gracefully; user can edit drafts in UI
- Avoids product fragility tied to vendor availability
Q: How would you scale retrieval as the public library grows?
- Index strategy for text + fuzzy matching
- Optional vector indexes (pgvector)
- Background embedding backfill / batch jobs
- Caching hot queries and popular cards
Reliability & security
Q: How do you approach secret safety in user-generated knowledge?
- Strong policy: never store secrets
- Sanitization expectation built into workflow
- Scoped keys + audit-friendly ledger
- Future: automated redaction + secret scanning at ingestion time
Q: What is your CORS / CSRF posture?
- Single-origin web app to API boundary
- Strict allowlist for origins in production
- Preflight correctness is required for auth flows (Implementation details omitted publicly.)
Product sense
Q: Why MCP—what’s the unique value?
- Standard tool contract for AI agents
- Enables “search/record/publish” as deterministic actions, not prompt hacks
- Makes Ask AI composable with multiple AI frontends and agent frameworks
Q: How do you prevent low-quality public content?
- Quality scoring gate for publishing
- Incentive design avoids “pay for hits”
- Clear publishing workflow + edit-before-publish
Q: What would you measure as your North Star?
- Time-to-resolution improvement over time (per user/team)
- “repeat issue retrieval rate” (did we find a prior solution?)
- Publish conversion rate of high-quality drafts
- Public hit rate vs false positive rate (billing fairness)
Behavioral (BQ)
Q: Tell me about a time you improved a system iteratively.
- Built an eval harness and regression tests for search quality
- Added explainable snippets and matched-field reporting
- Tuned fusion/thresholds to reduce false positives while preserving recall
- Kept changes safe via small iterations + end-to-end validation
Q: Tell me about handling ambiguity.
- Started with a clear core loop (search → record → publish)
- Designed optional capabilities (AI curation, embeddings) without coupling
- Decomposed the roadmap into measurable improvements (quality, trust, incentives)
Q: Tell me about a tough tradeoff you made.
- Chose hybrid retrieval complexity to earn trust and performance
- Kept monetization simple early (credits + invites) while maintaining a Stripe-ready path
Extra questions (what interviewers actually ask)
Q: If this had 1M users, what breaks first?
- DB read amplification from search queries (indexes + caching become critical)
- Cost spikes from AI curation/embeddings (batching + rate limits + paid tiers)
- Abuse vectors (scraping, spam publishing) (moderation + throttles + anomaly detection)
- Multi-tenant complexity for orgs (policy engine + audit logging)
Q: How do you think about “moat”?
- The moat is not “an embedding model”; it’s the closed-loop workflow + evaluation discipline + trust economics
- Compounding data advantage comes from high-quality, well-structured cards (not raw dumps)
- Tool ecosystem integration via MCP makes the product sticky in agent workflows
Q: Why are you the right person to build this?
- Full-stack execution across product, infra, and retrieval engineering
- Strong taste for trust/safety primitives (billing fairness, workspace boundaries, explainability)
- Willingness to build evaluation harnesses and iterate until quality is real, not vibes
Q: Give me a STAR story.
Example 1 — Search quality iteration
- S: Users need “Google-like” search over structured debug knowledge
- T: Improve recall while preventing costly false positives (billing fairness)
- A: Built an evaluation harness, added matched-field/snippet explainability, tuned fusion + thresholds
- R: A measurable regression-resistant pipeline with trust-first billing gates
Example 2 — Reliability under real UX constraints
- S: Browser auth + cross-origin calls are fragile and easy to break during rapid iteration
- T: Keep login/session flows reliable while evolving APIs and policies
- A: Tightened API contracts, improved error/empty/loading handling, and validated with end-to-end checks
- R: Reduced “it loads as zero then updates” confusion and improved perceived stability
Example 3 — Incentive design without gaming
- S: Public libraries tend to be gamed when rewards are tied to clicks/hits
- T: Encourage publishing without creating spam incentives
- A: Publish reward is one-time per card; no “earn on others’ hits”; quality gate for publishing
- R: Incentives align with usefulness instead of attention arbitrage
16) Roadmap (High-Leverage, Founder‑Grade)
Must-haves (to be world-class)
- Automated secret redaction + unsafe-content detection at ingestion time
- Stronger org/enterprise controls: workspace isolation policies, role templates, approval workflows
- Background jobs for embedding backfill + continuous evaluation at scale
- Admin operations: abuse prevention, rate limiting, anomaly detection
- Search quality: synonym expansion, better “hit” detection, and richer snippets
Nice-to-haves (strategic)
- Card types as first-class templates (HowTo, Decision, Lesson Learned, Reference)
- Browser extension / IDE plugin for frictionless capture
- “Autopublish suggestions” with human confirmation
- Enterprise deployment options (VPC, private networking, SSO)
17) What I’d Share Under NDA (Not in This Document)
If someone needs to fully reproduce or audit the system, I would share privately:
- code-level module layout and implementation details
- exact schema and migration history
- production deployment configuration
- measured evaluation results and performance numbers
- incident notes and operational logs
This public version is meant to be truthful, interview-ready, and non-sensitive.