ChengAI — Deep Dive (Project Narrative for RAG)
ChengAI is my AI‑powered portfolio and “digital twin”: a single website where a recruiter can either browse my work like a normal portfolio, or interview an AI version of me that answers questions using grounded evidence from my resume, projects, articles, skills, and experiences.
This write‑up is intentionally long and human‑readable. It’s meant to do two things at once:
- Be a real project deep‑dive that a human can read.
- Be indexable knowledge so the ChengAI chatbot can answer nuanced interview questions (system design, tradeoffs, debugging stories, product thinking) without making things up.
Live demo: https://chengai-tianle.ai-builders.space/
Why I Built It
I wanted a portfolio that does more than “look good” or list bullet points.
The hiring workflow is already becoming AI‑mediated: ATS filters, automated matching, AI summaries, and (increasingly) LLM‑powered recruiting assistants. My bet is that the next iteration is agent‑to‑agent: a company’s AI will ask a candidate’s AI targeted questions and evaluate responses automatically, long before a human spends time on a call.
So instead of fighting that trend, I built a system that embraces it:
- A clean portfolio for humans
- A conversational interface for AI‑mediated screening
- A JD matching tool that generates a persuasive but honest match report
- A CMS and ingestion pipeline so I can keep the knowledge base updated without rewriting code
The goal is simple: if you interview me, the bot should be “close enough” to my real answers—accurate, specific, and grounded—while still sounding human.
Product Experience (What Users Can Do)
1) Browse normally
The site works as a standard personal website:
- Projects
- Experience
- Skills
- Articles
- Stories (behavioral/impact examples)
- Resume download
2) Chat with my AI twin (RAG)
Recruiters (or engineers) can ask:
- “What are your strongest projects?”
- “How did you design your RAG pipeline?”
- “Tell me about a time you worked through ambiguity.”
- “How good is your Python / C# / TypeScript?”
The bot is designed to be evidence‑backed and to avoid hallucinating. When it can’t find evidence, it should either:
- Ask a clarifying question, or
- Explain what’s missing and what it can answer based on available sources.
3) Match a job description (JD Match)
The JD match feature is meant to be practical for real recruiting:
- Paste a JD
- Get a match score + strengths + gaps
- Get evidence‑backed citations from my knowledge base
- Get follow‑up questions and suggested interview angles
The point isn’t to “game keywords”—it’s to produce a credible, persuasive narrative that maps requirements to concrete experience.
4) Admin CMS for fast iteration
I can update the content without deploying:
- CRUD for projects, experience, skills, articles, stories
- Upload a resume
- Rebuild embeddings
- Manage the knowledge base
This matters because the system only stays good if updating it is easy.
System Architecture (High Level)
ChengAI is a Next.js app (App Router) with server routes that orchestrate retrieval + generation, backed by Supabase Postgres.
Core components:
- Frontend (Next.js + React + Tailwind): portfolio pages + chat UI + admin dashboard
- Backend (Next.js Route Handlers): chat endpoint, JD match endpoint, admin CRUD endpoints
- Database (Supabase Postgres):
- Content tables: projects / experiences / skills / articles / stories
- RAG table:
chunks(content chunks + embeddings + metadata) - Analytics table:
events(server‑side tracking)
- Retrieval:
- Vector search:
pgvectorvia an RPC (match_chunks) - Full‑text search: Postgres
tsvector(fts_content) - Fusion: RRF (reciprocal rank fusion) to combine vector + FTS results
- Vector search:
- Generation:
- Chat completions via AI Builders Space (OpenAI‑compatible API)
- Embeddings via OpenAI embedding model (or AI Builders Space as fallback)
- Streaming:
- SSE streaming to make chat responses feel instant
This is intentionally a “simple but strong” architecture: minimal moving parts, but still production‑usable.
Data Model & Knowledge Representation
ChengAI treats “knowledge” as a mix of structured and unstructured data:
- Structured (good for browsing and filtering): projects, skills, experiences
- Unstructured (good for nuanced answers): article bodies, resume text, long narratives, story write‑ups
For RAG, everything becomes chunks:
chunks.source_typetracks where the chunk came from (project,experience,article,resume,story,skill)chunks.source_idlinks back to the original recordchunks.metadatastores helpful attributes like title, slug, and chunk index
This lets the chatbot cite sources and lets me evolve the format without rewriting the retrieval system.
Ingestion & Indexing Pipeline
The system needs to keep the knowledge base fresh.
ChengAI maintains embeddings in sync with the published content:
- When I publish/update an item (project/article/experience/etc.), the server re‑indexes it:
- Chunk text into ~1000‑character pieces
- Generate embeddings (batched for efficiency)
- Replace existing chunks for that source
There’s also an admin “Rebuild embeddings” action that reindexes everything in one pass—useful after changing chunking rules or prompts.
Key design points:
- Chunking is semantic‑ish (splitting by paragraphs/lines) rather than naive fixed‑length slicing.
- Each chunk is big enough to preserve meaning, but small enough to retrieve precisely.
- Chunk metadata includes
chunk_indexandtotal_chunksso the bot can cite and the UI can display sources cleanly.
Retrieval Strategy (Hybrid + RRF)
I don’t rely on a single retrieval method.
ChengAI performs two searches in parallel:
- Vector similarity search (semantic recall) using
pgvector - Full‑text search (exact keyword precision) using
tsvectorwebsearch
Then it merges results with RRF (reciprocal rank fusion):
- If vector search finds semantically similar chunks, they rank well.
- If the query includes exact terms (e.g., “Semantic Kernel”, “EKS”, “FAISS”), FTS reinforces those chunks.
- The fused ranking is usually more robust than either method alone.
This hybrid approach is especially helpful for interview questions, which often mix:
- Conceptual prompts (“hardest tradeoff”, “system design”)
- Exact entities (“Kafka”, “Terraform”, “LangGraph”)
Generation (Grounded, But Not Robotic)
The hard part of a “digital twin” is balancing two constraints:
- Accuracy: don’t invent facts
- Human‑ness: don’t sound like a rigid citation machine
ChengAI uses a prompt that emphasizes:
- Answer the user’s intent first
- Use retrieved sources as grounding and examples
- If evidence is missing, say so, but still be helpful (suggest what to ask / where to look / what to clarify)
The goal is to feel like a strong candidate in a real conversation—not a search engine output.
Streaming UX (SSE)
For a chatbot used in hiring, latency and perceived latency matter.
ChengAI streams assistant output using Server‑Sent Events:
- Tokens appear as they are generated
- The UI updates progressively
I added backend sanitization so streaming output stays readable (e.g., avoid dumping raw “SOURCE 1” artifacts mid‑sentence). Evidence is presented as a separate, clean section rather than interrupting the main narrative.
Admin, Security, and Operational Safety
Because the app has admin write access to the database, I treat it like a real system:
- Admin routes require a session cookie
- CSRF validation is enforced for mutations
- Only published items are public and indexed for RAG
This keeps the production site safe while still letting me iterate quickly.
What I’d Highlight in an Interview
If you’re evaluating ChengAI as a system design project, here are the “high signal” points:
- I treated the portfolio as a product, not just a website.
- I designed a knowledge representation that supports both browsing and RAG.
- I implemented hybrid retrieval + fusion, not just naive vector search.
- I built a content ops loop (CMS + rebuild embeddings) so the system can keep improving.
- I made concrete UX tradeoffs: streaming output, readable citations, mobile usability, etc.
What’s Next (Realistic Roadmap)
ChengAI is already useful, but the “digital twin” can keep getting better. The next upgrades I’d prioritize:
- Better long‑context memory for multi‑turn interview sessions (persistent session state)
- More robust intent routing between “chat” and “JD match” so the system always responds at the right abstraction level
- Stronger evaluation harness (more automatic regression tests for common interview prompts)
- A smoother “continue chatting about this JD” workflow after a JD match report (so recruiters can drill deeper without re‑pasting context)
The core idea stays the same: make the system more helpful and more human, without compromising grounded truthfulness.