CURSOR PLAN MODE PROMPT — REALTIME PHONE SYSTEM (Twilio ↔ FastAPI ↔ OpenAI Realtime ↔ Supabase)

Goal:  
Create a production-ready real-time AI phone system that connects Twilio Media Streams ↔ FastAPI WebSocket ↔ OpenAI Realtime API ↔ Supabase, with audio streaming, RAG-enhanced instructions, transcript collection, and full call logging.

==============================
HIGH-LEVEL REQUIREMENTS
==============================

Implement a backend service with:

1. API Endpoints
- POST /api/v1/incoming-call-realtime — Twilio webhook returning TwiML that instructs Twilio to connect the call to a WebSocket stream.
- WS /api/v1/media-stream — WebSocket bridge:
  - Accepts Twilio Media Streams WebSocket connection
  - Connects to OpenAI Realtime API
  - Converts audio formats (μ-law ↔ PCM16)
  - Streams audio bidirectionally
  - Collects transcripts
  - Runs RAG
  - Saves call + transcript to Supabase

==============================
AUDIO PROCESSING
==============================

Implement audio conversion utilities:

Twilio → OpenAI:
- μ-law 8kHz → PCM16 8kHz → PCM16 24kHz
- base64 decode/encode
- use audioop-lts imported as audioop

OpenAI → Twilio:
- PCM16 24kHz → PCM16 8kHz → μ-law 8kHz

==============================
REALTIME EVENT HANDLING
==============================

Twilio Events:
- connected
- start (contains customParameters: From, To)
- media (audio chunks)
- stop

OpenAI Events:
- response.audio.delta
- response.audio_transcript.delta
- response.audio_transcript.done
- conversation.item.input_audio_transcription.completed

==============================
RAG IMPLEMENTATION
==============================

Supabase tables required:
- calls
- call_transcripts
- user_settings
- agent_prompts
- knowledge_base (with pgvector embedding column)

Supabase RPC required:
match_knowledge_chunks(query_embedding vector, match_user_id uuid, match_count int)

RAG steps:
1. Resolve user_id from Twilio "To" number using user_settings.
2. Fetch custom agent prompt from agent_prompts.
3. Generate embedding using text-embedding-3-small.
4. Query knowledge base using RPC.
5. Apply similarity threshold and chunk limit (env-configurable).
6. Assemble RAG-enhanced instructions.
7. Send session.update to OpenAI before streaming begins.

==============================
TRANSCRIPT COLLECTION
==============================

Implement TranscriptCollector that:
- Stores user speech
- Stores AI speech
- Builds final transcript on call end
- Generates AI summary (using OpenAI completion)
- Saves calls row
- Saves call_transcripts row

==============================
ENVIRONMENT VARIABLES
==============================

Create .env support with:

TWILIO_ACCOUNT_SID=
TWILIO_AUTH_TOKEN=
OPENAI_API_KEY=
LEMONFOX_API_KEY=
SUPABASE_URL=
SUPABASE_SERVICE_ROLE_KEY=
RAG_CHUNK_LIMIT=5
RAG_SIMILARITY_THRESHOLD=0.7
LOG_LEVEL=INFO

==============================
PROJECT STRUCTURE (CREATE EXACTLY THIS)
==============================

app/
  main.py
  api/
    v1/
      realtime_stream.py
      calls.py
  services/
    knowledge_base_service.py
  repositories/
    knowledge_base_repo.py
    conversations_repo.py
  core/
    config.py
    logging_config.py

==============================
IMPLEMENTATION INSTRUCTIONS FOR CURSOR
==============================

Build the entire project end-to-end, with:

Frameworks:
- FastAPI
- websockets (client)
- audioop-lts
- openai (latest official)
- supabase-py
- python-dotenv
- uvicorn

Behavior:
- Fully implement both endpoints
- Fully implement WebSocket bridge
- Fully implement audio conversion
- Fully implement RAG pipeline
- Fully implement transcript saving
- Fully implement logging

Code Quality:
- Type hinted
- Modular
- Async everywhere
- Robust error handling
- Clear logging
- No unused code
- Production-ready structure

==============================
WHEN BUILDING CODE
==============================

Cursor should:
- Generate missing files
- Update imports automatically
- Handle all async operations
- Create helper classes for clarity
- Follow the functional requirements exactly
- Build the system as described in the long specification

==============================
FINAL ASSETS TO DELIVER
==============================

Cursor should output:
- Complete FastAPI backend
- Complete audio processing utilities
- Complete WebSocket bridge
- Complete RAG pipeline
- Complete transcript collector
- Complete Supabase integration
- Complete Twilio webhook handler
- Ready-to-run system using uvicorn
- Fully working local dev workflow with ngrok

==============================
START NOW
==============================

Begin by generating the full project structure and boilerplate code, then proceed to implement each file in detail.  
Ensure the final output is a fully working system matching the design above.

If you need more context, request clarification—otherwise, proceed with generation.