RAG Implementation
RAG (Retrieval-Augmented Generation) implementation workflow covering embedding selection, vector database setup, chunking strategies, and retrieval optimization.
Content
Overview
Specialized workflow for implementing RAG (Retrieval-Augmented Generation) systems including embedding model selection, vector database setup, chunking strategies, retrieval optimization, and evaluation.
When to Use This Workflow
Use this workflow when:
- -Building RAG-powered applications
- -Implementing semantic search
- -Creating knowledge-grounded AI
- -Setting up document Q&A systems
- -Optimizing retrieval quality
Workflow Phases
Phase 1: Requirements Analysis
#### Skills to Invoke
- -
ai-product- AI product design - -
rag-engineer- RAG engineering
#### Actions
1. Define use case
2. Identify data sources
3. Set accuracy requirements
4. Determine latency targets
5. Plan evaluation metrics
#### Copy-Paste Prompts
Phase 2: Embedding Selection
#### Skills to Invoke
- -
embedding-strategies- Embedding selection - -
rag-engineer- RAG patterns
#### Actions
1. Evaluate embedding models
2. Test domain relevance
3. Measure embedding quality
4. Consider cost/latency
5. Select model
#### Copy-Paste Prompts
Phase 3: Vector Database Setup
#### Skills to Invoke
- -
vector-database-engineer- Vector DB - -
similarity-search-patterns- Similarity search
#### Actions
1. Choose vector database
2. Design schema
3. Configure indexes
4. Set up connection
5. Test queries
#### Copy-Paste Prompts
Phase 4: Chunking Strategy
#### Skills to Invoke
- -
rag-engineer- Chunking strategies - -
rag-implementation- RAG implementation
#### Actions
1. Choose chunk size
2. Implement chunking
3. Add overlap handling
4. Create metadata
5. Test retrieval quality
#### Copy-Paste Prompts
Phase 5: Retrieval Implementation
#### Skills to Invoke
- -
similarity-search-patterns- Similarity search - -
hybrid-search-implementation- Hybrid search
#### Actions
1. Implement vector search
2. Add keyword search
3. Configure hybrid search
4. Set up reranking
5. Optimize latency
#### Copy-Paste Prompts
Phase 6: LLM Integration
#### Skills to Invoke
- -
llm-application-dev-ai-assistant- LLM integration - -
llm-application-dev-prompt-optimize- Prompt optimization
#### Actions
1. Select LLM provider
2. Design prompt template
3. Implement context injection
4. Add citation handling
5. Test generation quality
#### Copy-Paste Prompts
Phase 7: Caching
#### Skills to Invoke
- -
prompt-caching- Prompt caching - -
rag-engineer- RAG optimization
#### Actions
1. Implement response caching
2. Set up embedding cache
3. Configure TTL
4. Add cache invalidation
5. Monitor hit rates
#### Copy-Paste Prompts
Phase 8: Evaluation
#### Skills to Invoke
- -
llm-evaluation- LLM evaluation - -
evaluation- AI evaluation
#### Actions
1. Define evaluation metrics
2. Create test dataset
3. Measure retrieval accuracy
4. Evaluate generation quality
5. Iterate on improvements
#### Copy-Paste Prompts
RAG Architecture
Quality Gates
- -[ ] Embedding model selected
- -[ ] Vector DB configured
- -[ ] Chunking implemented
- -[ ] Retrieval working
- -[ ] LLM integrated
- -[ ] Evaluation passing
Related Workflow Bundles
- -
ai-ml- AI/ML development - -
ai-agent-development- AI agents - -
database- Vector databases
FAQ
Discussion
Loading comments...