January 20, 2025

NeuralRAG

NeuralRAG is a production-grade Self-Correcting RAG pipeline that goes beyond basic retrieval and generation. It validates its own outputs, catches hallucinations, and automatically refines queries when the answers aren’t good enough.

Self-Correcting Pipeline

Built with LangGraph, the system runs a multi-step correction loop:

Retrieval: Hybrid BM25 + vector search across ChromaDB
Relevance Grading: LLM-based assessment of how relevant the retrieved docs actually are
Generation: Context-aware answer generation with LLaMA3
Hallucination Check: Verifies that generated answers are grounded in the source documents
Quality Assessment: Checks if the answer actually addresses the original question
Auto-Correction: Reformulates the query and re-retrieves if quality isn’t good enough

PDF, DOCX, TXT, Markdown ingestion with intelligent chunking
Recursive text splitting with configurable chunk size and overlap
Metadata preservation for source attribution

Key Features

Hybrid Search combining BM25 keyword matching with semantic vector similarity
Self-Correction Loop with up to 3 automatic retry cycles and query reformulation
Hallucination Detection ensuring answers stay grounded in retrieved context
Real-time Streaming via WebSocket-based responses
Modern Dashboard built with Next.js for document upload and chat

Technologies

Python: Backend pipeline
LLaMA3 (Ollama): Local LLM inference
ChromaDB: Vector database
LangGraph: Orchestration framework
FastAPI: REST API
Next.js: Frontend dashboard
Docker: Containerized deployment

Links

View on GitHub

NeuralRAG

Self-Correcting Pipeline

Key Features

Technologies

Links

Table of Contents

Have a project in mind or want to collaborate? Let's connect.

Self-Correcting Pipeline

Multi-Modal Document Processing

Key Features

Technologies

Links

Table of Contents

Have a project in mind or want to collaborate? Let's connect.