Retrieval-Augmented Generation (RAG) System for Legislative Text Analysis

Sep 30, 2025·
Christopher Teixeira
Christopher Teixeira
· 1 min read

This research effort involved the end-to-end development of a Retrieval-Augmented Generation (RAG) system using a custom embedding model to enable highly accurate and context-aware analysis of legislative documents. The primary goal was to create a robust, full-cycle AI solution that transforms raw legislative text into validated, consistent, and easily visualizable outputs. My role encompassed the entire development lifecycle, from data engineering and system design to final result validation and front-end visualization.

RAG Implementation & Data Engineering: Orchestrated the collection and ingestion of documents from multiple heterogeneous sources to construct the foundational knowledge base for the RAG system using a custom embedding model.

Custom Ranking & Search Enhancement: Designed and implemented a custom ranking heuristic to significantly enhance the relevance and accuracy of search results, utilizing frameworks like Chroma and LangChain within a Python environment.

Full-Cycle AI System Design: Developed the complete analytical workflow for legislative text, which included engineering specialized system and user prompts to guide the process from initial data input/query to final validation.

Validation & User Interface: Implemented a system of chained prompts to automatically validate the AI analysis against source documents, and visualized the final, validated outputs using HTML and JavaScript to ensure data integrity and user consistency.

Christopher Teixeira
Authors
Principal Data Scientist
My interests include using my skills for the public good and playing with baseball data.