Generative AI Engineering

AI Agent & RAG Infrastructure

This project represents my current work as an AI Engineer at IT&M S.R.L.. It focuses on the end-to-end design and implementation of an advanced AI agent system designed to solve complex business problems by leveraging Large Language Models (LLMs) and verified internal data.

The Challenge

In enterprise environments, standard LLMs often hallucinate or lack specific domain knowledge. The goal was to engineer a system that could provide context-aware, reliable answers grounded in a strictly verified knowledge base. We needed a solution that was not only accurate but also scalable and capable of interacting with existing legacy platforms.

RAG Architecture

To achieve high reliability, I utilized the Retrieval-Augmented Generation (RAG) framework. Unlike a standard chatbot, this system first retrieves relevant information from our private vector database before generating a response. This ensures that every answer is backed by concrete company data, significantly reducing errors and increasing user trust.

Orchestration & Scalability

A key part of my role was architecting the infrastructure to manage this complexity. I built a scalable system capable of orchestrating asynchronous API calls to multiple Large Language Models simultaneously. The system is optimized to:

Handle concurrent user requests without bottlenecks.
Aggregate complex query results from different sources.
Deliver a single, cohesive final response to the user.

Integration via GraphQL

Modern AI needs to talk to older systems. I leveraged GraphQL for high-performance integration. This allowed efficient querying and data manipulation between the new AI agent and the company's existing legacy platforms, ensuring that the AI could take action based on its insights.