AI Engineering

When Does Adding Fancy RAG Features Work?

an article about overengineering a RAG system, adding fancy things like query optimization, detailed chunking with neighbors and keys, together with expanding the context. The argument against this type of work is that for a...

HNSW at Scale: Why Your RAG System Gets Worse because the Vector Database Grows

a contemporary vector database—Neo4j, Milvus, Weaviate, Qdrant, Pinecone—there may be a really high likelihood that Hierarchical Navigable Small World (HNSW) is already powering your retrieval layer. It is kind of likely you probably did...

Production-Grade Observability for AI Agents: A Minimal-Code, Configuration-First Approach

grow more complex, traditional logging and monitoring fall short. What teams really want is observability: the power to trace agent decisions, evaluate response quality mechanically, and detect drift over time—without writing and maintaining...

GraphRAG in Practice: The way to Construct Cost-Efficient, High-Recall Retrieval Systems

article, , I outlined the core principles of GraphRAG design and introduced an augmented retrieval-and-generation pipeline that mixes graph search with vector search. I also discussed why constructing a wonderfully complete graph—one which...

How We Are Testing Our Agents in Dev

Why testing agents is so hard AI agent is performing as expected just isn't easy. Even small tweaks to components like your prompt versions, agent orchestration, and models can have large and unexpected impacts.  Among...

Notes on LLM Evaluation

, one could argue that the majority of the work resembles traditional software development greater than ML or Data Science, considering we regularly use off-the-shelf foundation models as a substitute of coaching them ourselves....

The Only Guide You Must Superb-Tune Llama 3 or Any Other Open Source Model

Superb-tuning large language models (LLMs) like Llama 3 involves adapting a pre-trained model to specific tasks using a domain-specific dataset. This process leverages the model's pre-existing knowledge, making it efficient and cost-effective in comparison...

Recent posts

Popular categories

ASK ANA