sgnt.ai

I'm Peter Sergeant. My current area of interest is working with the outputs of LLMs, and using them to drive agents. The majority of my work is currently in building NPCs for online games. There is also an about page.

Latest Updates

Forget the complexity: AI all boils down to drawing the right lines

August 2, 2025

Over and over again, despite the best efforts of humans, the most effective AI systems come down to one simple idea: finding the right shaped line that fits some data points.

Read Post

RAG chunking isn’t one problem, it’s three

July 7, 2025

Existing articles often focus on a chunk as a singular concept: you split the article into paragraphs, say, and use these to feed the LLM, generate embeddings, and quote back to the user. But that's three problems, not one.

Read Post

Understanding Modern AI is Understanding Embeddings: A Guide for Non-Programmers (with lots of dogs!)

May 8, 2025

Embeddings are a core AI concept that underpin a great deal of what we today think of as being AI. This article is going to give you an accurate and intuitive understanding of what an “embedding” is in less time than it takes to eat a (very large) bagel.

Read Post

When Users Won’t Wait: Engineering Killable LLM Responses

April 22, 2025

In our application, the chatbot can’t hide behind a loading spinner; users keep talking and expect it to pivot instantly. This constraint forced us to develop some lightweight techniques you can graft onto your own LLM app that serves impatient users.

Read Post

In-memory free-text search is a super-power for LLMs

April 19, 2025

While working on LLM-driven NPCs, I observed significant improvements in several areas by adding a simple component: in-memory free-text search

Read Post

Get the hell out of the LLM as soon as possible

April 1, 2025

Don’t let an LLM make decisions or implement business logic: they suck at that.

Read Post

Four bad definitions of "Agentic AI"

March 30, 2025

If your team promises to deliver (or buy!) 'Agentic AI', then everyone needs to have a shared understanding of what that means; you don't want to be the one left trying to explain the mismatch to stakeholders six months later. There's no current (2025-03-30) widely accepted definition, so if you're using the term, be clear on what you mean, and if someone else is using the term, it's worth figuring out which one they mean.

Read Post

Street-fighting RAG: Chain-of-thought prompting

January 7, 2025

or, reducing hallucination and making in-generation adjustments to LLM responses

Read Post

Series

Street-fighting RAG

Latest Updates

Forget the complexity: AI all boils down to drawing the right lines

RAG chunking isn’t one problem, it’s three

Understanding Modern AI is Understanding Embeddings: A Guide for Non-Programmers (with lots of dogs!)

When Users Won’t Wait: Engineering Killable LLM Responses

In-memory free-text search is a super-power for LLMs

Get the hell out of the LLM as soon as possible

Four bad definitions of "Agentic AI"

Street-fighting RAG: Chain-of-thought prompting