Ai-Engineering on Tyler Wells

Ai-Engineering on Tyler Wellshttps://www.tylerwells.dev/tags/ai-engineering/Recent content in Ai-Engineering on Tyler WellsHugoen-usSun, 03 May 2026 00:00:00 +0000Cut OpenClaw API Costs by 95%https://www.tylerwells.dev/posts/openclaw-cost-cutting/Sun, 03 May 2026 00:00:00 +0000https://www.tylerwells.dev/posts/openclaw-cost-cutting/OpenClaw bills can hit $600/month if you're not careful. Here's how to get that down to $20 without breaking core functionality via model routing, local inference, prompt caching, and smarter scheduling.What I Learned Building a Multi-Agent Document Analysis Systemhttps://www.tylerwells.dev/posts/multi-agent-doc-analysis-post-4-lessons-learned/Fri, 24 Apr 2026 00:00:00 +0000https://www.tylerwells.dev/posts/multi-agent-doc-analysis-post-4-lessons-learned/[Multi-Agent Series 4/4] What worked, what broke, how chunking caused false conclusions, and what I'd do differently in v2.Coordinating Multiple LLM Agents: Cross-Domain Synthesishttps://www.tylerwells.dev/posts/multi-agent-doc-analysis-post-3-coordinator/Thu, 23 Apr 2026 00:00:00 +0000https://www.tylerwells.dev/posts/multi-agent-doc-analysis-post-3-coordinator/[Multi-Agent Series 3/4] How a coordinator turns specialist findings into cross-domain insights and decision-ready recommendations.Building Specialist LLM Agents: Technical, Risk, Cost, and Timeline Analysishttps://www.tylerwells.dev/posts/multi-agent-doc-analysis-post-2-specialists/Wed, 22 Apr 2026 00:00:00 +0000https://www.tylerwells.dev/posts/multi-agent-doc-analysis-post-2-specialists/[Multi-Agent Series 2/4] How specialist prompts, shared inheritance, and structured JSON outputs turn one LLM into multiple domain-specific analyzers.Why Multi-Agent Systems Beat Single Agents for Complex Documentshttps://www.tylerwells.dev/posts/multi-agent-doc-analysis-post-1-why-multi-agent/Tue, 21 Apr 2026 00:00:00 +0000https://www.tylerwells.dev/posts/multi-agent-doc-analysis-post-1-why-multi-agent/[Multi-Agent Series 1/4] Why one large prompt breaks down on RFPs and contracts, and how specialist agents plus coordinator synthesis produce better analysis.Building Ozark Ridge: Lessons Learned and What I'd Do Differentlyhttps://www.tylerwells.dev/posts/ozark-ridge-post-4-lessons-learned/Thu, 16 Apr 2026 00:00:00 +0000https://www.tylerwells.dev/posts/ozark-ridge-post-4-lessons-learned/[Retail AI Series 4/4] What worked, what didn't, what I'd do differently in v2, and why this project matters for e-commerce AI.Building the AI Product Assistant: Context Injection, Multi-Turn Chat, and Cross-Product Retrievalhttps://www.tylerwells.dev/posts/ozark-ridge-post-3-ai-assistant/Wed, 15 Apr 2026 00:00:00 +0000https://www.tylerwells.dev/posts/ozark-ridge-post-3-ai-assistant/[Retail AI Series 3/4] How to build a Rufus-style AI assistant that answers product questions, suggests complementary gear, and builds camping loadouts — with conversation history, context injection, and dynamic retrieval.Keyword Search vs Semantic Search: Why Natural Language Queries Need Vector Embeddingshttps://www.tylerwells.dev/posts/ozark-ridge-post-2-keyword-vs-semantic/Tue, 14 Apr 2026 00:00:00 +0000https://www.tylerwells.dev/posts/ozark-ridge-post-2-keyword-vs-semantic/[Retail AI Series 2/4] Side-by-side comparison of keyword and semantic search, why keyword search fails on natural language queries, and what the retrieval scores actually tell you.Building AI Search for a Retail Website: The Stack and Whyhttps://www.tylerwells.dev/posts/ozark-ridge-post-1-stack/Sun, 12 Apr 2026 00:00:00 +0000https://www.tylerwells.dev/posts/ozark-ridge-post-1-stack/[Retail AI Series 1/4] Building a mock outdoor retail site with AI-powered product search and a Rufus-style assistant. This post covers the architecture, stack decisions, the indexing pipeline, and why RAG matters for e-commerce.AI-Powered QA Testing with playwright-cli and GitHub Copilothttps://www.tylerwells.dev/posts/playwright-cli-qa-post/Thu, 09 Apr 2026 00:00:00 +0000https://www.tylerwells.dev/posts/playwright-cli-qa-post/How to use playwright-cli with GitHub Copilot as an autonomous QA agent — without Playwright MCP, without writing test code, and without needing Copilot Vision or an embedded browser.What I Learned Building a LangGraph Agent From Scratchhttps://www.tylerwells.dev/posts/langgraph-agent-post/Mon, 30 Mar 2026 00:00:00 +0000https://www.tylerwells.dev/posts/langgraph-agent-post/Building a LangGraph job research agent showed me where agent loops actually help, why typed state matters, and how conditional edges change the design.Your MCP Server Is Only as Good as Its Docstringshttps://www.tylerwells.dev/posts/building-a-cfb-mcp-server/Sun, 15 Mar 2026 00:00:00 +0000https://www.tylerwells.dev/posts/building-a-cfb-mcp-server/Building a college football data MCP server taught me that the most important design decision in an agentic system isn't the architecture — it's the docstrings.How a Simple Power Automate Workflow Automated 250+ Hours of Work Per Monthhttps://www.tylerwells.dev/posts/retail-power-automate-workflow/Fri, 20 Feb 2026 00:00:00 +0000https://www.tylerwells.dev/posts/retail-power-automate-workflow/Eliminate costs tied to repetitive corporate tasks with simple workflows like this -- a Microsoft Form + a Power Automate flow + an AI promptMaking the Case for AI-First Engineeringhttps://www.tylerwells.dev/posts/ai-first-engineering-advocacy/Sun, 08 Feb 2026 00:00:00 +0000https://www.tylerwells.dev/posts/ai-first-engineering-advocacy/How I made the case for an AI-first workflow and what we learned from a pilot that increased team velocity by 300%.Scoring RAG Answer Quality with an LLM Judgehttps://www.tylerwells.dev/posts/rag-answer-judging/Mon, 26 Jan 2026 00:00:00 +0000https://www.tylerwells.dev/posts/rag-answer-judging/[RAG Series 3/3] Source URL retrieval tells you whether the right content was retrieved. It doesn't tell you whether the answer was any good. Adding an LLM judge to the eval harness reveals two failure modes that retrieval scoring alone can't see.How to Design RAG Eval Test Caseshttps://www.tylerwells.dev/posts/design-rag-eval-test-cases/Sat, 24 Jan 2026 00:00:00 +0000https://www.tylerwells.dev/posts/design-rag-eval-test-cases/[RAG Series 2/3] How to write test cases that catch real retrieval problems, why source URL retrieval is a useful proxy metric, and when it isn't enough.RAG Retrieval: Chunking, Embeddings, Reranking, and an Evalhttps://www.tylerwells.dev/posts/rag-retrieval-quality/Thu, 22 Jan 2026 00:00:00 +0000https://www.tylerwells.dev/posts/rag-retrieval-quality/[RAG Series 1/3] Covers chunking strategy, embedding model consistency, reranking, and building an eval harness — including what happened when Voyage AI's free-tier rate limits forced a more resilient architecture.