Almost Timely News: 🗞️ Making AI More Efficient (2026-04-05) :: View in Browser The Big Plug👉 I’ve got a new course! GEO 101 for Marketers. 👉 Just updated! The Unofficial LinkedIn Algorithm Guide, March 2026, now with new information straight from LinkedIn! Content Authenticity Statement100% of this week’s newsletter content was originated by me, the human. You’ll see me working with Claude Code in the video version. Learn why this kind of disclosure is a good idea and might be required for anyone doing business in any capacity with the EU in the near future. Watch This Newsletter On YouTube 📺Click here for the video 📺 version of this newsletter on YouTube » Click here for an MP3 audio 🎧 only version » What’s On My Mind: Making AI More EfficientIn this week’s issue, I want to dig deeper and more technically from the Trust Insights Livestream we did this week on managing AI usage limits. In each week’s livestream, we focus on the practical, business-oriented, “so what?” (hence the name of the show) but there’s a bunch of stuff that gets left on the mental cutting room floor because there either isn’t time, or it’s a rathole that would detract from the main point. This week’s newsletter is one of those ratholes. The TLDR of the livestream is that with proper planning, governance, utilities, and model swapping, you can use AI efficiently and effectively. Go watch or read the livestream here. Now, let’s go down that rathole. Part 1: AI Reinvents the WheelOne of the biggest hidden efficiency costs in AI, especially agentic AI, is thinking. We have these great models now, reasoning models that do rough drafts behind the scenes and have conversations with themselves to arrive at far better conclusions. If you remember from the early days of generative AI when we’d prompt a model to “think out loud step by step” or “show your work”, what we were doing back them was creating reasoning manually. Now, for almost every model and tool on the market, that’s automatic. It happens behind the scenes, with that little “Thinking…” label that pops up in various interfaces. That’s fine for casual use, but when you’re using agentic systems like Claude Cowork, Claude Code, etc. - anything that has a usage limits - every scrap of thinking eats away at your usage quota. Even more problematic is that when AI starts doing deep problem solving, like writing code, it understands from our instructions what it is we want it to do, and then it has to figure out how to do it. This results in it reinventing the wheel many, many times over. For example, in coding, if you want it to process a regular expression (regex), you might have that in the instructions, and then your favorite agentic system will write regex code. Except… there’s absolutely no need to do that. There are thousands of different regex libraries and packages that exist, and instead of burning tens of thousands of tokens writing it from scratch, it could just say “Oh, I’ll use this pre-existing solution that solves the problem perfectly” in like, 10 tokens. And this isn’t just coding. Every time you write a web page, draft a marketing campaign, do corporate strategy, anything where there’s a body of proven, existing knowledge, AI has a tendency to recreate it from scratch. This is an enormous waste. Here’s why this matters: one way or another, every token you consume, you pay for. If you’re on a fixed-fee plan like Claude Max or Gemini Ultra, you have fixed limits for how many tokens you can consume or how many requests you can make in periods of time. Claude, for example, books by a five-hour window and a weekly window. If you use more than your allotted amount of usage, you either have to pay more or you can’t use the service until the next interval. If you’re using your tool via its API, you are paying for every single token that goes through. And while some models charge very small amounts of money, like 50 cents per million tokens, when you’re doing things like agentic AI, you can consume several hundred million tokens in an hour. The less AI we use, the less we pay for it, one way or the other. One of the things I said to Katie on the livestream is that our goal, when using agentic AI, is paradoxically to use AI as little as possible. The more we can bring existing, pre-baked stuff into AI, the better results we’ll get because we won’t be asking AI to reinvent the wheel constantly, and we won’t be paying to repeat work. This is something I’ve been teaching for years, the concept of knowledge blocks, pre-made chunks of data that you give to AI or make available so that it doesn’t have to repeat work. Last year the hype bros decided to brand this as “context engineering” but it’s fundamentally all about managing the knowledge we give AI. Today, context engineering is all about not only the information we give AI, but also the utilities we give it. Command line interfaces or CLIs are tools that you install locally on the machine that you’re running your Agentic system on that are text-based apps. They look like they’re right out of from 1983, but those text-based apps are ideal for AI tools to use because they don’t have to click a mouse on anything, they can just type. And popular command line tools have been around for decades. Some of the ones that I use these days? Google Workspace has a command line tool that allows a tool like Claude Code to take control of any part of your Google Workspace. Gmail, your Google Calendar. If it’s in Google, it can control it, which means that I can have it pull my agenda, plan things out. It’s fantastic. Another one, the WordPress command line tool, allows a tool like Claude Code to programmatically manage your WordPress blog. It can write posts, it can rearrange things, it can turn on and off plugins, it can even validate new plugins. A third one that I really love: the NotebookLM unofficial command line tool that allows a tool, again, like Claude Code or Claude Cowork, to access notebook LM from the command line and create new notebooks, upload sources, create those audio podcasts. There’s no limit to the number of command line tools that are out there that you can give to a system like Claude Code or OpenWork or OpenCode or Qwencode, and instead of them having to reinvent the wheel constantly and burn that token budget, they can just pick up the tool, use it, and get great results by not using AI. So how else do we use AI as little as possible while still getting fantastic results? Deep research, first principles, templates, and pointers. Part 2: Deep ResearchHumans have this concept called cognitive load; |