Hacker News

by robotswantdataon 6/30/2025, 8:53 PMwith 247 comments

by JohnMakinon 6/30/2025, 9:29 PM

> Building powerful and reliable AI Agents is becoming less about finding a magic prompt or model updates.

Ok, I can buy this

> It is about the engineering of context and providing the right information and tools, in the right format, at the right time.

when the "right" format and "right" time are essentially, and maybe even necessarily, undefined, then aren't you still reaching for a "magic" solution?

If the definition of "right" information is "information which results in a sufficiently accurate answer from a language model" then I fail to see how you are doing anything fundamentally differently than prompt engineering. Since these are non-deterministic machines, I fail to see any reliable heuristic that is fundamentally indistinguishable than "trying and seeing" with prompts.

by rTX5CMRXIfFGon 7/1/2025, 4:17 AM

So then for code generation purposes, how is “context engineering” different now from writing technical specs? Providing the LLMs the “right amount of information” means writing specs that cover all states and edge cases. Providing the information “at the right time” means writing composable tech specs that can be interlinked with each other so that you can prompt the LLM with just the specs for the task at hand.

by simonwon 6/30/2025, 9:25 PM

I wrote a bit about this the other day: https://simonwillison.net/2025/Jun/27/context-engineering/

Drew Breunig has been doing some fantastic writing on this subject - coincidentally at the same time as the "context engineering" buzzword appeared but actually unrelated to that meme.

How Long Contexts Fail - https://www.dbreunig.com/2025/06/22/how-contexts-fail-and-ho... - talks about the various ways in which longer contexts can start causing problems (also known as "context rot")

How to Fix Your Context - https://www.dbreunig.com/2025/06/26/how-to-fix-your-context.... - gives names to a bunch of techniques for working around these problems including Tool Loadout, Context Quarantine, Context Pruning, Context Summarization, and Context Offloading.

by bgwalteron 6/30/2025, 11:16 PM

These discussions increasingly remind me of gamers discussing various strategies in WoW or similar. Purportedly working strategies found by trial and error and discussed in a language that is only intelligible to the in-group (because no one else is interested).

We are entering a new era of gamification of programming, where the power users force their imaginary strategies on innocent people by selling them to the equally clueless and gaming-addicted management.

by benreesmanon 6/30/2025, 11:32 PM

The new skill is programming, same as the old skill. To the extent these things are comprehensible, you understand them by writing programs: programs that train them, programs that run inferenve, programs that analyze their behavior. You get the most out of LLMs by knowing how they work in detail.

I had one view of what these things were and how they work, and a bunch of outcomes attached to that. And then I spent a bunch of time training language models in various ways and doing other related upstream and downstream work, and I had a different set of beliefs and outcomes attached to it. The second set of outcomes is much preferable.

I know people really want there to be some different answer, but it remains the case that mastering a programming tool involves implemtenting such, to one degree or another. I've only done medium sophistication ML programming, and my understand is therefore kinda medium, but like compilers, even doing a medium one is the difference between getting good results from a high complexity one and guessing.

Go train an LLM! How do you think Karpathy figured it out? The answer is on his blog!

by baxtron 6/30/2025, 9:19 PM

>Conclusion

Building powerful and reliable AI Agents is becoming less about finding a magic prompt or model updates. It is about the engineering of context and providing the right information and tools, in the right format, at the right time. It’s a cross-functional challenge that involves understanding your business use case, defining your outputs, and structuring all the necessary information so that an LLM can “accomplish the task."

That’s actually also true for humans: the more context (aka right info at the right time) you provide the better for solving tasks.

by munificenton 7/1/2025, 1:54 AM

All of these blog posts to me read like nerds speedrunning "how to be a tech lead for a non-disastrous internship".

Yes, if you have an over-eager but inexperienced entity that wants nothing more to please you by writing as much code as possible, as the entity's lead, you have to architect a good space where they have all the information they need but can't get easily distracted by nonessential stuff.

by zaptheimpaleron 7/1/2025, 12:11 AM

I feel like this is incredibly obvious to anyone who's ever used an LLM or has any concept of how they work. It was equally obvious before this that the "skill" of prompt-engineering was a bunch of hacks that would quickly cease to matter. Basically they have the raw intelligence, you now have to give them the ability to get input and the ability to take actions as output and there's a lot of plumbing to make that happen.

by ozimon 6/30/2025, 9:55 PM

Finding a magic prompt was never “prompt engineering” it was always “context engineering” - lots of “AI wannabe gurus” sold it as such but they never knew any better.

RAG wasn’t invented this year.

Proper tooling that wraps esoteric knowledge like using embeddings, vector dba or graph dba becomes more mainstream. Big players improve their tooling so more stuff is available.

by crystal_revengeon 6/30/2025, 9:25 PM

Definitely mirrors my experience. One heuristic I've often used when providing context to model is "is this enough information for a human to solve this task?". Building some text2SQL products in the past it was very interesting to see how often when the model failed, a real data analyst would reply something like "oh yea, that's an older table we don't use any more, the correct table is...". This means the model was likely making a mistake that a real human analyst would have without the proper context.

One thing that is missing from this list is: evaluations!

I'm shocked how often I still see large AI projects being run without any regard to evals. Evals are more important for AI projects than test suites are for traditional engineering ones. You don't even need a big eval set, just one that covers your problem surface reasonably well. However without it you're basically just "guessing" rather than iterating on your problem, and you're not even guessing in a way where each guess is an improvement on the last.

edit: To clarify, I ask myself this question. It's frequently the case that we expect LLMs to solve problems without the necessary information for a human to solve them.

by mountainriveron 6/30/2025, 11:38 PM

You can give most of the modern LLMs pretty darn good context and they will still fail. Our company has been deep down this path for over 2 years. The context crowd seems oddly in denial about this

by dinvladon 6/30/2025, 10:32 PM

I feel like ppl just keep inventing concepts for the same old things, which come down to dancing with the drums around the fire and screaming shamanic incantations :-)

by CharlieDigitalon 6/30/2025, 10:14 PM

I was at a startup that started using OpenAI APIs pretty early (almost 2 years ago now?).

"Back in the day", we had to be very sparing with context to get great results so we really focused on how to build great context. Indexing and retrieval were pretty much our core focus.

Now, even with the larger windows, I find this still to be true.

The moat for most companies is actually their data, data indexing, and data retrieval[0]. Companies that 1) have the data and 2) know how to use that data are going to win.

My analogy is this:

    > The LLM is just an oven; a fantastical oven.  But for it to produce a good product still depends on picking good ingredients, in the right ratio, and preparing them with care.  You hit the bake button, then you still need to finish it off with presentation and decoration.

[0] https://chrlschn.dev/blog/2024/11/on-bakers-ovens-and-ai-sta...

by jumploopson 6/30/2025, 10:16 PM

To anyone who has worked with LLMs extensively, this is obvious.

Single prompts can only get you so far (surprisingly far actually, but then they fall over quickly).

This is actually the reason I built my own chat client (~2 years ago), because I wanted to “fork” and “prune” the context easily; using the hosted interfaces was too opaque.

In the age of (working) tool-use, this starts to resemble agents calling sub-agents, partially to better abstract, but mostly to avoid context pollution.

by 8organicbitson 6/30/2025, 9:51 PM

One thought experiment I was musing on recently was the minimal context required to define a task (to an LLM, human, or otherwise). In software, there's a whole discipline of human centered design that aims to uncover the nuance of a task. I've worked with some great designers, and they are incredibly valuable to software development. They develop journey maps, user stories, collect requirements, and produce a wealth of design docs. I don't think you can successfully build large projects without that context.

I've seen lots of AI demos that prompt "build me a TODO app", pretend that is sufficient context, and then claim that the output matches their needs. Without proper context, you can't tell if the output is correct.

by jcon321on 6/30/2025, 10:18 PM

I thought this entire premise was obvious? Does it really take an article and a venn diagram to say you should only provide the relevant content to your LLM when asking a question?

by zacharyvoaseon 6/30/2025, 10:26 PM

I love how we have such a poor model of how LLMs work (or more aptly don't work) that we are developing an entire alchemical practice around them. Definitely seems healthy for the industry and the species.

by slavapestovon 6/30/2025, 10:12 PM

I feel like if the first link in your post is a tweet from a tech CEO the rest is unlikely to be insightful.

by aaronlinoopson 7/1/2025, 4:02 AM

As models become more powerful, the ability to communicate effectively with them becomes increasingly important, which is why maintaining context is crucial for better utilizing the model's capabilities.

by liampulleson 6/30/2025, 10:18 PM

The only engineering going on here is Job Engineering™

by rednafion 6/30/2025, 10:05 PM

I really don’t get this rush to invent neologisms to describe every single behavioral artifact of LLMs. Maybe it’s just a yearning to be known as the father of Deez Unseen Mind-blowing Behaviors (DUMB).

LLM farts — Stochastic Wind Release.

The latest one is yet another attempt to make prompting sound like some kind of profound skill, when it’s really not that different from just knowing how to use search effectively.

Also, “context” is such an overloaded term at this point that you might as well just call it “doing stuff” — and you’d objectively be more descriptive.

by b0a04glon 7/1/2025, 3:13 AM

https://blog.langchain.com/the-rise-of-context-engineering/?...

I feel op' s blog is more of duplicate of the above langchain's blog happened a week ago.

by semiinfinitelyon 6/30/2025, 9:57 PM

context engineering is just a phrase that karpathy uttered for the first time 6 days ago and now everyone is treating it like its a new field of science and engineering

by jshortyon 6/30/2025, 9:41 PM

I have felt somewhat frustrated with what I perceive as a broad tendency to malign "prompt engineering" as an antiquated approach for whatever new the industry technique is with regards to building a request body for a model API. Whether that's RAG years ago, nuance in a model request's schema beyond simple text (tool calls, structured outputs, etc), or concepts of agentic knowledge and memory more recently.

While models were less powerful a couple of years ago, there was nothing stopping you at that time from taking a highly dynamic approach to what you asked of them as a "prompt engineer"; you were just more vulnerable to indeterminism in the contract with the models at each step.

Context windows have grown larger; you can fit more in now, push out the need for fine-tuning, and get more ambitious with what you dump in to help guide the LLM. But I'm not immediately sure what skill requirements fundamentally change here. You just have more resources at your disposal, and can care less about counting tokens.

by eddythompson80on 6/30/2025, 9:35 PM

Which is funny because everyone is already looking at AI as: I have 30 TB of shit that is basically "my company". Can I dump that into your AI and have another, magical, all-konwning, co-worker?

by mgdevon 6/30/2025, 10:17 PM

If we zoom out far enough, and start to put more and more under the execution umbrella of AI, what we're actually describing here is... product development.

You are constructing the set of context, policies, directed attention toward some intentional end, same as it ever was. The difference is you need fewer meat bags to do it, even as your projects get larger and larger.

To me this is wholly encouraging.

Some projects will remain outside what models are capable of, and your role as a human will be to stitch many smaller projects together into the whole. As models grow more capable, that stitching will still happen - just as larger levels.

But as long as humans have imagination, there will always be a role for the human in the process: as the orchestrator of will, and ultimate fitness function for his own creations.

by _pdp_on 6/30/2025, 9:49 PM

It is wrong. The new/old skill is reverse engineering.

If the majority of the code is generated by AI, you'll still need people with technical expertise to make sense of it.

by bGl2YW5jon 6/30/2025, 9:25 PM

Saw this the other day and it made me think that too much effort and credence is being given to this idea of crafting the perfect environment for LLMs to thrive in. Which to me, is contrary to how powerful AI systems should function. We shouldn’t need to hold its hand so much.

Obviously we’ve got to tame the version of LLMs we’ve got now, and this kind of thinking is a step in the right direction. What I take issue with is the way this thinking is couched as a revolutionary silver bullet.

by lawlessoneon 6/30/2025, 9:55 PM

I look forward to 5 million LinkedIn posts repeating this

by hintymadon 6/30/2025, 11:47 PM

> The New Skill in AI Is Not Prompting, It's Context Engineering

Sounds like good managers and leaders now have an edge. Per Patty McCord of Netflix fame used to say: All that a manager does is setting the context.

by emporason 6/30/2025, 10:39 PM

Prompting sits on the back seat, while context is the driving factor. 100% agree with this.

For programming I don't use any prompts. I give a problem solved already, as a context or example, and I ask it to implement something similar. One sentence or two, and that's it.

Other kind of tasks, like writing, I use prompts, but even then, context and examples are still the driving factor.

In my opinion, we are in an interesting point in history, in which now individuals will need their own personal database. Like companies the last 50 years, which had their own database records of customers, products, prices and so on, now an individual will operate using personal contextual information, saved over a long period of time in wikis or Sqlite rows.

by colgandevon 6/30/2025, 10:06 PM

I've been finding a ton of success lately with speech to text as the user prompt, and then using https://continue.dev in VSCode, or Aider, to supply context from files from my projects and having those tools run the inference.

I'm trying to figure out how to build a "Context Management System" (as compared to a Content Management System) for all of my prompts. I completely agree with the premise of this article, if you aren't managing your context, you are losing all of the context you create every time you create a new conversation. I want to collect all of the reusable blocks from every conversation I have, as well as from my research and reading around the internet. Something like a mashup of Obsidian with some custom Python scripts.

The ideal inner loop I'm envisioning is to create a "Project" document that uses Jinja templating to allow transclusion of a bunch of other context objects like code files, documentation, articles, and then also my own other prompt fragments, and then to compose them in a master document that I can "compile" into a "superprompt" that has the precise context that I want for every prompt.

Since with the chat interfaces they are always already just sending the entire previous conversation message history anyway, I don't even really want to use a chat style interface as much as just "one shotting" the next step in development.

It's almost a turn based game: I'll fiddle with the code and the prompts, and then run "end turn" and now it is the llm's turn. On the llm's turn, it compiles the prompt and runs inference and outputs the changes. With Aider it can actually apply those changes itself. I'll then review the code using diffs and make changes and then that's a full turn of the game of AI-assisted code.

I love that I can just brain dump into speech to text, and llms don't really care that much about grammar and syntax. I can curate fragments of documentation and specifications for features, and then just kind of rant and rave about what I want for a while, and then paste that into the chat and with my current LLM of choice being Claude, it seems to work really quite well.

My Django work feels like it's been supercharged with just this workflow, and my context management engine isn't even really that polished.

If you aren't getting high quality output from llms, definitely consider how you are supplying context.

by almosthereon 7/1/2025, 2:40 AM

Which is prompt engineering, since you just ask the LLM for a good context for the next prompt.

by labradoron 6/30/2025, 9:55 PM

I’m curious how this applies to systems like ChatGPT, which now have two kinds of memory: user-configurable memory (a list of facts or preferences) and an opaque chat history memory. If context is the core unit of interaction, it seems important to give users more control or at least visibility into both.

I know context engineering is critical for agents, but I wonder if it's also useful for shaping personality and improving overall relatability? I'm curious if anyone else has thought about that.

by grafmaxon 6/30/2025, 9:53 PM

There is no need to develop this ‘skill’. This can all be automated as a preprocessing step before the main request runs. Then you can have agents with infinite context, etc.

by joe5150on 7/1/2025, 12:02 AM

Surely Jim is also using an agent. Jim can't be worth having a quick sync with if he's not using his own agent! So then why are these two agents emailing each other back and forth using bizarre, terse office jargon?

by saejoxon 6/30/2025, 9:43 PM

Claude 3.5 was released 1 year ago. Current LLMs are not much better at coding than it. Sure they are more shiny and well polished, but not much better at all. I think it is time to curb our enthusiasm.

I almost always rewrite AI written functions in my code a few weeks later. Doesn't matter they have more context or better context, they still fail to write code easily understandable by humans.

by geeewhyon 6/30/2025, 10:30 PM

ive beeen experimenting with this for a while, (im sure in a way, most of us did). Would be good to numerate some examples. When it comes to coding, here's a few:

- compile scripts that can grep / compile list of your relevant files as files of interest

- make temp symlinks in relevant repos to each other for documentation generation, pass each documentation collected from respective repos to to enable cross-repo ops to be performed atomically

- build scripts to copy schemas, db ddls, dtos, example records, api specs, contracts (still works better than MCP in most cases)

I found these steps not only help better output but also reduces cost greatly avoiding some "reasoning" hops. I'm sure practice can extend beyond coding.

by patrickhogan1on 6/30/2025, 10:07 PM

OpenAI’s o3 searches the web behind a curtain: you get a few source links and a fuzzy reasoning trace, but never the full chunk of text it actually pulled in. Without that raw context, it’s impossible to audit what really shaped the answer.

by walterfreedomon 6/30/2025, 11:53 PM

I am mostly focusing in this issue during the development of my agent engine (mostly for game npcs). Its really important to manage the context and not bloat the llm with irrelevant stuff for both quality and inference speed. I wrote about it here if anyone is interested: https://walterfreedom.com/post.html?id=ai-context-management

by pwarneron 6/30/2025, 9:29 PM

It's an integration adventure. This is why much AI is failing in the enterprise. MS Copilot is moderately interesting for data in MS Office, but forget about it accessing 90% of your data that's in other systems.

by asciiion 6/30/2025, 11:59 PM

Here I was thinking that part of Prompt Engineering is understanding context and awareness for other yada yada.

by hnthrow90348765on 6/30/2025, 9:58 PM

Cool, but wait another year or two and context engineering will be obsolete as well. It still feels like tinkering with the machine, which is what AI is (supposed to be) moving us away from.

by dborehamon 7/1/2025, 2:28 AM

The dudes who ran the Oracle of Delphi must have had this problem too.

by bag_boyon 6/30/2025, 11:01 PM

Anecdotally, I’ve found that chatting with Claude about a subject for a bit — coming to an understanding together, then tasking it — produces much better results than starting with an immediate ask.

I’ll usually spend a few minutes going back and forth before making a request.

For some reason, it just feels like this doesn't work as well with ChatGPT or Gemini. It might be my overuse of o3? The latency can wreck the vibe of a conversation.

by stillpointlabon 6/30/2025, 11:11 PM

I've been using the term context engineering for a few months now, I am very happy to see this gain traction.

This new stillpointlab hacker news account is based on the company name I chose to pursue my Context as a Service idea. My belief is that context is going to be the key differentiator in the future. The shortest description I can give to explain Context as a Service (CaaS) is "ETL for AI".

by alganeton 6/30/2025, 9:58 PM

If I need to do all this work (gather data, organize it, prepare it, etc), there are other AI solutions I might decide to use instead of an LLM.

by whimsicalismon 6/30/2025, 9:35 PM

i think context engineering as described is somewhat a subset of ‘environment engineering.’ the gold-standard is when an outcome reached with tools can be verified as correct and hillclimbed with RL. most of the engineering effort is from building the environment and verifier while the nuts and bolts of grpo/ppo training and open-weight tool-using models are commodities.

by adhamsalamaon 6/30/2025, 9:50 PM

There is no engineering involved in using AI. It's insulting to call begging an LLM "engineering".

by ModernMechon 6/30/2025, 9:32 PM

"Wow, AI will replace programming languages by allowing us to code in natural language!"

"Actually, you need to engineer the prompt to be very precise about what you want to AI to do."

"Actually, you also need to add in a bunch of "context" so it can disambiguate your intent."

"Actually English isn't a good way to express intent and requirements, so we have introduced protocols to structure your prompt, and various keywords to bring attention to specific phrases."

"Actually, these meta languages could use some more features and syntax so that we can better express intent and requirements without ambiguity."

"Actually... wait we just reinvented the idea of a programming language."

by bradheon 6/30/2025, 10:02 PM

Back in my day we just called this "knowing what to google" but alright, guys.

by retinaroson 6/30/2025, 10:15 PM

it is still sending a string of chars and hoping the model outputs something relevant. let’s not do like finance and permanently obfuscate really simple stuff to make us bigger than we are.

prompt engineering/context engineering : stringbuilder

Retrieval augmented generation: search+ adding strings to main string

test time compute: running multiple generation and choosing the best

agents: for loop and some ifs

by drmathon 6/30/2025, 10:59 PM

Isn't "context" just another word for "prompt?" Techniques have become more complex, but they're still just techniques for assembling the token sequences we feed to the transformer.

by davidclarkon 6/30/2025, 9:46 PM

Good example of why I have been totally ignoring people who beat the drum of needing to develop the skills of interacting with models. “Learn to prompt” is already dead? Of course, the true believers will just call this an evolution of prompting or some such goalpost moving.

Personally, my goalpost still hasn’t moved: I’ll invest in using AI when we are past this grand debate about its usefulness. The utility of a calculator is self-evident. The utility of an LLM requires 30k words of explanation and nuanced caveats. I just can’t even be bothered to read the sales pitch anymore.

by ameliuson 6/30/2025, 10:26 PM

Yes, and it is a soft skill.

by jongjongon 6/30/2025, 10:06 PM

Recently I started work on a new project and I 'vibe coded' a test case for a complex OAuth token expiry bug entirely with AI (with Cursor), complete with mocks and stubs... And it was on someone else's project. I had no prior familiarity with the code.

That's when I understood that vibe coding is real and context is the biggest hurdle.

That said, most of the context could not be pulled from the codebase directly but came from me after asking the AI to check/confirm certain things that I suspected could be the problem.

I think vibe coding can be very powerful in the hands of a senior developer because if you're the kind of person who can clearly explain their intuitions with words, it's exactly the missing piece that the AI needs to solve the problem... And you still need to do code review aspect which is also something which senior devs are generally good at. Sometimes it makes mistakes/incorrect assumptions.

I'm feeling positive about LLMs. I was always complaining about other people's ugly code before... I HATE over-modularized, poorly abstracted code where I have to jump across 5+ different files to figure out what a function is doing; with AI, I can just ask it to read all the relevant code across all the files and tell me WTF the spaghetti is doing... Then it generates new code which 'follows' existing 'conventions' (same level of mess). The AI basically automates the most horrible aspect of the work; making sense of the complexity and churning out more complexity that works. I love it.

That said, in the long run, to build sustainable projects, I think it will require following good coding conventions and minimal 'low code' coding... Because the codebase could explode in complexity if not used carefully. Code quality can only drop as the project grows. Poor abstractions tend to stick around and have negative flow-on effects which impact just about everything.

by m3kw9on 6/30/2025, 10:01 PM

Well, it’s still a prompt

by neilvon 6/30/2025, 11:09 PM

> Then you can generate a response.

> > Hey Jim! Tomorrow’s packed on my end, back-to-back all day. Thursday AM free if that works for you? Sent an invite, lmk if it works.

Feel free to send generated AI responses like this if you are a sociopath.

by la64710on 6/30/2025, 9:59 PM

Of course the best prompts automatically included providing the best (not necessarily most) context to extract the right output.

by rvzon 6/30/2025, 10:11 PM

This is just another "rebranding" of the failed "prompt engineering" trend to promote another borderline pseudo-scientific trend to attact more VC money to fund a new pyramid scheme.

Assuming that this will be using the totally flawed MCP protocol, I can only see more cases of data exfiltration attacks on these AI systems just like before [0] [1].

Prompt injection + Data exfiltration is the new social engineering in AI Agents.

[0] https://embracethered.com/blog/posts/2025/security-advisory-...

[1] https://www.bleepingcomputer.com/news/security/zero-click-ai...

by intellectronicaon 6/30/2025, 9:37 PM

See also: https://ai.intellectronica.net/context-engineering for an overview.

by LASRon 7/1/2025, 12:21 AM

Honestly, GPT-4o is all we ever needed to build a complete human-like reasoning system.

I am leading a small team working on a couple of “hard” problems to put the limits of LLMs to the test.

One is an options trader. Not algo / HFT, but simply doing due diligence, monitoring the news and making safe long-term bets.

Another is an online research and purchasing experience for residential real-estate.

Both these tasks, we’ve realized, you don’t even need a reasoning model. In fact, reasoning models are harder to get consistent results from.

What you need is a knowledge base infrastructure and pub-sub for updates. Amortize the learned knowledge across users and you have collaborative self-learning system that exhibits intelligence beyond any one particular user and is agnostic to the level of prompting skills they have.

Stay tuned for a limited alpha in this space. And DM if you’re interested.

The new skill in AI is not prompting, it's context engineering