无主之地2配置高吗|看真人裸体BBBBB|秋草莓丝瓜黄瓜榴莲色多多|真人強奷112分钟|精品一卡2卡3卡四卡新区|日本成人深夜苍井空|八十年代动画片

網(wǎng)易首頁 > 網(wǎng)易號 > 正文 申請入駐

Anthropic 官方指南:怎么給 Agent 設(shè)計工具

0
分享至

BLOG

本文翻譯自 Anthropic 官方博客「Seeing like an agent: how we design tools in Claude Code」,作者 Thariq Shihipar,Claude Code 團(tuán)隊工程師,今天發(fā)布

以下為逐段中英對照翻譯

構(gòu)建 Agent 最難的部分之一:設(shè)計工具

One of the hardest parts about building an agent harness is constructing its tools.

構(gòu)建 Agent harness 最困難的部分之一,是設(shè)計它的工具集

Claude acts completely through tool calling, but there are a number of ways tools can be constructed in the Claude API with primitives like bash, skills and code execution.

Claude 完全通過工具調(diào)用來行動。在 Claude API 中,工具可以用 bash、skills、代碼執(zhí)行等基礎(chǔ)原語來構(gòu)建

So how do you design your agents' tools? Do you give it one general-purpose tool like bash or code execution? Or fifty specialized tools, one for each use case?

那你該怎么給 Agent 設(shè)計工具?給它一個通用工具(比如 bash 或代碼執(zhí)行)就夠了?還是做五十個專用工具,每個場景一個?

To put yourself in the mind of the model, imagine being given a difficult math problem. What tools would you want in order to solve it? It would depend on your own skill set!

要站在模型的角度想這個問題,可以想象你面前有一道很難的數(shù)學(xué)題。你想要什么工具來解決它?答案取決于你自己的能力

Paper would be the minimum, but you'd be limited by manual calculations. A calculator would be better, but you would need to know how to operate the more advanced options. The fastest and most powerful option would be a computer, but you would have to know how to use it to write and execute code.

一張紙是最低配,但你只能手算。計算器好一些,但你得知道怎么用高級功能。最快最強(qiáng)的選擇是電腦,但你得會用它來寫和執(zhí)行代碼

This is a useful framework for designing your agent. You want to give it tools that are shaped to its own abilities. But how do you know what those abilities are? You pay attention, read its outputs, experiment. You learn to see like an agent.

這是一個很有用的設(shè)計框架。你要給 Agent 的工具,應(yīng)該貼合它自身的能力形狀。但你怎么知道它的能力是什么?你觀察它,讀它的輸出,反復(fù)實驗。你學(xué)會「像 Agent 一樣看」

If you're building an agent, you'll face the same questions we did: when to add a tool, when to remove one, and how to tell the difference. Here's how we've answered them while building Claude Code, including where we got it wrong first.

如果你在做 Agent,你會面對和我們一樣的問題:什么時候加工具,什么時候刪工具,怎么區(qū)分這兩種情況。下面是我們在 Claude Code 的實際經(jīng)驗,包括一開始做錯的地方

用 AskUserQuestion 工具改善提問能力


三種方案的光譜:從無結(jié)構(gòu)到過度剛性,AskUserQuestion 工具落在中間

When building the AskUserQuestion tool, our goal was to improve Claude's ability to ask questions (often called elicitation).

設(shè)計 AskUserQuestion 工具時,我們的目標(biāo)是提升 Claude 向用戶提問的能力(通常稱為 elicitation)

While Claude could just ask questions in plain text, we found answering those questions felt like they took an unnecessary amount of time. How could we lower this friction and increase the bandwidth of communication between the user and Claude?

雖然 Claude 可以用純文本提問,但我們發(fā)現(xiàn)回答這些問題的體驗很差,耗時太多。怎么降低這個摩擦,提升用戶和 Claude 之間的溝通帶寬?

第一次嘗試:修改 ExitPlanTool

The first approach we tried was adding a parameter to the ExitPlanTool to have an array of questions alongside the plan. This was the easiest fix to implement, but it confused Claude because we were simultaneously asking for a plan and a set of questions about the plan. What if the user's answers conflicted with what the plan said? Would Claude need to call the ExitPlanTool twice? We knew this tactic wouldn't work, so we went back to the drawing board.

我們第一個方案是給 ExitPlanTool 加一個參數(shù),讓它在輸出計劃的同時輸出一組問題。這是最省事的改法,但它讓 Claude 很困惑:我們同時要求它做計劃和對計劃提問。如果用戶的回答和計劃矛盾怎么辦?Claude 是不是得調(diào)兩次這個工具?我們知道這個方案行不通,于是回到原點

第二次嘗試:改變輸出格式

Next, we tried updating Claude's output instructions to serve a slightly modified markdown format that it could use to ask questions. For example, we could ask it to output a list of bullet point questions with alternatives in brackets. We could then parse and format that question as UI for the user.

接下來,我們嘗試修改 Claude 的輸出指令,讓它用一種特殊的 Markdown 格式來提問。比如用 bullet point 列出問題,每個問題后面用方括號給出選項。然后前端解析這個格式,渲染成 UI

Claude could usually produce this format, but not reliably. It would append extra sentences, drop options, or abandon the structure altogether. Onto the next approach.

Claude 大部分時候能生成這個格式,但不穩(wěn)定。它會在末尾多加一句話,漏掉選項,或者干脆不用這個格式。下一個方案

第三次嘗試:AskUserQuestion 工具


AskUserQuestion 工具的實際界面

Finally, we landed on creating a tool that Claude could call at any point, but it was particularly prompted to do so during plan mode. When the tool triggered we would show a modal to display the questions and block the agent's loop until the user answered.

最終方案是做一個獨立的工具,Claude 可以在任何時候調(diào)用,但在規(guī)劃模式中會被特別引導(dǎo)去使用。工具觸發(fā)后彈出一個模態(tài)框顯示問題,阻塞 Agent 循環(huán)直到用戶回答

This tool allowed us to prompt Claude for a structured output and it helped us ensure that Claude gave the user multiple options. It also gave users ways to compose this functionality, for example calling it in the Agent SDK or using referring to it in skills.

這個工具讓我們能引導(dǎo) Claude 輸出結(jié)構(gòu)化內(nèi)容,確保給用戶多個選項。它也給了用戶組合使用的空間,比如在 Agent SDK 或 Skills 中引用它

Most importantly, Claude seemed to like calling this tool and we found its outputs worked well. After all, even the best designed tool doesn't work if Claude doesn't understand how to call it.

最關(guān)鍵的一點:Claude 喜歡調(diào)用這個工具,輸出質(zhì)量也好。畢竟,再好的工具設(shè)計,如果模型不理解怎么調(diào)用,也是白搭

Is this the final form of elicitation in Claude Code? We doubt it. As Claude gets more capable, the tools that serve it have to evolve too. The next section shows a case where a tool that once helped started getting in the way.

這是 Claude Code 中 elicitation 的最終形態(tài)嗎?大概不是。隨著 Claude 能力提升,服務(wù)它的工具也必須跟著演進(jìn)。下一節(jié)會展示一個曾經(jīng)有用的工具后來開始礙事的案例

跟隨能力迭代:從 Todos 到 Tasks


從 Todos 到 Tasks:單 Agent 線性清單 → 多 Agent 協(xié)作任務(wù)圖

When we first launched Claude Code, we realized that the model needed a todo list to keep it on track. Todos could be written at the start and checked off as the model did work. To do this we gave Claude the TodoWrite tool, which would write or update Todos and display them to the user.

Claude Code 剛上線時,我們發(fā)現(xiàn)模型需要一個待辦清單來保持專注。開工前列好待辦,做完一項勾一項。我們做了 TodoWrite 工具來實現(xiàn)這個功能

But even then, we often saw Claude forgetting what it had to do. To adapt, we inserted system reminders every 5 turns that reminded Claude of its goal.

即便如此,Claude 還是經(jīng)常忘記該干什么。我們于是每隔 5 輪對話就插一條系統(tǒng)提醒

As models improved, they found To-do lists limiting. Being sent reminders of the todo list made Claude think that it had to stick to the list instead of modifying it when it realized it needed to change course. We also saw Opus 4.5 also get much better at using subagents, but how could subagents coordinate on a shared todo list?

隨著模型迭代,Todo 列表開始礙事。系統(tǒng)提醒讓 Claude 覺得必須嚴(yán)格按清單執(zhí)行,不敢中途調(diào)整方向。Opus 4.5 用子 Agent 的能力大幅提升,但多個子 Agent 怎么共享一個 Todo 列表?

Seeing this, we replaced the TodoWrite feature with the Task tool. Whereas todos are focused on keeping the model on track, tasks help agents communicate with each other. Tasks could include dependencies, share updates across subagents and the model could alter and delete them.

看到這些問題,我們把 TodoWrite 替換成了 Task 工具。Todo 的重點是讓模型保持方向,Task 的重點是讓 Agent 之間互相溝通。Task 支持依賴關(guān)系,可以跨子 Agent 共享狀態(tài)更新,模型可以隨時修改和刪除

模型能力提升之后,曾經(jīng)需要的工具可能反過來限制它

As model capabilities increase, the tools that your models once needed might now be constraining them. It's important to constantly revisit previous assumptions on what tools are needed. This is also why it's useful to stick to a small set of models to support that have a fairly similar capabilities profile.

隨著模型能力提升,你的模型曾經(jīng)需要的工具現(xiàn)在可能反過來在限制它。定期回頭審視「這些工具是否還有必要」很重要。這也是為什么建議只支持少量能力相近的模型,這樣工具設(shè)計可以聚焦

設(shè)計搜索界面

The most consequential tools we've built are the ones that let Claude find its own context.

我們做過的最有影響力的工具,是那些讓 Claude 自己尋找上下文的工具

When Claude Code was first released internally, we used RAG: a vector database would pre-index the codebase, and the harness would retrieve relevant snippets and hand them to Claude before each response. While RAG was powerful and fast, it required indexing and setup and could be fragile across a host of different environments. Most importantly, Claude was given this context instead of finding the context itself.

Claude Code 內(nèi)部版本最早用的是 RAG:向量數(shù)據(jù)庫預(yù)先索引代碼庫,每次回復(fù)前自動檢索相關(guān)片段塞給 Claude。RAG 速度快、效果好,但需要預(yù)處理,環(huán)境兼容性脆弱。最根本的問題是:上下文是被塞給 Claude 的,不是 Claude 自己找的

But if Claude could search on the web, why couldn't it also search your codebase? By giving Claude a Grep tool, we could let it search for files and build context itself.

如果 Claude 能搜網(wǎng)頁,為什么不能搜代碼庫?給 Claude 一個 Grep 工具,就能讓它自己搜文件、自己構(gòu)建上下文

As Claude gets smarter, it becomes increasingly good at building its context when given the right tools.

Claude 越聰明,給它合適的工具后它就越擅長自己構(gòu)建上下文

When we introduced Agent Skills, we formalized the idea of progressive disclosure, which allows agents to incrementally discover relevant context through exploration.

Agent Skills 上線后,我們把這個思路正式化為漸進(jìn)式披露(progressive disclosure):讓 Agent 通過探索逐步發(fā)現(xiàn)相關(guān)上下文

Claude could now read skill files and those files could then reference other files that the model could read recursively. In fact, a common use of skills is to add more search capabilities to Claude like giving it instructions on how to use an API or query a database.

Claude 現(xiàn)在可以讀 Skill 文件,Skill 文件可以引用其他文件,模型可以遞歸地發(fā)現(xiàn)和加載上下文。一個常見的 Skill 用法就是給 Claude 增加搜索能力:告訴它怎么調(diào) API、怎么查數(shù)據(jù)庫

Over the course of a year, Claude went from not really being able to build its own context to being able to do nested search across several layers of files to find the exact context it needed.

一年時間,Claude 從幾乎不會自己構(gòu)建上下文,到能在多層文件中嵌套搜索,精確找到需要的信息

Progressive disclosure is now a common technique we use to add new functionality without adding a tool. In the next section, we explain why.

漸進(jìn)式披露現(xiàn)在是我們常用的一種技術(shù):不加工具就能加功能。下一節(jié)解釋具體怎么做

漸進(jìn)式披露:Claude Code Guide 子 Agent

Claude Code currently has ~20 tools, and our team frequently revisits if we need all of them for Claude to be most effective. The bar to add a new tool is high, because this gives the model one more option to think about.

Claude Code 目前有大約 20 個工具,團(tuán)隊經(jīng)常審視是否每個都有必要。加新工具的門檻很高,因為每多一個工具,模型就多一個需要思考的選項

For example, we noticed that Claude did not know enough about how to use Claude Code. If you asked it how to add a MCP or what a slash command did, it would not be able to reply.

比如,我們發(fā)現(xiàn) Claude 不夠了解 Claude Code 自身的功能。你問它怎么加 MCP、某個斜杠命令是什么意思,它答不上來

We could have put all of this information in the system prompt, but given that users rarely asked about this, it would have added context rot and interfered with Claude Code's main job: writing code.

可以把這些信息全塞進(jìn) system prompt,但用戶很少問這類問題,塞進(jìn)去會造成上下文腐蝕,干擾 Claude 的主要工作(寫代碼)

Instead, we tried progressive disclosure: we gave Claude a link to its docs that it could load and search when needed. This worked, but Claude would pull large chunks of documentation into context to find an answer the user could have gotten in one sentence.

我們嘗試漸進(jìn)式披露:給 Claude 一個指向文檔的鏈接,需要時自己去查。能用,但 Claude 會把大段文檔拉進(jìn)上下文,只為回答一個一句話就能搞定的問題

So we built the Claude Code Guide — a subagent Claude calls whenever a user asks about Claude Code itself. The subagent does the doc-searching in its own context, follows detailed instructions on how to search and what to extract, and hands back only the answer. The main agent's context stays clean.

最終我們做了一個Claude Code Guide子 Agent。當(dāng)用戶問 Claude Code 自身的問題時,主 Agent 把請求轉(zhuǎn)給這個子 Agent。子 Agent 在自己的上下文里搜索文檔、提取答案,只把答案傳回來。主 Agent 的上下文保持干凈

While this isn't a perfect solution (Claude can still get confused when you ask it about how to set itself up), we were able to add things to Claude's action space without adding a new tool.

這個方案不完美(Claude 有時候還是會在自身配置問題上犯糊涂),但關(guān)鍵是:不用加新工具,就能擴(kuò)展 Agent 的能力范圍

像 Agent 一樣看,是手藝活

Designing the tools for your models is as much an art as it is a science. It depends heavily on the model you're using, the goal of the agent and the environment it's operating in.

給模型設(shè)計工具,與其說是科學(xué),更接近手藝。它取決于你用的模型、Agent 的目標(biāo)、運行的環(huán)境

Our best advice? Experiment often, read your outputs, try new things. And most importantly, try to see like an agent.

我們最好的建議?多實驗,讀你的輸出,試新東西。最重要的是,學(xué)會像 Agent 一樣看

Experiment often, read your outputs, try new things. And most importantly, try to see like an agent.

https://claude.com/blog/seeing-like-an-agent

作者:Thariq Shihipar,Anthropic 工程師,Claude Code 團(tuán)隊

特別聲明:以上內(nèi)容(如有圖片或視頻亦包括在內(nèi))為自媒體平臺“網(wǎng)易號”用戶上傳并發(fā)布,本平臺僅提供信息存儲服務(wù)。

Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.

相關(guān)推薦
熱點推薦
A股新股王,50天股價狂飆25倍

A股新股王,50天股價狂飆25倍

和訊網(wǎng)
2026-06-12 18:10:14
“好吃到不對勁!”消費者因餅干太好吃而引發(fā)懷疑,配料表完全對不上!當(dāng)?shù)厥斜O(jiān)局介入

“好吃到不對勁!”消費者因餅干太好吃而引發(fā)懷疑,配料表完全對不上!當(dāng)?shù)厥斜O(jiān)局介入

極目新聞
2026-06-12 06:54:56
私生活混亂,從央視主持到勞改犯,如今靠直播打賞討生活

私生活混亂,從央視主持到勞改犯,如今靠直播打賞討生活

素衣讀史
2026-06-11 21:56:30
馬斯克正式成為人類首個10000億美元富豪,還帶動約400名員工成為億萬富翁;這些錢每小時花100萬美元、24小時不停,需要超114年才能花完

馬斯克正式成為人類首個10000億美元富豪,還帶動約400名員工成為億萬富翁;這些錢每小時花100萬美元、24小時不停,需要超114年才能花完

極目新聞
2026-06-12 22:28:06
韓國2-1逆轉(zhuǎn)出線在望,女球迷又火了,身材顏值都在線,笑容很甜

韓國2-1逆轉(zhuǎn)出線在望,女球迷又火了,身材顏值都在線,笑容很甜

球盲百小易
2026-06-12 19:28:00
老人入住精神病院7年后查出梅毒;哈爾濱精神專科白漁泡醫(yī)院稱系舊疾,家屬出示入院前梅毒陰性檢測報告反駁

老人入住精神病院7年后查出梅毒;哈爾濱精神專科白漁泡醫(yī)院稱系舊疾,家屬出示入院前梅毒陰性檢測報告反駁

大風(fēng)新聞
2026-06-12 12:12:20
英國爆發(fā)大騷亂:四天燎原、全境失控!

英國爆發(fā)大騷亂:四天燎原、全境失控!

怪味歷史連連看
2026-06-12 14:30:03
重磅:烏克蘭摧毀俄羅斯最大的下卡姆斯克油氣廠!

重磅:烏克蘭摧毀俄羅斯最大的下卡姆斯克油氣廠!

項鵬飛
2026-06-12 18:54:51
內(nèi)塔尼亞胡:特朗普不打伊朗了,沒提前告訴我

內(nèi)塔尼亞胡:特朗普不打伊朗了,沒提前告訴我

政知新媒體
2026-06-12 19:06:11
葡萄牙6-1血洗加拿大,雷戈梅開二度領(lǐng)跑射手榜,決賽對陣突尼斯

葡萄牙6-1血洗加拿大,雷戈梅開二度領(lǐng)跑射手榜,決賽對陣突尼斯

林子說事
2026-06-12 19:37:51
釘釘CEO無招被開除,一切都結(jié)束了

釘釘CEO無招被開除,一切都結(jié)束了

科技頭版Pro
2026-06-12 14:15:22
在荷蘭上班的華人感慨:不要信媒體,荷蘭已經(jīng)相當(dāng)于我國二線城市

在荷蘭上班的華人感慨:不要信媒體,荷蘭已經(jīng)相當(dāng)于我國二線城市

離離言幾許
2026-06-11 00:12:29
魚餌含精神藥品“安定”!日產(chǎn)十噸銷往全國,廠家:魚被麻痹狂咬鉤 利潤率50%

魚餌含精神藥品“安定”!日產(chǎn)十噸銷往全國,廠家:魚被麻痹狂咬鉤 利潤率50%

貓頭鷹視頻
2026-06-12 19:15:43
謝娜再次翻車,這一次,她踢到鐵板了

謝娜再次翻車,這一次,她踢到鐵板了

桌子的生活觀
2026-06-12 11:58:27
金正恩:我們的選擇是正確的

金正恩:我們的選擇是正確的

IN朝鮮
2026-06-12 13:10:39
韓國逆轉(zhuǎn)開門紅創(chuàng)7紀(jì)錄!黃仁范賽后比心硬漢柔情 韓媒:最大功臣

韓國逆轉(zhuǎn)開門紅創(chuàng)7紀(jì)錄!黃仁范賽后比心硬漢柔情 韓媒:最大功臣

顏小白的籃球夢
2026-06-12 12:31:34
臺軍首次在西部海岸,朝中國大陸方向射擊30枚海馬斯火箭彈。

臺軍首次在西部海岸,朝中國大陸方向射擊30枚海馬斯火箭彈。

果媽聊娛樂
2026-06-12 11:56:07
世界杯沒開始,法國隊先贏一局?全員拎著“一套房”出行,把機(jī)場走出高定T臺!

世界杯沒開始,法國隊先贏一局?全員拎著“一套房”出行,把機(jī)場走出高定T臺!

新歐洲
2026-06-12 20:40:21
看世界杯遭持槍搶劫中國男子發(fā)聲:頭被槍抵著,為保命全程配合,大使館迅速介入,現(xiàn)已在機(jī)場準(zhǔn)備回國

看世界杯遭持槍搶劫中國男子發(fā)聲:頭被槍抵著,為保命全程配合,大使館迅速介入,現(xiàn)已在機(jī)場準(zhǔn)備回國

瀟湘晨報
2026-06-12 16:20:20
小鵬GX上市首月銷量,讓我楞了三分鐘

小鵬GX上市首月銷量,讓我楞了三分鐘

ZAKER新聞
2026-06-12 16:36:08
2026-06-13 01:07:00
賽博禪心
賽博禪心
拜AI古佛,修賽博禪心
466文章數(shù) 53關(guān)注度
往期回顧 全部

科技要聞

剛剛,人類歷史上首位萬億美元富豪誕生!

頭條要聞

美加墨世界杯第二場比賽就現(xiàn)空座 英媒:尷尬

頭條要聞

美加墨世界杯第二場比賽就現(xiàn)空座 英媒:尷尬

體育要聞

歐洲恐韓?肉德維德?

娛樂要聞

一天4個瓜,肖戰(zhàn)熱巴最意外

財經(jīng)要聞

萬億美元順差背后,透露這些信號

汽車要聞

標(biāo)配激光雷達(dá)/雙動力可選 昊鉑S600限時售17.99萬起

態(tài)度原創(chuàng)

家居
房產(chǎn)
數(shù)碼
手機(jī)
藝術(shù)

家居要聞

空間微調(diào) 移形換境

房產(chǎn)要聞

海南最賺錢行業(yè)曝光!最快4年半,海口全款買三房!

數(shù)碼要聞

英國監(jiān)管機(jī)構(gòu)警告:亞馬遜、eBay仍在售可能致命的假冒手機(jī)充電器

手機(jī)要聞

vivo X Fold6再預(yù)熱:天璣9500超能版+OriginOS 6 Fold

藝術(shù)要聞

砸了640億,再賠160億!沙特“The Line”項目徹底涼了?

無障礙瀏覽 進(jìn)入關(guān)懷版