Data Agent,还是 Data Skills
Data Agent vs. Data Skills
公司里越来越多人在建自己的 Agent。产品团队有增长 Agent,运营有活动 Agent,算法有推荐 Agent。每个 Agent 都要看数据,大家自然会想到:把数据分析做成一个 skill,所有 Agent 都能调。
我花了一段时间想这个问题,结论是:不行。
不是因为技术上不可行,而是其他原因。
Agent 有自己的”利益”
增长团队建的 Agent,目标函数里写着”帮增长团队拉 DAU”。现在 DAU 下了,你让它分析原因。
它查了数据,发现三个因素都有贡献:新用户留存下降(增长负责)、老用户活跃度下降(运营负责)、推荐点击率下降(算法负责)。
中立的说法是”新用户留存是主因,贡献了 60% 的下降”。但这个 Agent 不会这么讲。它会把重点放在运营和算法身上。数据都对,但强调什么、淡化什么,由立场决定。
这不是幻觉,也不是能力不足。Agent 干的恰恰是它该干的事。它的设计就是为增长团队服务,分析数据时自然会向增长团队倾斜。
你说,那我把数据查询做成标准化的 skill,保证查询本身是中立的。可以。但 skill 返回的原始结果,仍然由宿主 Agent 来解读。选择呈现哪个数字、先说哪个结论、用什么语气描述自己负责的部分,这些都发生在 skill 之后。
Skill 管得了查询,管不了解读。
这是个老问题
“做分析的,不能是被分析的当事人。”
开发不能 approve 自己的 PR,业务不能自审财报。这些规则不是因为不信任个人,是因为结构性利益冲突客观存在。分析结果会直接影响分析者的 OKR 时,任何人(包括 Agent)都会下意识地挑着讲。
Agent 时代让这个问题变得更尖锐。人类分析师至少有职业声誉的约束,有”被发现造假”的后果。Agent 没有这层约束,它只是忠实地执行目标函数。目标函数有偏,输出就有偏。
独立 Data Agent 解决什么
把数据分析能力从各业务 Agent 里抽出来,放到一个独立的 Data Agent 里。这个 Agent:
它的目标函数是”输出准确的分析”,不是”让某个业务好看”。它不为任何一个 OKR 负责,所以没有挑着讲的动机。组织上归数据团队,考核的是准确性和响应速度,不是业务满意度。
各业务 Agent 要数据时,向 Data Agent 发请求,拿到结果。结果怎么用是各家自己的事;分析这一步是不是中立,由 Data Agent 兜底。
类似公司里独立的财务审计。业务团队可以提需求,可以不同意结论、按自己的判断行事,但审计这件事必须由独立方做。
不是二选一
我不是说 data skill 完全不能存在。“昨天 DAU 多少”这种简单查询,通用 Agent 调个 skill 就够了。
但涉及归因分析、跨域对比、口径敏感的任务,必须走 Data Agent。关键指标的定义和计算,只能从 Data Agent 出口。
分级的标准很简单:只要分析结果会影响请求方的 OKR,这个分析就不能让请求方的 Agent 自己做。
Ownership
Data Agent 的 ownership 归谁,比 Data Agent 的技术架构更重要。
如果归业务团队建设和维护,那”独立”只是代码层面的独立。目标函数、迭代优先级、数据口径的定义权,全部会被 ownership 方的利益左右。
Agent 时代的数据分析,难的不是把 SQL 写得更好,是让数据分析在公司里依然可信。
老答案:把裁判和运动员分开。
More and more people across the company are building their own Agents. Product has a growth Agent, ops has a campaign Agent, the algorithm team has a recommendation Agent. Each Agent needs to look at data, so a natural idea comes up: turn data analysis into a skill that every Agent can call.
I spent a while thinking about this. My conclusion: it doesn’t work.
Not because it’s technically infeasible. The reason lies elsewhere.
Every Agent Has a Side
Take the growth team’s Agent. Its objective function says “help the growth team push DAU.” Now DAU is down. You ask it to analyze why.
It pulls the data and finds three contributing factors: new-user retention is down (growth’s responsibility), existing-user activity is down (ops’ responsibility), recommendation click-through is down (algorithm’s responsibility).
A neutral framing would be “new-user retention is the main driver, accounting for 60% of the drop.” But this Agent won’t put it that way. It’ll put the focus on ops and the algorithm team. The data is correct either way. What gets emphasized and what gets played down comes down to where you stand.
This isn’t hallucination, and it isn’t incompetence. The Agent is doing exactly what it was built to do. It was designed to serve the growth team, so naturally its analysis tilts toward the growth team.
You might say: fine, I’ll standardize the query layer as a skill, so the query itself is neutral. Sure. But the raw results the skill returns still get interpreted by the host Agent. Which number to surface, which conclusion to lead with, what tone to use when describing its own responsibility: all of that happens after the skill returns.
A skill controls the query. It can’t control the interpretation.
An Old Problem
“The one doing the analysis can’t be the one being analyzed.”
Engineers don’t approve their own PRs. Business teams don’t audit their own financials. These rules aren’t about distrusting individuals. They exist because structural conflicts of interest are real. When the result of an analysis directly affects the analyst’s own OKR, anyone (Agents included) will unconsciously cherry-pick what to say.
The Agent era makes this sharper. A human analyst at least has professional reputation as a check, and faces consequences if caught fabricating. An Agent has no such check. It just faithfully executes its objective function. If the objective function is biased, the output is biased.
What an Independent Data Agent Solves
Pull data analysis out of every business Agent and put it into a separate Data Agent. This Agent:
Its objective function is “produce accurate analysis,” not “make some business unit look good.” It isn’t responsible for any single OKR, so it has no motive to cherry-pick. Organizationally it sits with the data team, and it’s evaluated on accuracy and response speed, not on how satisfied the business teams are.
When business Agents need data, they send requests to the Data Agent and get results back. What they do with those results is up to them. The Data Agent guarantees that the analysis itself is neutral.
Like an independent financial audit inside a company. Business teams can ask for audits, and they can disagree with the conclusions and act on their own judgment, but the audit itself has to be done by an independent party.
Not an Either-Or
I’m not saying data skills shouldn’t exist. For simple queries like “what was DAU yesterday,” a general-purpose Agent calling a skill is enough.
But anything involving attribution, cross-domain comparison, or definition-sensitive work has to go through the Data Agent. The definition and computation of key metrics can only come out of the Data Agent.
The rule is simple: if the result of an analysis could affect the requester’s OKR, the analysis can’t be done by the requester’s own Agent.
Ownership
Who owns the Data Agent matters more than its technical architecture.
If a business team builds and maintains it, then “independence” only exists in the code. Objective function, iteration priorities, the authority to define data: all of it gets pulled toward the owner’s interests.
In the Agent era, the hard part of data analysis isn’t writing better SQL. It’s keeping data analysis trustworthy across the company.
Old answer: keep referees and players separate.