The advisor tool now supports a max_tokens parameter to cap the advisor model's
The advisor tool now supports a max_tokens parameter to cap the advisor model's output per call, reducing latency and output token cost for workloads that don't need full-length advisor responses. Set tools[].max_tokens on the advisor tool definition; see Capping advisor output.
출처: Claude Platform 릴리스 노트 (원문 보기)