Task Breadth — Claude Code Maturity Score Dimension | AI Native Builder

Weight: 10% of overall score · How the overall score is calculated

Definition

Task Breadth measures how wide a range of task types you delegate to Claude. An engineer who uses Claude exclusively for one type of work — say, debugging — is not a low-maturity user, but they are leaving significant value on the table. Full-stack AI collaboration requires trusting Claude across multiple domains, including ones outside your primary expertise.

This dimension rewards coverage, not volume.

How it's measured

Each session is classified into one of nine task types:

debugging
feature_implementation
refactoring
code_understanding
design_planning
data_science
front_end
papercut_fix
other

The other category is excluded from the breadth count. The score is based on the number of distinct task types present across all your sessions (excluding other), scaled to 1–10:

score = min(10, (distinct_types / 8) × 10)

Distinct types seen	Score
8	10
6–7	7.5–8.75
4–5	5–6.25
2–3	2.5–3.75
1	1.25

What high vs low looks like

High (score 7–10)

Using Claude for debugging one day, front-end implementation the next, data analysis the next
Reaching outside your primary domain — a backend engineer using Claude to handle React work; a researcher using it for data science pipelines
Sessions span at least 5–6 distinct task type categories

Low (score 1–3)

All sessions are the same type — only debugging, or only code understanding
Never delegating tasks outside your area of primary expertise
Using Claude as a narrow productivity tool rather than a domain expander

Behavioural patterns in real sessions

Anthropic's work study found a consistent pattern across teams: engineers who used Claude most broadly reported the largest qualitative changes in how they worked.

The study noted that engineers were becoming "more full-stack" — a front-end engineer taking on data science tasks, a researcher handling deployment scripts — specifically because Claude reduced the cost of working in unfamiliar domains.

Team-specific data illustrates how task type distribution varies by role:

Team	Dominant task type	Share
Pre-training	Feature building	54.6%
Security	Code understanding	48.9%
Alignment & Safety	Front-end development	7.5%
Non-technical staff	Debugging	51.5%

High breadth does not mean uniform distribution — it means deliberate extension beyond your default. The non-technical staff cohort scoring 51.5% on debugging reflects a legitimate use pattern, but a low breadth score would reflect that no other task types were attempted.

The most commonly used task types across the full cohort were debugging (55% daily use), code understanding (42%), and new features (37%). Engineers who added data science, front-end, and design planning to this base showed the highest breadth scores.

How it affects your overall score

Task Breadth carries 10% of your total score.

A one-point improvement in this dimension adds 0.10 points to your overall score.

This is the most straightforward dimension to improve deliberately: intentionally trying Claude on one new task type in your next session will move this score. It is also the fastest to plateau — once you reach 6–7 distinct types, the marginal gain from adding more is small.

It interacts with Complexity Progression (broader task types expose you to higher-complexity work) and New Work Generation (unfamiliar domain tasks are more likely to generate net-new work).

Analyze your sessions →