Engineering organizations have spent the last several years investing in developer productivity frameworks. DX Core 4, DORA, SPACE, DevEx. The names change; the premise does not. Measure how developers work, improve the conditions around them, and performance will follow.
It sounds reasonable. It’s also wrong.
Not wrong in the sense that these frameworks are useless. They aren’t. They’re wrong in the sense that they’re being asked to answer a question they were never designed to answer: who is performing, who is not, and what should we do about it?
The Category Error
Productivity describes how work gets done. Speed, efficiency, flow state, satisfaction. Performance describes whether the right work was delivered. These aren’t the same thing, and treating them as interchangeable is a category error that costs organizations real clarity.
A team can score well on every productivity metric available and still deliver the wrong thing, late, to the wrong standard. A developer can report high satisfaction and deep flow while producing zero business value. Productivity frameworks can’t catch this, because they aren’t looking for it.
Organizations care about productivity only insofar as it drives performance. Give them a tool that measures performance directly and they’ll choose it every time. The reason they bought productivity frameworks isn’t that productivity was the goal. It’s that nothing better was on the table.
The Fatal Flaw
DX Core 4, the most sophisticated of the current frameworks, explicitly refuses to measure at the individual level. The justifications are real: gaming risk, attribution complexity in collaborative work, privacy regulations in some jurisdictions. These aren’t imaginary concerns.
But the result is a framework that is structurally misaligned with how organizations operate. Compensation, bonuses, promotions, performance reviews, PIPs. Every one of these traces back to a named individual. A measurement system that deliberately abstracts away the individual can’t connect to any of the mechanisms that organizations use to manage and reward their people.
This creates a specific, predictable failure mode. When a framework can’t identify individuals, it’s forced to treat every problem as systemic. A team is slow? Must be a tooling issue, a process issue, an environment issue. Maybe it is. Or maybe one person isn’t delivering, and the framework has no way to surface that. The team carries the weight. The dashboards stay green. The problem persists.
You can’t fix a system while ignoring its components. Individuals are nodes in the system. Pretending they aren’t doesn’t make the system view more accurate. It makes it incomplete.
Complexity Without Actionability
DX Core 4 is elaborate, well-researched, and visually compelling. It organizes developer experience into dimensions, produces polished dashboards, and generates reports that look like insight. But looking like insight and being insight are different things.
These frameworks can gesture directionally at problems. “Developer satisfaction is down in this area.” “Deployment frequency has dropped.” Fine. But they can’t tell you who’s responsible, what commitment was missed, or what specific action will fix it. They diagnose symptoms at the team level. They don’t pinpoint causes at the individual level. Polished dashboards aren’t a substitute for accountability.
What Performance Measurement Requires
Performance measurement requires what productivity frameworks deliberately avoid: named ownership, auditable agreements, and objective records of what was committed and whether it was delivered.
This is the gap that Collaborate by Contract fills. CBC structures work around explicit agreements. Every agreement has named stakeholders, defined deliverables, measurable outcomes, and clear accountability. Performance evaluation stops being a subjective debate and becomes a verifiable record.
The contractor analogy makes this concrete. If you hire someone to renovate your kitchen, you don’t measure their performance by how many nails they hammered, how satisfied they felt during the project, or how smooth their workflow appeared. You measure it by whether they delivered what was agreed, at the agreed quality, within the agreed terms. The same standard should apply inside organizations.
CBC-aligned metrics like agreement completion rate, cycle time from commitment to delivery, and rework rate are individual-attributable, auditable, and outcome-focused. They answer the question that matters: did this person deliver what they committed to?
Putting Productivity Frameworks in Their Place
None of this means DX Core 4, DORA, or SPACE should be discarded. They’re useful diagnostic tools for engineering systems. They can identify friction in tooling, flag environmental bottlenecks, and surface patterns that affect team-level execution. They speed up how work gets done, and that matters.
But they don’t define performance. They support it. Confuse the two and you end up with an organization that can tell you its developers are satisfied but can’t tell you whether the right work shipped, who delivered it, and who didn’t.
Productivity frameworks are the wrong tool for the most important job. Not a bad tool. The wrong tool. Use them for what they do well. But for the question that keeps every engineering leader up at night, whether their people are performing, you need a system that’s willing to name names.