When You Push for 3x
I was at a lunch table with my boss and most of my team, telling the story of how we'd doubled our velocity. I was proud of the number. I told it like it was a win. My team was sitting right there, listening to me describe what they'd done to make that number. They knew. I didn't yet. That's the part I don't tell in the short version. Not just that velocity got gamed, but that I was the one carrying the number into rooms and setting it on the table like a trophy. Every time I did that, I taught my team something about what I was rewarding. Death by a thousand paper cuts Velocity doesn't collapse in one visible moment. There's no incident report. No postmortem. It erodes the way a codebase quietly deteriorates when nobody's watching the right signals. Tickets start getting larger. Not more
I was at a lunch table with my boss and most of my team, telling the story of how we'd doubled our velocity. I was proud of the number. I told it like it was a win. My team was sitting right there, listening to me describe what they'd done to make that number. They knew. I didn't yet.
That's the part I don't tell in the short version. Not just that velocity got gamed, but that I was the one carrying the number into rooms and setting it on the table like a trophy. Every time I did that, I taught my team something about what I was rewarding.
Death by a thousand paper cuts
Velocity doesn't collapse in one visible moment. There's no incident report. No postmortem. It erodes the way a codebase quietly deteriorates when nobody's watching the right signals.
Tickets start getting larger. Not more complex ... larger. A thing that should be a 2 becomes a 5. A 5 becomes an 8. The complexity hasn't changed. The estimate has. Engineers aren't doing this because they're lazy or dishonest. They're doing it because they are not dumb. They see patterns faster than most leaders give them credit for, and the pattern here was unmistakable.
They despise it too. Nobody wants to spend their career gaming a metric. But the incentive structure demanded it, and I built the incentive structure.
When I asked why velocity dipped in a way that had only one acceptable answer, the team learned exactly what I was rewarding. I wasn't asking a genuine question. I was expressing an expectation.
The team skipped tests. They took shortcuts. Debt compounded sprint by sprint, quietly, in the parts of the codebase that the velocity chart couldn't see. Leadership was happy. I was happy. My team was miserable, doing work they didn't believe in, and watching me celebrate it.
The metric was broken before I started pushing
Velocity was never designed to do what I was asking it to do.
Story points replaced hours estimates because humans are terrible at estimating hours. A task that feels like a three-hour job on a Tuesday morning after strong coffee can turn into a twelve-hour slog on a Friday when the system you're touching has dependencies nobody documented. Hours failed as a productivity measure under that variability.
So we moved to story points. Relative complexity. A 3 means "this is about as complex as the last thing we called a 3." It's a tool for team-level planning within a sprint, not a performance indicator for executive reporting.
The moment velocity gets shared upstairs, that distinction disappears. Leadership doesn't see a planning tool. They see a number that went up or didn't. And teams, watching a number get carried into executive meetings, stop thinking about relative complexity and start thinking about the number. Gaming is not a character flaw. It's a rational response to a broken measurement system.
Every time leadership tries to bring velocity into a planning conversation with me now, I push back. Velocity is the wrong number for that room. A 3-point ticket can take two hours on one day and thirty-two hours the next. Complexity and time are two different things. The number sounds precise. It isn't.
When a planning tool becomes a performance proxy, you haven't solved the measurement problem. You've just moved it.
The same pattern is running on AI right now
GitHub's research on AI-assisted development is the new velocity chart. Their 2022 study showed developers using Copilot completed a controlled coding task 55% faster than those without it. The number is real. Leaders are picking it up, carrying it into planning meetings, and setting it on the table like a trophy.
What the study measured was isolated task completion speed. Developers were given a well-defined problem in a contained environment and watched. What it could not measure was what happens after the code ships ... whether the output integrates cleanly into a real system, whether it creates dependencies the next engineer has to untangle, or whether the shortcuts AI enables are compounding into a maintenance problem that won't surface on any dashboard for another eighteen months.
The pattern is identical. A number that captures something real but incomplete gets elevated into a mandate. The teams receiving that mandate are not dumb. They see what's being rewarded. They optimize accordingly.
When you push for 3x productivity without defining what quality looks like alongside it, you get 3x of whatever path of least resistance produces. That might be 3x the output. It might be 3x the debt. The velocity chart won't distinguish between them. Neither will the AI coding metrics.
The quality thresholds have to come first, before the productivity push. Not as an afterthought, not as a hedge, but as the definition of what success looks like. What does "done" mean? What does "good" mean? What signals tell you that speed is compounding into something durable versus something that will have to be rebuilt in two years?
Those questions don't have glamorous answers. They don't generate the kind of headline GitHub's research does. But they're the difference between 3x productivity and 3x of something nobody wanted.
The dashboard won't tell you. The engineers know.
What I changed
I stopped using velocity as an external number. It lives inside the team, where it belongs, as a planning tool. It doesn't go upstairs.
What goes upstairs instead is outcome evidence. What shipped. What it does. What it unblocked. The kind of answer that requires someone to understand what the work was, not just whether the number went up.
That change made some conversations harder. Leadership wants a number they can track across quarters, and "here's what we built and why it matters" is not as clean as a chart with an arrow pointing right. But a chart with an arrow pointing right was how I ended up at a lunch table bragging about a number my team had learned to inflate while I called it a win.
The leaders pushing hardest for AI productivity gains right now are running the same pattern. The number is real. The study is real. What's missing is the conversation that happens before the mandate ... what does quality look like here, what are we measuring when we measure "done." Skipping it is how you end up with 3x of something you didn't want, and engineers who could see it coming the whole time.
The dashboard won't show you that part. Ask your engineers.
One email a week from The Builder's Leader. The frameworks, the blind spots, and the conversations most leaders avoid. Subscribe for free.
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
versionproductcopilot
The end of the browse-and-click era: The roadmap to agentic commerce
The agentic commerce era is here — but how much does it matter? The short answer: enormously, and more so every day. McKinsey has estimated that five years from now, agentic commerce — meaning AI agents acting autonomously on behalf of consumers to search, compare and buy across platforms — could account for $1 trillion in the US business-to-consumer (B2C) retail market, and $3 trillion to $5 trillion globally. At the January conference of the National Retail Federation, participants noted that what was experimental is fast becoming operationalized , and implementations that used to take two to three months are now being done in a few weeks. Common standards and practices are being established, such as the Agentic AI Foundation , as well as tools like the Agentic Commerce Protocol and the

PMI builds commerce engine to glean customer insights
Counterfeit tobacco sales account for as much as 75% of South Africa’s total market. And while Mary Mahuma, CIO for Southern Africa PMI, admits that the challenge facing the business is significant, she finds solutions by tackling the root cause of the issue: customer insights . According to her, other FMCG brands also struggle to clearly understand consumer behavior, how they engage with brands, and what they actually want. This is especially true in rural and informal markets. “One might expect a brand like PMI to try to address these challenges by focusing on big fish,” she says. “But there’s so much value in better targeting our strategies toward understanding the hidden market for tobacco products.” This market, consisting of small, independent convenience or general trade stores, is

Without controls, an AI agent can cost more than an employee
Without proper controls, AI agents can cost more than what outputs are worth according to Jason Calacanis and Chamath Palihapitiya , two IT experts and cohosts of popular podcast, All In podcast . During a recent episode, long-time tech investor Calacanis noted that agent costs quickly rose to $300 a day while using the Claude API at one of his organizations. At the same time, these $100,000-a-year agents were replacing only a fraction of an employee’s work. “When do tokens outpace the salary of the employee?” he asked. “Because you’re about to hit it. I’m about to hit it.” Palihapitiya, CEO at VC firm Social Capital, said his organization sets token budgets for its best developers, but unfettered agent use can get expensive. “If you aggregate it across all people, you can clearly see a tr
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Products
The Morning After: NASA’s Artemis II is on a voyage around the Moon
NASA’s Artemis II successfully launched on April 1 , with its crew on a 10-day mission to circle the Moon. It’s the first crewed Artemis flight and a major step toward humanity returning to our little neighbor in the future. Since launch, the vehicle has separated from its launch system and been manually piloted, testing how the Orion capsule will dock with future lunar landers. There have been some snags, however: The onboard toilet went awry, and Microsoft Outlook has been acting screwy . Jokes aside, there is something magnificent about seeing humanity taking to the stars once again. That, for all of our worst instincts, we can still come together to solve problems and explore beyond our own horizons. — Dan Cooper The other big stories (and deals) this morning SpaceX has reportedly file

Without controls, an AI agent can cost more than an employee
Without proper controls, AI agents can cost more than what outputs are worth according to Jason Calacanis and Chamath Palihapitiya , two IT experts and cohosts of popular podcast, All In podcast . During a recent episode, long-time tech investor Calacanis noted that agent costs quickly rose to $300 a day while using the Claude API at one of his organizations. At the same time, these $100,000-a-year agents were replacing only a fraction of an employee’s work. “When do tokens outpace the salary of the employee?” he asked. “Because you’re about to hit it. I’m about to hit it.” Palihapitiya, CEO at VC firm Social Capital, said his organization sets token budgets for its best developers, but unfettered agent use can get expensive. “If you aggregate it across all people, you can clearly see a tr

PMI builds commerce engine to glean customer insights
Counterfeit tobacco sales account for as much as 75% of South Africa’s total market. And while Mary Mahuma, CIO for Southern Africa PMI, admits that the challenge facing the business is significant, she finds solutions by tackling the root cause of the issue: customer insights . According to her, other FMCG brands also struggle to clearly understand consumer behavior, how they engage with brands, and what they actually want. This is especially true in rural and informal markets. “One might expect a brand like PMI to try to address these challenges by focusing on big fish,” she says. “But there’s so much value in better targeting our strategies toward understanding the hidden market for tobacco products.” This market, consisting of small, independent convenience or general trade stores, is

The end of the browse-and-click era: The roadmap to agentic commerce
The agentic commerce era is here — but how much does it matter? The short answer: enormously, and more so every day. McKinsey has estimated that five years from now, agentic commerce — meaning AI agents acting autonomously on behalf of consumers to search, compare and buy across platforms — could account for $1 trillion in the US business-to-consumer (B2C) retail market, and $3 trillion to $5 trillion globally. At the January conference of the National Retail Federation, participants noted that what was experimental is fast becoming operationalized , and implementations that used to take two to three months are now being done in a few weeks. Common standards and practices are being established, such as the Agentic AI Foundation , as well as tools like the Agentic Commerce Protocol and the


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!