Your Org Chart Is Not Your Operating Model
Why healthy software teams keep accountability close to the product, centralize leverage, and design for knowledge resilience
The research behind this piece makes a point that should not be controversial, but somehow still is: assigning work in engineering is not a staffing exercise. It is an operating-model decision.
Every time a manager hands the ugliest migration to the same senior engineer, keeps support safely away from product teams, or leaves incident response as a vague “engineering problem,” they are making a bet. The bet is partly about speed. It is also about who grows, who gets overloaded, how customers get heard, and how many parts of the business become quietly dependent on the same human.
I think most software org-design arguments start in the wrong place. Teams argue about centralized versus decentralized support. They argue about whether “you build it, you run it” is enlightened or cruel. They argue about whether PMs or EMs “really” own delivery. Then production wobbles, an enterprise customer escalates, and the truth appears in plain view: nobody can quite say who owns the outcome, who is allowed to decide trade-offs, or who is supposed to help without taking the work away.
The useful question is not centralized or decentralized. It is this: what should stay close to the product, what should be standardized for leverage, and how much support does a person need for this specific task, in this specific context, right now?
That is a much less ideological question. It is also much more useful.
* * *
Staffing for Speed Usually Breaks the System
The research frames engineering assignment as a balance among three goals: immediate efficiency for the business, growth and challenge for the individual, and long-term durability for the team through broader knowledge distribution. That is exactly the right frame.
Most teams still optimize for the first goal because it feels measurable. There is a deadline. There is a waiting customer. There is one engineer who has seen this weird subsystem before. So the same person gets the ticket, the migration, the escalation, the architecture call, and the after-hours Slack ping. On paper that looks efficient. In practice it is a loan with ugly interest.
You get the feature faster this month. You also get a narrower bench, a slower onboarding path for everyone else, and a team that keeps confusing expertise with exclusivity. You teach the organization that the safest path is always “give it to the person who already knows.” Then you act surprised when nothing scales except that person’s calendar.
Preference is not performance. A team’s preference for the known expert is often just a preference for lower short-term anxiety. Performance is whether the organization can keep shipping, supporting, and learning when that expert is on vacation, on a different priority, or out the door.
This is why the durability point matters so much. Low bus factor is not an abstract cleanliness complaint from architecture enthusiasts. It is an operating risk. If only one or two people can explain a service, debug it under pressure, and change it safely, the service is fragile no matter how many dashboards you bought.
One reason this keeps getting missed is that software teams love the language of ownership and hate the cost of building it. Real ownership requires stable enough priorities for people to learn their domains, supportive enough leadership for people to stretch into new ones, and good enough developer experience that the team is not burning all its time on friction. That is not glamorous. It is just what the research keeps saying.
This does not mean romanticizing rotation. That is another management hobby that goes bad quickly. Stretch work is good. Blind rotation is not. Research on job rotation in software organizations has found real upside—broader knowledge, more variety, stronger collaboration—but also real costs, including extra coordination and slower ramp-up when it is done badly.
Managers need to stop treating “let’s rotate people more” like a moral virtue. The point is resilience and growth, not churn.
So yes, give newer engineers stretch work. Yes, move knowledge across boundaries. But budget the activation energy. Pair first. Shadow on-call. Use scoped migrations. Run design reviews. Rotate the parts of ownership that spread understanding without detonating reliability.
This sounds obvious, but teams miss it constantly. They think they are making a staffing choice. They are actually choosing what kind of system they will have six months later.
* * *
Roles Matter Less Than Decision Rights
The research is strongest when it treats work allocation as a managerial lens. It is thinner when you ask the more dangerous question: who actually owns what when the pressure arrives?
This is where titles become a trap. The wrong debate is “Which title owns this?” The useful question is “Who is accountable for the outcome, who is doing the work, who is supporting it, and who decides trade-offs when things get expensive?”
The boundaries are not mysterious. Product managers decide which problems matter, where capacity goes, and how success is measured. Engineering managers build the team system: staffing, coaching, feedback, delivery environment, and capacity over time. Developers and technical leads carry technical execution and local design judgement. Support resolves customer issues, manages response expectations, and escalates well. Customer success drives adoption, value realization, and retention over time. SRE and platform teams reduce operational toil and cognitive load. QA helps shape test strategy and risk-based validation, but quality is still a team outcome.
The important part is not memorizing the boxes. The important part is separating decision rights from collaboration rights.
A PM should not need to line-manage engineers to own prioritization. An EM does not need to be the sole architecture authority to own team health and delivery capacity. A support team should not become the default product manager just because it sees a lot of pain. Customer success should not be used as a renamed support desk. SRE should not become a generic queue for every operational mess product teams would rather not think about.
One person may wear multiple hats in a small company. That is fine. But the hats still need names.
If your incident process begins with “who knows this area?” and your escalation process ends with “can someone pull in Alex?”, the org chart is decorative. What you actually have is an informal routing network disguised as a company.
That is why good operating models obsess over explicit ownership, escalation paths, service boundaries, and discoverability. Not because bureaucracy is fun. Because ambiguity is expensive at exactly the moments when nobody has time for ambiguity.
It is also why multidisciplinary teams keep showing up in serious operating models. When product, engineering, design, delivery, and operational concerns have to coordinate through endless handoffs, the organization learns slowly and blames quickly. Pulling the right capabilities close to the value stream is not fashionable theory. It is a way to reduce queueing, argument, and institutional amnesia.
* * *
The Best Support Model Is Usually a Layered One
I think the centralize-versus-decentralize argument survives mainly because it is emotionally satisfying. It lets everyone keep their favorite ideology.
Builders like autonomy. Operators like consistency. Finance likes shared services. Founders like speed. Everybody can find a slogan.
Real organizations do not run on slogans.
Small teams often make a largely decentralized model work. “You build it, you run it” can be perfectly rational when context is dense, coordination cost is low, and the same people who ship the feature can still answer the customer question without starting a three-team ritual. In that stage, all-hands customer exposure is useful. Support can be lightweight. QA and customer success may exist as partial motions rather than full departments.
Scale changes the math.
Once you have several product teams, more enterprise customers, tighter contracts, more compliance needs, or a larger production surface area, pure autonomy starts producing duplicated tools, fuzzy escalation, and very expensive reinvention. That is when hybrid models show up, and for good reason.
What actually matters is service layering.
Centralize the capabilities that benefit from consistency and leverage: observability defaults, CI/CD primitives, service catalogs, golden paths, testing frameworks, access patterns, incident playbooks, escalation mechanics, and reusable platform abstractions.
Decentralize the responsibilities that need product intimacy and fast feedback: feature trade-offs, roadmap choices, local design decisions, day-to-day operations, and most ownership of live services.
Embed or closely align specialist expertise where the cost of failure is high: SRE for reliability-critical systems, security for regulated workflows, QA for complex validation, support engineering where customer escalations are technical and time-sensitive.
This is also where one of the most common category errors shows up. Support and customer success are not the same thing just because both talk to customers. Reactive troubleshooting and proactive value realization are different motions, different metrics, and usually different staffing profiles. The moment post-sales complexity appears, collapsing them into one vague customer team is usually just a polite way to hide confusion.
That split matters operationally. When a customer says, “Billing is broken,” support should be focused on case resolution, severity, reproducibility, and clear escalation. Customer success should be focused on business impact, adoption risk, and what needs to happen so the customer still gets value from the relationship next quarter. Those are adjacent jobs. They are not identical jobs.
The hidden benefit of separating those motions is cleaner signal. Support sees repeated technical failure. Customer success sees adoption friction and value gaps. Product sees the pattern and decides whether the work belongs in the roadmap. When those signals are blurred together, the organization starts mistaking urgency for importance. You become excellent at handling escalations and mediocre at removing the causes.
Platform teams have their own failure mode. Everyone says they want a platform team until the platform team becomes an internal procurement department. The point of a platform is to reduce cognitive load and duplicated effort for delivery teams. It should behave like a product with internal users, adoption goals, feedback loops, and a healthy fear of becoming a ticket sink.
The pattern I keep coming back to is simple: keep accountability close to the product, centralize leverage, and be explicit about the interfaces. That is not as catchy as a bumper sticker. It is much closer to how durable organizations actually work.
* * *
A Better Default for a Messy Scale-Up
Take a fairly normal scale-up. Eight product squads. A growing enterprise customer base. Two legacy services nobody wants to touch. One billing system that only a principal engineer truly understands. Support tickets that bounce between account managers and engineers. Customer success people dragged into technical debugging because they own the relationship. Incidents that are resolved through a heroic chain of DMs instead of an obvious path.
Nobody thinks this is the design. It is just the sediment of six quarters of urgency.
Here is the redesign I would make.
First, assign explicit service ownership to product teams and publish it somewhere boring and easy to find. Not a wiki graveyard. A living service catalog or internal portal that answers three questions fast: who owns this, how do I escalate it, and where is the runbook.
Second, separate reactive and proactive post-sales work. Support handles ticket intake, reproduction, response expectations, and known resolution paths. Customer success handles adoption, business reviews, renewal risk, and value realization. They stay tightly linked, but they stop pretending to be the same function.
Third, add specialist overlays where risk justifies them. If incidents are frequent or blast radius is high, put SRE or production engineering close enough to influence design, not just mop up afterwards. If release friction is high or defects escape late, add QA capability that helps teams improve validation instead of merely catching bugs at the end.
Fourth, invest in platform only where duplication is clearly systemic. If every squad is separately solving environment setup, observability wiring, deployment conventions, or service metadata, that is not autonomy. That is repeated tax. Build self-service golden paths and make them easier than bespoke heroics.
Fifth, map knowledge concentration directly. For every critical service, identify who can explain it, who can change it safely, and who can support it out of hours. If the same names appear everywhere, you do not have senior talent. You have a resilience gap.
Sixth, calibrate support by task maturity, not title. A senior engineer new to distributed billing may need close pairing. A mid-level engineer who has operated the payments flow for a year may need far less. This sounds mundane. It is one of the most missed management moves in software.
Seventh, review the model with a balanced scorecard. Not just delivery speed. Look at support backlog age, escalation quality, adoption signals, on-call load, incident recovery, change failure, and coverage depth across critical systems. If you only measure output, you will recreate the same fragility with nicer labels.
The key point is that this redesign is slower in month one and better by quarter two. That trade-off is hard for leaders who want immediate neatness. But the alternative is worse. You pay the coordination cost anyway. You just pay it through unplanned Slack archaeology, heroic escalation, and customers discovering your org chart before you do.
The problem is usually not that people are lazy or territorial. It is that the interfaces were never designed.
* * *
Your Operating Model Shows Up Under Stress
There is no universally correct support model for software organizations. Startups can survive with dense local ownership and lightweight specialist help. Scale-ups usually need product teams with end-to-end accountability plus support, success, platform, and reliability overlays. Larger or regulated environments usually land in a federated model with local ownership, central standards, and embedded specialists where risk is highest.
That is not inconsistency. That is fit.
What actually scales is not autonomy by itself or centralization by itself. It is clarity.
And clarity is not a meeting. It is a discoverable set of defaults. People should know where to look, who to call, what level of response applies, and who can make the call when speed and safety conflict. If that information lives only in the heads of veterans, you do not have clarity. You have oral tradition.
Clarity about who decides what.
Clarity about which problems belong close to the product.
Clarity about what gets standardized for leverage.
Clarity about how customers, support, and product learn from one another.
Clarity about where critical knowledge lives, and whether the organization can survive without its favorite heroes for a week.
I think managers underestimate how often org-design failure is disguised as a people problem. “We need stronger engineers.” “We need better collaboration.” “We need more ownership.” Sometimes, sure. But often the team is behaving rationally inside a blurry system.
Your org chart is not your operating model. Your operating model is the set of decisions, escalations, handoffs, and defaults that show up when something important breaks or a customer asks a hard question.
If the only thing holding that system together is that three veterans answer Slack fast and remember which doc is secretly current, you do not have a mature software organization.
You have folklore with payroll.
And that is fine for a while.
It is not a strategy.
The slightly annoying question is this: if your best engineer vanished for two weeks, would delivery dip because they are exceptional, or because your operating model quietly turned them into infrastructure?
* * *
Notes and References
1. Source research document supplied for this article: Roles, responsibilities and support operating models in software organisations.
2. DORA. Accelerate State of DevOps Report 2024.
3. Team Topologies. Key Concepts.
4. Google. Site Reliability Engineering book, “Introduction.”
5. Amazon Web Services. AWS Well-Architected Framework, “Relationships and ownership” and “Mechanisms exist to manage responsibilities and ownership.”
6. GitLab Handbook. “Product Manager.”
7. GitLab Handbook. “Engineering Manager.”
8. GitLab Handbook. “Support Engineer Responsibilities.”
9. GOV.UK Service Manual. “Set up a service team at each phase.”
10. HubSpot. “Your Customer Success Team.”
11. 37signals / Signal v. Noise. “Everyone on Support.”
12. Jabrayilzade, Evtikhiev, Tüzün, and Kovalenko. “Bus Factor In Practice.” ICSE 2022.
13. Santos, Silva, and Magalhães. “Benefits and Limitations of Job Rotation in Software Organizations: A Systematic Literature Review.” EASE 2016.
14. TSIA. “What Is Customer Success? Definition, Importance, and Benefits.”

