on
The implications of the different ways of organizing development and infrastructure staffs
The DevOps movement came up as a cultural shift to break down the silos in large organizations, better integrating development and operations teams through collaboration. However, this collaboration can happen in different ways from an organizational perspective: developers and infrastructure specialists can be part of different departments or can be together in a single team. With advancements in PaaS offers, it is possible even to envision developers themselves taking operations responsibilities.
Our research at IME-USP (University of São Paulo) investigates how software-producing companies organize their development and infrastructure teams. We are taking this endeavor by interviewing software professionals to understand how things are really happening in the real world. With our research, we hope to provide a theory to support organizations in designing their organizational structures toward continuous delivery and handling the consequences of a given structure choice.
Based on the careful analysis of the conducted interviews, we elaborated a theory describing the organizational structures used by industry in the real world regarding how the work of developers and infrastructure engineers can be coordinated in the pursuit of continuous delivery. We describe such structures (segregated departments, collaborating departments, single department, and API-mediated departments) in detail in our digest of organizational structures. Here in this post, we summarize each structure with a figure and its caption.
To better understand such structures, we sought to unfold why different organizations adopt different structures. Moreover, considering the existence of advantages and drawbacks for each structure, we wanted to know about the strategies adopted by companies to overcome the drawbacks of each structure. Thus, through a research process, for each organizational structure, we investigated their conditions, causes, avoidance reasons, consequences, and contingencies, as defined below:
- Conditions: environmental conditions necessary to implement a structure (i.e., prerequisites).
- Causes: reasons/motivations/opportunities that led the organization to adopt a particular structure and not another.
- Avoidance reasons: reasons/motivations that led the organization not to adopt a particular structure.
- Consequences: outcomes that happen or are expected to happen after an organization adopts a structure, including unexpected issues.
- Contingencies: strategies to overcome a structure’s drawbacks.
So now we list the conditions, causes, avoidance reasons, consequences, and contingencies associated with each structure. We have just submitted such results to a peer-review process. The complete submitted article is available here. In front of each listed implication, there is a code (e.g., SC01) to refer to in discussions. In our research, we call these implications “strong codes”, so the “SC” letters.
Segregated dev & infra departments
Consequences
- SC01 - Devs lack autonomy and depend on ops
- SC02 - Low delivery performance (queues and delays)
- SC03 - Friction and blaming games between devs and infra
Collaborating dev & infra departments
Conditions
- SC04 - Enough infra people to align with dev teams
- SC05 - Top management support
Causes
- SC06 - In a non-large company / with few products, it is easier to be collaborative
- SC07 - Trying to avoid the delivery bottleneck
- SC08 - Bottom-up initiative with later top-management support
Consequences
- SC09 - Growing interaction inter-areas (e.g., knowledge sharing)
- SC10 - Precarious collaboration (ops overloaded)
- SC11 - Discomfort/frustration/friction/inefficiency with blurred responsibilities (people don’t know what to do or what to expect from others)
- SC12 - Waiting (hand-offs), infra still a bottleneck
- SC13 - Automation supports collaboration
Contingencies
- SC14 - Giving more autonomy to devs (in staging or even production)
Single dev/infra department
Conditions
- SC15 - Enough ops for each dev team
Causes
- SC16 - Startup scenario (small, young, weak infra scalability requirements, business focus, use of cloud services to limit costs)
- SC17 - Cloud services decrease the need of infra & ops staff
- SC18 - Delivery velocity, agility, critical project
Avoidance reasons
- SC19 - Not suitable for applying corporate governance standards
- SC20 - More costs: duplication of infra work among teams, high salaries for infra professionals, underused infra professionals
Consequences
- SC21 - No [infra] defaults across teams: freedom, but possibly leading to duplication of efforts and high maintenance costs
Contingencies
- SC22 - Improve infra skills in-house, inclusive with tech talks
API-mediated dev & infra departments
Conditions
- SC23 - Medium to large sized company
- SC24 - Top-down initiatives/sponsorship
- SC25 - Upfront investment
- SC26 - Requires coding skills from infra people
Causes
- SC27 - Delivery bottleneck in infra management
- SC28 - Compatible with existing rigid structures (low impact on organogram) / Only a few people needed to form a platform team
- SC29 - Fosters continuous delivery
- SC30 - A hero or visionary (hero culture)
- SC31 - Emerged as best solution; other initiatives not so fruitful
- SC32 - Multiple products / multiple dev teams / multiple clients (requires high delivery performance)
Consequences
- SC33 - Interaction (devs x platform team) to: support devs, make things work, and demand new capabilities from the platform
- SC34 - The platform provides common mechanisms (e.g., scaling, billing, observability, monitoring)
- SC35 - Promotes continuous delivery, agility, and faster changes
- SC36 - Devs responsible for infra architecture / concerns (e.g., NFR)
- SC37 - Platform team provides consulting and documentation to devs
- SC38 - Adding devs do not require adding [proportionally] more infra people
- SC39 - Eliminated previous bottleneck
- SC40 - Small platform team (excellence center)
- SC41 - High costs when using public clouds
- SC42 - Devs skills are too focused on corporate needs, lacking base infra knowledge (bad for devs themselves, not for the company)
- SC43 - The cost of managing the platform (even using open-source software) is high
- SC44 - Risk: platform is magic to devs; neglect quality because they trust too much in the platform, any problem they blame the platform and do not know what to do, even for simple problems or when the problem is in the application itself
- SC45 - Devs possibly unable to understand the infra or to contribute to the platform
Contingencies
- SC46 - Decide how much devs must be exposed to the infra internals (some places more, some places less)