Crafting Platforms' Book
Chapter 02

Internal Developer Platform

Your scientists were so preoccupied with whether they could, they didn't stop to think if they should.

— Ian Malcolm, Jurassic Park
Story

Two weeks after Marta’s presentation on Platform Engineering, Diego calls a meeting with her and Javi.

“We agree we need an internal platform,” Diego begins. “But how do we actually build it?”

“First, let’s align on what we’re building,” Marta says. “An internal developer platform isn’t just infrastructure. It glues together all the tools developers need, such as CI/CD, observability, security, and templates. It also provides golden paths that lower cognitive load and enable self-service.”

“Golden paths?” Javi asks.

“Pre-built, opinionated ways of doing things. The platform says: ‘here’s how you deploy a database.’ Teams can deviate if needed, but the default path is secure, scalable, and well-documented. The easy way is the right way.”

“And that’s different from just giving everyone AWS access?” Diego asks.

“Very different. A platform isn’t shared infrastructure. It’s a product. We ask: what problems do developers have? What do they need? How do we measure success?”

“How do we structure objectives?” Javi asks.

“We use OKRs, like the rest of the company. Our objectives cascade from company goals. If the company wants faster time-to-market, our objective might be ‘reduce deployment lead time by 75%.’ Leadership sees the direct connection.”

“What capabilities should the platform offer?” Diego asks.

“Most companies organize them into layers like infrastructure, CI/CD, observability, security, developer tooling. But we don’t build everything on day one. We start with an MVP that solves the biggest problems, then iterate based on real feedback.”

“And support?” Javi asks.

“Hybrid model,” Marta explains. “Self-service where it makes sense, service desk for direct support, clear documentation. Make the right path the easy path. We have on-call for platform incidents, but product teams operate their own services and should have their own on-call.”

Diego nods. “I’ll discuss with leadership. Meanwhile, start defining the roadmap.”

“We’ll start by talking to developers,” Marta says. “Understanding what hurts. Then build our product vision from there.”

“An internal developer platform,” Javi concludes. “Built like a product, where our customers are our engineers… sounds like a plan.”

An Internal Developer Platform (IDP) is a layer of tools, services, and workflows that a platform team builds and maintains to enable development teams to self-serve their infrastructure and operational needs. More precisely, an IDP glues together the technologies and tools an organization uses into golden paths that lower cognitive load on developers and enable developer self-service without abstracting away the context developers need to do their jobs effectively.

Think of it as the “product” that your platform team offers to your developers. Just as your company builds products for external customers, the platform team builds a product for internal customers: the engineers who build and operate your software.

An IDP typically includes:

  • Infrastructure provisioning: Self-service access to compute, storage, databases, and networking
  • Application deployment: CI/CD pipelines, deployment automation, release management
  • Observability: Metrics, logs, traces, alerting, and dashboards
  • Security and compliance: Secrets management, vulnerability scanning, policy enforcement
  • Developer tooling: Templates, documentation, development environments, service catalogs

The key differentiator of an IDP is integration. It’s not just a collection of tools. It’s a curated, cohesive experience where the pieces work together, accessible through multiple interfaces (CLI, API, UI) to meet developers where they are. A developer shouldn’t need to understand five different systems to deploy a service. The platform abstracts that complexity behind a unified interface.

Successful IDP initiatives start small. Begin with a Minimum Viable Platform that addresses your developers’ most pressing pain points, then iterate based on adoption and feedback. Trying to build everything at once is a common mistake.

Note

IDP vs. PaaS

How does an IDP differ from a Platform as a Service (PaaS) like Heroku or Render? The distinction is customization and control. A commercial PaaS is a one-size-fits-all solution. You adapt to it. An IDP is tailored to your organization’s specific needs, technologies, and constraints. You build it, you own it, you evolve it based on feedback from your developers. An IDP can also incorporate PaaS offerings as part of its capabilities.

Note

IDP multiple meanings

The term IDP also refers to Internal Developer Portal, which is a specific capability within an Internal Developer Platform. It is a user interface that provides developers with a centralized place to discover, provision, and manage platform services. We have a chapter on developer experience later in the book that covers this in detail.

Platform as Product

When defining what a platform should (or shouldn’t) offer, it’s not only important to understand what developers explicitly need, but also to fill in the gaps in needs they don’t know they have, but which are critical for the organization’s long-term success: security, scalability, reliability, compliance, costs… Our job, therefore, must also be to contribute and educate on best practices, for the benefit of the business.

Treating the platform as a product also means basing decisions on user research and continuous feedback loops rather than assumptions. It requires cross-functional collaboration. Security, engineering, infrastructure, operations, and executives should all have input into the platform’s direction. The platform team doesn’t operate in isolation. It serves the entire engineering organization.

The Mindset Shift

The first thing the platform team must do is adopt the product mindset: the customer comes first. They develop something that others will use. This “something” (a.k.a. The Platform) is developed based on two types of needs:

  • Explicit needs: What development teams directly demand, such as infrastructure self-service, autonomy, CI/CD pipelines, reusable templates, clear documentation, ephemeral development environments…

  • Implicit needs: Requirements imposed by engineering best practices or business needs that may not be evident to all teams, such as integrated observability, security, horizontal scalability, reliability and disaster recovery, regulations, cost optimization…

The balance between both is what defines a successful platform. Ignoring explicit needs results in a platform that nobody adopts. Ignoring implicit needs results in a platform that generates medium and long-term problems.

The first obstacle many organizations encounter when trying to build an internal platform is that they apply the same mindset they’ve used for years to manage infrastructure. The platform team doesn’t provide servers, but enables a path (golden path) for development teams to do it themselves. This may sound simple and risky. It may sound like shirking responsibilities. But nothing could be further from the truth. The nuance is in “enabling a path.” That path is pre-established, configured according to best practices, and is secure. That path has guardrails to prevent developers from making critical errors, but at the same time gives them autonomy to move fast, without depending on a third team. This is the central piece of the mindset shift. The easy path has to be the right path.

At the same time, the platform team is a cross-functional team that offers a common infrastructure, with shared services. That is, it offers a platform as a service that must have corresponding 24x7 support.

Another key focus is the relationship with developers. In a traditional model, the infrastructure team is a provider that responds to tickets and support requests. In a platform model, the platform team is a strategic partner that understands developers’ needs. That’s why it’s important to have constant bidirectional communication, with satisfaction surveys, needs discovery sessions, etc. The platform’s success depends on the success of the developers who use it.

There’s a subtle but important difference between “user” and “customer”:

  • A user consumes what you give them. They can complain, but have few alternatives.
  • A customer chooses to use your product because it provides value. If it doesn’t, they seek alternatives.

In our context, developers are technically captive users (they can’t easily switch to another platform), but they should be treated as customers who choose to use your product. This means we must strive to offer an excellent user experience, listen to their needs, and adjust the platform accordingly.

Importantly, the platform should be voluntarily adopted. The platform team must earn developer trust by demonstrating value, not by mandating usage. When the platform genuinely makes developers’ lives easier, adoption follows naturally. This way, the platform becomes a strategic asset for the organization.

The Engineering Momentum Problem

Platform engineering looks great on paper, but making it work in practice is a different story. In an ideal world, you start with a blank slate, design the perfect platform, and roll it out to your developers. In reality, most organizations already have established systems, processes, delivery pressures, and cultural habits that make this transition challenging. Resistance to change is natural, especially when it involves altering workflows that have “worked well enough” for years.

I call this the Engineering Momentum Problem, and understanding it is essential, because even a well-designed platform will fail if you underestimate developer inertia.

Note

Understanding Momentum in Physics

In physics, momentum (“quantity of motion”) is the product of mass and velocity:

$$p = m \cdot v$$

A fully loaded cargo ship traveling at cruising speed has enormous momentum, because of its huge mass. You can’t simply turn the wheel and expect the vessel to pivot sharply. It will fight you. It wants to keep going straight. To change its direction, you need to apply a gentle, continuous force over a long arc —sometimes with the help of a tugboat— to slowly bend its trajectory. There is no such thing as a sudden 90-degree turn in a ship that size.

If you replace the physical terms with organizational ones, the analogy becomes clear:

  • Object: development teams
  • Mass: people, habits, inherited processes, legacy tools, technology they’ve accumulated, interdependencies, historical decisions (that no one fully remembers)
  • Velocity: roadmaps, deadlines, commitments already made, business pressure, operational load

The bigger the mass and the faster the velocity, the greater the momentum. And the harder it is to change direction.

Why Momentum Matters

In the context of platform engineering, momentum directly affects the success of the platform’s adoption. Introducing a new platform —no matter how elegant— disrupts existing workflows. It adds short-term friction to already overloaded teams and threatens their delivery commitments.

This is where many platforms fail. Not because they are badly designed, but because they underestimated momentum. Teams with high momentum will naturally push back against new tools, new processes, new abstractions, new ways of deploying or operating services, new responsibilities.

Applying Force to Change Direction

Just as with the moving vessel, you cannot (and should not) force an abrupt turn. Platform engineering works exactly the same way:

  • You don’t mandate a platform all at once.
  • You don’t force teams to abandon familiar tools overnight.
  • You don’t try to shift the entire company’s workflow in a single quarter.

Instead, you apply continuous gentle force:

  • guidance
  • empathy
  • clear documentation
  • examples
  • golden paths
  • human support
  • collaboration
  • small wins

Little by little, momentum bends. The change stops feeling like an imposition and starts feeling like an obvious improvement. Because you’re no longer seen as “the team pushing a new thing,” but the team that helps to reduce the burden.

When that happens, adoption stops being a fight, and you can start introducing new technologies in the stack, new processes, and new culture for building software. Yes, that awesome Kubernetes and Argo CD to fully automate deployments that you want to put in place will come months later in the process.

The key insight: adoption needs gentle, continuous force rather than mandates. This is where the product mindset pays off: if developers don’t want to use your platform, it doesn’t matter how technically excellent it is.

Fundamentals

Like any product, a platform must be defined based on:

  • Vision: A clear statement of what you want to achieve with the platform, aligned with the organization’s strategic objectives.
  • Strategy: A medium and long-term plan that details how the platform will be built, evolved, and maintained.
  • Objectives: Measurable goals that define what success looks like, typically expressed as OKRs or similar frameworks.
  • Roadmap: A tactical plan that prioritizes functionalities and improvements to implement based on what developers and the business need at each moment.
  • Roles and responsibilities: Clear definition of who does what within the platform team.

You don’t build a house from the roof down, but starting with the foundation. The manager who has been tasked with building a platform must start by defining all of the above, because at some point they’ll have to present it to the rest of the organization to get support, commitment, and even funding. You need to be very clear about the platform’s reason for being, the problems it will solve, and especially, how it will be managed and evolve over time.

As you can see, there’s nothing technical here. First you have to define the what and the why, before getting into the how, which will be practically the rest of the book.

Vision

The vision is easy to define with everything we’ve seen so far. There will be nuances or flourishes depending on each organization’s context. But in essence, a platform’s vision is something like:

“Provide developers with a secure, robust, and scalable platform that enables them to develop, deploy, and operate applications autonomously and efficiently.”

Or if we want to focus on the organization, something like this also works:

“Provide the organization with an internal platform for developers that facilitates fast and secure software delivery”

Personally, I prefer to focus on developers, because the organization is implicit with them. If we improve what they do, it means we’re improving the organization as a whole.

The vision should be inspiring, clear, and concise. It should be easy to remember.

Strategy

The strategy to follow will depend on each organization’s context, needs, and technological maturity. However, in my experience, I believe there are certain common elements to all:

  • Segmentation: define a priori the different segmentation models (by region, by tier, by team…). Each has its own challenges and will complicate implementation. Defining them a posteriori can mean partially or totally redoing certain parts of the platform, such as the network or data management. We have an entire chapter dedicated to this due to its importance. Segmentation has technical consequences, but it’s a decision imposed by business needs (for example, regulatory requirements for data sovereignty).
  • Capabilities: define the minimum, desirable, and future capabilities that the platform should offer. This will help us define the roadmap and prioritize work. Later we’ll see what some of these capabilities are.
  • Team: define the team needed to build and maintain the platform, including both technical and management roles.
  • Relationship with developers: define how to interact with development teams, how to collect their feedback, and how to educate and evangelize about the platform.

Objectives

Before defining a roadmap, you need to define what success looks like. Objectives provide the “why” behind each initiative on the roadmap and establish how you’ll measure progress. Without clear objectives, a roadmap becomes a wish list of features rather than a strategic plan tied to business outcomes.

OKRs (Objectives and Key Results) have become a popular framework for this purpose. An Objective is a qualitative, inspiring goal that describes what you want to achieve. Key Results are quantitative metrics that measure whether you’ve achieved the objective. The combination ensures alignment between aspirational goals and concrete, measurable outcomes.

This is my personal choice because of its simplicity, effectiveness and my experience using it. However, you can use any other framework that suits your organization, what matters is having clear, measurable objectives. When defining platform objectives, consider both developer-centric goals and business outcomes.

Platform objectives should align with broader organizational goals. If the company’s objective is to reduce time-to-market for new features, the platform’s objectives around deployment frequency and developer productivity directly support that goal. This alignment helps justify platform investments and ensures the team is focused on what matters most to the business.

When presenting objectives to stakeholders, make the connection explicit. Show how each platform objective contributes to the company’s success. This builds support and understanding across the organization.

Roadmap

The roadmap is a tactical plan that details what will be built and when. It should be based on the platform strategy, the objectives defined above, and the needs of developers. The roadmap should be flexible and adaptable, as priorities will change over time based on feedback, new requirements, and progress toward objectives.

Every major initiative on the roadmap should trace back to one or more objectives. This connection serves multiple purposes:

  • Prioritization: When faced with competing initiatives, you can evaluate which one contributes more to your key results
  • Communication: Stakeholders understand why you’re building what you’re building
  • Focus: The team can avoid scope creep by asking “does this help us achieve our objectives?”

When defining the roadmap, prioritize functionalities that provide the most value, both to developers and to your key results. Various prioritization techniques exist (MoSCoW, RICE, weighted scoring), but the specific method matters less than having a consistent approach that ties back to your objectives.

A common structure for platform roadmaps includes:

  • Now (current sprint): Committed work with clear deliverables
  • Next (following sprint): Planned work, subject to minor adjustments
  • Later (future sprints): Strategic direction, flexible based on learnings

This “Now/Next/Later” format avoids the trap of committing to specific dates too far in advance while still providing visibility into the platform’s direction. You can also use quarters or months if that fits better with your organization’s planning cycles.

At the end of each sprint (or your chosen planning cycle), assess:

  1. Delivery: Did we complete what we planned?
  2. Impact: Did completed initiatives move our key results as expected?
  3. Learning: What did we learn that should inform future planning?

If you’re consistently delivering initiatives but not moving key results, something is wrong. Either the initiatives aren’t actually addressing the objectives, or the key results themselves need revisiting.

In my experience, a Kanban-style approach works well for platform teams: maintain a backlog prioritized by impact on objectives, pull work based on capacity, and hold regular planning sessions (monthly or per sprint) to review progress against key results and adjust priorities. This balances structure with the flexibility that platform work often requires.

Importantly, revisit your roadmap regularly based on feedback. IDP initiatives evolve quickly as you learn what developers actually need versus what you assumed they needed. A roadmap that doesn’t adapt to real-world feedback becomes stale and disconnected from developer reality.

Roles and Responsibilities

The platform is built by a team that is known as a platform team. This team’s mission is to reduce the cognitive load on stream-aligned teams by providing them with internal self-service services. It acts as an enabling layer that allows development teams to be more autonomous and productive.

Note
A stream-aligned team is a team aligned to a flow of work from a segment of the business domain. They are responsible for delivering value directly to customers. Their focus is on end-to-end delivery of features, products, or services.

The goal of this section is not to provide an exhaustive guide on how to structure a platform team. There are specialized books like Team Topologies [Skelton, 2019] that delve into these organizational strategies. This section provides context on who usually forms part of these teams to give coherence to the rest of this book’s content.

Below are the most common roles. In small teams, one person often wears multiple hats. Only in large organizations will you find highly specialized roles.

Core roles:

  • Platform Engineers (various levels): Build and maintain platform capabilities, from implementation to architecture to technical vision.
  • Engineering Manager: Team management, hiring, professional development, strategic alignment.
  • Product Manager: In larger teams, a dedicated PM drives roadmap and stakeholder communication. In smaller teams, this mindset is distributed among senior engineers and management.

Supporting roles (as the platform matures):

  • Developer Advocate: Evangelization, training, and bridge between platform and developers—critical for adoption.
  • Technical Writer: Documentation quality directly impacts self-service success.

Operational responsibilities:

  • On-call rotation for critical platform incidents.
  • Technical support through service desk or dedicated channels.
  • Incident management with blameless postmortems.

Platform SLAs and SLOs

The platform should define and publish its own service levels:

  • SLOs (Service Level Objectives): Internal targets for availability, latency, and reliability. For example: “CI/CD pipelines run within the first minute 99% of the time.”
  • SLAs (Service Level Agreements): Commitments to your internal customers. These create accountability and set expectations.

Tracking these metrics publicly (dashboards accessible to all developers) builds trust and demonstrates the platform team’s commitment to reliability.

Capabilities

An IDP’s value comes from the capabilities it provides to developers. While the specific capabilities will vary by organization, most platforms organize them into logical groupings, often called planes or layers. This organization helps teams understand what the platform offers and makes it easier to evolve incrementally.

A common way to structure platform capabilities:

  • Infrastructure & Resources: The foundational layer providing compute, storage, networking, and managed services. This includes cloud provider resources (VMs, managed databases, storage buckets), container orchestration (Kubernetes clusters), networking (VPCs, DNS, load balancers, CDNs), and messaging systems (Kafka, queues).

  • Integration & Delivery: Everything needed to build, test, and deploy software. CI/CD pipelines, artifact registries (container images, packages), deployment automation, release management, and environment provisioning. This is often where teams see the most immediate value from a platform.

  • Observability: The ability to understand what’s happening in production. Metrics collection and visualization, centralized logging, distributed tracing, alerting, dashboards, and SLO management. Good observability reduces mean time to detection and resolution.

  • Security & Compliance: Capabilities that make secure practices the default. Secrets management, identity and access management, vulnerability scanning, policy enforcement, compliance automation, and audit logging. Security baked into the platform is far more effective than security bolted on afterward.

  • Developer Experience: Tools and workflows that make developers productive. Service catalogs, project templates, documentation portals, development environments, internal APIs, and self-service interfaces. This layer ties everything else together into a cohesive experience.

Note
Different organizations use different naming for these groupings. You might see “Runtime Plane” instead of “Infrastructure,” or “Delivery Plane” instead of “Integration & Delivery.” The specific names matter less than the underlying philosophy: group capabilities logically and provide abstractions that teams can easily consume.

You don’t need to build all of this on day one. In fact, trying to do so is a common mistake. Start with the minimum capabilities that address your developers’ most pressing pain points—typically CI/CD and basic infrastructure provisioning—then expand based on adoption and feedback.

The subsequent chapters of this book will explore each capability area in depth, with practical guidance on implementation.

Developer Experience

Developer experience (DevEx) is not a feature. It’s a lens through which every platform decision should be evaluated. A platform with excellent capabilities but poor DevEx will struggle with adoption. A platform with modest capabilities but excellent DevEx will thrive.

At its core, DevEx means:

  • Self-service over tickets: Developers can provision what they need without waiting for approvals or manual intervention.
  • Golden paths: The easiest way to do something is also the correct way. Good defaults, sensible guardrails, minimal configuration.
  • Clear documentation: Not just reference docs, but guides, examples, and templates that help developers succeed quickly.
  • Responsive support: When self-service isn’t enough, help is accessible and timely.

Feedback Loops

A platform without feedback is a platform built on assumptions. Establish multiple channels to understand how developers experience your platform:

  • Surveys: Periodic (quarterly) satisfaction surveys covering specific aspects: documentation, support, reliability, ease of use. Keep them short.
  • Usage analytics: Track adoption metrics, feature usage, and where developers struggle or abandon workflows.
  • Direct conversations: Regular office hours, embedded time with teams, or informal conversations. Numbers tell you what. Conversations tell you why.
  • Support patterns: Analyze support tickets for recurring themes. Repeated questions indicate documentation gaps or UX problems.

The goal isn’t just collecting feedback. It’s closing the loop. When developers see their feedback result in improvements, they trust the platform team and engage more.

Support Model

Platform support typically operates in tiers:

  • Self-service (Tier 0): Documentation, FAQs, examples, and templates. This should handle the majority of questions. Invest heavily here.
  • Community support (Tier 1): Slack channels, or office hours where developers help each other and platform team members participate.
  • Direct support (Tier 2): Service desk or dedicated channel for issues that can’t be resolved through self-service. Track response times and resolution rates.
  • Escalation (Tier 3): Complex issues requiring platform engineer involvement. Should be rare if Tiers 0-2 are working well.

The goal is to push as much as possible to lower tiers, but not to avoid work, but because self-service is faster for developers.

We’ll explore specific DevEx practices throughout the book, but keep this principle central: if developers don’t want to use your platform, it doesn’t matter how technically excellent it is.

Skills for This Chapter

AI Skill
define-platform-vision — An AI skill that guides you through defining your platform’s vision, strategy, and OKRs. It asks about your organizational context, developer pain points, and business goals, then produces a platform charter you can present to stakeholders.

Summary

This chapter sets the foundation for everything that follows. The Internal Developer Platform is the product we are building throughout the book, and treating it as such is crucial for success. A platform is not just a technical challenge on its own. It’s a cultural and organizational one. The mindset shift is crucial: the platform team doesn’t provide servers, but enables paths for development teams to do it themselves. The easy path has to be the right path.

We also explored the Engineering Momentum Problem, which is the challenge of overcoming developer inertia. Even a well-designed platform will fail if you underestimate this aspect. Adoption needs gentle, continuous force rather than mandates.

Finally, we covered the fundamentals of building a platform as a product: vision, strategy, objectives, roadmap, roles and responsibilities, and key capabilities. With this foundation in place, we can now dive into the specific layers and practices that make up a successful Internal Developer Platform.