Platform engineering is the discipline of building and maintaining a self-service platform for developers. The platform provides a set of cloud-native tools and services to help developers deliver applications quickly and efficiently. The goal of platform engineering is to improve developer experience (DX) by standardizing and automating most of the tasks in the software delivery lifecycle (SDLC). Instead of context switching like provisioning infrastructure, managing security, and learning curve, developers can focus on coding and delivering the business logic using automated platforms
Platform engineering has an inward-looking perspective as it focuses on optimizing developers in the organization for better productivity. Organizations benefit greatly from developers working at the optimum level because it leads to faster release cycles. The platform makes it happen by providing everything developers need to get their code into production so they do not have to wait for other IT teams for infrastructure and tooling. The self-service platform that makes developers' day-to-day activities more effortless and autonomous is called an internal developer platform (IDP).
IDP is a platform that comprises self-serving cloud-native tools and technologies which developers can use to build, test, deploy, monitor or does almost anything regarding application development and delivery with as little overhead as possible. Platform engineers or platform teams build it after consulting the developers and understanding their unique challenges and workflows.
After discussing and implementing Kubernetes CI/CD pipelines and GitOps solutions for many large hi-tech enterprises, we realized a typical IDP would consist of the below 5 pillars:
The platform team designs IDP in a way that is easy to use for developers with a minimal learning curve. IDPs can help reduce developers' cognitive load and improve DX by automating repetitive tasks, reducing maintenance overhead, and eliminating the need for endless scripting. IDP enables development teams to independently manage resources, infrastructure needs, deployments, and rollbacks by providing a self-service platform. This increases developer autonomy and accountability, reduces dependencies, and streamlines the development cycle.
Platform engineering can help organizations reap several internal (developers) and external (end users) benefits:
Implementing a successful platform team in an organization and leveraging the above benefits requires following some common principles. Treating the platform as a product is one of them.
One of the core principles of platform engineering is productizing the platform. The platform team needs to employ a product management mindset to design and maintain a platform that is not only user-friendly but meets the expectations and needs of the customers (app developers). It starts with collecting data points around the problems developers have and identifying which area to facilitate. This could improve deployment frequency, reduce the change failure rate, improve reliability and security, improve DX, etc.
It is important to note that building a platform is all about building a core product that solves common challenges most teams have. It is not about solving the problems of a single team but providing the product across multiple teams to solve the same set of problems. For example, if multiple teams require the same piece of infrastructure, it makes sense for the platform team to work on that shared piece and distribute it. This idea of reusing the platform and repeatability is crucial as it allows for standardization, consistency, and scalability in application delivery.
As in product management, the platform team owns the product, chooses certain metrics, and continues taking customer feedback to improve the user experience. The platform's product roadmap evolves with respect to feedback, and it accommodates changing needs and desires of the customers.
The primary role of a platform engineer is to design and maintain a self-service platform (IDP) and provide platform services for developers. It starts with engaging with the developers and understanding their pain points:
Interview developers and different IT teams to understand their engineering landscape and challenges and to know what they are optimizing for. They may be trying to build an effective CI/CD pipeline or implement better access control, among many other challenges around software delivery.
Identify common challenges most teams share and prioritize solving them over problems individual teams face. For example, if most teams find it hard to store and retrieve secrets securely, it is ideal to prioritize and solve them for everyone.
Design IDP with required tools that would solve those problems for users, along with documentation to enable developers to self-serve resources and infrastructure. Adopting a secret management tool would solve challenges around securely managing secrets in the above case. Part of platform designing also includes writing scripts to automate routine development tasks, such as spinning up new environments and provisioning infrastructure to reduce errors and friction points in the development flow.
Choose specific metrics around the goals to measure the platform's effectiveness. For example, if the goal is to improve DX, the metrics include engagement scores, team feedback, etc. Similarly, the metrics will change if the goal is to reduce the change failure rate or to increase deployment frequency.
Continue listening to the customers and watch the metrics. Gather user feedback to add new tools to the platform and optimize for a better user experience. This also includes staying up-to-date with emerging tools and technologies in the DevOps and cloud infrastructure space and adopting them if necessary.
It is easy to confuse the roles of a DevOps engineer or SRE with that of a platform engineer since they all manage the underlying infrastructure and support software development teams. Although there are certain overlapping responsibilities between all these roles, each differs from the others with its unique focus.
DevOps is a philosophy that brought a cultural shift to SDLC to improve software delivery speed and quality. DevOps facilitated collaboration and communication between development and ops teams and accelerated automation to streamline deployments. Platform engineering — a practice rather than a philosophy — can be considered the next iteration of DevOps as it shares some core principles of DevOps: collaboration (with Ops), continuous improvement, and automation.
The daily tasks of a platform team and DevOps differ from each other in some aspects. DevOps use certain tools and automation to streamline getting the code to production, managing it, and observing it using logging and monitoring tools. They mostly work on building an effective CI/CD pipeline. Platform engineers take all the tools used by DevOps and integrate them into a shared platform, which different IT teams can use on an enterprise level. This eliminates the need for teams to configure and manage infrastructure and tooling on their own and saves significant time, effort, and resources. Platform engineers also create the documentation and optimize the platform so developers can self-serve the tools and infrastructure in their workflow.
Platform teams are required only in matured companies with many different IT teams using complex tools and infrastructure. Naturally, a dedicated platform team to manage the complexity will become necessary in such an engineering landscape. The platform team builds and manages the infrastructure, helping DevOps speed up continuous delivery. However, it is common for the DevOps team to perform platform engineering tasks (configuring Terraform, for example) in startups.
Site reliability engineers (SREs) focus on ensuring the application is reliable, secure, and always available. They work with developers and Ops teams to create systems or infrastructure that support delivering highly reliable applications. SREs also perform capacity planning and infrastructure scaling and manage and respond to incidents so that the platform meets required service level objectives (SLOs). On the other hand, platform engineering manages complex infrastructure and builds an efficient platform for developers to optimize SDLC. While both work on platforms and their roles sound similar, their goals differ.
The major difference between platform engineering and SRE regards whom they face and cater their services to. SREs face end users and ensure the application is reliable and available for them. Platform engineers face internal developers and focus on improving their developer experience. The daily tasks of both teams differ with respect to these goals. Platform engineering provides the underlying infrastructure for rapid application delivery, while SREs do the same to deliver highly reliable and available applications. SREs work more on troubleshooting and incident response, and platform engineers focus on complex infrastructure and enabling developer self-service.
To achieve their respective goals, both SREs and platform teams use different tools in their workflows. SREs mostly use monitoring and logging tools like Prometheus or Grafana to detect anomalies in real-time and to set automated alerts. Platform teams work with different sets of tools spanning various stages of the software delivery process, such as container orchestration tools, CI/CD pipeline tools, and IaC tools. All in all, SREs and platform teams work on building a reliable and scalable infrastructure with different goals but with some overlapping between the tools they use.
A platform team will not be an immediate requirement in a startup with a few engineers. Once the organization grows to multiple IT teams and starts dealing with complex tooling and infrastructure, it is ideal to have platform engineers to manage the complexity.
Top-level engineers like the VP or Head of Engineering usually create the role of a platform engineer when developers spend more time configuring the tools and infrastructure rather than delivering the business logic. They would find that most IT teams are solving the same problems, like spinning up a new environment, which lags the delivery process. So the Head of Engineering would define the scope of platform engineering, identify the areas of responsibility, and create the role of a platform engineer/team.
The platform engineer starts by building the logs of the infrastructure and tools that are already used in the organization. Then they would interview developers and understand their challenges and build the internal developer platform with tools and services that solve problems on an enterprise level. They will build the platform in a way that is flexible and facilitates different architectures and deployment styles. Platform engineers also create documentation and conduct training sessions to help developers self-serve the platform. It is ideal for platform engineers to have a developer background so they know what it is like to be a developer and understand the challenges better.
Once the platform is ready, platform engineers onboard application developers. It will require internal marketing and letting teams know of the platform and what it can solve. The best way to onboard users is to pull them to the platform rather than throw the platform at them. This can be done by starting with a small team and helping them overcome a challenge. For example, help a small team optimize CI/CD pipeline and provide the best experience possible in the process. Word-of-mouth from early adopters will have a positive ripple effect throughout the organization, which will help onboard more users to the platform.
Platform engineering does not stop at onboarding the users. It is a continuous process where the platform accommodates emerging tools and technologies and the changing needs and requirements of the users.
The open-source Devtron platform is built to enable platform engineers with a standardized toolchain, which helps developers accelerate software delivery. Devtron platform helps developers by automating CI/CD platform, security, and observability for end-to-end SDLC. Below are some use cases of the Devtron platform:
Devtron makes platform engineering easier by providing all these features inside a user-friendly dashboard. Devtron also has great community support for integrating emerging tools and technologies into the platform. Feel free to log in with your GitHub credentials and have a look and feel for the platform here: https://preview.devtron.ai/dashboard/login/.
Want to start with a simplified GitOps deployment for free but without any hassle, then try Devtron open source platform for GitOps
Do you need any help to setup GitOps deployment along with multi-cluster and multi-cloud visibility and controls with Kubernetes dashboard.