How Did India’s Largest Interior Design Company Enabled Developer Self-Service With a 3X Gain in Productivity?

Who is the customer?

Livspace is a one-stop shop for all things interiors, like interior design and renovation services. They are spread across Southeast Asia and the Middle East, with operations in India, Singapore, Malaysia, and Saudi Arabia. They rely on technology to serve their customers. Their customer journey is fully digital, onboarding customers on their platform to the design and delivery. Their business depends on reliable backend operations.

Backstory?

The customer did not want to compromise on technology. They kept pace with technological advancements in the cloud native space. As early adopters of Kubernetes, they relied on HELM to perform their deployments. They also used Jenkins as a CI tool to build and pack the apps into Docker images. This system resulted in a custom CI/CD solution needing multiple scripts which relied on in-house scripting.

As they scaled to multiple geographies with multiple clusters, the lifecycle management of HELM packages, workflows, and scripts became a burden. DevOps spent a lot of time with Developers to enable their day-to-day activities.

But they knew the complexities and issues they would face when the company grew. They wanted to standardize deployments and infrastructure management while enforcing rules. But they needed the right set of tools and controls to do so.

What challenges did they face?

They were already on Day 1 of operations and had grown from a few microservices to hundreds of them with the help of in-house IDP platforms. At this point, they had moved from one cluster operation to managing clusters across regions.

But as they were scaling, the focus was on enabling Developers, but that was eating most of the time. These tightly coupled CI/CD pipelines with legacy K8s became a grim reminder that they needed an overhaul. With the Legacy K8s nearing the end of life, they looked for a modern platform to help them shift from their existing system to a more cloud-native ecosystem.

Empowering Developers for self-serve

Developers in the current environment lacked visibility and control over their releases (sometimes called the downstream value chain). With a blind downstream process, developers required the assistance of DevOps engineers to troubleshoot their issues. At a scale of 100+microservices, developers had to wait long hours to get details on their deployed resources. This waiting consumed developers' valuable time and reduced their productivity.

While it consumed the time of Developers, the effort put in by DevOps did not add value to their core functionality. DevOps desperately needed a platform to relieve them from the everyday operations of pipeline lifecycle management.

Value Realization on Day 2 Operations

They were stuck with legacy Kubernetes and needed help to upgrade their ecosystem to scale and operate multiple clusters. Deployments became a bottleneck, and triage was chaotic. The bottleneck was primarily due to a need for more control and visibility with the Developer.

Granular Access Control, Across Clusters

Access control is configured using API access control in the Kubernetes architecture. But the issue is that DevOps must configure it from scratch for every new cluster. Each new cluster adds a lot of repeat work for DevOps as they will have to configure existing users again. It became a problem in managing the access control for the team.

Transfer and Documentation of Tribal knowledge

A higher degree of Kubernetes maturity in an organization means top talent retention is a priority. So organizations need to disseminate the tribal knowledge of senior engineers with the junior engineers so the best practices and expertise aren't lost when a resource leaves the team or organization. As current DevOps platforms relied heavily on the understanding of DevOps and developers to operate, it was essential to document and transfer that knowledge. But writing everything and training all developers and engineers were not feasible.

Why was Devtron right for them?

When they were looking at rewriting its CI/CD ecosystem, it came across Devtron. Open-source was the way forward for them as they provided far better product capabilities than their closed-source counterparts. Open-source products helped keep the costs of operations down.

The customer decided to leverage AWS EKS to address a variety of challenges. EKS provided multi-geography availability, multi-cluster, and cluster management services amongst other benefits.

While experimenting with Devtron, they noticed the frequent product releases. The fact that Devtron uses its platform for its product releases gave them confidence and reassurance that it will be future-proof. Also, Devtron has a good level of integration and support of AWS EKS. They immediately hit it off with Devtron. The platform features quickly plugged in their pain points. After their three-month tryst, they were convinced that Devtron had the workforce, capabilities, and expertise to drive timely and quick support.

The platform provided the developers with all sorts of data and metrics for anomaly detection and troubleshooting in the downstream value chain. Such capability encouraged the developers to participate more in triage activities and ensured they fixed minor issues before reaching DevOps personnel. This significantly improved the developer experience they were looking for. Devtron’s Kubernetes dashboard is far more advanced than the standard Kubernetes dashboard, specializing in multiuser and multi-cloud visibility.

Developer infra knowledge gap- bridged by Devtron

Developers did not have to interact with Kubernetes directly as the platform formed an abstraction over it and delivered everything on an easy UI. The UX simplified their day-to-day operations and elevated their maturity on the platform. With minimal experience in Kubernetes, a developer could perform end-to-end operations on Kubernetes.

DORA reports to leadership

Devtron had a trump card. It was the observability dashboard for leaders. A single DORA metric panel provided leaders with the four essential metrics by collecting hundreds of data points. This ensured that they could compare themselves with the standards set by DORA. They did not have to spend extra time capturing or projecting them. These metrics are available for each service, so each service owner can quickly figure out the areas where they excel or lack agility.

The Migration Project

Stage 1: Open Source Trial

They had an experienced DevOps team and could integrate Devtron into one of their infrastructures. The ease of use made developers and DevOps of that particular team happy. We also shipped them quick feature updates for their use cases. With our priority support, they were convinced they could let go of the GitLab/ Jenkins setup and adopt Devtron as their sole DevOps platform.

Stage 2: Migrate CRDs and K8s apps

It takes less than 3 minutes to create functional end-to-end pipelines for deployment on AWS EKS. Livspace onboarded their applications onto the Devtron platform in less than a week. The migration included all necessary configurations and security policy enforcement. For the custom CRDs and third-party apps, the Devtron development team quickly built all necessary custom plugins, making the transition to the new system (which included AWS EKS) seamless for the whole team.

“We migrated multiple staging environments (yes, the entire environment of 150+ services) in less than a week.”

Stage 3: Zero tolerance security

Devtron workflows are pre-configured to perform container scanning using Clair. With a simple configuration, vulnerability scanning was mandatory for each pipeline across the organization. The compliance team ensures that all vulnerabilities are addressed before they are pushed into production. The pipeline is configured to notify authorized personnel on vulnerability detection, and deployments are halted for that microservice until approved.

Stage 4: Limit access via VPN and token

They did not want its dashboard exposed on a public IP. We enabled VPN access for all their users looking to log in to the dashboard. Users are authorized with a one-time token that expires after 24 hours of inactivity to take the security a notch higher and prevent social engineering activities.

“ All these factors ushered a new era in our Tech ecosystem. We started migrating all of our apps onto Devtron and were able to soon onboard 150+ microservices in a matter of a few weeks. Earlier, we couldn’t have thought of achieving it in months (maybe even quarters).”

What benefits did they realize?

They are a marquee customer who migrated from GitLab and Jenkins. The Devtron platform meets their demands of empowering developers and bringing uniformity into their processes and infrastructure. They went from 100 microservices to over 200 microservices with over 1000 pipelines.

Devtron’s unified interface, the “Devtron Dashboard,” created the biggest impact on the operations. It provides an intuitive dashboard that encapsulates everything an application owner needs to deploy and debug their apps smoothly. It became an instant hit when launched for all their internal application teams.

“Software releases to production which used to take months or sometimes quarters earlier, have now come down to nearly a week, and even less than a day for some services.”

Increased developer productivity.

With Devtron, applications teams bypassed the extensive Kubernetes learning curve as the low code platform enabled developers to perform end-to-end needs on AWS EKS efficiently. Granular access enabled developers with downstream value chain visibility, thus empowering them to perform a triage without the help of DevOps. The unified dashboard interface enabled them to perform real-time application troubleshooting in a single place. Developers now deploy on demand, and the lead time for changes is less than 12 hours.

Reduced failure rates

Enforcing infrastructure policies and other container security measures ensured that failures at production were the lowest. The zero-tolerance policies were crucial in reducing release error rates.

The developer-friendly UI provides a custom Kubernetes dashboard with augmented Helm app management to help engineers and developers find and troubleshoot errors at lightning speed. The developers have adopted Devtron as an exclusive IDP for their daily operations. They are considered an elite performer with at least a daily deployment and a significantly lower failure rate of only 7%.

Faster Recovery

With an increased velocity of deployments arises the probability of increased failure rates. Failures are bound to happen, but Devtron ensured that these failures don't increase with the scale of deployments is essential. They can recover from failures in less than 30 minutes. The average across the organization is 27 minutes.

Multi-cluster management

The primary challenge of the organization running a Day 1 operation on AWS EKS is multi-cluster management. EKS is namespace centric, which makes operations on one cluster easy, but across clusters, it is a hassle for developers and DevOps both.

They have to redo any addition/deletion for multiple EKS clusters. However, scripts can do the job but are not flexible and efficient. Devtron enabled future upgrades to clusters simply via their multi-cluster support. Spawning new namespaces and environments is a breezier task than ever.

GitOps at scale

ArgoCD is an optional plugin that allows DevOps to enable Gitops for their workflows. This flexibility allows teams across clusters to run independent operations adhering to the security best practices. The high degree of automation in the workflows reduces the workload of DevOps.

Read more from the customer themselves: https://blog.livspace.io/how-livspace-revolutionised-its-ci-cd-saga-3120724e271b