Technical
Prodvana Architecture Part 1: Overview
Naphat Sanguansin
Platform Engineering is a force multiplier for velocity, reliability, and efficiency. Platform teams reduce friction between the application and infrastructure and sustainably accelerate the business.
In 2019, Netflix published a blog on Managed Delivery. At the same time, I was the tech lead of a project to decompose the Dropbox monolith into a platform called Atlas, a system that shared all the same properties of Managed Delivery. These initiatives had the same goal: understanding the intent rather than the steps.
What follows is a series of posts that deep dive into the technical architecture of Prodvana’s intent-based deployment system. Each post is structured around a major component.
Part 1 (this post) - Overview and Motivations
Part 2 - the Prodvana Compiler
Part 3 - the Prodvana Convergence Engine
Part 4 - the Prodvana Runtime Interface and Conclusion
The Benefits
Teams experience a 50% increase in deployment frequency.
Teams catch at least 20% of failures automatically and resolve them.
Teams can easily stand up new regions and tenants, with the average service needing 5x less configuration.
Teams onboard their first service to Prodvana in under 20 minutes.
Teams employ a single workflow to handle various workflows, encompassing cloud-native and legacy processes.
Defining the requirements of an intent-based system
Akin to the Managed Delivery approach at Netflix, Prodvana defines “intent” as understanding the unique needs of the application, delivery, and infrastructure.
Requirement 1: Intent Definitions
The system must provide a way to define definitions of intent that cover cases across Infrastructure, Delivery, and Applications.
For Infrastructure
Identifying regions where deployments need to occur
Determining the appropriate machine types for different regions.
Defining the runtime environment (Kubernetes, ECS, VMs, etc.).
Defining which components are available for Delivery and Applications to utilize.
For Delivery
Establish the order of deployments across various environments.
Set environment-specific rules such as maintenance windows and banning deployments when alerts are firing.
Overall "health" of the system.
For Applications
How to start the service (flags, binary name)?
How many replicas to run?
How does the application specify its healthiness?
Specific configuration for the application, such as backend storage, credentials, and more.
Requirement 2: Adaptive to real-time changes
By using intent, we also have an implied requirement of adapting to real-time changes. For example, an unrelated change may cause an outage that makes deploying unsafe. A delivery system that operates on intent must adjust to real-time changes.
Requirement 3: Brownfield, not Greenfield
Based on our experience in infrastructure, we looked to maximize adoption rates and avoid migration taxes. We did this by taking a brownfield first approach, which drives technical requirements on interfaces and integration points.
For example, the system must be adaptable to various types of infrastructure, including Kubernetes, virtual machines, and proprietary solutions. The system should integrate existing configurations, such as Kubernetes configurations or ECS task specs, minimizing migration costs without requiring a complete system redesign or massive migration.
Key Terminology
These definitions are used for the remainder of this architecture overview.
Application: A collection of related Services with the same release requirements that roll up to a single product for a team, including internally facing products.
Release Channel: Logical slices and release stages of an Application. In Netflix's Managed Delivery blog post, this is called an environment.
Service: The unit of deployment, the entity being released to all Release Channels of an Application.
Service Instance: An instantiation of a Service in a Release Channel.
Runtime: The backends where Services run, such as Kubernetes clusters, Google Cloud Run regions, and Amazon ECS clusters.
Dynamic Delivery: How We Implemented Intent-Based Delivery
Dynamic Delivery is three primary components designed with unique requirements and interfaces resulting in a meaningful separation of concerns.
Component 1: The Prodvana Compiler
The Prodvana Compiler is responsible for compiling user configurations into a desired state.
Inputs:
Delivery and Infrastructure requirements for the Release Channel
Service-specific configurations
Outputs:
Desired state with one compiled Service configuration per Release Channel
Component 2: The Prodvana Convergence Engine
The Prodvana Convergence Engine uses the desired states from the Prodvana Compiler and the Services’s current state information from the Prodvana Runtime to continuously make decisions about the work needed to get to the desired state.
Inputs:
Desired state from the Prodvana Compiler
Current state from the Prodvana Runtime Interface
Outputs:
Convergence status for Service owners
Apply requests for the Prodvana Runtime Interface
Component 3: The Prodvana Runtime Interface
The Prodvana Runtime Interface is the gateway to user environments. The Prodvana Runtime Interface ensures that Prodvana is backend-agnostic and helps minimize migration costs for various Runtime types. It satisfies two interfaces:
fetch
- return the current state of a Service in the Runtimeapply
- runs the command(s) to bring the Service to a desired state.
Next Step: Prodvana Compiler Deep Dive
See how the Prodvana Compiler is built to implement intent-based deployments and minimize changes needed to onboard onto Prodvana.