Reliability as Code pioneer Cortex announced that it has secured $2.5 million in seed funding led by Sequoia Capital. The new funds will accelerate development of the Cortex platform, which enables engineering leaders and site reliability engineers (SREs) to move beyond manual processes to gain visibility and control of rapidly expanding microservices.
Bogomil Balkansky, a partner at Sequoia, joins the Cortex board of directors. Additional investors in the round include Y Combinator, Scott Belsky, CPO at Adobe; Gokul Rajaram, board member of Pinterest and Coinbase; Sam Lambert, CPO of PlanetScale; Manik Gupta, former CPO of Uber and Mathilde Collin, CEO and founder of Front.
“Microservices are becoming ubiquitous because they help engineering teams move faster and with lower risk,” said Balkansky. “But microservices proliferation has side effects: engineering and SRE leaders are challenged to track what services exist, how they depend on one another and what their quality is. The Cortex platform puts all that information at their fingertips, and makes it easy to launch team-wide initiatives to improve service quality.”
“The rapid rate with which we have moved from monolithic application architectures to microservices, and the rapid service ‘sprawl’ that we have seen as a result have left DevOps and SRE teams in precarious positions,” said Torsten Volk, managing research director, Enterprise Management Associates. “The fact is that no organization knows everything that’s going on in their environments. This leads to development inefficiencies, internal conflict, wasted time and high costs. Platforms that give these teams the ability to optimize their microservices mix with Reliability as Code provide strong value in the market today.”
From legacy manual processes to Reliability as Code
Engineering and SRE leaders today depend on tribal knowledge and spreadsheets to track and optimize microservices, leading to surprise outages, security vulnerabilities and loss of time and money. The Cortex Reliability as Code platform is designed to automatically score services against best practices, migrations and SLOs. With the shift from legacy manual processes to Reliability as Code, Cortex provides a single-pane-of-glass for visualization of service ownership, documentation and performance history, giving engineering and SRE teams the visibility and control they need, even as teams shift, people move, platforms change and microservices continue to grow.
“Cortex built bridges among our SRE and our engineering team by providing a single source of truth for Production Readiness Review (PRR) for our microservices,” said Riadh Amari, senior SRE manager at Namely. “Now engineering leaders and SREs aren’t just constantly reminding people to adhere to best practices. They provide critical data on service performance and improve reliability consistently and with minimal friction. Cortex has completely changed their role and elevated their strategic importance to the entire organization.”
“Cortex is a great tool for tracking and managing migrations and best practices. It makes answering complex questions around the number of services and service ownership very simple and a matter of just a few clicks,” said Tanmay Sardesai, software engineer at Clever. “Cortex has also eliminated many spreadsheets that we used to track migrations. It is slowly becoming the one-stop shop for tracking service performance and metadata in our org.”
The Cortex platform features an automatic onboarding workflow that scans all potential microservices sources, discovers the microservices, maps metadata to them and infers critical information such as ownership and on-call rotations. The platform then generates a dashboard illustrating relevant metrics for each of more than 30 third-party tools, including Datadog, Sonarqube, Snyk and PagerDuty.
These integrations let engineers quickly access on-call rotations, latency dashboards, open vulnerabilities and more in one searchable dashboard. Each integration includes individual rules that enable users to drill down into specific metrics to grade the quality of their services via customizable “scorecards.” This enables development of production readiness checklists, security audits and evaluations of operation and development maturity.
New capabilities of the platform include:
- Microservices Quality Scorecards keep teams accountable for following best SRE/security/infra best practices by ‘gamifying’ service quality. Scorecards stack-rank services based on detailed performance metrics which motivates service owners to improve either performance, or decommission unused or unmanaged services.
- Cortex Query Language (CQL), a domain-specific language, lets users write granular rules about health of deploys, SLOs, on-call rotations, security vulnerabilities, package versions and more, such as, “if the service is a production service, then there must be an on-call rotation and greater than 85% test coverage.” This enables engineering and SRE leaders to track and enforce service quality across the entire engineering organization. Engineers can set reliability standards across teams and types of services through direct integrations with a variety of tools.
- Initiatives help drive progress towards meeting service metrics over a specified time period, letting users set goals and deadlines in a scorecard, making service quality a moving target for the team. Cortex messages service owners over Slack or email with their action items, which engineers can also see when they login to the Cortex dashboard. Initiatives align the team towards a common goal within a scorecard and help accelerate progress as a result.
“Cortex is working to address the pains and pressures that we and our colleagues at other leading tech companies felt when we were in the DevOps and SRE trenches and that are leading to burnout across these teams all over the world,” said Anish Dhar, CEO and co-founder, Cortex. “The rapid sprawl of microservices, combined with the archaic processes currently being applied to their management, are creating an unsustainable future for SREs and for the cloud. We are very pleased to have brought this much-needed Reliability as Code platform to market and are looking forward to seeing what our users will be able to achieve by leveraging the platform.”