Site Reliability Engineer – Canonical (Remote)

Remote
United States
Posted 3 weeks ago

​Canonical, the publisher of Ubuntu, is a pioneer in globally distributed work. As a Site Reliability Engineer, you won’t just be managing cloud tools; you will be perfecting enterprise infrastructure using a model-driven approach. You will manage hundreds of private clouds and Kubernetes clusters across both physical hardware (bare metal) and public clouds. At Canonical, automation is treated as a software engineering problem, requiring deep Python fluency and a scientific mindset to manage open-source operations at a massive scale.

  • Location: Globally Remote (Americas/Pittsburgh focus)
  • Experience: Strong background in Linux, Python, and Networking.
  • Core Tech: Ubuntu, OpenStack, Kubernetes, Kubeflow, Kafka, OpenSearch.
  • Travel: Ability to travel internationally twice a year for team sprints.

​Full-Stack Open Source Operations

​You will work across the entire technology stack, from bare-metal networking and the Linux kernel up to orchestration layers. Canonical’s approach focuses on “model-driven” operations, where complex software like OpenStack and Kubernetes are deployed and managed as reusable code models. This reduces the manual “toil” of traditional sysadmin work and allows for the management of massive distributed estates.

​Kubernetes & Application Ecosystem

​You will be responsible for the lifecycle of Kubernetes clusters and the open-source applications running on them, such as Kubeflow for AI, Kafka for streaming, and various databases. Your role involves monitoring these applications with an observability-first mindset, identifying incidents before they impact global customers, and ensuring that the entire open-source portfolio meets rigorous enterprise standards.

​Automation as Software Engineering

​To succeed at Canonical, you must be a software engineer first. You will use Python to build the automation that drives infrastructure. This includes creating recovery pipelines, automating security standards, and implementing metrics-driven scaling. You will move beyond “scripting” to build robust, maintainable software that handles the deployment and maintenance of mission-critical services for global brand-name customers.

Summary: You are at the heart of the open-source world. By applying high-level Python engineering to the challenges of bare-metal and cloud infrastructure, you ensure that Ubuntu-based environments remain the gold standard for enterprise innovation, AI, and IoT.

Job Features

Job CategoryDevOps

Apply For This Job

A valid phone number is required.