Site Reliability / GitOps Engineer

Remote
Posted 1 month ago

An opportunity is available for a Site Reliability / GitOps Engineer to join the Information Systems (IS) team at Canonical, the leading provider of Ubuntu and open-source software to global enterprise and technology markets. This role is a unique opportunity for an “automation-first” technologist to manage and evolve the core IT production services used by over 60 million Ubuntu users worldwide.

This is a full-time, remote position, available globally in any timezone.


Role Summary and Automation Leadership Mandate

This SRE & GitOps Engineer will drive operations automation to the next level across Canonical’s private and public clouds. The role combines deep hands-on expertise with infrastructure as code (IaC) and software development practices to ensure the reliability and scalability of Canonical’s services and products.

As a Site Reliability / GitOps Engineer, you will:

  • IaC & Automation: Apply your experience of IaC to develop infrastructure as code practice within IS by constantly increasing automation and improving IaC processes. Automate software operations for re-usability and consistency across private and public clouds.
  • Resilience & Development: Develop new features and improve the resilience and scalability of the existing cloud and container portfolio. You’ll be given uninterrupted development time to focus on large-scale projects and automation of manual tasks.
  • Operational Responsibility: Maintain operational responsibility for all of Canonical’s core services, networks, and infrastructure. Carry final responsibility for time-critical escalations.
  • Observability & Troubleshooting: Develop skills in troubleshooting, capacity planning, and performance investigation. Set up, maintain, and use observability tools such as Prometheus, Grafana, and Elasticsearch.
  • Collaboration & Improvement: Collaborate with development teams to design service architecture, documentation, playbooks, and operational procedures. You will also improve Canonical products and the open-source technologies by providing critical feedback (submitting bugs and sometimes pull requests).
  • GitOps Practice: Utilize version control, peer review, and CI/CD to roll out changes to both applications and infrastructure, defining operations entirely in code.

Required Experience and Technical Qualifications

The ideal candidate is a Linux and automation expert with a strong modern engineering background, capable of operating distributed systems and solving complex, full-stack problems.

  • IaC & GitOps Expertise: A deep experience of, and knowledge to define operations in code, using version control, peer review, and CI/CD to roll out changes.
  • Engineering Background: Strong modern engineering background (peer-review, unit testing, SCM, CI/CD, Agile).
  • Programming: Python software development experience, particularly with large projects.
  • Linux & Networking: Practical knowledge of Linux networking, routing, and firewalls. Hands-on experience administering enterprise Linux servers.
  • Systems Knowledge: Affinity with various forms of Linux storage (from Ceph to Databases). Proficiency with cloud computing concepts and technologies.
  • Education: Bachelor’s degree or greater, preferably in computer science or a related engineering field.
  • Attributes: Motivated and able to troubleshoot from kernel to web. Passionate and familiar with open-source, especially Ubuntu or Debian.

Job Features

Job CategoryInformation Technology, Product Management, Software Engineering

Apply For This Job

A valid phone number is required.