Site Reliability Engineer (Remote, US)

Remote
United States
Posted 1 month ago

Red Hat is seeking a Site Reliability Engineer (SRE) to join the team responsible for developing, scaling, and operating its OpenShift managed cloud services. This is a crucial role focused on running Red Hat’s enterprise Kubernetes distribution at scale, demanding expertise in coding, operations, and large-scale distributed system design. The position is fully remote within the US, with specific locations noted as Remote US CO, WA, and CA.

The salary range for this role is highly competitive, spanning $94,550.00 to $191,840.00 annually, with the final offer based on qualifications, experience, and location.


Key Responsibilities and Contributions

As an SRE, you will be a core contributor to the service’s reliability and scalability, working within a small, agile, global team that practices continuous improvement and blameless postmortems.

  • Code and Development: Contribute code to increase the scalability and reliability of the service. You will also contribute software tests and participate in peer reviews to ensure code quality.
  • Automation and Efficiency: Focus on eliminating work through automation and making the monitoring system more sustainable.
  • Operational Excellence: Participate in a regular on-call schedule (including occasional paid weekends and holidays) and practice sustainable incident response and blameless postmortems.
  • Support and Mentoring: Resolve customer issues escalated from the Global Support team and help develop peers’ capabilities through knowledge sharing, mentoring, and collaboration.
  • Agile Work: Work within a small agile team to develop and improve SRE software, plan, and self-improve.

Required Experience and Technical Skills

The ideal candidate will have a strong blend of software engineering and cloud operations expertise.

  • Education & Experience: BS in Computer Science or a related technical field, or equivalent experience.
  • Software Engineering: 3+ years of software engineering experience with at least one object-oriented language (Python, Golang, Java, C, C++). Golang is preferred.
  • Cloud Operations: 3+ years of experience managing Linux-based systems in a public cloud (AWS, GCP, or Azure).
  • Monitoring: 3+ years of experience with enterprise systems monitoring; knowledge of Prometheus is preferred.
  • Cloud & Containers:
    • 1+ year experience delivering hosted cloud services.
    • 1+ year experience with Kubernetes.
    • 1+ year experience with containers on Linux.
  • Technical Fundamentals: Solid understanding of standard TCP/IP networking and common protocols like DNS and HTTP.
  • Soft Skills: Excellent communication skills in a global team environment and a demonstrated ability to quickly and accurately troubleshoot systems issues.

Benefits and Company Culture

Red Hat offers a comprehensive benefits package applicable to full-time, permanent US associates, including medical, dental, and vision coverage, a 401(k) with employer match, paid time off and holidays, and paid parental leave.

Red Hat emphasizes an inclusive culture built on open source principles, encouraging associates from diverse backgrounds to share ideas and challenge the status quo. The company is an equal opportunity and affirmative action employer.

Job Features

Job CategoryCloud Engineer

Apply For This Job

A valid phone number is required.