Site Reliability Engineer (SRE) – Database (USA & Canada)

Remote

US/Canada

Posted 5 months ago

We are seeking a Site Reliability Engineer (SRE) with dedicated expertise in database management and optimization to join the DevOps team. This mid-level, experienced role is focused on managing and optimizing SQL and NoSQL database infrastructure at scale, ensuring systems are resilient, performant, and secure through code.

This is a Full-Time, Remote position available in the USA & Canada. The compensation range is $120,000 – $150,000 USD.

Role Summary and Database Reliability Mandate

This SRE role sits at the critical intersection of database administration, software engineering, and operations. You will be responsible for applying SRE principles to mission-critical data systems (e.g., Postgres, MongoDB), driving continuous improvement in availability and performance through automation, advanced monitoring, and rigorous incident response.

Key Responsibilities:

Database Reliability & Performance: Maintain and optimize SQL and NoSQL database systems, focusing on improving availability, latency, and scalability.
Automation & IaC: Design and implement automation for provisioning, configuration, and maintenance using Python, Bash, and Infrastructure-as-Code (IaC) tools like Terraform or Ansible.
Observability & Monitoring: Own the setup and refinement of monitoring systems (e.g., Prometheus, Grafana, Datadog) to ensure deep visibility into database health and anomalies.
Incident Management: Lead or contribute to on-call rotations, triage production issues, and perform thorough Root Cause Analysis (RCA) to drive long-term reliability.
Performance Tuning: Analyze slow queries, indexing strategies, and schema design to improve database efficiency and throughput.
Resilience & Security: Implement and validate robust backup, recovery, and disaster readiness strategies. Enforce database security policies, access controls, and compliance best practices.
Collaboration: Partner with software engineers, data engineers, and DevOps teams to align database architecture with application needs and business continuity goals.

Required Experience and Technical Expertise

The ideal candidate is an experienced SRE or infrastructure engineer with significant, direct experience supporting production database systems in a cloud-native environment.

Experience (Required):
- 5+ years in SRE, DevOps, or infrastructure engineering roles, with direct experience supporting production database systems.
Technical Expertise:
- Solid experience with relational and NoSQL databases (e.g., Postgres, SQL Server, MongoDB), including comfort with query optimization, replication, and failover.
- Proficiency in Python, Bash, or similar scripting languages.
- Experience with CI/CD pipelines and Infrastructure-as-Code tools.
- Hands-on experience with cloud platforms (AWS, Azure, GCP) and container orchestration (Docker, Kubernetes).
Mindset: Demonstrated ability to troubleshoot complex systems independently and drive resolution. Strong written and verbal communication skills to influence and align stakeholders.
Education: Bachelor’s degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience.

Job Features

Job Category

Data, DevOps, Software Engineering

Role Summary and Database Reliability Mandate

Key Responsibilities:

Required Experience and Technical Expertise

Job Features

Apply For This Job