
Site Reliability Engineer
- Southbank, VIC
- Permanent
- Full-time
- Design, implement, and maintain developer-friendly tools to improve productivity, code quality, and deployment efficiency for Kubernetes-based workloads.
- Identify bottlenecks in integration and deployment pipelines and implement enhancements to support faster, more reliable deployments to on-premise and cloud Kubernetes clusters.
- Collaborate with development teams to enable self-service tooling for managing deployments, logs, and infrastructure resources in Kubernetes environments.
- Continuously improve build, test, and deployment automation for Kubernetes infrastructure across on-premise and cloud environments (AWS/GCP).
- Provide better visibility into Kubernetes environments through improved observability tools, dashboards, and metrics.
- Manage and improve Kubernetes orchestration across on-premise infrastructure and AWS/GCP clusters to ensure reliability, scalability, and consistency.
- Enhance observability by implementing robust monitoring, logging, and alerting solutions tailored to Kubernetes workloads using tools like Grafana, Loki or cloud-native tools like CloudWatch (AWS) and Stackdriver (GCP).
- Collaborate with Engineering Leadership to implement reliability engineering practices such as load testing, chaos testing, and recovery mechanisms for Kubernetes services.
- Bachelors or Masters in Computer Engineering (or equivalent experience)
- 2+ years in Software or Systems Engineering
- Automation for scaling using tools like Ansible, Terraform, Helm, and ArgoCD.
- Software development in at least one language such as Go or Python
- Experience in building and maintaining container platforms, such as Kubernetes
- Expert in Observability platforms such as Grafana, Prometheus etc
- Experienced in using and tuning cloud native technology
- Solid understanding of basic Linux and cloud networking (e.g., routing, firewalls, DNS, VPCs, subnets, load balancers).
- Salary continuance insurance
- NEP Days - additional 5 days of leave per year (conditions apply)
- NEP Travel benefits & discounts including Qantas Club Membership
- Discounts through Employment Hero Work app
- Employee Assistance Programme
- NEP's Live Production solutions range from AV services and live audience enhancements to traditional outside broadcast and cutting-edge centralized and cloud production.
- NEP's Virtual Production solutions start at the creative stage and end with exceptional execution across ICVFX, augmented reality, LED stages and more.
- NEP's Media Processing solutions provide the tools and products our clients need to ingest, edit, store, search, manage and distribute their digital assets to rights holders across multiple platforms.