DISNEYのStaff Site Reliability Engineer (Staff SRE) Skip Navigation
The Walt Disney Company. Be you. Be here. Be part of the story.

Be Part of the Story

Staff Site Reliability Engineer (Staff SRE)

応募 後で応募 Job ID 10135906 勤務地-都市 バンクーバー, カナダ 勤務地-国 Walt Disney Animation Studios 掲載日 2025/10/31

仕事内容:

Walt Disney Animation Studios’ world-class filmmakers, artists, and technical collaborators create the magic of animation. Bring your unique talents, passion and ideas to our team and prepare to play in a creative, artist-friendly environment.

We are seeking a Staff SRE with expertise in systems administration skills in Linux platforms, and also has experience with software development (e.g. Python, Go, Java, Node), CI Pipeline tools (e.g. Jenkins), Git source management, cloud hosting (AWS, GCP & Azure), container computing (e.g. Docker, OCI), and web technologies. The ideal candidate will enjoy the diversity and challenges of working at various levels in the foundational deployment stack, from defining configuration management, to developing CI/CD infrastructure and processes.

This role resides within the Platform and Infrastructure team at Walt Disney Animation Studios (WDAS), and we build the tools and manage the infrastructure that artists use daily to create our celebrated animated content. The SRE team within Platform Engineering is focused on optimizing service deployments  and improving the availability, latency, performance, efficiency, and observability of systems at WDAS. All projects have in common pursuit of simple and performant solutions to complex problems using Agile and DevOps methodologies as part of high-energy, proficient teams. 

Critical to success in this role is an aptitude for working collaboratively with a technical team. You will help to develop and drive requirements and strategies while also supporting services and core services infrastructure.

Our studio thrives from a wide variety of technical backgrounds and experiences, so we encourage applicants to apply even if they have experiences not specified below. Bring your unique talents, passion and ideas to our team, and be a part of Disney’s creative legacy! 

 

Responsibilities

As Staff SRE, you will translate ideas into tangible products that shape experiences by focusing on a systematic approach to automation, resiliency, efficiency, stability, security, performance, and capacity management, as well as documentation. You will serve as a subject matter expert in multiple areas and be looked at by your fellow team members as a 'go to' individual; you are someone who has a clear understanding of, and can thoroughly elaborate on SRE principles and best practices to a given audience. To be successful in this role you will continuously uphold and improve all the relevant reliability aspects for our services, with an increased focus on SLIs and SLOs, while raising the reliability of a variety of large scale user facing and internal services. As Staff SRE, you will maintain a strong understanding of stakeholder workflows and requirements, and then be able to translate the targeted solutions into an end-to-end architectural design.

 

You will work with engineering, creative and production teams in an extremely collaborative and high-energy environment to brainstorm, architect, gather requirements, troubleshoot, and provide stellar customer support.  You are passionate about constantly learning, applying technology to solve complex problems, and is a highly motivated, optimistic, proactive, creative thought leader and project manager.

Additional Responsibilities Include:

  • Support a wide range of on-premises and cloud deployments  using infrastructure-as-code, self-healing, and security automation patterns and can facilitate others to use the Infrastructure as Code paradigm

  • Deploy and manage a wide array of on-premises and cloud deployments 

  • Develop useful telemetry, alerts, and response to reduce Mean Time To Repair (MTTR).

  • Collaborate and provide technical excellence within and across teams.

  • Consult on best practices and develop tools to enable smooth adoptions of good service reliability practices and methods.

  • Identify areas of improvement in reliability, efficiency, and operations.

  • Build tools to help your SRE team quickly pinpoint, isolate and resolve issues related to infrastructure, platform services and applications.

  • Continuously refine monitoring processes, configurations, and thresholds.

  • Practice and promote sustainable incident response and blameless postmortems

  • Develop runbooks and tools to streamline processes and shorten problem resolution time.

  • Write code that improves scalability, performance, maintainability, and security.

  • Add, tune and maintain alert configurations and documentation as needed.

  • Develop and improve CI/CD processes to improve release cadence and success.

  • Use Chaos Engineering principles and methodologies to test what you build under real-world conditions.

  • Mentor SREs, Sysadmins, and Systems Engineers  in technical and non-technical SRE responsibilities.

Required Education

  • BS in Computer Science, Computer Engineering, Electrical Engineering or related field

Key Qualifications:

  • 7+ years of experience in SRE, devops, technical operations, systems engineering, software engineering or related discipline

  • Proficient, collaborative, & experienced in building reliable, scalable, enterprise systems

  • Excellent communication skills, both verbal and written

  • Passionate and curious about ways to leverage technology while continually learning

  • Efficiently skilled with the use of containers and container orchestration systems  in enterprise production environments (e.g. Docker, Kubernetes, Rancher, AWS ECS and EKS)

  • Experience with configuration management and infrastructure as code  (e.g. Terraform, Helm, Cloud Formation, Ansible, Puppet, and Ansible)

  • Comfortable in one or more of the following languages (Python, Java, Scala, Go, Rust, Ruby, or similar)

  • Skilled in Cloud/PaaS/SaaS Environments (e.g. AWS, Azure, Google Cloud Compute)

  • Hands-on experience using source control (Git, GitHub) and feature branching strategies

  • Experience with continuous integration tools (e.g. Jenkins, Gitlab CI/CD, AWS CodeBuild, CodeDeploy, Spinnaker)

  • Knowledge of best practices and IT operations in an always-up, always-available service

  • Possess expertise in scalable testing, automation, continuous integration frameworks and best practices

  • Experience in SDLC, distributed systems, networking, hardware, logistics and operations or capacity planning

  • UNIX/Linux administration, troubleshooting, performance tuning, and security

  • Experience with DevOps methodologies and/or SRE

  • Experience with monitoring and observability tooling such as Datadog, Prometheus, and Grafana

  • Experience with automating infrastructure, deployment and testing using tools like Cloudformation, Ansible or Terraform.

  • Experience with Service Level Objectives and Error Budgets

  • Understanding of the principles and methodologies behind Chaos Engineering

Bonus Qualifications:

  • Expertise in web server administration

The Walt Disney Company is an Equal Opportunity Employer.


The hiring range for this position in British Columbia, Canada is C$124,200 to C$166,700 CAD per year. The base pay actually offered will take into account internal equity and also may vary depending on the candidate’s geographic region, job-related knowledge, skills, and experience among other factors. A full range of medical, financial, and/or other variable pay or benefits, may be offered dependent on the level and position offered.


Walt Disney Animation Studios について:

映画製作者主導のアニメーション・スタジオであるWalt Disney Animation Studiosは、優れた芸術性とストーリーテリングを画期的なテクノロジーと組み合わせることで、世界中で愛される映画を数多く製作してきました。1937年公開の初のフルアニメーション長編映画Snow White and the Seven Dwarfsに始まり、2024年秋の新作に向け, Moana 2にいたるまで、Disney Animationは、イノベーションと創造性にあふれた実績を築き続けています。他にも、Pinocchio、Sleeping Beauty、The Jungle Book、The Little Mermaid、The Lion King、Frozen、Big Hero 6、Zootopia、Encantoなど、時を超えて今も人々に愛される作品があります。

The Walt Disney Company について:

The Walt Disney Companyは、その子会社・関連会社とともに、多様性あふれる国際企業として、Disney Entertainment、ESPN、Disney Experiencesの3事業を柱に、ファミリー向けエンターテインメントとメディアの世界をけん引しています。1920年代に小さなアニメ・スタジオとしてスタートしたDisneyは、今日のエンターテインメント業界において卓越した存在となりました。ディズニーは今後も、子供から大人まで、ご家族のだれもが楽しめる一流の物語や体験を生み出し続けます。Disneyのストーリーやキャラクター、体験は、世界中のあらゆる場所の消費者やお客様に届けられています。当社は40カ国以上で、従業員とキャストメンバーが一丸となり、世界的にも地域的にも歓迎されるエンターテインメント体験を創出しています。

このポジションは Walt Disney Pictures という事業部門の一つである Walt Disney Animation Studiosでのお仕事です。

応募 後で応募

ジョブアラートに登録する

ジョブアラートにご登録いただくと、ご希望の求人情報をメールで受け取ることができます。

関心選択リストから職種を選択してください。選択リストから勤務地を選択してください。最後に、「追加」をクリックして、ジョブアラートを作成してください。

  • テクノロジー, バンクーバー, ブリティッシュコロンビア州, カナダ削除