Senior Director, Site Reliability Engineering
Who We Are
Founded in 2005, 2K Games is a global video game company, publishing titles developed by some of the most influential game development studios in the world. Our studios responsible for developing 2K’s portfolio of world-class games across multiple platforms, include Visual Concepts, Firaxis, Hangar 13, CatDaddy, Cloud Chamber, and HB Studios. Our portfolio of titles is expanding due to our global strategic plan, building and acquiring exciting studios whose content continues to inspire all of us! 2K publishes titles in today’s most popular gaming genres, including sports, shooters, action, role-playing, strategy, casual, and family entertainment.
Our team of engineers, marketers, artists, writers, data scientists, producers, thinkers and doers, are the professional publishing stewards of our growing library of critically-acclaimed franchises such as NBA 2K, Battleborn, BioShock, Borderlands, The Darkness, Mafia, Sid Meier’s Civilization, WWE 2K, and XCOM.
At 2K, we pride ourselves on creating an inclusive work environment, which means encouraging our teams to Come as You Are and do your best work! We are dedicated to diversity and inclusion, and want our community of candidates to reflect this commitment. We encourage all qualified applicants to explore our global positions.
2K is headquartered in Novato, California and is a wholly owned label of Take-Two Interactive Software, Inc. (NASDAQ: TTWO).
What We Need
We are looking for a Senior Director of Site Reliability Engineering (SRE) who will report into the VP, Technical Infrastructure and Security. This role will be leading the SRE teams that support our production and development environments. Robust leadership, technical, and communication skills are required as you work with internal game teams and customers to adopt and implement modern scalable architecture patterns.
What You Will Do
- Develop, manage and drive the SRE teams strategic direction to enable engineering teams to ship and release reliably, securely and efficiently.
- Lead, and scale SRE teams to focus on root cause analysis, pattern identification and continuous improvement in order to optimize application performance, resilience and reliability.
- Develop and implement SRE best practices and techniques including detecting and responding to issues, and restoring applications/services across business domains.
- Build metrics-driven approach to ensure the stability and security of enterprise cloud services including SLIs, SLOs, and SLAs
- Establish and supervise OKRs to measure overall progress for the SRE program
- Architect and operate highly resilient systems in a multi-datacenter global environment serving game and consumer services.
- Partner with cross functional teams to support and improve our overall security posture, Patch Management, Disaster Recovery and Business Continuity efforts.
- Establish working processes with software engineering teams to support our innovation efforts.
- Define and implement standards that will affect systems, services and multiple software environments
Who We Think Will Be a Great Fit
- 10+ years experience in the SRE or system engineering fields
- 5 years coaching managers and senior technical talent
- 3+ years of experience establishing and maturing an SRE practice.
- Experience hiring and managing teams with scope over highly available systems in a cloud environment.
- Experience managing stakeholder relationships, communicating with and addressing the company’s executive leadership team.
- Proven leadership skills with a focus on adapting and changing as the organization or environment requires. Create an environment of ownership which continuously challenges the team to evolve with SRE function.
- Deeply knowledgeable about modern infrastructure management tools and processes.
- Relentless focus on availability, security, and performance
- Software engineering fundamentals leading on system design, architecture and tooling
- Fluent with at least one modern programming language and a good understanding of code management principles
- Experience architecting and maintaining large scale distributed infrastructure that spans terrestrial and cloud datacenters
- Has architected microservices using virtualization or containerization.
- Experience with developing software for highly scalable/distributed systems
- A background of building CI/CD pipelines in Jenkins, TeamCity, or Spinnaker
- Experience in gaming or similar industries combining large scale internet facing systems with software development and entertainment services culture
- Familiarity with common source code repositories and infrastructure as code methodologies
As an equal opportunity employer, we are committed to ensuring that qualified individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform their essential job functions, and to receive other benefits and privileges of employment. Please contact us if you need reasonable accommodation.
Please note that 2K Games and its studios never uses instant messaging apps or personal email accounts to contact prospective employees or conduct interviews and when emailing, only use 2K.com accounts.