Compute and Storage Allocation Lead

Aug 24, 2024
Austin, United States
... Not specified
... Intermediate
Full time
... Office work


WHAT YOU DO AT AMD CHANGES EVERYTHING

We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences – the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world’s most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives. 

AMD together we advance_




THE ROLE:

The GPU Technologies and Engineering (G&E) Operations Team is looking for a dynamic, technical, experienced candidate to join our team. As a key contributor to the success of AMD, you will have ownership of the following:

  • Compute and Storage Management: Serve as the BU lead for compute and storage resources & handle ticket generation, user requests, and IT coordination
  • Issue Analysis and Debugging: Lead efforts to debug and resolve compute and storage issues & coordinate with IT for deeper troubleshooting
  • Resource Allocation and Efficiency: Generate resource estimates and conduct efficiency reviews & utilize scripting and automation for streamlined reporting

THE PERSON: 

The ideal candidate will be proactive, detail-oriented, and capable of managing complex compute and storage environments for the SOC and IP hardware and software business units (BU). They will possess strong problem-solving skills, excellent communication abilities, and a proven track record of effectively coordinating with IT and engineers to ensure optimal resource allocation and issue resolution. Understanding of silicon design and related IT needs is required.  Familiarity with scripting and report automation is essential for this role, enabling efficient management and reporting of compute and storage resources.

KEY RESPONSIBILITIES: 

Manage GPU Compute and Storage Allocations: 

  • Daily manage grid arbitration with team point of contact, managers and projects
  • Oversee all GPU compute allocations and handle user requests for additional compute resources (cores)
  • Serve as the graphics hardware/software focal point for all compute-related activities
  • Generate and manage all tickets related to GPU BU compute
  • Interface with IT as needed to facilitate operations, address issues and ensure adequate storage resources are available and functioning
  • Handle point of contact requests for additional storage resources (disks)

Manage SDE Compute and Storage Allocations:

  • Oversee all compute and storage allocations for AMD’s Secure Design Environment (SDE)
  • File all SDE compute and storage tickets, including upgrades and additional disk space
  • Interface with users for troubleshooting and support
  • Coordinate with IT as needed to ensure seamless operations and resource availability

Analyze and Debug Compute and Debug Disk Related Issues:

  • Analyze and debug all compute-related issues within the GPU BU
  • Coordinate with IT to address and resolve any compute-related problems
  • Analyze and debug all disk-related issues within the GPU BU
  • Serve as the BU lead and focal point for all storage-related activities
  • Generate and manage all tickets related to GPU BU storage
  • Coordinate with IT to address and resolve any storage-related problems

Resource Management and Efficiency:

  • Generate compute and storage estimates and coordinate with IT to ensure resource availability
  • Proactively drive efficiency/wastage reduction efforts in collaboration with engineering teams and IT
  • Review all GPU projects for efficiency and intervene when necessary to optimize resource use and performance

PREFERRED EXPERIENCE: 

  • Extensive program management experience with a PMP certification desired
  • Experience interfacing with IT and managing engineering resource requests
  • Strong understanding of modern development flows for SOC and hardware IP
  • Proven experience in managing compute and storage resources, preferably within a GPU environment
  • Strong analytical and problem-solving skills, excellent communication and collaborative team-oriented approach
  • Ability to generate accurate resource estimates and manage project efficiencies
  • Proficiency in GPU management tools and platforms
  • Knowledge of compute and storage resource allocation and management
  • Effective ticket management and issue resolution capabilities
  • Experience writing scripts for automated reports
  • Skilled at project trade-offs and arbitration with projects

ACADEMIC CREDENTIALS: 

  • Bachelor’s Degree in related discipline 
  • Program Management certification a plus 

ALTERNATE LOCATIONS:

  • Markham, Ontario 

#LI-TB1

#LI-Hybrid

 




At AMD, your base pay is one part of your total rewards package.  Your base pay will depend on where your skills, qualifications, experience, and location fit into the hiring range for the position. You may be eligible for incentives based upon your role such as either an annual bonus or sales incentive. Many AMD employees have the opportunity to own shares of AMD stock, as well as a discount when purchasing AMD stock if voluntarily participating in AMD’s Employee Stock Purchase Plan. You’ll also be eligible for competitive benefits described in more detail here.

 

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

At AMD, your base pay is one part of your total rewards package.  Your base pay will depend on where your skills, qualifications, experience, and location fit into the hiring range for the position. You may be eligible for incentives based upon your role such as either an annual bonus or sales incentive. Many AMD employees have the opportunity to own shares of AMD stock, as well as a discount when purchasing AMD stock if voluntarily participating in AMD’s Employee Stock Purchase Plan. You’ll also be eligible for competitive benefits described in more detail here.

 

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

THE ROLE:

The GPU Technologies and Engineering (G&E) Operations Team is looking for a dynamic, technical, experienced candidate to join our team. As a key contributor to the success of AMD, you will have ownership of the following:

  • Compute and Storage Management: Serve as the BU lead for compute and storage resources & handle ticket generation, user requests, and IT coordination
  • Issue Analysis and Debugging: Lead efforts to debug and resolve compute and storage issues & coordinate with IT for deeper troubleshooting
  • Resource Allocation and Efficiency: Generate resource estimates and conduct efficiency reviews & utilize scripting and automation for streamlined reporting

THE PERSON: 

The ideal candidate will be proactive, detail-oriented, and capable of managing complex compute and storage environments for the SOC and IP hardware and software business units (BU). They will possess strong problem-solving skills, excellent communication abilities, and a proven track record of effectively coordinating with IT and engineers to ensure optimal resource allocation and issue resolution. Understanding of silicon design and related IT needs is required.  Familiarity with scripting and report automation is essential for this role, enabling efficient management and reporting of compute and storage resources.

KEY RESPONSIBILITIES: 

Manage GPU Compute and Storage Allocations: 

  • Daily manage grid arbitration with team point of contact, managers and projects
  • Oversee all GPU compute allocations and handle user requests for additional compute resources (cores)
  • Serve as the graphics hardware/software focal point for all compute-related activities
  • Generate and manage all tickets related to GPU BU compute
  • Interface with IT as needed to facilitate operations, address issues and ensure adequate storage resources are available and functioning
  • Handle point of contact requests for additional storage resources (disks)

Manage SDE Compute and Storage Allocations:

  • Oversee all compute and storage allocations for AMD’s Secure Design Environment (SDE)
  • File all SDE compute and storage tickets, including upgrades and additional disk space
  • Interface with users for troubleshooting and support
  • Coordinate with IT as needed to ensure seamless operations and resource availability

Analyze and Debug Compute and Debug Disk Related Issues:

  • Analyze and debug all compute-related issues within the GPU BU
  • Coordinate with IT to address and resolve any compute-related problems
  • Analyze and debug all disk-related issues within the GPU BU
  • Serve as the BU lead and focal point for all storage-related activities
  • Generate and manage all tickets related to GPU BU storage
  • Coordinate with IT to address and resolve any storage-related problems

Resource Management and Efficiency:

  • Generate compute and storage estimates and coordinate with IT to ensure resource availability
  • Proactively drive efficiency/wastage reduction efforts in collaboration with engineering teams and IT
  • Review all GPU projects for efficiency and intervene when necessary to optimize resource use and performance

PREFERRED EXPERIENCE: 

  • Extensive program management experience with a PMP certification desired
  • Experience interfacing with IT and managing engineering resource requests
  • Strong understanding of modern development flows for SOC and hardware IP
  • Proven experience in managing compute and storage resources, preferably within a GPU environment
  • Strong analytical and problem-solving skills, excellent communication and collaborative team-oriented approach
  • Ability to generate accurate resource estimates and manage project efficiencies
  • Proficiency in GPU management tools and platforms
  • Knowledge of compute and storage resource allocation and management
  • Effective ticket management and issue resolution capabilities
  • Experience writing scripts for automated reports
  • Skilled at project trade-offs and arbitration with projects

ACADEMIC CREDENTIALS: 

  • Bachelor’s Degree in related discipline 
  • Program Management certification a plus 

ALTERNATE LOCATIONS:

  • Markham, Ontario 

#LI-TB1

#LI-Hybrid

 

COMPANY JOBS
1163 available jobs
WEBSITE