Fellow/PMTS Data Center GPU Optimization Engineer, AI-ML/HPC

Sep 13, 2024
Santa Clara, Cuba
... Not specified
... Intermediate
Full time
... Office work


WHAT YOU DO AT AMD CHANGES EVERYTHING

We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences – the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world’s most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives. 

AMD together we advance_




THE TEAM:

AMD's Data Center GPU organization is transforming the industry with our AI based Graphic Processors. Our primary objective is to design exceptional products that drive the evolution of computing experiences, serving as the cornerstone for enterprise Data Centers, (AI) Artificial Intelligence, HPC and Embedded systems. If this resonates with you, come and joining our Data Center GPU organization where we are building amazing AI powered products with amazing people.

 

THE ROLE:

Would you like to be part of a world class team enabling applications for AI/ML, High performance computing? AMD is searching for talented and highly motivated computational scientists/engineers to join our team of developers preparing applications for Ai/ML, supercomputing platforms across industry, academia, cloud service providers and national laboratories.

This position is for a senior DC GPU Application Optimization Engineer in high performance computing, with a focus on optimizing Machine Learning applications.   You will be part of a team porting and tuning a wide variety of scientific applications for AMD CPU and GPU platforms.  

THE PERSON:

A computational scientist, physicist, or engineer with experience in multiple scientific computing domains and experience with using Machine Learning techniques in an AI/HPC setting. Must be self-motivated and possess the ability to work well within a team environment.

 

 

KEY RESPONSIBILITIES:

  • Port and optimize a variety of machine learning based applications for AMD CPU and GPU systems
  • Provide domain specific knowledge to other groups at AMD
  • Engage with AMD product groups to drive resolution of application and customer issues
  • Develop and present training materials to internal audiences, at customer venues, and at industry conferences

 

PREFERRED EXPERIENCE:

  • Deep understanding of distributed systems and ability to dive deep into individual components such as compute, network and storage.
  • Expert level hands-on experience in Networking, Storage and cluster design, modelling, and analytics.
  • Solid grounding in current AI/ML frameworks and deep understanding of the ecosystem.
  • Extensive, experience and mastery in Python and one systems language - preferably C++.
  • Working experience with distributed pre-training, fine-tuning and inference.
  • Familiarity with orchestrator/resource managers such as slurm and k8s.
  • Broad experience creating, adapting, and running workloads with widely used HPC applications.
  • Strong performance analysis skills for both CPU and GPU 
  • Experience in working with large customers and excellent communication level from engineer to mid-management to C-level of audience. 
  • Relevant industry experience 

 

Great to have:

o NeurIPS/ICML, or equivalent publications

o Thought leader, patents and other publications.

o Experience working at the k8s scheduler level

o In-depth HPC knowledge

o Ability to work well in geographically dispersed teams

 

ACADEMIC CREDENTIALS:

  • Masters or PhD in Computer Science, Computational Physics, Engineering or related subjects, or equivalent experience

 

LOCATION:

Santa Clara, CA

 

#LI-BW1

 

#LI-hybrid




At AMD, your base pay is one part of your total rewards package.  Your base pay will depend on where your skills, qualifications, experience, and location fit into the hiring range for the position. You may be eligible for incentives based upon your role such as either an annual bonus or sales incentive. Many AMD employees have the opportunity to own shares of AMD stock, as well as a discount when purchasing AMD stock if voluntarily participating in AMD’s Employee Stock Purchase Plan. You’ll also be eligible for competitive benefits described in more detail here.

 

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

At AMD, your base pay is one part of your total rewards package.  Your base pay will depend on where your skills, qualifications, experience, and location fit into the hiring range for the position. You may be eligible for incentives based upon your role such as either an annual bonus or sales incentive. Many AMD employees have the opportunity to own shares of AMD stock, as well as a discount when purchasing AMD stock if voluntarily participating in AMD’s Employee Stock Purchase Plan. You’ll also be eligible for competitive benefits described in more detail here.

 

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

THE TEAM:

AMD's Data Center GPU organization is transforming the industry with our AI based Graphic Processors. Our primary objective is to design exceptional products that drive the evolution of computing experiences, serving as the cornerstone for enterprise Data Centers, (AI) Artificial Intelligence, HPC and Embedded systems. If this resonates with you, come and joining our Data Center GPU organization where we are building amazing AI powered products with amazing people.

 

THE ROLE:

Would you like to be part of a world class team enabling applications for AI/ML, High performance computing? AMD is searching for talented and highly motivated computational scientists/engineers to join our team of developers preparing applications for Ai/ML, supercomputing platforms across industry, academia, cloud service providers and national laboratories.

This position is for a senior DC GPU Application Optimization Engineer in high performance computing, with a focus on optimizing Machine Learning applications.   You will be part of a team porting and tuning a wide variety of scientific applications for AMD CPU and GPU platforms.  

THE PERSON:

A computational scientist, physicist, or engineer with experience in multiple scientific computing domains and experience with using Machine Learning techniques in an AI/HPC setting. Must be self-motivated and possess the ability to work well within a team environment.

 

 

KEY RESPONSIBILITIES:

  • Port and optimize a variety of machine learning based applications for AMD CPU and GPU systems
  • Provide domain specific knowledge to other groups at AMD
  • Engage with AMD product groups to drive resolution of application and customer issues
  • Develop and present training materials to internal audiences, at customer venues, and at industry conferences

 

PREFERRED EXPERIENCE:

  • Deep understanding of distributed systems and ability to dive deep into individual components such as compute, network and storage.
  • Expert level hands-on experience in Networking, Storage and cluster design, modelling, and analytics.
  • Solid grounding in current AI/ML frameworks and deep understanding of the ecosystem.
  • Extensive, experience and mastery in Python and one systems language - preferably C++.
  • Working experience with distributed pre-training, fine-tuning and inference.
  • Familiarity with orchestrator/resource managers such as slurm and k8s.
  • Broad experience creating, adapting, and running workloads with widely used HPC applications.
  • Strong performance analysis skills for both CPU and GPU 
  • Experience in working with large customers and excellent communication level from engineer to mid-management to C-level of audience. 
  • Relevant industry experience 

 

Great to have:

o NeurIPS/ICML, or equivalent publications

o Thought leader, patents and other publications.

o Experience working at the k8s scheduler level

o In-depth HPC knowledge

o Ability to work well in geographically dispersed teams

 

ACADEMIC CREDENTIALS:

  • Masters or PhD in Computer Science, Computational Physics, Engineering or related subjects, or equivalent experience

 

LOCATION:

Santa Clara, CA

 

#LI-BW1

 

#LI-hybrid

COMPANY JOBS
1028 available jobs
WEBSITE