Artificial Intelligence Performance Engineer

Mar 27, 2024
Santa Clara, Cuba
... Not specified
... Intermediate
Full time
... Office work


WHAT YOU DO AT AMD CHANGES EVERYTHING

We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences – the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world’s most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives. 

AMD together we advance_




Artificial Intelligence Performance Engineer

THE TEAM:

 

In the ever-evolving landscape of artificial intelligence, we are a powerhouse – a cutting-edge 'AI Software Solutions team' poised to redefine the benchmarks of excellence. Specializing in a multifaceted approach, we stand at the forefront of AI optimization, meticulously fine-tuning algorithms to unlock unprecedented efficiency. Our expertise extends beyond the virtual realm, encompassing 3P enablement, where we seamlessly navigate the intricacies of Proof of Concept, Request for Proposal, and optimization strategies. 
 
Delving into the core of hardware capabilities, we embark on a journey of performance optimization, ensuring that every machine operates at its apex. With a focus on ML Perf inference, we transcend conventional boundaries, setting a new standard for real-time decision-making. As trailblazers in AI training improvement, we harness innovation to elevate the capabilities of machine learning models. Join us on this transformative journey as we navigate the frontiers of artificial intelligence, where every challenge is an opportunity, and every solution propels us into the future.

THE ROLE:

As part of our team, you'll be responsible for ensuring that AMD Instinct GPU-accelerated systems are operating at peak performance before being deployed to solve the world's most challenging problems. We're looking for a highly motivated candidate with expertise in GPU performance and familiarity with performance monitoring and tuning tools. The ideal candidate should also possess data science and communication skills to effectively convey their findings to engineering and business teams.

We value curiosity and innovation, and we're committed to providing a challenging and supportive environment where you can learn and grow. As you collaborate with your peers, you'll have the opportunity to make a real impact and contribute to our organization's success. We're looking for someone who's passionate about improving their skills and constantly seeking new ways to drive performance and efficiency. If you're passionate about performance engineering and want to make a real impact with customers deploying the latest AI and ML breakthroughs, we encourage you to apply.

KEY RESPONSIBILITIES: 

  • Define performance suite and best practices for measuring GPU-accelerated workloads to assess scalability and efficiency of AI models and algorithms
  • Benchmark and analyze AI workloads in single and large multi-node configurations comparing against previous generations and our competitors
  • Perform comprehensive performance analysis and report findings for the entire platform including GPU, CPU, interconnects, network, software stack, etc.
  • Identify performance bottlenecks that impact data center GPU-accelerated workloads, tune and collaborate with other software teams to improve performance
  • Stay up to date with emerging technologies and trends in the AI field and explore ways to improve the performance of GPU-accelerated workloads at scale

PREFERRED EXPERIENCE: 

  • Solid knowledge of Artificial Intelligence (AI) and Machine Learning (ML) concepts and techniques, including deep learning, reinforcement learning, natural language processing, generative AI, and computer vision, as well as practical experience applying these concepts to solve real-world problems through research or work experience
  • Experience in benchmarking methodologies, performance analysis, workload profiling, performance monitoring and debugging tools
  • Advanced Linux OS, container (e.g. Docker) and GitHub skills
  • Programming skills in a variety of relevant languages such as Python or C/C++
  • Expertise with deep learning frameworks like PyTorch and TensorFlow
  • Knowledge and interest in computer and GPU architecture
  • In-depth knowledge of GPU acceleration with either AMD or Nvidia GPU compute products
  • Inquiring mind, excellent problem-solving skills, and automation mindset

ACADEMIC CREDENTIALS: 

  • B.S., M.S., PhD in Computer Science or Engineering or similar field

LOCATION: 

Location: Santa Clara CA, Bellevue WA; Austin TX; Orlando, FL

#LI-RL1




At AMD, your base pay is one part of your total rewards package.  Your base pay will depend on where your skills, qualifications, experience, and location fit into the hiring range for the position. You may be eligible for incentives based upon your role such as either an annual bonus or sales incentive. Many AMD employees have the opportunity to own shares of AMD stock, as well as a discount when purchasing AMD stock if voluntarily participating in AMD’s Employee Stock Purchase Plan. You’ll also be eligible for competitive benefits described in more detail here.

 

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

At AMD, your base pay is one part of your total rewards package.  Your base pay will depend on where your skills, qualifications, experience, and location fit into the hiring range for the position. You may be eligible for incentives based upon your role such as either an annual bonus or sales incentive. Many AMD employees have the opportunity to own shares of AMD stock, as well as a discount when purchasing AMD stock if voluntarily participating in AMD’s Employee Stock Purchase Plan. You’ll also be eligible for competitive benefits described in more detail here.

 

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

Artificial Intelligence Performance Engineer

THE TEAM:

 

In the ever-evolving landscape of artificial intelligence, we are a powerhouse – a cutting-edge 'AI Software Solutions team' poised to redefine the benchmarks of excellence. Specializing in a multifaceted approach, we stand at the forefront of AI optimization, meticulously fine-tuning algorithms to unlock unprecedented efficiency. Our expertise extends beyond the virtual realm, encompassing 3P enablement, where we seamlessly navigate the intricacies of Proof of Concept, Request for Proposal, and optimization strategies. 
 
Delving into the core of hardware capabilities, we embark on a journey of performance optimization, ensuring that every machine operates at its apex. With a focus on ML Perf inference, we transcend conventional boundaries, setting a new standard for real-time decision-making. As trailblazers in AI training improvement, we harness innovation to elevate the capabilities of machine learning models. Join us on this transformative journey as we navigate the frontiers of artificial intelligence, where every challenge is an opportunity, and every solution propels us into the future.

THE ROLE:

As part of our team, you'll be responsible for ensuring that AMD Instinct GPU-accelerated systems are operating at peak performance before being deployed to solve the world's most challenging problems. We're looking for a highly motivated candidate with expertise in GPU performance and familiarity with performance monitoring and tuning tools. The ideal candidate should also possess data science and communication skills to effectively convey their findings to engineering and business teams.

We value curiosity and innovation, and we're committed to providing a challenging and supportive environment where you can learn and grow. As you collaborate with your peers, you'll have the opportunity to make a real impact and contribute to our organization's success. We're looking for someone who's passionate about improving their skills and constantly seeking new ways to drive performance and efficiency. If you're passionate about performance engineering and want to make a real impact with customers deploying the latest AI and ML breakthroughs, we encourage you to apply.

KEY RESPONSIBILITIES: 

  • Define performance suite and best practices for measuring GPU-accelerated workloads to assess scalability and efficiency of AI models and algorithms
  • Benchmark and analyze AI workloads in single and large multi-node configurations comparing against previous generations and our competitors
  • Perform comprehensive performance analysis and report findings for the entire platform including GPU, CPU, interconnects, network, software stack, etc.
  • Identify performance bottlenecks that impact data center GPU-accelerated workloads, tune and collaborate with other software teams to improve performance
  • Stay up to date with emerging technologies and trends in the AI field and explore ways to improve the performance of GPU-accelerated workloads at scale

PREFERRED EXPERIENCE: 

  • Solid knowledge of Artificial Intelligence (AI) and Machine Learning (ML) concepts and techniques, including deep learning, reinforcement learning, natural language processing, generative AI, and computer vision, as well as practical experience applying these concepts to solve real-world problems through research or work experience
  • Experience in benchmarking methodologies, performance analysis, workload profiling, performance monitoring and debugging tools
  • Advanced Linux OS, container (e.g. Docker) and GitHub skills
  • Programming skills in a variety of relevant languages such as Python or C/C++
  • Expertise with deep learning frameworks like PyTorch and TensorFlow
  • Knowledge and interest in computer and GPU architecture
  • In-depth knowledge of GPU acceleration with either AMD or Nvidia GPU compute products
  • Inquiring mind, excellent problem-solving skills, and automation mindset

ACADEMIC CREDENTIALS: 

  • B.S., M.S., PhD in Computer Science or Engineering or similar field

LOCATION: 

Location: Santa Clara CA, Bellevue WA; Austin TX; Orlando, FL

#LI-RL1

COMPANY JOBS
1667 available jobs
WEBSITE