Lead Systems Design Engineer - Data Center GPU

Mar 20, 2024
Austin, United States
... Not specified
... Intermediate
Full time
... Office work


WHAT YOU DO AT AMD CHANGES EVERYTHING

We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences – the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world’s most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives. 

AMD together we advance_




THE ROLE: 

We are looking for a dynamic, energetic Lead / Principal Systems Design Engineer to join our growing team. As a key contributor to the success of AMD’s product, you will be part of a leading team to drive and improve AMD’s abilities to deliver the highest quality, industry leading technologies to market. The Systems Design Engineering team fosters and encourages continuous technical innovation to showcase successes as well as facilitate continuous career development.

 

The Datacenter Graphics and Accelerated Computing (DCGPU) organization is looking for an experienced system level debug engineer.  Individual will be part of a team that as to bring-up, validate and ensure the platform being used is fully validated:  including electrical, power, networking and SOC.  Individual will be required to lead and document the plan for validating the system itself as well put in documentation for unique steps to enable it.  Individual will need to be able to drive to root closure any issues encountered and communicate with the different Functional and IP layers for resolution.

 

THE PERSON: 

As a Leader in Systems Design Engineering, you will drive balanced, scalable, and automated solutions. In this high visibility position, your software systems engineering expertise will be necessary towards product development, definition, and root cause resolution.

Experience in debugging of complex HW/FW issues is a must, understand the flow of a GPU through the different layers of a system and be able to validate the items connecting to the GPU SOC (pcie, vr’s, RMs, retimers, HBM, internal networking).   Communication Is essential in working with different owners of the functional code stack as well the ability to drive issues via phone calls, chat messages, e-mails.  Hands on experience with Hardware in a DataCenter environment will be required.

 

KEY RESPONSIBILITIES: 

  • Driving technical innovation to improve AMD’s capabilities across validation, including tool and script development, technical and procedural methodology enhancement, and various internal and cross-functional technical initiatives 
  • Debugging issues found during the process, bring-up, validation, and production phases of SOC programs and driver to root caus
  • Working with multiple teams, and tracking test execution to make sure all features are validated and optimized on time 
  • Working closely with supporting technical teams 
  • Engaging in other software/hardware modeling frameworks 
  • Leading collaborative approach with multiple teams 
  • Debug / triage engineer and understanding of industry tools for root causing complex issue
  • Understanding of GPU/System level HW and SW flow
  • Ability to probe parts of a board; check electrical and power currents and validate a system
  • Communicate / Document flows and methods of bring-up, boot-up, system initialization and debug

 

KEY QUALIFICATIONS

  • Minimum 8 yrs experience in System or SOC level debug and triage
  • Proven ability to drive resolution of critical problems within a lab, datacenter
  • Relationship with external customers/partners and able to help resolve problems in their datacenter
  • Relationship with external customers/partners on ability to work manufacturing issues/failures
  • Relationship with external customers/partners on ability to define rqmts for manufacturing validation

 PREFERRED EXPERIENCE

  • Significant experience in SoC and/or System debug of complex issues
  • Develop / Document debug capabilities on a given SOC and System
  • Go-to-person for debugging of issues for the Production level Platform validation
  • Collaborate with internal teams on root causing issues, finding optimum resolutions
  • Hands-on experience in using industry debug tools, scopes as well examine board level power

ACADEMIC CREDENTIALS: 

  • Bachelors or Masters degree in electrical or computer engineering 

 

LOCATION:

Austin, TX

 

#LI-SL2




At AMD, your base pay is one part of your total rewards package.  Your base pay will depend on where your skills, qualifications, experience, and location fit into the hiring range for the position. You may be eligible for incentives based upon your role such as either an annual bonus or sales incentive. Many AMD employees have the opportunity to own shares of AMD stock, as well as a discount when purchasing AMD stock if voluntarily participating in AMD’s Employee Stock Purchase Plan. You’ll also be eligible for competitive benefits described in more detail here.

 

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

At AMD, your base pay is one part of your total rewards package.  Your base pay will depend on where your skills, qualifications, experience, and location fit into the hiring range for the position. You may be eligible for incentives based upon your role such as either an annual bonus or sales incentive. Many AMD employees have the opportunity to own shares of AMD stock, as well as a discount when purchasing AMD stock if voluntarily participating in AMD’s Employee Stock Purchase Plan. You’ll also be eligible for competitive benefits described in more detail here.

 

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.

THE ROLE: 

We are looking for a dynamic, energetic Lead / Principal Systems Design Engineer to join our growing team. As a key contributor to the success of AMD’s product, you will be part of a leading team to drive and improve AMD’s abilities to deliver the highest quality, industry leading technologies to market. The Systems Design Engineering team fosters and encourages continuous technical innovation to showcase successes as well as facilitate continuous career development.

 

The Datacenter Graphics and Accelerated Computing (DCGPU) organization is looking for an experienced system level debug engineer.  Individual will be part of a team that as to bring-up, validate and ensure the platform being used is fully validated:  including electrical, power, networking and SOC.  Individual will be required to lead and document the plan for validating the system itself as well put in documentation for unique steps to enable it.  Individual will need to be able to drive to root closure any issues encountered and communicate with the different Functional and IP layers for resolution.

 

THE PERSON: 

As a Leader in Systems Design Engineering, you will drive balanced, scalable, and automated solutions. In this high visibility position, your software systems engineering expertise will be necessary towards product development, definition, and root cause resolution.

Experience in debugging of complex HW/FW issues is a must, understand the flow of a GPU through the different layers of a system and be able to validate the items connecting to the GPU SOC (pcie, vr’s, RMs, retimers, HBM, internal networking).   Communication Is essential in working with different owners of the functional code stack as well the ability to drive issues via phone calls, chat messages, e-mails.  Hands on experience with Hardware in a DataCenter environment will be required.

 

KEY RESPONSIBILITIES: 

  • Driving technical innovation to improve AMD’s capabilities across validation, including tool and script development, technical and procedural methodology enhancement, and various internal and cross-functional technical initiatives 
  • Debugging issues found during the process, bring-up, validation, and production phases of SOC programs and driver to root caus
  • Working with multiple teams, and tracking test execution to make sure all features are validated and optimized on time 
  • Working closely with supporting technical teams 
  • Engaging in other software/hardware modeling frameworks 
  • Leading collaborative approach with multiple teams 
  • Debug / triage engineer and understanding of industry tools for root causing complex issue
  • Understanding of GPU/System level HW and SW flow
  • Ability to probe parts of a board; check electrical and power currents and validate a system
  • Communicate / Document flows and methods of bring-up, boot-up, system initialization and debug

 

KEY QUALIFICATIONS

  • Minimum 8 yrs experience in System or SOC level debug and triage
  • Proven ability to drive resolution of critical problems within a lab, datacenter
  • Relationship with external customers/partners and able to help resolve problems in their datacenter
  • Relationship with external customers/partners on ability to work manufacturing issues/failures
  • Relationship with external customers/partners on ability to define rqmts for manufacturing validation

 PREFERRED EXPERIENCE

  • Significant experience in SoC and/or System debug of complex issues
  • Develop / Document debug capabilities on a given SOC and System
  • Go-to-person for debugging of issues for the Production level Platform validation
  • Collaborate with internal teams on root causing issues, finding optimum resolutions
  • Hands-on experience in using industry debug tools, scopes as well examine board level power

ACADEMIC CREDENTIALS: 

  • Bachelors or Masters degree in electrical or computer engineering 

 

LOCATION:

Austin, TX

 

#LI-SL2

COMPANY JOBS
1667 available jobs
WEBSITE