Program Manager, AI Model Evaluation, International Seller Growth
Amazon · Shanghai, China · Editorial, Writing, & Content Management
About this role
Amazon is hiring a mid-level Program Manager in the operations function based in Shanghai, China. The posting calls out experience with SQL, LLMs, Testing.
- Role
- Program Manager
- Function
- operations
- Level
- mid
- Track
- Individual contributor
- Employment
- Full-time
- Location
- Shanghai, China
- Department
- Editorial, Writing, & Content Management
- Posted
- Mar 19, 2026
More roles at Amazon
Job description
from Amazon careersJoin the Seller AI team where you'll lead benchmarking and evaluation of AI models that enhance the seller experience across Amazon's global marketplace. You'll manage a team dedicated to validating, testing, and improving Artificial Intelligence (AI) and Large Language Models (LLMs) that power innovative seller tools. This role combines strategic leadership with hands-on technical oversight, requiring exceptional communication skills, team management, and stakeholder engagement. In this position, you'll drive the development and implementation of comprehensive benchmarking methodologies to evaluate AI model performance across accuracy, robustness, bias, and reliability metrics. Your expertise will be crucial in translating technical findings into actionable insights that improve Seller Assistant's performance and contribute to the growth of Amazon's seller community worldwide. Key job responsibilities 1. Plan and execute benchmarking exercises for AI models, defining test plans, metrics, and acceptance criteria across accuracy, robustness, bias, and reliability dimensions. 2. Lead a team responsible for validating data based on specific annotation guidelines, ensuring accuracy and quality while escalating potential regulatory risks 3. Prepare comprehensive audit and benchmarking reports, including error ratings, root-cause analysis, and recommendations for senior stakeholders 4. Drive process efficiencies and explore automation opportunities to enhance the productivity of data generation initiatives 5. Mentor team…