Senior Data Engineer
Google · Mountain View, CA | Chicago, IL | Irvine, CA | New York City, NY
Our team, within Go-to-Market (GTM), serves as the intelligence partner for product teams, transforming massive volumes of unstructured conversational data into quantified, trusted insights that bridge the gap between customer feedback and product decisions. This is a high-visibility initiative critical for accelerating the Ads product adoption flywheel and shaping Go-to-Market (GTM) strategy for priority products.
As a Senior Data Engineer, you will own and architect the foundational infrastructure that transforms unstructured customer feedback into quantified strategic assets. You will help us move towards scalable, automated pipelines that integrate sales transcripts with critical Business Intelligence (BI). You will pioneer our transition towards more flexible workflows, developing the core infrastructure and platforms that multiply our data science team's capacity, agility, and impact through end-to-end delivery of production-ready solutions.
The US base salary range for this full-time position is $156,000-$226,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.Responsibilities
- Design and maintain pipelines to ingest, clean, and process massive volumes of unstructured data, including business transcripts and support cases, into reliable analytical datasets.
- Architect and deploy advanced platforms and tooling that empower the team to leverage autonomous AI agents and Large Language Models (LLMs) for intelligent routing and automated insights.
- Develop internal libraries and self-serve frameworks that streamline Natural Language Processing (NLP) and causal analysis, significantly reducing operational friction and enhancing team productivity.
- Manage and optimize large-scale embedding workflows using TensorFlow and Tensor Processing Units (TPUs), ensuring efficient processing that bypasses standard API constraints for high-volume data.
- Implement automated monitoring, alerting, and rigorous data quality checks to guarantee the security, reliability, and governance of high-stakes analytical assets.
Minimum qualifications:
- Bachelor's degree or equivalent practical experience.
- 5 years of experience coding in Python and SQL.
- 5 years of experience working with machine learning operations (MLOps) and large language model operations (LLMOps) principles and data infrastructure, including deploying text processing and embedding pipelines.
- 5 years of experience designing and deploying data pipelines, including managing data schemas and processing unstructured text data for machine learning (ML) workflows.
Preferred qualifications:
- Experience with data schemas.
- Experience with google colaboratory (Colab), tensorFlow, tensor processing units (TPUs), and agentic tools and platforms for processing unstructured text data.
- Experience with LLM orchestration and agentic infrastructure.
- Proficiency in SQL and Python.
- Understanding of MLOps/LLMOps principles to ensure the scalable and reliable deployment of text processing and embedding pipelines.