About this role

Perplexity is hiring a intern-level Data Engineer based in London, United Kingdom (hybrid). The posting calls out experience with CUDA, Databricks, PyTorch, Distributed Systems. Listed education preference: a master's degree or equivalent.

Role: Data Engineer
Function: data engineering
Level: intern
Track: Individual contributor
Employment: Internship
Location: London, United Kingdom
Work mode: Hybrid
Education: Master's degree
Visa: Not sponsored
Department: AI
Posted: Feb 12, 2026

AI Summary

Work with the AI Inference team to optimize model serving latency and throughput across GPU clusters. Bring up new models, implement inference optimizations and quantization schemes, and optimize the entire serving stack from GPU kernels to endpoints. Requires strong systems programming fundamentals, GPU programming experience (CUDA/Triton), and ML framework knowledge.

Upgrade to Pro for AI summaries, resume match scores & career intelligence →

More roles at Perplexity

Product Marketing Manager

San Francisco, CA · mid

Executive Communications Manager

San Francisco, CA · director

Member of Technical Staff (Model Behavior Architect)

San Francisco, CA · staff

Python LLMs Prompt Engineering

Engineering Manager (AI Inference)

San Francisco, CA · manager

CUDA Kubernetes TensorFlow

Business Development Representative

San Francisco, CA · mid

SQL Salesforce API Development All Perplexity jobs →

Job description

from Perplexity careers

Perplexity is excited to announce the Internship Program for exceptional Master’s or PhD students studying Computer Science or Engineering in the UK, enrolled in the 2025-2026 academic year. This is an intensive program in which you will work directly with our AI Inference team. This program offers a unique opportunity to gain valuable experience in a rapidly growing AI startup. Outstanding performers might be offered a full time position at the end of the program.

Our AI Inference team is responsible for running the models behind the Perplexity products. The team maintains the inference engine and deployments behind models ranging from single-node embeddings to distributed sparse Mixture-of-Experts models, maintaining large GPU clusters. With a keen focus on latency and throughput, the Inference team is responsible for the entire serving stack, from GPU kernels to networking and monitoring infrastructure.

Responsibilities

Work with the inference team to improve serving latency and throughput
Bring up support for new models and state-of-the art inference optimizations or quantization schemes
Optimize inference across the entire stack, from GPU kernels to serving endpoints

Qualifications

Strong engineering track record with proven knowledge of fundamentals and programming languages (multi-threaded programming, networking, compilation, systems programming, etc)
Pursuing a Master's or PhD in Computer Science with a focus on performance-related subjects (HPC, Compilers, Distributed Systems)
This is an excerpt. Read the full job description on Perplexity careers →

All data engineering jobs data engineering in London, United Kingdom Jobs in London, United Kingdom data engineering salaries data engineering career path

All Perplexity Jobs Browse data engineering roles intern positions

UK Internship Program

About this role

More roles at Perplexity

Job description