senior software engineering Software Engineer ic · Posted Jan 16, 2026
AI Summary
Design and own the runtime safety layer for an AI agentic orchestration platform. Architect guardrails, rollback mechanisms, observability pipelines, and fault isolation to ensure autonomous systems remain aligned with intent, auditable, and recoverable in high-assurance engineering environments.
Overview


Keysight is at the forefront of technology innovation, delivering breakthroughs and trusted insights in electronic design, simulation, prototyping, test, manufacturing, and optimization. Our ~15,000 employees create world-class solutions in communications, 5G, automotive, energy, quantum, aerospace, defense, and semiconductor markets for customers in over 100 countries. Learn more about what we do.

 

 

Our award-winning culture embraces a bold vision of where technology can take us and a passion for tackling challenging problems with industry-first solutions. We believe that when people feel a sense of belonging, they can be more creative, innovative, and thrive at all points in their careers.

 

 

 

About the Team

Keysight’s Applied AI Autonomy Initiative is building a next-generation agentic orchestration framework that enables AI agents to reason, adapt, and coordinate across complex engineering workflows. The platform combines LLM-based reasoning, reinforcement-inspired feedback loops, and simulation-driven validation to automate and optimize engineering decisions at scale.

 

This role sits at the core of the initiative, defining how autonomy can be deployed safely, transparently, and predictably in high-assurance engineering environments.

 

 

About the Role

As a Senior Engineer – Agentic Runtime Safety, Stability & Observability, you will design and own the runtime safety and reliability layer of Keysight’s agentic orchestration platform.

Your mission is to ensure that AI-driven orchestration remains aligned with human intent, observable, auditable, and recoverable. You will architect guardrails, rollback mechanisms, and observability pipelines that allow autonomous systems to act powerfully—without sacrificing trust, control, or predictability.

 

This role bridges AI systems, runtime engineering, and safety-critical design, working closely with AI architects, ML engineers, and simulation teams.

 

 


Responsibilities


Runtime Safety & Execution Control

  • Design runtime guardrails ensuring agent actions remain aligned with intent, policies, and system constraints.

  • Implement intent validation, semantic checks, and execution contracts before orchestration runs.

  • Define safety boundaries, escalation paths, and rollback conditions within agent workflows.

 

Fault Isolation, Rollback & Recovery

  • Architect deterministic rollback, checkpointing, and recovery mechanisms for multi-agent systems.

  • Design fault-isolation boundaries to prevent local failures from cascading system-wide.

  • Build sandboxed execution environments for validating AI-generated orchestration logic.

 

Observability & Diagnostics

  • Implement end-to-end observability capturing agent decisions, execution traces, and system health.

  • Develop anomaly detection and confidence-based safety gating for runtime behavior.

  • Build introspection APIs and dashboards exposing rationale, safety metrics, and performance signals.

 

Adaptive Governance

  • Establish feedback loops that adjust orchestration behavior based on performance and safety signals.

  • Contribute to continuous safety validation and runtime certification pipelines.

  • Collaborate across teams to embed transparency and traceability into every orchestration cycle.

 


Qualifications


Required Qualifications

  • PhD or 5+ years of experience in systems engineering, runtime reliability, or safety-critical software.

  • Strong proficiency in Python and C/C++.

  • Proven experience designing fault-tolerant, observable, and recoverable systems.

  • Hands-on experience with agentic orchestration frameworks (e.g., LangGraph, LangChain, or similar).

  • Solid understanding of execution control, intent alignment, and policy enforcement in automated systems.

  • Experience building telemetry, monitoring, or diagnostics pipelines in complex runtimes.

 

Desired Qualifications

  • Background in safety-critical or regulated domains (e.g. aerospace, industrial systems, EDA, HPC).

  • Experience with semantic validation, policy modeling, or goal disambiguation.

  • Familiarity with rollback strategies, dynamic gating, or safety scoring in distributed systems.

  • Experience with Python/C++ interoperability (e.g. PyBind11, gRPC, ZeroMQ).

  • Exposure to simulation-driven systems or hybrid AI–physics environments.

 

 

Careers Privacy Statement***Keysight is an Equal Opportunity Employer.***

All Keysight Technologies Jobs Browse software engineering roles senior positions