Hybrid : 1

Machine Learning AI Agent Platform

Posted 2 months ago

Tokyo, New York, United States
Location

Hybrid
Working model

WealthTech
Industry

Permanent
Employment Type

Storm2

⚡ Staff ML Engineer, AI Agents – Wealth Management Infrastructure 🌍 San Francisco Bay Area (Hybrid) 💲 $250,000 – $300,000 Base + 20% bonus + Equity

The Company

Storm2’s client is a Series A-stage company building the AI infrastructure layer for institutional wealth management. They’re not selling a chatbot or a demo. Their systems run live in regulated environments, embedded into how some of the world’s largest financial institutions serve clients day to day.

The Role

A lot of “AI engineer” roles right now are about wrapping APIs and writing prompts. This one is not.

You’d be working at the level where agent systems are actually built: designing evaluation frameworks that determine whether a model is safe and reliable enough to operate in a live financial environment, building the orchestration and tooling that makes agents work at scale, and translating the latest LLM capabilities into production systems that actually hold up under enterprise constraints. The evals piece is central, not an afterthought. If an agent is advising on client suitability or supporting portfolio decisions, you need rigorous ways to know whether it’s working and when it’s failing.

The context is a demanding one. Enterprise deployment means security requirements, latency constraints, multi-tenant architecture, and partners who need stable APIs rather than moving targets. You’ll be building for that reality from day one, not as a later phase.

What makes this role interesting is the combination: enough research exposure to stay close to what LLMs are becoming capable of, paired with the engineering discipline to make those capabilities reliable in a high-stakes domain. If you’ve mostly lived on one side of that line, this will push you.

What you’ll be working on:

Designing and building the AI Agent Platform: tool use, planning, memory, orchestration
Building evaluation and benchmarking frameworks to assess agent quality, reliability, and safety in production
LLM orchestration, prompt management, and workflow execution infrastructure
APIs and platform abstractions for enterprise and external partners
Self-hosted and multi-tenant deployment infrastructure with real enterprise constraints
Bridging new LLM capabilities into stable, production-grade financial workflows
Improving observability, failure handling, and reliability across agent systems

What you’ll bring:

7+ years building production ML or backend systems for ML-powered products
Hands-on experience with LLMs, agent frameworks, or applied ML systems in production
Experience building evaluation or benchmarking systems for LLMs or ML
Strong Python and modern ML tooling
Systems thinking at the level of latency, failure modes, and reliability tradeoffs
Based in or willing to relocate to the Bay Area

Strong plus:

Experience with self-hosted models or enterprise AI deployments
Background in distributed systems or data infrastructure
Prior exposure to financial or other high-stakes regulated domains

📧 Click ‘Easy Apply’ or email thomas.hill@storm2.com

⚡ Storm2 is a specialist FinTech recruitment firm with clients across Europe, APAC, and North America. Visit storm2.com or follow us on LinkedIn for the latest roles and intel.

Apply now

Full Name

Email Address

Mobile Number

Current Job Title

Country of Residence

LinkedIn Profile

Add your LinkedIn profile URL

Upload CV

Upload your CV/resume or any other relevant file. Max. file size: 128 MB.

Message

Senior Frontend engineer

Staff Software Engineer

Senior Software Engineer

Full Stack Engineer

Sr Compliance Officer

SVP, Self-Directed Trading Platform Management

Machine Learning AI Agent Platform

Brand Marketing- Confidential

VP Engineering

Business Development Manager (US)

Machine Learning AI Agent Platform

Apply now

Discover

Specialisms

Find us