Portfolio Jobs

I only invest in exceptional people. Now you can work with them!
Sarah Smith Fund
companies
Jobs

Member of Technical Staff - AI Research Engineer

GenPeach AI

GenPeach AI

Software Engineering, IT, Data Science
Posted on Mar 3, 2026

About GenPeach AI

GenPeach AI is a product-driven research lab building vertical multimodal foundation models for hyper-realistic human generation in image and video – designed for emotionally resonant, human-centered AI experiences. Our goal is to create tools that supercharge human creativity rather than replace it.

We train models from scratch: proprietary datasets at massive scale, novel architectures and training recipes, large GPU clusters, and tight product integration so research ships to users quickly.

We are a deeply technical team of around 10 people. We’re advised by Directors from Google DeepMind and backed by leading AI-focused funds and angels from OpenAI, Meta AI, Microsoft AI, Project Prometheus, and Fal. Collectively, our team, advisors, and angels have contributed to models including Meta’s Imagine/MovieGen and foundation-model work behind OpenAI’s Sora, plus Google’s Veo and Gemini.

About the Team

You’ll join the research team working across image/video generation and multimodal understanding. You’ll work closely with other Research Engineers and Scientists, as well as Founders and help turn research into scalable training runs, strong evaluations, and production-ready systems.

About the Role

We’re hiring an AI Research Engineer to help build and scale GenPeach’s foundation models end-to-end – from implementing new model ideas and training recipes, to owning the parts of the training stack that determine quality and speed, to pushing models through production constraints.

This is a hands-on, high-ownership role. You’ll write research-grade code that becomes production-critical.

In this role, you will

  • Implement and iterate on image/video generative model ideas (architecture, losses, conditioning, sampling, distillation, post-training)

  • Own training performance end-to-end (distributed training, throughput, memory, stability, debugging scaling failure modes)

  • Build the experimentation loop (evals, ablations, reproducibility tooling, reporting, decision hygiene)

  • Build and improve VLMs for image/video captioning (data recipes, training strategies, model variants, evaluation)

  • Run high-iteration research: read papers when useful, implement ideas, validate empirically

  • Create captioning pipelines that improve generation training and product quality

  • Partner with inference/product to ship under real constraints (latency, cost, reliability, rollout safety)
    Build demos and prototypes to showcase capabilities and accelerate iteration

You might thrive in this role if you

  • Love the craft of experimentation: fast iteration, clear ablations, strong evals, and honest conclusions

  • Enjoy debugging messy real-world training runs (not just clean demos)

  • Can move between research and engineering: write clean code, ship utilities, and improve team velocity

  • Take ownership beyond your job description when needed (startup reality)

  • Communicate clearly and collaborate well in a small, senior team

Minimum Qualifications

  • Strong Python and PyTorch skills (4+ years of experience)

  • Experience implementing and training deep learning models (generative models, VLMs, LLMs, vision/video, or adjacent)

  • Solid understanding of training dynamics, optimization, and practical debugging

  • Ability to drive projects end-to-end with minimal supervision

Preferred Qualifications

  • Hands-on experience with diffusion/flow-based image or video generation, or large-scale generative modeling in adjacent domains

  • Experience with distributed training at scale (multi-node) and performance tuning (throughput/memory)

  • Experience building evaluation frameworks (offline metrics + human eval + regression tracking)

  • Strong intuition for data quality and dataset/labeling tradeoffs for training and captioning

  • Publications are a plus, but shipped impact and strong technical evidence matter more


What makes this role unique

  • Build frontier image/video models and the VLM captioning systems that power them

  • Join a lean, senior team that holds a high engineering + research bar

  • Direct product impact: your training runs become real user-facing capabilities

  • Benchmark against the best in the world and compete on model quality through what we ship

How we work

  • You own outcomes end-to-end and are trusted with real responsibility

  • Direct, low-ego communication and fast feedback loops

  • Bias toward impact: measure → iterate → ship

  • Research discipline: clear ablations, reproducibility, and crisp decision-making


Logistics

  • Location: Zurich (Switzerland) or Warsaw (Poland) — onsite or hybrid. If you’re elsewhere, we’re open to remote (team/timezone fit considered).

  • Compensation: competitive salary + meaningful equity (level-dependent)

  • Interview process: quick screen → 2x technical rounds (practical + systems) → team fit/values

What we offer

  • Visa sponsorship (where applicable); we’ll make a strong effort to relocate you to Switzerland or Poland if desired

  • Remote-friendly: work fully remote, hybrid, or on-site from our hubs

  • Regular offsites and in-person events to collaborate and connect

  • Flexible PTO