Member of Technical Staff - AI Infrastructure Engineer
GenPeach AI
About GenPeach AI
GenPeach AI is a product-driven research lab aiming to redefine how people create and interact through multimodal, emotionally resonant AI.
We are building vertical foundation models specializing in generating hyper-realistic humans in image & video. Our stack involves working with large-scale proprietary datasets, designing novel model architectures, efficiently training them on large GPU clusters and integrating them in the end-user products.
We train and deploy our own large-scale models and ship them into real products. Our team operates at the intersection of research-grade AI and production-grade systems engineering.
About the Role
We are looking for a Member of Technical Staff (MTS) to own and evolve the AI infrastructure layer that powers both research and production systems.
This is a core infrastructure role with high ownership and direct impact on model training, inference performance, and developer velocity.
In this role, you will
Own the AI execution and infrastructure layer used by research and product teams
-
Design and build high-performance Python systems for:
scalable model inference
training orchestration
large-scale data processing
Partner closely with research and backend engineers to productionize models and expose them via APIs
Design and operate distributed pipelines and task queues for batch and streaming workloads
Optimize GPU inference for latency, throughput, and cost efficiency
Own the MLOps lifecycle, model deployment and versioning; monitoring and alerting
Build and maintain CI/CD pipelines for services and ML workflows
Debug and resolve performance bottlenecks across Python, GPUs, networking, and storage
Contribute to infrastructure design decisions and long-term architecture
Minimum Qualifications
5+ years of professional software engineering experience (Python)
Strong proficiency in Python, including async programming, multiprocessing / concurrency, performance profiling and optimization
Experience building and operating high-performance, large-scale distributed systems
-
Practical understanding of production ML inference, including:
latency constraints
reliability and fault tolerance
cost and scaling trade-offs
Hands-on experience with Docker, Kubernetes, Infrastructure-as-Code (Terraform or similar)
Experience operating systems in Linux production environments
Preferred Qualifications
Experience with distributed execution frameworks (e.g., Ray)
Experience working with GPU workloads (training or inference)
Familiarity with model serving architectures and inference optimization
Experience handling large-scale image or video datasets (100s of TBs to PBs)
Strong fundamentals in data structures and algorithms
Exposure to observability stacks (metrics, logs, tracing) for ML systems
Experience supporting or collaborating closely with ML research teams
What Makes This Role Unique
Direct ownership over infrastructure supporting foundation models
Real impact on model quality, latency, and cost
Tight collaboration between research and production — no silos
Small, senior team with high trust and low bureaucracy
Opportunity to shape systems from first principles
Our Culture
High ownership and accountability
Strong technical standards
Direct, low-ego communication
Bias toward shipping, measuring, and iterating fast
Logistics
Location: Zurich or Warsaw: onsite or hybrid. If you’re elsewhere, we’re open to remote (team/timezone fit considered).
Competitive salary + meaningful equity (depending on role and level)
Interview process: quick screen → technical (practical + systems) → team fit/values