N-iX is a global software development service company that helps businesses across the globe create next-generation software products. Founded in 2002, we unite 2,400+ tech-savvy professionals across 40+ countries, working on impactful projects for industry leaders and Fortune 500 companies. Our expertise spans cloud, data, AI/ML, embedded software, IoT, and more, driving digital transformation across finance, manufacturing, telecom, healthcare, and other industries. Join N-iX and become part of a team where your ideas make a real impact.
We are looking for an** Inference Platform Engineer (LLM & Kubernetes)** to join our team.
Our client is a leading European AI company developing large language models and generative platforms for enterprise and government clients.
Their products combine high-performance technologies, transparency, accessibility, and data security, fully aligned with European regulatory and ethical standards.
As an Inference Platform Engineer (LLM & Kubernetes), you will take ownership of inference API integration, operations, and platform reliability across production AI systems.
This role is designed to be covered by** 1–2 FTE split across several senior specialists**, ensuring continuity of inference services and full coverage during planned and unplanned absences as we take over end-to-end LLM inference responsibility.
Responsibilities:
-
Take ownership of inference API integration, orchestration, and long-term platform reliability
-
Lead operations for LLM inference services as they transition under internal ownership
-
Ensure inference API availability, latency, and performance in production environments
-
Design and maintain multi-turn conversation handling, chat templates, and prompt orchestration
-
Proactively monitor, troubleshoot, and resolve inference platform issues, logs, and errors
-
Manage Kubernetes deployments, Helm charts, and ArgoCD workflows for inference services
-
Ensure platform security, CVE monitoring, and compliance with internal and regulatory standards
-
Collaborate closely with backend, platform, and infrastructure teams
-
Maintain clear operational documentation to support shared ownership across multiple FTEs
Requirements:
-
5+ years of Python programming experience
-
Strong Kubernetes (k8s) experience, including deployment, scaling, and monitoring
-
Experience handling large-scale logs, monitoring, and observability in production
-
Basic knowledge of LLM fundamentals and the surrounding industry (e.g., what type of models exist, how does an LLM generate output)
-
Experience from the user side developing against an Inference API (e.g., OpenAI, Anthropic, OpenRouter etc.) and understanding of their structure (experience with providing or deploying a similar API yourself a strong plus)
-
Ability to independently own and operate inference services in a shared-responsibility model (1–2 FTE split across multiple specialists)
-
Strong communication skills and experience working with cross-functional engineering teams
-
Solid Linux fundamentals
Nice to have:
-
Hands-on experience with Helm charts, ArgoCD, and CI/CD for AI services
-
Interest in partly working with Rust
-
Senior-level experience with production LLM inference or AI platform operations
-
Experience building or operating multi-turn conversational AI systems
-
Familiarity with real-time API orchestration or streaming inference workloads
-
Background in MLOps, AI platform engineering, or SRE
-
Experience with cloud-based inference deployments and scaling
-
Knowledge of security, CVE scanning, and operational best practices
Technology Stack:
-
Inference: OpenAI, Anthropic, or other LLM inference APIs
-
Focus Areas: API integration, multi-turn conversation orchestration, tool calling, platform reliability
-
Infrastructure: Kubernetes, Helm, ArgoCD, cloud or hybrid environments
-
Monitoring: Logs, metrics, observability tools for inference systems
-
Workflow: Git, CI/CD pipelines, documentation, operational runbooks, incident handling
-
Standards: Reliability, latency, performance, security, maintainability
We offer:*
-
Flexible working format - remote, office-based or flexible
-
A competitive salary and good compensation package
-
Personalized career growth
-
Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
-
Active tech communities with regular knowledge sharing
-
Education reimbursement
-
Memorable anniversary presents
-
Corporate events and team buildings
-
Other location-specific benefits
*not applicable for freelancers