Working Student – GenAI / LLM Evaluation – Agentic AI / NLP (f/m/d) at Cinemo GmbH in Karlsruhe

Showing 26 of 1,000,000+Live

Cinemo GmbH

Working Student – GenAI / LLM Evaluation – Agentic AI / NLP (f/m/d)

Karlsruhe, Germany·Working Student·Posted 1 month+

Location: Karlsruhe, Germany
Category: Software Development
Job type: Working Student
Seniority: Intern
Language: en

Job details

Position Description

As a Working Student in the GenAI / LLM team at Cinemo, you will support the evaluation and validation of agentic AI systems and GenAI algorithms for NLP that power next-generation in-car experiences. You will help build datasets, extend evaluation tooling, and contribute to end-to-end testing workflows to ensure our non-deterministic AI components are measurable, reliable, and ready for real-world automotive environments across cloud-based services and in-vehicle platforms such as Android Automotive OS (AAOS) and Linux.

In this role, you will:

Support evaluation of agentic AI systems and LLM-based NLP features, including qualitative and quantitative analysis.
Create, curate, and maintain datasets for benchmarking, regression testing, and scenario coverage.
Extend and improve internal evaluation frameworks (metrics, dashboards, automated test runs).
Contribute to end-to-end testing of GenAI features within the in-car experience, including integration and validation workflows.
Document findings, track model/system changes, and communicate results clearly to the team.
Collaborate with engineers and researchers to translate evaluation insights into actionable improvements.

What you will need to succeed:

Ongoing Bachelor’s or Master’s studies in Computer Science, AI/ML, Data Science, Computational Linguistics, or a related field.
Hands-on programming skills in Python and a solid understanding of basic ML/NLP concepts.
Interest in GenAI / LLMs, agentic systems, and evaluation of non-deterministic AI behavior.
Experience with data handling and dataset creation (labeling, preprocessing, quality checks).
Familiarity with software testing concepts (e.g., unit/e2e testing, CI) is a plus.
Good written and spoken English communication skills.
The successful candidate will be based in Karlsruhe, Germany.

English-friendly

Position Description

In this role, you will:

Support evaluation of agentic AI systems and LLM-based NLP features, including qualitative and quantitative analysis.

Create, curate, and maintain datasets for benchmarking, regression testing, and scenario coverage.

Extend and improve internal evaluation frameworks (metrics, dashboards, automated test runs).

Contribute to end-to-end testing of GenAI features within the in-car experience, including integration and validation workflows.

Document findings, track model/system changes, and communicate results clearly to the team.

Collaborate with engineers and researchers to translate evaluation insights into actionable improvements.

What you will need to succeed:

Ongoing Bachelor’s or Master’s studies in Computer Science, AI/ML, Data Science, Computational Linguistics, or a related field.

Hands-on programming skills in Python and a solid understanding of basic ML/NLP concepts.

Interest in GenAI / LLMs, agentic systems, and evaluation of non-deterministic AI behavior.

Experience with data handling and dataset creation (labeling, preprocessing, quality checks).

Familiarity with software testing concepts (e.g., unit/e2e testing, CI) is a plus.

Good written and spoken English communication skills.

The successful candidate will be based in Karlsruhe, Germany.

Related English-speaking jobs