GPU / ML Engineer

Runara

Remote

2 day average response time from company

Apply

or view all positions

Recruiter

Vera Bekker

Roles:

Low LevelMachine Learning

Must-have skills:

PythonC++

Nice-to-have skills:

Embedded

Considering candidates from:
Eastern Europe

Work arrangement: Remote

Industry: Software Development

Language: English, Russian

Level: Senior

Required experience: 5+ years

Size: 2 - 10 employees

GPU / ML Engineer

Runara

Remote

2 day average response time from company

Company

Solving AI inference economics through intelligent orchestration, real-time telemetry & automatic runtime optimization.

Description

The company is looking for an engineer to support model optimization and inference for large language models, working mainly with Python and NVIDIA GPUs (CUDA).

Tasks:

Work with NVIDIA GPUs (CUDA) to run and optimize ML workloads
Apply quantization techniques to LLMs using existing libraries (e.g., GPTQ)
Integrate and run off-the-shelf tools for model optimization and inference
Optimize performance of models on modern GPU architectures (e.g., Hopper, Blackwell)
Collaborate with the team to validate approaches and results
Quickly prototype and validate technical solutions

Must-have:

5+ years of experience in software engineering / ML / GPU-related roles
Strong hands-on experience with NVIDIA GPUs and CUDA
Solid Python skills
Experience working with ML frameworks and running models in production or near-production environments
Ability to work independently
Basic background in applied mathematics (education)

Nice-to-have:

Experience with LLM optimization and inference pipelines
Familiarity with modern GPU architectures (Hopper, Blackwell)
Experience with quantization techniques (e.g., GPTQ or similar)
English skills
Embedded systems or low-level optimization

Benefits:

Remote, flexible engagement
Opportunity to expand into a larger role if collaboration is successful
Work on modern AI / LLM optimization problems

Interview process:

Intro call with Toughbyte
First interview with the architect
Follow-up interview with the company executives (if needed)

Questions

Have questions about this position? Try the company page or sign up to ask one.

GPU / ML Engineer at Runara

Apply

or view all positions