
Recruiter
Alexandra Frolkina
Roles:
Machine Learning
Must-have skills:
Python
Nice-to-have skills:
Embedded
Considering candidates from:
Eastern Europe
Eastern Europe
Work arrangement: Remote
Industry: Software Development
Language: English, Russian
Level: Senior
Required experience: 5+ years
Size: 2 - 10 employees
Company
Solving AI inference economics through intelligent orchestration, real-time telemetry & automatic runtime optimization.
Description
Description:
The company is looking for an engineer to support model optimization and inference for large language models, working mainly with Python and NVIDIA GPUs (CUDA).
The company is looking for an engineer to support model optimization and inference for large language models, working mainly with Python and NVIDIA GPUs (CUDA).
Tasks:
- Work with NVIDIA GPUs (CUDA) to run and optimize ML workloads
- Apply quantization techniques to LLMs using existing libraries (e.g., GPTQ)
- Integrate and run off-the-shelf tools for model optimization and inference
- Optimize performance of models on modern GPU architectures (e.g., Hopper, Blackwell)
- Collaborate with the team to validate approaches and results
- Quickly prototype and validate technical solutions
Must-have:
- 5+ years of experience in software engineering / ML / GPU-related roles
- Strong hands-on experience with NVIDIA GPUs and CUDA
- Solid Python skills
- Experience working with ML frameworks and running models in production or near-production environments
- Ability to work independently
- Basic background in applied mathematics (education)
Nice-to-have:
- Experience with LLM optimization and inference pipelines
- Familiarity with modern GPU architectures (Hopper, Blackwell)
- Experience with quantization techniques (e.g., GPTQ or similar)
- English skills
- Embedded systems or low-level optimization
Benefits:
- Remote, flexible engagement
- Opportunity to expand into a larger role if collaboration is successful
- Work on modern AI / LLM optimization problems
Interview process:
- Intro call with Toughbyte
- First interview with the architect
- Follow-up interview with the company executives (if needed)
Questions
Have questions about this position? Try the company page or sign up to ask one.
