Site Reliability Engineer

Oslo, Norway
Roles:
Backend
Must-have skills:
Go
Considering candidates from:
Europe
Work arrangement: Onsite
Industry: Software Development
Language: English
Level: Middle
Required experience: 2+ years
Relocation: Paid
Visa support: Provided
Size: 201 - 500 employees
Logo of Pexip

Site Reliability Engineer

Oslo, Norway
Pexip is a high load video conferencing platform. Earlier this year they reached a million minutes of video conferencing going through its platform every hour. That's video and audio being decoded, mixed and encoded in real-time on more than 4,000 virtual machines all around the world.
Currently, they are looking for a talented Site Reliability Engineer to join their engineering team and participate in building great software, increasing the manageability of the platform and automating anything and everything.

Tasks: 
  • Build great software, including code to increase the manageability of the platform and automate anything and everything
  • Establish and maintain SLIs and SLOs, and the tooling to support it
  • Participate in a 24x7 on-call duty within the team for critical services and escalation workflows
  • Monitor, troubleshoot and resolve production-grade issues for SaaS platform and applications
  • Participate in the post-mortem process, Root Cause Analysis (RCA) and lessons learned process
  • Maintaining a knowledge base of known issues and solutions
  • Work with cutting edge technology in the cloud to build and maintain CI/CD pipelines for build, deploy, code coverage. Ability to install, configure, update and troubleshoot cloud Microservices
  • Collaborate with Engineering teams, influencing and contributing to product design. establishing requirements for manageability and operations, and ensuring it’s implemented
Must have skills:
  • 2-4 years of combined experience with both software development and system administration/operations. 
  • Confident hands-on skills with Golang.
  • Experience managing applications running on private, public or hybrid cloud platforms. 
  • Deep understanding of the Linux operating system.  
  • Willingness to participate in a 24x7 on-call duty within the team for critical services and escalation workflows.
Nice to have skills:
  • Expertise in designing, analyzing, and troubleshooting large-scale distributed systems. 
  • Experience with Docker, Kubernetes, Terraform, Ansible or equivalent technologies. 
  • Understanding of standard networking protocols and components. 
  • Ability to debug, optimize code, and automate routine tasks. 
  • Systematic problem-solving approach, coupled with effective communication skills and a sense of drive. 
  • Higher degree in Computer Science or relative field.
Benefits:
  • Stock options with a 4-year plan
  • Free food (a personal company chef who is cooking lunches)
  • Health insurance
Interview process (may vary): 
  1. Intro call with Toughbyte
  2. 30 minutes intro interview with Pexip
  3. 1-hour tech interview 
  4. Test assignment (it could be done in 10-20 hours but some have spent less (3-4 hours) and some more. It really depends on how much effort the candidate wants and can put into it.) 
  5. The candidate comes for an on-site interview (Remote during COVID): 2 afternoons - each afternoon some coding exercises, chats with the team, dinner with the whole team
It's possible for exceptionally strong candidates from countries close to the Oslo time zone. But the company prioritizes candidates who are willing to relocate during the recruitment process.