Logo of Archipelo

Remote Data Engineer

Remote
Photo of Anastasiya Ivanenko
Recruiter
Anastasiya Ivanenko
Roles:
Data
Must have skills:
Python
Go
Nice to have skills:
NLP
Elasticsearch
Considering candidates from:
Worldwide
Work arrangement:
Remote only
Industry:
Developer tools
Language:
English
Level:
Senior
Company size:
2-10 employees
Logo of Archipelo

Remote Data Engineer

Remote
Archipelo is building an intelligent code discovery platform that provides the best tools for developers to discover code in any form—and benefit through insights, recognition, and greater productivity. They are transforming code search to improve the practice of modern programming—using a graph-based approach drawing on data from the entire open source ecosystem. They're on a mission to build the world's best code discovery engine. Archipelo is well-funded by top investors in Silicon Valley, including the first investors of Google, Twitter, Zoom, LinkedIn, and Uber. Their team has backgrounds from NASA, LinkedIn, Facebook, Amazon, AWS, Cisco and MIT, Harvard, Stanford, and Berkeley.
Right now, they are seeking a Senior Data Engineer to lead technology development on the frontier of code discovery and developer productivity. 

Responsibilities


  • Implement data engineering best practices
  • Design and develop data systems for machine learning
  • Write real-time pipelines that execute complex operations on incoming data
  • Experiment in ways that accelerate prototyping and maximize resource utilization
  • Ensure pipelines work quickly, focus on fast single node performance and leverage horizontal scaling
  • Manage our data pipeline, including scheduling, dataflow programming, SQL and data labelling
  • Orchestrate the operation of clusters of commodity machines
  • Review code, mentor other engineers and support the data team
  • Attract, recruit and retain top engineering and scientific talent

Must-have skills:

  • Expertise in microservices and cloud computing—across cloud platforms
  • Proficient with distributed systems and the coordination of high volume independent commodity machines into complete, functional systems to handle diverse workloads
  • Minimum 8+ years of professional software engineering experience
  • Expertise with ETL
  • Expertise with Go
  • Expertise with Python
  • Proven expertise and leadership as a world-class senior data engineer

Nice-to-have skills:

  • Bachelor’s or Master’s degree in computer science/engineering, mathematics, physics, or other related technical fields with equivalent practical experience
  • Experience with Natural Language Processing and Understanding (NLP & NLU)
  • Knowledge of math, probability, statistics and algorithms
  • Experience with Golang
  • Familiarity with machine learning frameworks (like Keras or PyTorch)
  • Ability to run a CPU and IO profiler to figure out where to optimize the pipeline
  • Advanced working knowledge of information retrieval and search technologies and have set up and used open-source search systems to query and understand data
  • Experience with most of the following technologies:
    • ElasticSearch, Solr or equivalent experience
    • Machine learning infrastructure
    • Deep learning 
    • Any task scheduler
    • Any graph database

Apply now

or
By applying you agree to our terms of service. This site is protected by reCAPTCHA and the Google privacy policy and terms of service also apply.