📈 Open Startup
RSS
API
Post a Job

get a remote job
you can do anywhere

The largest collection of Remote Jobs for Digital Nomads online. Get a remote job you can do anywhere at Remote Companies like Buffer, Zapier and Automattic who embrace the future. There are 29,650+ jobs that allow you to work anywhere and live everywhere.

The largest collection of Remote Jobs for Digital Nomads online. Get a remote job you can do anywhere at Remote Companies like Buffer, Zapier and Automattic who embrace the future. There are 29,650+ jobs that allow you to work anywhere and live everywhere.

  Jobs

  People

👉 Hiring for a remote Cloud position?

Post a Job - $299
on the 🏆 #1 remote jobs board

Scrapinghub

Senior Backend Engineer For Cloud Services


Scrapinghub


cloud

senior

engineer

backend

cloud

senior

engineer

backend

10mo

Apply

{linebreak}About the job:{linebreak}{linebreak}We are looking for two Senior Backend Engineers to develop and grow our crawling and extraction services. Our automated service is used directly by our customers via API, as well as by us for internal projects. Our extraction capabilities include automated product and article extraction from single pages or whole domains using machine learning and custom built components and we plan to expand it for jobs and news. The service is still in early stages of development, serving its first customers.{linebreak}{linebreak}As a professional services company we are often required to build a custom crawling and extraction pipeline for a specific customer. That requires crawl and extraction planning with respect to customer needs, including crawling time estimation and HW allocation. The volume is often very high, and solutions have to be properly designed to provide the required performance, reliability and maintainability.{linebreak}{linebreak}Our platform has several components communicating via Apache Kafka and using HBase as a permanent storage. Most components are written in Python, while several crucial components are made using Scala and Kafka Streams. Currently, main priorities are improving reliability and scalability of the system, integration with other Scrapinghub services, implementation of auto-scaling and other features. This is going to be a challenging journey for every good Backend Engineer!{linebreak}{linebreak}Job Responsibilities:{linebreak}{linebreak}{linebreak}* Design and implementation of a large scale web crawling and extraction service.{linebreak}{linebreak}* Solution architecture for large scale crawling and data extraction: design, hardware and development effort estimations, writing proposal drafts, explaining and motivating the solution for customers,{linebreak}{linebreak}* Implementation and troubleshooting of Apache Kafka applications: workers, HW estimation, performance tuning, debugging,{linebreak}{linebreak}* Interaction with data science engineers and customers{linebreak}{linebreak}* Write code carefully for critical and production environments along with good communication and learning skills.{linebreak}{linebreak}{linebreak}{linebreak}{linebreak}Requirements:{linebreak}{linebreak}{linebreak}* Experience building at least one large scale data processing system or high load service. Understanding what CPU/memory effort the particular code requires,{linebreak}{linebreak}* Good knowledge of Python{linebreak}{linebreak}* experience with any distributed messaging system (Rabbitmq, Kafka, ZeroMQ, etc),{linebreak}{linebreak}* Docker containers basics,{linebreak}{linebreak}* Linux knowledge.{linebreak}{linebreak}* Good communication skills in English,{linebreak}{linebreak}* Understand a ways to solve problem, and ability to wisely choose between: quick hotfix, long-term solution, or design change.{linebreak}{linebreak}{linebreak}{linebreak}{linebreak}Bonus points for:{linebreak}{linebreak}{linebreak}* Kafka Streams and microservices based on Apache Kafka, understanding Kafka message delivery semantics and how to achieve them on practice,{linebreak}{linebreak}* HBase: data model, selecting the access patterns, maintenance processes,{linebreak}{linebreak}* Understanding how web works: research on link structure, major components on link graphs,{linebreak}{linebreak}* Algorithms and data structures background,{linebreak}{linebreak}* Experience with web data processing tasks: web crawling, finding similar items, mining data streams, link analysis, etc.{linebreak}{linebreak}* Experience with Microservices,{linebreak}{linebreak}* Experience with JVM,{linebreak}{linebreak}* Open source activity.{linebreak}{linebreak}{linebreak}

See more jobs at Scrapinghub

Apply for this Job

👉 Please reference you found the job on Remote OK as thank you to us, this helps us get more companies to post here!

When applying for jobs, you should NEVER have to pay to apply. That is a scam! Always verify you're actually talking to the company in the job post and not an imposter. Scams in remote work are rampant, be careful! When clicking on the button to apply above, you will leave Remote OK and go to the job application page for that company outside this site. Remote OK accepts no liability or responsibility as a consequence of any reliance upon information on there (external sites) or here.