Software Engineer II: Data Pipeline

US - CA - San Francisco

Location(s)

US - CA - San Francisco, US - Remote

Team(s)

Product & Engineering


Rapid7's tCell product is the next-generation Cloud WAF (RASP) which provides runtime security for web applications by gathering application security data via in-app instrumentation, and enforcing adaptive policies to stop a broad spectrum of attack types including zero-days.  The solution is driven by large scale data collection of application behavior that is analyzed for threats and anomalies. Security teams are then able to quickly respond to incidents, as well as continuously improve their security posture by applying protective policies that prevent applications compromise.

Job Overview:

Rapid7's tCell RASP/NG-WAF backend consists of several distributed data services that power the application protection solution. This includes a data streaming ingest pipeline, low-latency queries across large volumes of time-series data, notification/alert handling, and various analytics services including machine learning. In this role, you'll own one or more key services from design to delivery. If you enjoy implementing challenging data services in an elegant, resilient, highly-scalable, and performant manner, then this job is for you.

Responsibilities: 

  • Design and implement data pipeline processing functionality, either by extending existing systems, or implementing new ones as needed. This includes data collection, and outbound data transfer
     

  • Analyze and improve the efficiency, scalability, and reliability of our backend systems.

  • Work as a member of a team that is customer focused, and responsive to their immediate and long term business needs.  

  • Write automated tests that can exercise comprehensively the functionality for correctness and performance, including clarifying requirements, scope, and limits

  • Collaborate with other team members on bringing innovative solutions that solve our customer's most challenging problems.
     

  • Ensure services run resiliently in an infrastructure as code, CI/CD, cloud environment


 

Requirements:
 

  • A passion for innovative and clean solutions to challenging problems at scale

  • 5+ years of industry experience with a track-record of success in distributed systems
     

  • Excellent understanding of distributed data streaming and processing systems design and tools
     

  • Experience and expertise in functional programming (Scala, Clojure, Java using lambdas, etc.)
     

  • Prior experience with Spark streaming, Kafka Streams, Flink, or similar streaming technologies. Experience with data/workflow engines a plus.

  • Prior experience with Druid, Elastic, Cassandra, or similar

  • Comfortability in leveraging cloud infrastructure such as AWS, Lambda, Docker, Kubernetes, Terraform.

  • Demonstrated abilities in developing software involving caching, queuing, concurrency, and network programming

  • Ability to quickly pick up new languages and frameworks, and adapt them effectively for the needs of the product

  • A knack for taking high level requirements, and having excellent technical judgement to deliver high quality functionality under tight schedules

  • Bachelor's degree (or higher) in Computer Science, or equivalent experience. (Masters a plus)

#LI-REMOTE

#LI-LS1