Site Reliability Engineer
About The Position
Fabric makes profitable on-demand e-commerce a reality. Its flexible micro-fulfillment solution was specifically designed to enable fast fulfillment from small spaces. By leveraging robotic automation, Fabric allows retailers to reduce costs and cut fulfillment times. Unlike any other micro-fulfillment solution, Fabric’s software-led robotics and modular approach gives retailers the flexibility to build the fulfillment center that fits their requirements, allowing them to fulfill online orders at maximum speed while ensuring profitability. Retailers can choose a platform model to run and operate independently on their real estate or a service model in which fulfillment is offered as a service, with minimal capex investment. Founded in 2015, Fabric has raised $138 million to date and is backed by Aleph, Corner Ventures, Canada Pension Plan Investment Board (CPPIB), Innovation Endeavors, La Maison, Playground Ventures, and Temasek. With offices in New York City and Tel Aviv, Fabric is rapidly expanding its U.S. operations with over 170+ team members globally and 15 sites under development/contract, including two live micro-fulfillment centers.
We are looking for Site Reliability Engineer
We are looking for a technology leader with multidisciplinary technological skills across engineering, infrastructure, and methodologies to join our Software Engineering team in Tel Aviv.
Fabric Software Engineering is a team of brilliant and talented engineers, algorithms developers, data scientists, and researchers, all dedicated to building a robust, highly scalable robotic fulfillment network.
We're responsible for everything from orchestrating the robots' movement to handling orders and managing stock across multiple fulfillment centers. We solve challenging problems through collaboration across different roles and paradigms using cutting edge technology. We have a diverse tech stack, with a backend organized in a microservices architecture that helps keep it flexible. We're investing in a DevOps culture, which means end-to-end ownership from supporting the feature design to monitoring the production deployment.
As our business grows, the team is developing fast as well, opening new opportunities for both professional and personal growth.
As Site Reliability Engineer (SRE), you will design and implement solutions for the company’s robot’s production infrastructure, CI & CD pipelines, and observability for a business-critical robot’s production environment. You will write code, leverage managed solutions, open-source tools, and industry best practices to ensure that our robot’s production environment is highly available to serve our sites.
- Design and implement solutions for robots CI/CD, including simulating new code to test potential improvements or issues.
- Provide solutions for observability on our production deployment (including monitoring, logging) to other engineers, professional services, and operations in the sites.
- Ensure a highly reliable system across on-prem deployment.
- Support engineering teams performing development on IoT hardware components.
- At least 3 years of engineering experience in Linux environments.
- Experience with Docker and with running containers in production environments.
- Experience with modern (containerized) CI/CD solutions.
- Experience with Python is an advantage.
- Experience with IoT is an advantage.
- Experience with Yocto is an advantage.
- An excellent collaborator with great communication skills, who can take part in a diverse cross-functional team
The perfect candidate is someone who:
- Is thrilled when they solve complex infrastructure and architecture problems.
- Enjoys working with other team members and colleagues to solve infrastructure problems, build deployment pipelines, and stream data to the cloud.
- Is driven by the desire to make a big impact
- Enjoys an enabling role, supporting the team’s velocity and practices
Apply for this position