Staff Software Engineer, Compute Infrastructure
San Francisco, United States
Airbnb is a mission-driven company dedicated to helping create a world where anyone can belong anywhere. It takes a unified team committed to our core values to achieve this goal. Airbnb's various functions embody the company's innovative spirit and our fast-moving team is committed to leading as a 21st century company.
The Compute Infrastructure team builds and maintains the Kubernetes-based infrastructure that powers the Airbnb site. As a software engineer on this team, you’ll work on one of the largest end user Kubernetes deployments in the community that runs a variety of production-facing workloads and integrates with almost all other infrastructure systems at Airbnb. With users around the world, reliability, scalability, efficiency and high availability are our team’s core concerns.
Airbnb is a member of the Cloud Native Compute Foundation’s end user community and regularly meets with peer companies to discuss cloud native engineering challenges at scale.
What you'll do:
- Lead development of an internal PaaS called OneTouch, that provides an easy integration point for Airbnb’s product teams to deploy and manage their services.
- Manage and deploy Kubernetes infrastructure on top of the public cloud.
- Deliver frameworks and platforms that are secure, efficient, mature and highly available that abstract away infrastructure complexity.
- Optimize existing systems/services to improve performance and efficiency.
- Systematically improve availability by applying industry and distributed systems best practices.
- Work with other talented and friendly infrastructure engineers to build the foundation for Airbnb’s technical growth over the next decade.
- 8+ years of relevant software development industry experience in a fast paced, high growth tech environment.
- Experience leading and shipping large scope technical projects in collaboration with multiple experienced engineers.
- Expertise with a public cloud provider (AWS, GCP, Azure) and their infrastructure as a service offering (e.g. EC2).
- Experience running with large scale container orchestration systems (e.g. Kubernetes, Borg, Tupperware, Mesos) or high performance computing.
- Familiar with Linux system administration and container technology.
- You are passionate about building and using great developer tools.
- You are a full-cycle developer: strong ownership and experience building and operating high-scale, distributed systems across the full software life cycle.
- You have excellent communication skills and the ability to work well within a team and across engineering teams.
- You are a strong problem solver and have solid production debugging skills.
- You are passionate about efficiency, availability, quality and system quality.
What we’ve done:
- Introduced a Kubernetes backed compute platform to the company and led the migration of hundreds of new and existing services to Kubernetes, helping to deprecate Airbnb’s monolith.
- Related talks:
- Built out the first phase of our next generation service mesh using Envoy and Istio to connect all of Airbnb’s services.
- Rolled out auto-scaling across all of Airbnb’s Kubernetes clusters to provide elasticity and improved efficiency.
- Built tooling to allow users to deploy microservices to production with a simple set of commands.
- Shipped a distributed counting service, which is one of the highest throughput services, that powers other applications at Airbnb.
- Delivered a framework for cross-region data backup and recovery.