Strategic Reliability Engineer - Capital One
  • York, Pennsylvania, United States of America
  • via Bebee.com
-
Job Description

At Capital One, we're seeking a seasoned Strategic Reliability Engineer to drive the design, development, and implementation of technical solutions that improve system reliability. As a key member of our team, you'll work closely with Agile teams to identify and prioritize opportunities for process improvements.Key responsibilities include:Collaborating with Agile teams to design, develop, and implement technical solutions that enhance system reliability.Communicating Service Level Objective concepts to product partners and driving agreement on objectives.Identifying strategic opportunities to improve reliability and influencing the team's direction.Developing and implementing processes or solutions that improve reliability across multiple platforms.Identifying gaps in automation and developing strategic plans to reduce toil for platform teams.Working with experts to arrive at optimal design and deployment configurations.Establishing standards that improve deployment and system reliability for integration pipelines and recommending approaches for chaos testing.Identifying and developing proactive, automated approaches for system reliability and alerting, as well as identifying key performance indicators for a system.Translating business requirements into implementations such as scaling, failover, timeouts, and health checks, and working with development teams to test and improve system performance and reliability.Required Qualifications:Bachelor's Degree.At least 4 years of professional software engineering experience.At least 1 year of experience with cloud computing (AWS, Microsoft Azure, Google Cloud).Preferred Qualifications:Master's Degree.7+ years of experience in at least one of the following: Java, Scala, Python, Go, or Node.js.2+ years of experience with AWS, Google Cloud Platform, Azure, or another cloud service.4+ years of experience in open source frameworks.1+ years of people management experience.2+ years of experience in Agile practices.2+ years of experience with blameless incident reviews and post-incident responses.2+ years of experience with secure coding practices.2+ years of experience in creating release documentation.2+ years of experience in logging technologies (log4j configuration, Splunk).2+ years of experience in resilient system architecture patterns (Microservices Architecture, Layered Architecture, Event-Driven Architecture).Capital One is committed to diversity and inclusion in the workplace.

;