Senior Reliability Engineer (Remote) About the Role As Senior Reliability Engineer, you will ensure the resilience and availability of Kohl’s systems and applications, collaborate closely with development teams, contribute to architectural designs, conduct risk assessments and design for failure, and implement robust monitoring and failover mechanisms. What You’ll Do • Drive error budget and Service Level Objective (SLO) adoption across products • Drive incident response efforts, perform root cause analysis and implement preventative measures to enhance system reliability • Establish consistent practices that elevate Kohl’s operational excellence through automation and process improvements • Follow software lifecycle and drive reliability, observability, and efficiency across product teams within an assigned domain • Identify repeated toil and find opportunities for automation and risk reduction • On-call on a rotation to respond to production incidents and conduct blameless retros and root-cause analyses (RCAs) to drive a culture of continuous improvements • Proactively identifies failures before they cause outages using chaos engineering techniques such as edge cases, failure modes and design review • Advise on capacity planning and provide continuous assessments on systems behavior and consumption • Work with product managers to identify and prioritize work for reliability best practices (i.e., leveraging SLIs/SLOs/Error Budgets) • Mentors and assists engineers on the team • Additional tasks may be assigned What Skills You Have Required • Bachelor's Degree or equivalent in MIS, Computer Science or related field • 4+ years of experience in software development • Strong programming skills in one or more languages (Java, Python, Go or Node.js) • In-depth knowledge of systems architecture, operating system internals and network fundamentals • In-depth knowledge of application design patterns, event-driven architecture, database schemas, and testing strategies • Experience with multi-region application troubleshooting and performance tuning • Working experience with one cloud platform (GCP, AWS, or Azure) • Working experience with monitoring techniques and tools (e.g., CloudWatch, Grafana, Prometheus, OpenTelemetry, Tracing) Preferred • Experience supporting Big Data platforms • Data analytics experience with tools like Tableau, Qlik • In-depth knowledge of containerization and container orchestration (e.g., Docker, Kubernetes, Rancher) • Experience with one or more configuration management systems (e.g., Chef, Ansible, Puppet) • Passion for and experience with AI and ML methodologies (MLOps) • Experience writing Infrastructure as code (e.g., Terraform, OpenTofu) For other Kohl's jobs, click here here!
Join our team of detail-oriented HVAC installation technicians. Were looking for an experienced and professional install technician to help our clients stay comfortable and healthy indoors. You'll be responsible for troubleshooting, diagnosing, and installing a variety...
...Solutions! Here you can make an impact on the world around you and accelerate your career in areas that energize and excite you. Environmental Field Technician I Acts as Waste Technician for customer-site work. May require coordinating performance of subcontractors...
...DESCRIPTION Amazon Transportation Services (ATS) Linehaul is seeking a highly skilled and motivated Shipper Performance... ..., Word, and Excel) experience - Work a flexible schedule/shift/work area, including weekends, nights, and/or holidays - Are 18 years of age or older
...As a Pediatric Dental Hygienist, you will play a vital role in ensuring your young patients develop healthy habits and a lifetime of beautiful smiles! If you enjoy working with a fun team of like-minded professionals dedicated to providing the latest evidence based...
...systems, network, and data across the enterprise. The role is technical and requires an in-depth understanding of information and cyber security technologies and functions. The role also requires an understanding of business goals/strategy and operational requirements in...