Clicking on the following button will update the content below

Engineer - Reliability Management
BrandBest Buy

Best Buy IT has a key position available within our Reliability Engineering team, ensuring the health and stability of our critical applications and infrastructure services.

The Engineer-Reliability Management role will be responsible for designing, developing, maintaining, and tuning availability and performance monitoring solutions.
Leveraging deep technical experience this roll will select, gather, and aggregate key metrics used to support technology optimization while driving toward a quantitative, metric driven, business relevant global IT organization.

This position will serve on a team of performance focused technology professionals, cooperatively monitoring and managing metrics and KPIs enabling world-class IT solutions to be delivered. The ideal candidate will possess demonstrated experience in analyzing data to diagnosis complex issues and communicate effectively to stakeholders with various level of technical expertise. Creativity in design and implementation decision is a critical to the success of this team. The ideal candidate will be challenged to think in new ways and solve engineering challenges with the focus of ensuring Best Buy applications are designed to be resilient and reliable.

Key position responsibilities include:
  • Technology thought leadership for a global Reliability Engineering service offering across a highly complex/ highly dynamic technology landscape
  • Partnering with product and engineering teams to implement end to end availability and performance telemetry capture, automated problem identification and proactive notifications.
  • Providing Dev/Ops support across Reliability Engineering dynamic service offerings.
  • Developing, publishing, and implementing Reliability Engineering standards.
Minimum Requirements:
  • 6+ years of IT experience working in a related discipline such as application monitoring, performance engineering, application support, or system administration
  • 1+ years of experience with Enterprise class monitoring solutions such as Dynatrace
  • 1+ years’ experience with log analytics solutions such as Splunk or Elastic Stack
  • 1+ year experience with opensource tools such as Prometheus, InfluxDB, Graphite, or Grafana

Preferred Qualifications:
  • Bachelor’s degree in Computer Information Systems, Engineering, Management Information Systems, Computer Science
  • Experience with enterprise scale APM (Application Performance Monitoring) solutions
  • Experience with enterprise scale DEM (Digital Experience Monitoring) solutions
  • Experience with scripting languages (i.e. Python, Perl, PowerShell)
  • System Administration experience with Windows or UNIX / Linux operating systems
  • Experience with cloud-based monitoring solutions such as Amazon Cloudwatch, or Google Stackdriver
  • Experience using configuration management and orchestration tools like Puppet and Chef
  • Experience with public and hybrid cloud technologies (AWS, GCP and Azure)
  • Experience with Open Source technologies and big data platforms
  • Able to communicate effectively with stakeholders across functional areas with various levels of technical experience
  • Conceptual understanding of Agile and DevOps culture
  • Experience implementing CI / CD and DevOps philosophies and working with infrastructure-as-code

Auto Req. ID769519BR
Employment CategoryDigital & Information Technology
Job LevelManager without Direct Reports
Location Number950500-105-Res Eng Reliability & Qual
Address7601 Penn Avenue South


Clicking on the following button will update the content below