Site Reliability Engineer in St. Louis, MO at HUNTER Technical Resources

Date Posted: 9/23/2019

Job Snapshot

  • Employee Type:
    Full-Time
  • Location:
    St. Louis, MO
  • Job Type:
  • Experience:
    At least 7 year(s)
  • Date Posted:
    9/23/2019
  • Job ID:
    4678207

Job Description


Job Summary:

Looking for a highly motivated Site Reliability Engineer, who is capable of building and running large-scale, massively distributed, fault-tolerant systems. Individual will work with teams across the organization and ensure  core services reliability and keep an eye on capacity and performance. 
  • Responsible for blameless postmortems and proactive identification of potential outages factor into iterative improvement. 
  • Experience in Designing and Deploying multi-data center Large Scale Web Applications. 
  • Work closely with dev, and ops teams to build highly available, cost effective systems. 
  • Create new tools and scripts designed for auto-remediation of incidents. 
  • Design/Implementation of Big Data technologies, including Hadoop, MongoDB, Kafka, RabbitMQ, Zookeeper, Spark, ELK, etc 
  • Responsible for establishing end-to-end monitoring and alerting on all critical aspects to ensure SLAs and get proactive notifications of possible issues for all systems. 
  • Design platforms for extremely high uptime metrics. 
  • Works well independently and requires little or no supervision. 
  • Work with cloud operations team to resolve trouble tickets, developing and running scripts, and troubleshooting. 
  • Fully understand the application, microservices interactions. 
  • Design/Implementation containers/applications in scalable HA/DR multi-tier cloud environments, including new system design, documentation, implementation, and deployment. 
  • Participate in 24x7 an on-call rotation. 

Job Requirements: 
  • 7+ years of experience in the following areas: 
    • Experience in providing L4 technical support for production 24x7. 
    • Strong experience in production support and operations. 
    • Design/Implementation of network and presentation tier technologies, including F5, Apache, Nginx, etc 
    • Experience in Performance Testing/Tuning/Monitoring, maximizing system uptime and availability, ensuring functional and performance SLAs. 
    • Experience with monitoring Application/Infrastructure Performance, and availability. 
    • Automation Experience with Build/deployment, Software Configuration/Continuous Integration/Continuous Delivery/Release Engineering related tasks in an JavaEE/C++ Environments. 
    • Experience in automating manual processes using Python, Ruby, Unix Shell (bash, ksh), perl, Ant, etc. 
    • Installing, Configuring, Administering, and Tuning of JavaEE Application Servers/Containers like Tomcat, WebSphere, etc 
    • Installing/maintaining/Administering software on Unix Linux, Windows servers. 
    • Experience with Web service technologies, including REST, SOAP, JSON, XML 
    • Experience with Cloud Platforms and virtualization Technologies. 
    • Deploying and automating infrastructure/applications in cloud environment using Chef, RPM, etc. 
    • Working closely with Development, QA, Product Management, and Production Ops teams to make sure Product Releases on-time with quality. 
    • Hands on experience Configuring and Administering SCM(GIT, SVN), Build (CMake, Make files, Maven), CI(Jenkins), CD Automation Tools. 
    • Experience with database (RDBMS, NoSql) technologies is a plus. 
    • Experience with Performance Testing is a plus. 
    • Configuring and maintaining SDLC Environments. 
    • Experience in Agile Methodologies and processes. 
    • Strong Automation, problem-solving skills, and ability to follow through to completion. 
    • Demonstrated leadership skills through a variety of activities, including leading or mentoring technical staff 
    • Strong verbal/written communication skills. 
    • Participate in 24x7 an on-call rotation.