Position Overview :
The Operations Engineer is responsible to provide technical expertise, education, and support to ensure the highest level of reliability and availability for critical applications.
This role focuses on systems engineering to deploy high quality solutions and address operations problems. You have an organic approach of our systems and are able to stay focused when production services are down to maintain the work as a team when needed to bring them back up.
Responsible for driving to maintain and increase system operations quality, availability and security.
Partners with development team to ensure efficiencies in increasing quality, availability and security to technical platforms.
Works individually and with teams to drive reliability goals and objectives across platforms.
Assists in the development and deployment of solutions to increase service stability through automation and process re-engineering.
Analyze service performance to improve quality and customer experience.
Evaluates tools and solutions to ensure consistent processes and repetitive tasks are performed with a higher level of accuracy and reduced defects.
Evaluates and advises on recovery tooling to adhere to enterprise standards.
Introduces new and impactful technologies to the production support tool chain that help minimize friction for production releases and support, and to more quickly diagnose and recover from production incidents.
Provide regular reports in a timely manner, including weekly activity report.
Other activities assigned by management as needed.
Bachelor’s Degree or equivalent experience.
5+ years of experience in AWS on a production level required.
Strong scripting background with preference for cloud native networking.
Expert in installing and administering some sort of mainstream Linux operating system (Amazon Linux or other Red Hat derivative preferred).
Work well individually as well as with a team in a fast-paced environment, quick learning and excellent problem-solving skills.
7 years of experience working in a production level operational role such as : Linux admin / systems admin / DevOps engineer / cloud engineer.
Experience with opensource Puppet / Ansible or another configuration management suite.
Ability to anticipate, identify and resolver customer communication problems.
Excellent communication oral and written in English and ability to effectively communicate with diverse backgrounds and levels of the organization.
Excellent attention to details and able to work independently with minimal supervision.
IoT industry experience is desired, but not required.
Ability to be on-call.
Preferred Knowledge :
Experience with Nagios / Check MK monitoring system; Mongodb / Elasticsearch; Graylog; Statsd / Graphite / Grafana; git version control