Hammersmith and Fulham, London
£450 - £550 per day
3 months ago
ASAP within 2-4 weeks
Web Systems Reliability Engineer (Web Platform Support / Linux Administration / DevOps). This position is for an experienced Systems Reliability Engineer to help create, build, and operate hosting platforms, automation frameworks and web applications that deliver amazing digital experiences. Primary responsibilities include maintaining, designing, building, and supporting java based web application systems for a large scale enterprise production environment that hosts a variety of online properties for this global media company based in west London.
The Senior Systems Reliability Engineer is expected to have expert level systems administration skills on Linux and have extensive experience with supporting Java based web application platform/development environments (knowledge of Java and java components is required). The role will also help provide, source management, cloud hosting, container computing, and grow the DevOps and automation culture and help migrate applications to the public/private cloud.
Develop and maintain scalable, resilient onprem and cloud-based infrastructure using an Infrastructure as Code approach
Work with delivery teams to build and maintain continuous deployment practices
Contribute to platform architecture and design and drive continuous improvement
Work within and support a DevOps culture
- Recognized as a subject matter expert on Linux OS including OS performance monitoring, setup, configuration, tuning, and troubleshooting
- Strong experience with Java based web server and application server technology, including setup, configuration, performance monitoring, tuning, Resiliency, clustering, and debugging (e.g. JConsole). Although there will be no actual coding involved you will need to be able to understand Java and Java components
- Experience in public and private cloud hosting services (AWS, Google Cloud, Azure, OpenStack, CloudStack) as well as familiarity with container computing (eg. Docker, Mesos, Kubernetes).
- Software Development Continuous Integration (CI) Pipeline knowledge (Jenkins)
- Experience with Source Control Management systems (Git)
- Config Management Experience (e.g. Chef, Puppet, Ansible)
- Proficient in web or webserver technologies: PHP, Node.js, Tomcat, IIS, Apache (IHS), MySQL, Oracle, MSSQL, etc., including being able to perform basic setup, configuration, and troubleshooting.
- Expert on HTTP(S), TCP/IP, SNMP and DNS.
- Understand internet technologies and network protocols, including HTTP, basic load balancing configurations, security zones, VIPs, etc.
- Able to author tools and scripts to be used by others to automate repeatable production tasks in standard languages like bash, python, golang, csh, batch or VBscript.
- Other development skills may also be useful Python, PHP, Ruby, Java, Go, Swift or C++ and able to build unit test suites for all software being developed.
- Demonstrates exceptional troubleshooting methodology, including the ability to author and instruct new methodologies to the SE team.
- Demonstrate ability to independently triage moderately complex incidents.
- Independently resolve moderately to highly complex system and application incidents.
- Able to identify and propose system and application fixes for performance bottlenecks.
This position will also bring expertise on systems, operational excellence and application stability, security, performance, and capacity management, as well as documentation. This position works closely with various application teams to brainstorm, architect, gather requirements, troubleshoot, and provide stellar customer support. The role requires someone who is creative, proactive, constructive, and highly motivated. The Senior Systems Engineer must be prepared to work in an extremely collaborative and high-energy environment.