Hammersmith and Fulham, London
£650 - £800 per day + +ir35 uplift towards NI/Hol
4 months ago
1 year +
Our Global Enterprise Media Client is looking for a highly experienced SPLUNK subject matter expert / Sr. Splunk Systems Reliability Engineer who has extensive experience designing building and deploying Splunk for large enterprise environments (tens of Terabytes per day). This role is based on site in West London.
This role will be the EMEA member of a small specialist globally distributed team that will be responsible for designing, building, deploying a next generation centralised Splunk platform for the whole enterprise across the Globe. The platform will be using the latest and greatest tools to extract infrastructure data from a vast variety of systems across the globe and transform this infrastructure data into insight for business decision making and automation purposes. This is not any old data from any old business, this is working for a world leader with exciting businesses and digital consumer facing products including streaming media platforms, ecommerce, theme parks, TV and Film entertainment.
-You will need to be a SME in the Splunk platform including Solution Design, Build, Deployment and management with Approx 5-7 years experience in this as this is a Senior role.
-You will need need to identify the platforms and extract data from a huge variety of sources, transform the data into insights and develop the dashboards and understand and interpret the data.
-Expertise with Splunk and modules such as Enterprise Security
-Expertise with large-scale Splunk environments (multiple tens of TBs/day of ingest)
-This SRE is expected to have experience running large-scale Splunk environment in the tens of terabytes per day of ingest with experience using the Enterprise Security module. The Sr SRE is also expected to have expert level systems administration skills in Linux and Windows platforms, and must have experience with software development (e.g. Python, Go, Java), automation experience (Chef, Terraform, Cloud Formation), cloud hosting (AWS, GCP & Azure), and the DevOps team culture.
The Splunk Systems Reliability Engineering (SRE) team helps elevate SRE practices within the enterprise keeping Splunk available and performing within our SLAs. Additionally, this SRE will be promoting and onboarding new technologies that work in conjunction with Splunk, solving complex automation and scaling issues with the companies next generation Splunk environment.
-Bachelor of Science degree (ideally Masters) in, computer science or related field or equivalent experience in technical operations and software engineering
Other skills you will need or use include:
-Expertise in multiple scripting languages and advanced skills in programming languages (e.g. Go, Python, Ruby, Dart, Node, Java, others alike) with ability to build test coverage for all software being developed.
-Systems administration skills on Linux and Windows platforms
-Networking skills and protocols (e.g. HTTP, TLS, SSH, DNS)
-Experience with Source Control Management systems (e.g. Git)
-Expertise in public and private cloud hosting services (AWS, Google Cloud, Azure)
-Proficient with data technologies (e.g. NoSQL, MySQL, MongoDB, Redis, Elastic) including being able to perform basic setup, configuration, and troubleshooting.
-Able to implement existing base standards for new systems and/or applications for all of the following:
-Site/Systems monitoring and instrumentation
-Application monitoring and instrumentation
-System monitoring and instrumentation
-Resilience, performance & Telemetry data
-Able to diagnose simple to complex system and process problems.
-Able to perform and provide in depth analysis on load test runs against a moderately complex system.
-Demonstrate exceptional troubleshooting methodology, including the ability to author and instruct new methodologies to the SRE team.
-Independently resolve moderately to highly complex system and application incidents.
-Able to identify and propose system and application fixes for performance bottlenecks.
-Able to evaluate new application requirements for capacity and run-time best practices.
-Able to evaluate new system and/or infrastructure solutions for technical feasibility against known requirements and standards.
-Effective at dealing with change: Able to transition in role or handle a significant modification or technology with minimal ramp-up time and with very little guidance.
Systems Reliability Engineers use a software engineering approach to architect, design, automate, monitor, and build applications at scale. This includes operating and engineering software with close business segment alignment to deliver platforms through efficient, effective and resilient architectures. SREs are talented engineers that are focused on improving quality through a data driven approach: instrumentation, automation, and functional/unit testing.
This role will be inside ir35 but the rate would have an additional uplift towards Company NI/Holiday.