Incident Commander Customer Technology (Chaos Testing)Omgeving Amsterdam
As Incident Commander within the Customer Technology Platform, you are responsible for the rapid resolution (mitigation of impact for customers and/or business) of major incidents. In addition, you are developing Dashboards and or automation to make the processes easier and faster. Thanks to your input, everyone in the department is aware of the up-to-date processes and you ensure timely communication in case something is broken. You also ensure that these processes are supported with tools and technology that you can develop yourself.
As an Incident Commander you make your presence felt by:
- Continuously improving the processes with regard to Incident Command and spread this
- knowledge by providing training. With the aim of making each Incident a learning moment, so that it won’t reoccur or can be solved even faster.
- Being able to independently manage major incidents and, together with involved resolution teams, ensure minimal impact for customers and business.
- Run disaster recovery drills and implement chaos testing.
- Initiating continuous improvements to solve incidents structurally or make us able to identify incidents before any impact occurs.
At our client they work with the latest technologies and innovative solutions such as Kubernetes, CI/CD, Github actions etc. The entire platform is in Azure and is a state of the art scalable platform. There is an open development culture, with room for your ideas, you get ownership, and we expect your own initiative. You are really a spider in the web and have to deal with different departments and levels within the organization. Chaos Engineering is a subject we want to develop further, the ideal candidate has knowledge / experience in Chaos Engineering and wants to share his/her enthusiasm with the rest of the department.
We work daily on our clients website, the iOS/Android App and more. Within the Enabling cluster, we work on various matters to ensure that product teams can focus primarily on delivering customer value. Our focus is mainly on automation. The team makes sure that if something breaks, we quickly fix it together again. We hate it when customers or our own business has impact from a disruption. There is a no blame culture, we roll up our sleeves and fix the problem. Afterwards we look back together where we can improve.
To maximize the customer impact, it is important that you:
- Have a technical higher professional education or master's degree, preferably in Information Technology.
- Have experience with Grafana, Chaos Monkey, Gremlin, LitmusChaos, Azure kennis, Chaos Engineering principles
- Have about 4-6 years of relevant work experience in which you have proven your added value in roles as Incident Manager, Incident Commander or DevOps Engineer within a complex technology environment.
- Have deep knowledge of Incident management (ITIL).
- Having knowledge of OpsGenie and ServiceNow is an advantage.
- Are used to working with time-critical deadlines, and able to evaluate and improve processes.
- Have great communication skills in English and/or Dutch.