Site Reliability Engineer

Your role within our Kingdom
Our job is to build effective, stable and reliable large scale infrastructure tools and services for our games and product teams, to allow them to focus on creating great games. We strive to empower developer teams to be autonomous and flexible and continuously work to create a self service model for our tech, by being in close collaboration with development teams in the full product life cycle.We engineer and provide the shared infrastructure serving all of our games, as well as the developer environments and supporting tech like observability, log management, and event transport. This includes everything from working in the Data Centers and writing orchestration and automation for our production stack to troubleshooting distributed systems and resolving production incidents.We are currently at the beginning of multiple projects redefining our infrastructure. Among other things we have major efforts to modernize our platform as well as all supporting software and orchestration.The Application team is responsible for designing, building, and maintaining the production applications' infrastructure. These applications serve billions of data objects representing game states, messaging, and much more that constantly serve hundreds of thousands of requests per second with low millisecond response times. Being part of the team, you will write software to support and automate our infrastructure as well as manage and plan our environment, working in close collaboration with the rest of the Infrastructure Engineering organization and backend-developer teams.You will among other things: Develop and maintain utilities and libraries supporting our distributed infrastructure, working with technologies like MySQL, Cassandra, Kafka, OpenTSDB, and so on. Build automation and improve the resilience of the systems serving our games Evaluate hardware and software, run benchmarks, and perform capacity planning, for existing and future deployments Do performance analysis, optimization, and workload characteristics to minimize the resource utilization and cost of our backend Work closely with other teams on incident resolution and proactive strengthening King's site reliability Create and maintain our deployment pipelines Provide subject matter expertise for our technologies and systems to stakeholders Troubleshooting, incident management, and On Call
Skills to create thrills
Comfortable working in a Linux computing environment Strong development skills in Python, and some knowledge of Java, Perl, SQL, or similar Experience automating and orchestrating distributed systems as well as creating internal tools such as backup management or metrics collectionInterest or experience in Database technologies like MySQL, Cassandra, HDFS/Hadoop, etc Monitoring systems like OpenTSDB, InfluxDB, Graphite, etc Log management systems like Graylog, the ELK stack, etc Orchestration frameworks like Ansible, Salt, etc Familiarity with Linux performance tools Good communication skills
Bonus points

Don't Be Fooled

The fraudster will send a check to the victim who has accepted a job. The check can be for multiple reasons such as signing bonus, supplies, etc. The victim will be instructed to deposit the check and use the money for any of these reasons and then instructed to send the remaining funds to the fraudster. The check will bounce and the victim is left responsible.

More Jobs

Senior Data Engineer
Freeport, ME L.L. Bean, Inc.
Dir - IS-Integ IS - Network Engineering
Portland, ME Maine Medical Center
Engineer III, Facilities
Bath, ME Bath Iron Works Corp
Engineer III, Q/A
Bath, ME Bath Iron Works Corp
Construction Services Engineer I - 2019
Eliot, ME Flack + Kurtz Inc.