Search Jobvertise Jobs
Jobvertise

Devops Engineer with Incident Management exp
Location:
US-PA-Philadelphia
Jobcode:
3554056
Email this job to a friend

Report this Job

Report this job





Incorrect company
Incorrect location
Job is expired
Job may be a scam
Other







Apply Online
or email this job to apply later

Role and Responsibilities : Facilitate E2E coordination of critical incidents occurring in client environment (Business applications and infrastructure) Responsible for application level/server level monitoring using multiple tools 24/7. Build Proactive monitoring environment for all environments and self-healing for repeat incidents in Nagios, Op5, Splunk, Grafana Initiating Bridge Calls and coordinating with the resolver groups until the issue is resolved Senior level troubleshooting & Coordination of technical restoration actions and plans for Major Incidents, P1 and P2 incidents in a multi-supplier ecosystem Ensuring outages are E2E driven by maintaining authority during technical bridges for faster and successful resolution Be involved in Performing Root Cause Analysis (RCA) for domain related incidents, create and maintain recovery playbooks/ Standard Operating Procedures (SOP), for commonly occurring customer patterns and issues Liaises with High Priority Incident Manager counterparts and proactively remain cognizant of industry trends to develop and promote best practice. Use GSH, Puppet, and Ansible for testing, triaging, and fixing bugs in lower environment with frequent feedback from production infrastructure and applications. Proactively monitor SLA performance and report on them accurately. Ensure MTTD, MTTT & MTTR are met and develop ways to improve the product quality using Splunk, Grafana Work on User Stories in sprints & interact with stakeholders to ensure business requirements are met Participate in Tabletop & Simulated War Games -Monitor affected DC, failover to redundant site, do health checks, Identify/note any issues observed during the war game activity, coordinate with other teams, communication to stakeholders etc Adhere all best practices wrt to ITSM/ITIL standards Qualifications and Education Requirements Must Have: BE/B.Tech, MCA, BSc (IT), BCA Preferred Skills 7+ years of experience managing server level incidents and running incident management programs, preferably in large-scale environments 5+ years of experience on various DevOps tools like Ansible, Kubernetes, Puppet, Chef, Jenkins, Docker, SVN, and GIT to integrate automation and managing various applications Expert in Linux and must be RHEL Certified, System Administration Experienced in creation & modification of multiple Python, Ruby and Shell Scripts for various application-level tasks. Experience in Designing and Implementing servers on Open stack Platform through Terraform Experience in working with JIRA & Service Now tools to plan, track, support and close requests, tickets, and incidents. Working Experience in Installation and configuration of monitoring tools like Splunk, Kibana, Grafana, OP5 and Prometheus for different environments. Experience working within a sprint wise environment Additional Notes Outstanding communication and presentation skills, written and verbal. Excellent listening skills and a high degree of empathy Good analytical and problem-solving skills to troubleshoot systems problems and analyze the complex architectural environment Expert in ITSM and ITIL Certification is an added advantage

Infosight Consulting Inc

Apply Online
or email this job to apply later

Related Jobs



flameRemote - IOS developers Click here
Global Cybers LLC - Remote
Demonstrates up-to-date expertise and applies this to the development, improvement, and release of the * iOS App. Participates in peer code reviews an...
Posted 5 days ago
 


flameRemote - ios Developer - Bellevue, WA - Fulltime Click here
Saxon Global Inc - Bellevue, WA
Position: iOS Developer Location: Bellevue, WA Duration: Full Time/ Permanent Opportunity Req. Visa Status: USC/GC/EAD-GC, H4, L2 Only Require...
Posted 5 days ago
 


flameRemote - Salesforce Business Analyst 10+yrs Click here
UNICOM Technologies Inc - Chicago, IL
Salesforce BA Remote Long term 10+ years of techno-functional experience in Salesforce, having at least 3 End to End implementations in Salesfor...
Posted 5 days ago
 


flameRemote - IOS Developer Click here
Jeevan Technologies - Remote
iOS developer " 5+ years in iOS Development with 2+ years of development in Swift/SwiftUI. Experience on developing an iOS Universal application ...
Posted 5 days ago
 


flameRemote - Mobile Developer ios and Android Click here
nDimensions Technologies Inc - Columbus, OH
Mobile Developer iOS and Android - Contract -Remote Location : Columbus OH/ Remote Description: Writing native Android and iOS code as p...
Posted 5 days ago
 


 
Search millions of jobs

Jobseekers
Employers
Company

Jobs by Title | Resumes by Title | Top Job Searches
Privacy | Terms of Use


* Free services are subject to limitations