Job Description
Who are we
Fulcrum Digital is a leading IT services and business platform company. We partner with global companies from diverse industries including banking and financial services insurance higher education food services retail manufacturing and eCommerce. With expertise in digital transformation machine learning and emerging technologies we offer a consultingled integrated suite of enterprisegrade software products services and solutions.
Roles & Responsibilities:
- Business operations are responsible for ensuring the platforms production readiness. This is accomplished by closely partnering with developers to design build implement and support technology services.
- A business operations engineer will ensure operational criteria like system availability capacity performance monitoring selfhealing and deployment automation are implemented throughout the delivery process. Business Operations plays a key role in leading the DevOps transformation at Fulcrums Client through our tooling and by being an advocate for change and standards throughout the development quality release and product organizations.
- We accomplish this transformation through supporting daily operations with a hyperfocus on triage and then root cause by understanding the business impact of our products. The goal of every team member is to shift left to be more proactive and upfront in the development process and to manage production proactively and change activities to maximize customer experience and increase the overall value of supported applications.
- SRE/Devops teams also focus on risk management by tying all our activities together with an overarching responsibility for compliance and risk mitigation across all our environments. A biz ops focus is also on streamlining and standardizing traditional application specific support activities and centralizing points of interaction for both internal and external partners by communicating effectively with all key stakeholders.
- Ultimately the role of SRE is to align Product and Customer Focused priorities with Operational needs. We regularly review our run state not only from an internal perspective but also understanding and providing the feedback loop to our development partners on how we can improve the customer experience of our applications.
Skills Required:
Writing day to day Unix scripts for application maintenance purpose.
RDBMS knowledge and writing SQL queries.
Knowledge of ITIL framework Incident Management Change Management and problem management.
Excellent written and verbal communication skills ability to collaborate with functional and business users.
Systematic problemsolving approach coupled with strong communication skills and a sense of ownership and drive.
Should be able to work in Application support project in rotational shifts (Weekend work/on call as well).
Ability to troubleshooting debug and analyze application server logs and automate routine tasks.
Experience in dealing with different stakeholders in difficult situations making decisions with a sense of urgency.
Experience in working with development operations and product teams to prioritize defects/issues.
Good to have experience in industry standard CI/CD tools like Git/BitBucket Jenkins Maven Artifactory and Chef.
Good to have experience in writing Splunk queries and creating Splunk dashboards reports.
Good to have experience in any programming experience i.e. Core Java.
Good to have knowledge in Payments/Finance Domain.
Requirements
Looking for a Site Reliability Engineer who can help us solve problems build our CI/CD pipeline and lead in DevOps automation and best practices.
Are you a born problem solver who loves to figure out how something works
Are you a CI/CD geek who loves all things automation
Do you have a low tolerance for manual work and look to automate everything you can
Business Operations is leading the DevOps transformation through our tooling and by being an advocate for change & standards throughout the development quality release and product organizations. We need team members with an appetite for change and pushing the boundaries of what can be done with automation. Experience in working across development operations and product teams to prioritize needs and to build relationships is a must.
Location : St. Louis MO ( Hybrid)
Employment Type : Contract
Understanding of event-driven architectures Distributed systems - How clusters are formed, Quorum management, Failure handling. 3 to 5 years of hands-on Experience in MQ or NATS broker or similar messaging solutions. Understanding of Kafka clustering would be good to have. Knows Client-Server communication aspects - sockets, TLS protocol etc Understands the concept of region and AZs. Provide L2 support production systems like application, database, middleware components, infrastructure and network components. Manage production incidents end-to-end within defined SLAs with focus on resolution rather than who caused it. Interact with various stakeholders such as Release managers, program leads, service managers, development and test leads Review operational readiness requirements such as monitoring and alerting, log rotation and resilience of the components and report the gaps Provide pre-implementation support with activities such as release notes review and implementation dry runs. Protect production components by running health checks monitoring latency and memory utilization. Automate day-to-day activities and propose changes that improve reliability Participate in CAB and provide feedback on change requests Support the DevOps team in testing the promoted pipelines and suggest automation of configuration items. Practice incident management best practices and perform RCA. Participate in disaster recovery tests and operational acceptance tests Analyze the technology stack that makes up the product and optimize recovery time objective. Work with team members spread across and time zones Share knowledge, document improvements and mentor junior resources It is good to have skills using Jenkins to orchestrate builds and link to Sonar, Maven, etc. to build out the CI/CD pipeline. Support deployments of code into multiple lower environments. Supporting current processes needed with an emphasis on automating everything as soon as possible. It is good to have skill to design, Implement, and enhance our deployment automation based on Chef. We need proven experience designing and implementing an overall release and deployment process. It is good to have skill to design and implement a Git based code management strategy that will support multiple environment deployments in parallel. Experience with automation for Branch management, code promotions, and version management. Engage in and improve the whole lifecycle of services from inception and design through deployment, operation, and refinement. Requirements MQ/EB Understanding of event-driven architectures Distributed systems - How clusters are formed, Quorum management, Failure handling. 3 to 5 years of hands-on Experience in MQ or NATS broker or similar messaging solutions. An understanding of Kafka clustering would be good to have. Knows Client-Server communication aspects - sockets, TLS protocol etc Understand the concept of region and AZs. Deployments MTF/Prod, Maintenance items (including stop/start, Disaster Recovery-related activities, etc.), CR for changes in MTF/Prod Good knowledge on Nginx Tools - Log Monitoring Tool - Splunk Application Monitoring tool - Dynatrace Ticketing incident/problem management tool - Remedy Dev-ops Basics - CI-CD Basics, Overview of Git, Bit-bucket, SonarQube, Ansible/Chef Skills - Linux & Shell Scripting ITIL / ITSM PL/SQL Troubleshooting Jenkins - CI/CD Groovy Scripting/Yaml Ansible/Chef Nginx Java / JEE Event-Driven Architectures MQ or NATS broker or similar messaging solutions. Kafka Client-server communication aspects - sockets, TLS protocol Understand the concept of region and AZs.
Job Tags
Full time, Contract work, Shift work, Weekend work,