Site Reliability Engineer (SRE)
SabancıDx: the Future is Today! 🚀
Discover SabancıDx!
As SabancıDx, Turkey's leading technology company, we provide our clients with innovative solutions they need in a competitive environment. We address all their cloud technology needs holistically and prepare them for future technological demands with our managed technology services and products developed for digital transformation.
Offering a single point of service for cloud solutions and managed services, we empower businesses with innovative technology solutions that enhance productivity and success. Through our AI-Powered Hybrid Cloud Solution Center, we combine global capabilities with local needs to deliver flexible and reliable solutions for our clients in Turkey and around the world.
Because the Future Is Today! 🌍 Learn more about Our’s strong solutions that shape tomorrow. 👉🏻 https://www.sabancidx.com/en/about-us
About Role
We are a leading provider of cloud and infrastructure managed services, empowering enterprises with reliable, secure, and scalable IT operations. We are seeking a Site Reliability Engineer (SRE) to join our team for a long-term engagement with a major player in the finance industry. This role is ideal for someone who blends deep technical skills with a proactive mindset to drive system resilience and operational excellence across the full stack.
We are looking for Site Reliability Engineer (SRE) for our team 🎯
Responsibilities
Observability & Monitoring
- Develop and manage end-to-end observability frameworks across infrastructure, applications, and business services.
- Use AI/ML tools and analytics to perform proactive anomaly detection and trend forecasting.
- Build real-time dashboards and reliability indicators that track system health through the lens of business impact.
- Ensure around-the-clock monitoring coverage (5x9) and manage escalation protocols effectively.
Alarm & Incident Governance
- Own the complete alert lifecycle: define, tune, respond, and evolve.
- Design adaptive, dynamic alert thresholds using contextual, time-series data.
- Minimize alert fatigue by implementing intelligent alert correlation and reducing false positives.
- Lead post-incident reviews (PIRs) and continuously improve incident response patterns and alarm quality.
Critical Incident Response
- Act as the Incident Commander during high-severity (P1) events, coordinating rapid resolution with multiple teams.
- Perform root cause investigations leveraging architectural blueprints, change history, and system telemetry.
- Implement automated rollbacks and hotfixes where applicable to reduce mean time to recovery (MTTR).
- Facilitate war rooms and provide real-time updates to technical and business stakeholders.
Deployment Assurance & Resilience Engineering
- Contribute to the release lifecycle by assessing deployment risks, validating architecture readiness, and ensuring observability is embedded pre-release.
- Track post-deployment health, validate against SLOs, and optimize performance and resource efficiency.
- Define and enforce resilience standards, recovery strategies, and compliance checks across environments.
Preferred Skills And Experience
- 5+ years of experience in Site Reliability Engineering, DevOps, or infrastructure roles.
- Strong expertise in infra platforms with a focus on scalable architectures.
- Proven experience with monitoring and observability ecosystems
- Skilled in incident and problem management frameworks (e.g., ITIL, SRE best practices).
- Familiar with CI/CD pipelines, container orchestration (Kubernetes), and distributed systems.
- Strong interpersonal skills and experience working in cross-functional teams.
- Background in regulated environments (insurance, finance, healthcare) is a plus.
Here in SabancıDx, our journey is to find the best version of ourselves and to create the best future together. 🌱
What we offer you:
- Enjoying flexible hybrid & Remote working model through which you design your own experience according to role description.
- The chance of working in a relaxing, green & fresh SabancıDx Digital Campus.
- Opportunity to engage in gardening in our greenery area.
- Free of charge lunch with different menu alternatives at our Digi-Delight Cafeteria
- An agile and innovative working environment where you GROW with learning and development opportunities.
- Opportunity to be a part of an agile team that work on sustainability projects.
- Feeling valued especially with our reward and recognition app, Thanxie!
- Health insurance & benefits including technical devices.
Working with a Young, Curious, Brave, Growth Oriented, Loving Team!
Please find detailed information about the processing of your personal data in the Employee Candidate Privacy Notice We kindly ask you to make sure that your requests do not include sensitive personal data (race, ethnicity, political opinion, philosophical belief, disguise and dress, membership to associations, foundations or trade unions, health, sexual life, criminal conviction and security measures, biometric and genetic data).