Be the force behind digital reliability.
As a Site Reliability Engineer, you’ll combine software engineering, automation, and incident management best practices to keep Barclays’ critical systems fast, available, and ready to scale. This is a hands-on technical role where you’ll tackle complex problems, prevent outages, and make sure millions of customers enjoy a seamless banking experience.
🎯 What You’ll Do :
🖥 Keep systems online – ensure reliability, performance, and scalability through proactive monitoring, preventive maintenance, and capacity planning. 🔍 Fix issues before they escalate – investigate outages, identify root causes, and implement lasting preventive solutions
🤖 Automate to innovate – develop scripts and tools (Python, Bash, etc.) to streamline operations, reduce manual work, and boost resilience
⚡ Optimise performance – monitor resource usage, spot bottlenecks, and fine-tune systems for maximum efficiency
🤝 Collaborate cross-functionally – work closely with developers, platform teams, and other engineers to embed reliability into every stage of the software lifecycle
📈 Stay ahead of trends – track the latest in SRE practices, cloud infrastructure, and monitoring tools, sharing knowledge with your peers.
What you should already have:
According to the Czech labour law, you need to hold a valid work permit.
🛠 Tech Stack & Skills Must-Haves:
Linux/Unix/Windows administration 🖥
Scripting & automation: Python, Bash 🐍
Databases: Oracle, SQL, MongoDB 🗄
Strong analytical & troubleshooting abilities🕵️
Nice-to-Have Experience:
Monitoring & observability tools: Prometheus, Grafana, Splunk, ELK 📊
Middleware: MQ, Solace, or similar messaging tech📡
CI/CD pipelines and Kubernetes 🚢