Our client, a leading insurance group, is currently seeking a highly skilled and experienced Cloud, Infrastructure, and Site Reliability Engineering (SRE) Architect to join the regional dynamic team. The firm is working on a major digital transformation and product and is committed to leveraging cutting-edge technologies to ensure the reliability, scalability, and security of the infrastructure. In this role, you will play a crucial part in architecting and managing the cloud-based infrastructure, as well as implementing best practices for SRE to maximize system reliability.
Responsibilities:
Design and architect highly scalable, secure, and cost-effective cloud-based infrastructure solutions that meet the needs of our business and supporting functions.
Implement and maintain the cloud-based platform, ensuring its efficient integration with various technologies and services.
Collaborate closely with cross-functional teams to understand business requirements and translate them into technical specifications and solutions.
Develop and enforce best practices for the implementation of DevOps and SRE methodologies, ensuring high reliability and availability of our systems.
Monitor and analyze system performance, identify areas for improvement, and implement proactive measures to enhance the overall reliability and performance of our infrastructure.
Maintain accurate documentation of architecture, design decisions, procedures, and configurations, ensuring it is regularly updated.
Collaborate with security teams to ensure the implementation of robust security measures and privacy controls in compliance with industry regulations and standards.
Mentor and guide other team members by sharing knowledge and best practices, fostering a culture of continuous growth and learning.
Requirements:
Bachelor's or Master's degree in Computer Science, Information Technology, or a related field.
Extensive experience in designing and implementing cloud-based infrastructure solutions, preferably in large-scale and complex environments.
In-depth knowledge of leading cloud platforms such as AWS, Azure, or Google Cloud, and proficiency in deploying and managing various cloud services.
Strong background in implementing DevOps and SRE best practices, including infrastructure as code, automated provisioning, monitoring, and alerting.
Proficiency in scripting and infrastructure automation using tools such as Terraform, Ansible, or similar.
Excellent understanding of networking principles, distributed systems, containerization, and microservices architecture.
Familiarity with agile methodologies and experience working in an agile environment.
Strong problem-solving skills with the ability to analyze complex systems, troubleshoot issues, and implement effective solutions.
Excellent communication and interpersonal skills with the ability to collaborate effectively with cross-functional teams, stakeholders, and vendors.
Relevant certifications such as AWS Certified Solutions Architect are highly desirable.