Site Reliability Engineer

Permanent contract|Bucuresti|Innovation / Project / Organization

Site Reliability Engineer

Bucuresti, Romania Permanent contract Innovation / Project / Organization


We are looking for a Site Reliability Engineer for the Business Intelligence domain, which will help create a bridge between BI and Operations by helping us put in place scalable frameworks on Monitoring, Automation, Operations, which will be used by BI, IT and Infrastructure. We need a mature professional that can help us put in place the SRE processes and activity best practices.

Role Description:

•    Monitor and take steps to improve the overall application stack performance and stability.
•    Improve the monitoring framework by adding new integrations and monitoring services.
•    Maintain and monitor the deployment and orchestration of the servers, Docker containers, databases, and general backend infrastructure for non-prod environments.
•    The SRE is responsible for detecting capacity issues and working closely with the BI & Infrastructure architecture teams for mitigation.

Issue resolution
•    Troubleshoot complicated, cross platform issues handling the entire applicative stack (OS, Networking, Database, Applicative) 
•    Work closely with the Operational team, in a consulting role, for solving incidents in the production environment.
•    Act as single point of contact for technical issues encountered on the non-production environments.
•    Work closely with the Service Desk team in order to build a troubleshooting knowledge base for the end-users.

•    Apply automation and software to any tasks or parts of the system that would benefit from it.
•    Document your system knowledge as you acquire it over time, create run-books, and ensure critical system information is readily available to those who need it
•    Keep up-to date with security, and proactively identify, diagnose, and solve complex security issues for all environments, in cooperation with our IT security team.
•    Maintain backup, and redundancy strategies
•    Conduct system analysis, configuration management and develop improvements for the solution’s performance.
•    Qualify and deploy applicative evolutions/upgrades on the non-production environments. Offer advice on new features vs stability vs operability.


Key Skills:
•    Have a strong technical background; 5+ years’ experience in an IT or technology support role
•    Proactive behavior that demonstrates initiative and positive work ethic;
•    Solid interpersonal skills and comfortable interacting with stakeholders at both infrastructure and application levels;
•    Experience with virtualization and containerization (VMware, podman/kubernetes preferably)
•    Experience with monitoring systems (CheckMK, Centreon/Nagios preferably)
•    Experience with scripting (bash, Python preferably)
•    Database knowledge (Oracle preferably)
•    Experience in monitoring and data analytics tools (Prometheus, Grafana, ELK)
•    JVM performance and tuning is a plus;
•    Experience with CI/CD tooling (Jenkins, GitLab) is a plus;
•    Experience with administration on other applications (Tableau, SAS, tomcat) is a plus


Service Reliability Engineer

We are an equal opportunities employer and we are proud to make diversity a strength for our company. Societe Generale is committed to recognizing and promoting all talents, regardless of their beliefs, age, disability, parental status, ethnic origin, nationality, gender identity, sexual orientation, membership of a political, religious, trade union or minority organisation, or any other characteristic that could be subject to discrimination.

Reference: 2200080I
Entity: BRD
Starting date: immediate
Publication date: 2022/08/04