job specification
Job Title: Site Reliability Engineer (SRE) / Infrastructure Operations MID LEVEL
Role Overview
Responsible for managing day-to-day infrastructure operations, including monitoring, alerting, and driving stability improvements across the environment.
Key Responsibilities
- Monitor overall infrastructure health and system performance
- Track key performance metrics such as CPU, memory, and disk utilization
- Tune alerts to improve signal-to-noise ratio and reduce alert fatigue
- Support disaster recovery (DR) rehearsals and readiness activities
- Maintain and update runbooks, documentation, and operational reports
Required Experience
- 4–6 years of experience in Site Reliability Engineering (SRE) or infrastructure operations
- Hands-on experience with VMware environments
- Experience with monitoring tools such as PRTG, Datadog, or similar platforms
- Strong incident management experience, including response and resolution processes
Core Skills & Competencies
- Solid understanding of infrastructure performance metrics (CPU, memory, disk, etc.)
- Experience with alert tuning and optimization
- Ability to proactively detect and troubleshoot performance issues
- Strong incident management and operational response capabilities
Screening Signals
Look for candidates who:
- Understand CPU Ready thresholds and their impact on performance
- Have hands-on experience tuning alerts to reduce noise
- Can proactively identify and resolve performance bottlenecks
- Demonstrate strong incident management experience in production environments
- Start Date:
- 12.06.2026
- Contact person:
- Bernd Kraft
- Company:
- Insight Global Brazil, Ludwig-Erhard-Strasse 14
- Telephone:
- Job email:
- Click here
Print job
-
Technical Product Manager (Payments)
-
Desenvolvedor Java
-
Desenvolvedor Java
-
Desenvolvedor Java
-
Executivo Comercial (Full-cycle)
-
Analista de Gestão de Pessoas Sênior (Indicadores/People Analy...
-
Remote AI Quality Analyst (Chinese)
-
Remote AI Quality Analyst (Chinese)
-
Remote AI Quality Analyst (Chinese)
-
Guidewire InsuranceNow Technical Architect / 100% Remote in Br...
-
Video Editor
-
Freelancer | AEM Content Author (Adobe Experience Manager)
-
Technical Support Specialist
-
Technical Support Specialist
-
Technical Support Specialist
