EXPLORE PREMIER
OPPORTUNITIES
As a skilled professional seeking career growth, you deserve access to the best job opportunities available. Join Outdefine's Trusted community today and apply to premier job openings with leading enterprises globally. Set your own rate, keep all your pay, and enjoy the benefits of a fee-free experience.



Site Reliability Engineer
Outdefine Partner
Overview
About Outdefine Outdefine is a web3 talent community that connects top talent with leading-edge companies and enterprises globally. Companies choose to hire Outdefine Trusted Members because their skills and readiness have been proven. When you accept a job as a Trusted Member, you will keep all of your pay. Contrast this with traditional hiring networks and agencies that charge membership fees and take up to 50% of your earnings as their markup. Additionally, Trusted Members get access to premier jobs, networking, and a global community powered by tokens. You can earn Outdefine tokens by working, contributing to the community, and referring friends. More than 100 jobs are currently listed on Outdefine, with more regularly added. Join over 5,000 professionals from 25 countries who are building and developing their careers with Outdefine. In order to apply for this position, first complete your profile on www.outdefine.com. We want to make sure that your application gets the most attention, so we suggest that you start the assessment process now to become a Trusted Member. To receive direct support from career experts, join Discord.
Skills
Requirements
o Experience with Linux Operating System; Operating Systems; Unix Operating System; Windows Operating System required. o Experience with Other: Experience with observability/monitoring tools such as Prometheus / Grafana, ELK, Splunk, Dynatrace, Datadog, Azure Log Analytics, AWS CloudWatch required. o Experience with Other: enterprise level CICD Tools such as Ansible, Terraform, YAML, Groovy, Cloudbees, OpenShift required. o Experience with Cloud: working in public cloud platforms like AWS and/or Azure required. o Experience with Containerization of Applications: Hands-on and understanding about containerization of applications / concepts - helm, Kubernetes, docker; Hands on experience in building / deploying containerization of applications; Understanding of building optimized imaging strategies; Understanding of building scalable images, availability strategies, good with CLI or troubleshooting skills in command line required. o Experience with Programming Languages: Java, Go, Python, understanding of OOPS required. o Experience with Other: building and operating highly scaled applications required. o Experience with Postgres; MongoDB; MySQL; Oracle Database Management System (DBMS); PL SQL; SQL (Programming Language) required. o Experience with Other: varying code repositories, auto deployments, branching with tools such as GitHub / Gitlab, Bitbucket, Subversion required. o Experience with Other: IT service management tools such as Service Now, Atlassian, BMC required. Good to have skills – • Bachelor’s is good but if not relevant experience in related field will work. Soft Skills: Soft skills are required because of client interaction. • Intermediate - Seeks to acquire knowledge in area of specialty. • Intermediate - Ability to identify basic problems and procedural irregularities, collect data, establish facts, and draw valid conclusions. • Intermediate - Ability to work independently. • Intermediate - Demonstrated analytical skills. • Intermediate - Demonstrated project management skills. • Intermediate - Demonstrates a high level of accuracy, even under pressure.
Duties
• Helps lead projects that are focused on managing and maintaining optimum platform infrastructure performance, reliability, and security using SRE practices, observability tools, manual and automated procedures, documentation, people and processes and continuous delivery(CI/CD) tools, processes, and designs. Develops complex services to automate monitoring activities and provide critical information to facilitate response and resolution of performance and availability issues and incidents. Understands and advocates for standardized and scalable software tools to ensure that systems operate without interruption at optimum performance and leads project teams throughout the deployment process. Troubleshoots and analyzes service disruptions to determine the root cause of issues and develop solutions for improved reliability. • Troubleshoots and resolves more complex problems with systems and services and initiates regular deployment of new versions of the systems and their subcomponents. • Leads more complex projects focused on building and maintaining observability/monitoring for the application, monitoring key performance indicators, maintaining alerting, and continuously improving visibility. • Helps make decisions around periodic system validation and testing, service monitoring, and standing up new services/tools. • Uses knowledge and experience to identify strategies that increase system reliability and performance through on-call rotation and process optimization. • Identifies and implements necessary manual and automated procedures for improved collaborative response in real-time. • Leads lower level Engineers in stress, security, and performance testing. • Resolves issues that come up through support escalation. • Keeps documentation and runbooks up to date to effectively deal with new incidents that might arise. • Leads post incident reviews and documents findings for future informed decision making. • Reviews proposals to optimize Software Development Life Cycle (SDLC) to boost service reliability and makes decisions around which proposals should move forward. • Communicates complex topics with development teams to investigate and document issues and leads internal team to develop solutions to mitigate them. • Performs other duties as assigned. • Complies with all policies and standards.
The Hiring Process
In order to apply for this position, first complete your profile on www.app.outdefine.com.
We want to learn more about you, so we encourage you to provide us with a brief summary of yourself and your past experience as part of the process. As soon as this is completed, you'll take a technical assessment based on your skill set, and if you pass, you'll earn 500 Outdefine tokens. We will review your application, and if you are qualified, we will invite you to a 1:1 video interview.
Already a Trusted Member of Outdefine? Then go ahead and apply directly for the job of your dreams.
Equal Employment Opportunity
We are an equal-opportunity employer and do not discriminate against any employee or applicant for employment on the basis of race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, veteran status, or any other protected status. We are committed to creating a diverse and inclusive environment for all employees and applicants for employment. All qualified individuals are encouraged to apply and will be considered for employment without regard to any legally protected status.
Become a trusted member, apply to jobs, and earn token rewards


Create and customize your member profile.


Earn 500 Outdefine tokens for becoming trusted member and completing your assessment.


Once you are a Trusted Member you can start applying to jobs.