Job Description: Tesla’s Vehicle Engineering team is currently seeking a site reliability engineer (SRE) to focus on improving our in-house manufacturing and supply chain computer vision platform. This role will contribute to many of our other purpose-built applications that automate and improve the process of designing, building, and delivering Tesla products, globally. Job Responsibilities: Maintain existing infrastructure deployed both on AWS and private Kubernetes clusters Recommend latest ops best practices, with a focus on automation Write technical documentation and runbooks Collaborate with stakeholders across regions to understand deployment requirements Derive functional (application) and non-functional (availability, performance, security, and maintainability, cost) requirements for global deployments and scaling needs Deliver on deployment requirements with proactive monitoring tools and dashboards Job Requirements: Ability to choose the right storage system, e.g., relational DBs (AWS RDS for Postgres, CockroachDB), cache (Redis, Elastic Search), S3 (public and private clouds), time-series DBs (InfluxDB, Prometheus) based on the application need Understanding of when to use and how to set up message brokers, e.g., Kafka and AWS Kinesis BS in Computer Science/related field, or equivalent industry experience Fluency with tools for orchestration in AWS or a private cloud, especially with Kubernetes, using tools like Terraform, Ansible, ArgoCD, and Docker Relevant experience with the Go/Python/JS or other programming languages Demonstrated ability to respond to production outages and implement long-term fixes Familiarity with of machine-to-machine communication concepts, i.e., TCP/IP, web sockets, and RPC Job Details: Company: Tesla Vacancy Type: Full Time Job Location: San Diego, CA, US Application Deadline: N/A Apply Here getyourvacancy.com