Infrastructure - DevOps & Site Reliability Engineer

BLACKBIRD.AI

BLACKBIRD.AI

Software Engineering, Other Engineering
New York, NY, USA
Posted on Friday, October 21, 2022

This is a fully remote opportunity at Blackbird.AI. You will not be required to relocate.

Database and Microservices infrastructure Devops engineer and Site Reliability Engineer

The Company:

What has been the effect of disinformation on the world?

Blackbird.AI creates leading-edge AI software to provide critical real-time insights to provide our clients with a deep understanding of ongoing disruptive narratives, their motives, and overall digital noise. We are united by our dedication to our mission. We believe that we have a responsibility to society and that our service is vitally needed by organizations and individuals to create an empowered and critical thinking society.

If this mission resonates with you, we'd love to hear from you.

The Opportunity:

Get ready to join a small but growing team of highly talented engineers and leaders, building exciting AI-driven services and technologies. As an Infrastructure Architect for Blackbird.AI, you will own the infrastructure architecture for a real-time streaming cloud-hosted analytics platform, and help the company establish a solid foundation for deployment of different micro services databases and frameworks, along with performance monitoring tools, continuous integration and deployment pipelines.

Responsibilities:

  • Own and maintain self hosted and AWS hosted Linux servers.
  • High proactivity in troubleshooting and communicating on server infrastructure deployments.
  • Develop, and maintain kubernetes clusters that hold several databases and ETLs
    • Engineer fault tolerance, backup and retention policies.
    • Develop deployment and rollout scripts.
    • Monitor and scale the different databases.
  • Develop and maintain servicing different web applications through web servers and ingresses.
  • Maintain and scale several deployments: Kafka, ElasticSearch, postgresDB, redis ...
  • Audit and manage security related issues : TLS , firewalls etc ...
  • Automate cloud agnostic deployments to : AWS or on premise.
  • Work with the data engineering team and the full stack development team to ensure best practices in terms of stack choice and deployments.

Must Have:

  • BS degree in Computer Science or equivalent
  • Demonstrated product success with deployment in the cloud and SaaS model; in a horizontally scalable distributed fashion.
  • Expert level knowledge of Linux systems.
  • Expert level capable of Kubernetes ecosystem and docker containers.
  • Proficiency in Helm charts , Python and or golang.
  • Solid level in web servers and security related topics.
  • Good experience in building and maintaining Kafka clusters and redis clusters.
  • Experience in deploying and maintaining prometheus, grafana and setting up monitoring for the infrastructure.
  • Good knowledge of ElasticSearch and MetricBeat for log monitoring.
  • Solid experience in addressing infrastructure security issues.
  • 2+ years’ hands-on experience in developing with Python, Bash.
  • Familiarity with infrastructure as code tools , terraform or similar.
  • Experience in Managing secrets stores, Vault or similar.
  • Expertise in build automation, continuous integration and deployment (CI/CD) tools,
  • Experience working with cloud based services (similar to AWS S3, CloudFront, Route53, ElastiCache etc.).
  • Experience working with distributed teams.

Helpful to Have:

  • Experience with MLOps frameworks Kubeflow, SeldOn or similar.
  • Experience in building Kubernetes operators in golang.
  • Technical background or experience with AI/ML deployments.
  • Experience with multi-tenant deployments in AWS or similar.
  • Familiarity with mainstream ETL tools, airflow or similar.
  • Experience dealing with massive datasets in order of terabytes.
  • Health Care Plan (Medical, Dental & Vision)
  • Paid Time Off (Vacation, Sick & Public Holidays)
  • Work From Home
  • Stock Option Plan
  • Exciting career development prospects, to grow into leadership roles

Take note - due to the high volume of applicants, only shortlisted candidates will be notified. Thank you for taking the time to apply for the role at Blackbird.AI.

LI-Remote