Resume Score
CV/Résumé Score
  • Expertini Resume Scoring: See how well your CV/Résumé matches this job: Sr Systems Engineer Linux – AI Infrastructure.
Erode | Expertini

Urgent! Sr Systems Engineer Linux – AI Infrastructure Job | DC Tech Consulting

Sr Systems Engineer Linux – AI Infrastructure



Job description

Position: Senior Linux Administrator – AI/ML Infrastructure


Location: Remote

Experience: 5+ Years

Type: Full-time


Role Overview


We are seeking a highly skilled Senior Linux Administrator to join our team, focusing on the implementation and management of on-premises Linux servers optimized for AI/ML workloads.


The ideal candidate will have deep expertise in Linux system administration, Kubernetes cluster management, and a strong understanding of data center infrastructure components including servers, networking, storage, and virtualization technologies.


This role requires hands-on experience in automating infrastructure, optimizing performance, and ensuring reliability for high-performance computing (HPC) and AI/ML pipelines.


Key Responsibilities


Deploy, configure, and manage on-premises Linux servers supporting AI/ML workloads.


Set up, manage, and troubleshoot Kubernetes clusters for containerized workloads.


Optimize system and network performance for compute-intensive applications.


Automate provisioning and configuration using Ansible, Terraform, and scripting (Bash/Python).


Administer and monitor data center components such as servers, storage arrays, switches, and power systems.


Ensure system security, patch management, and compliance across environments.


Collaborate with DevOps, Data Science, and AI engineering teams to enable seamless integration with ML pipelines.


Plan and implement scalability strategies, maintaining uptime and redundancy.


Maintain comprehensive documentation of configurations, policies, and network diagrams.


Required Skills & Qualifications


7+ years of experience in Linux system administration (RHEL, Ubuntu, CentOS).


Proven hands-on experience with Kubernetes cluster management (setup, scaling, troubleshooting).


CKA (Certified Kubernetes Administrator) certification is mandatory.


Strong knowledge of data center components – servers, racks, networking switches, storage systems, and virtualization layers.


Experience with Ansible, Terraform, CI/CD pipelines, and infrastructure automation.


Proficiency in scripting languages (Bash, Python).


Understanding of performance tuning, system optimization, and fault diagnosis.


Excellent problem-solving, communication, and collaboration skills.


Preferred / Good to Have


Exposure to NVIDIA GPU management, CUDA environments, and AI/ML compute nodes.


Familiarity with HPC environments and distributed computing frameworks.


Experience managing monitoring systems (Prometheus, Grafana) and backup solutions.


Knowledge of DevOps practices, containerization, and hybrid cloud environments.


Required Skill Profession

Prb



Your Complete Job Search Toolkit

✨ Smart • Intelligent • Private • Secure

Start Using Our Tools

Join thousands of professionals who've advanced their careers with our platform

Rate or Report This Job
If you feel this job is inaccurate or spam kindly report to us using below form.
Please Note: This is NOT a job application form.


    Unlock Your Sr Systems Potential: Insight & Career Growth Guide


Advance your career or build your team with Expertini's smart job platform. Connecting professionals and employers in Erode, India.