Summary
Overview
Work History
Education
Skills
Accomplishments
Timeline
Additional Information
BusinessAnalyst

Vinodkumar Balagani

AWS & DevOps Engineer
Glasgow

Summary

Dynamic AWS and DevOps Engineer with over 7 years of experience, including 1 year in System Administration And 4 years specializing in AWS DevOps and 1 year in Linux Administration. Expertise in designing, automating, and optimizing CI/CD pipelines, cloud infrastructure, and containerized deployments. Proven ability to leverage AWS cloud services, Kubernetes (EKS), Infrastructure as Code (Terraform, Ansible), and monitoring solutions (Prometheus, Grafana, CloudWatch) to enhance operational efficiency. Committed to reducing deployment times, improving system reliability, and driving zero-downtime releases for seamless application performance.

Overview

7
7
years of professional experience

Work History

System Administrator (Part-Time)

EVERYDAY JD LTD
2024.10 - Current
  • Reduced downtime by proactively identifying and resolving potential issues through thorough system monitoring.
  • Established effective communication channels between IT support staff and end-users, leading to improved issue resolution times overall.
  • Installed important security and functionality patches to maintain optimal protections against intrusion and system reliability.
  • Simplified troubleshooting processes by creating detailed documentation for system configurations, procedures, and best practices.

DevOps Engineer

Tech Mahindra
2021.07 - 2024.08
  • Designed and implemented CI/CD pipelines using Jenkins for Java, Node.js, and React.js applications, reducing deployment cycle time by 40%.
  • Migrated microservices to AWS EKS, configured readiness & liveness probes, and enabled HPA, improving scalability and uptime to 99.9%.
  • Introduced GitOps workflow with ArgoCD for Kubernetes deployments, improving deployment reliability and rollback efficiency.
  • Implemented Blue-Green and Canary deployments on EKS using ALB ingress and Argo Rollouts, enabling zero-downtime releases and reducing rollback risks.
  • Deployed Helm charts for consistent packaging and version control, streamlining Kubernetes deployments.
  • Automated infrastructure provisioning with Terraform (remote state S3 + DynamoDB locking) and managed configuration with Ansible playbooks, cutting manual efforts by 60%.
  • Standardized Terraform by creating modular templates, ensuring consistency across environments.
  • Secured CI/CD pipelines by integrating Trivy (container scanning), Checkov (Terraform IaC security), and SonarQube with quality/security gates, improving code reliability by 30%.
  • Enhanced Kubernetes security with RBAC, Pod Security Policies, and AWS IAM roles for service accounts, ensuring least-privilege access control.
  • Configured and optimized AWS services (EC2, VPC, ELB, IAM, S3, Auto Scaling groups, CloudWatch) for high availability and security.
  • Designed and tested a multi-region Disaster Recovery (DR) strategy using Route53 failover routing, Aurora Global Database, and S3 replication, reducing RTO to
  • Optimized cloud spend by leveraging EKS Spot instances, EC2 Auto Scaling policies, and S3 lifecycle rules, achieving ~25% monthly cost savings.
  • Introduced monitoring with Prometheus and Grafana, integrated with CloudWatch & SNS alerts, reducing incident detection time.
  • Implemented centralized logging solution with the EFK stack (Elasticsearch, Fluentd, Kibana), improving production issue troubleshooting.
  • Automated incident response workflows using AWS Lambda runbooks triggered from CloudWatch alarms, reducing MTTR by 40%.
  • Collaborated with development and QA teams using Jira and Confluence, ensuring timely releases and clear process documentation.

Linux System Administrator

HCL Tech
2019.06 - 2020.07
  • Installed, configured, and maintained RedHat Linux servers (30+ systems) for production and development environments.
  • Installation and upgrade of Linux OS using Kickstart, RPM, and YUM package administration.
  • Managed user accounts, groups, permissions, file systems, NFS, DNS, DHCP, and LVM ensuring smooth operations.
  • Extended LVM volumes and file systems online to address production storage requirements without downtime.
  • Configured and maintained NFS, Samba, FTP, DNS, and DHCP servers, supporting heterogeneous environments.
  • Monitored servers using Nagios, Netcool, vmstat, iostat, netstat, and set up alerts to proactively identify issues.
  • Automated daily tasks with shell scripts and cron jobs, improving efficiency and reducing manual workload.
  • Implemented security hardening, configured firewalld/iptables, disabled unused services, enforced password policies, and managed sudo access.
  • Installed and configured Apache Web Server and Tomcat Server to support application deployments.
  • Handled backup and restore operations (Tivoli Storage Manager, MySQL DB backups), ensuring data integrity and business continuity.
  • Troubleshot critical production incidents (CPU spikes, memory leaks, disk space issues, process failures) to restore services quickly.
  • Performed kernel upgrades, firmware updates, and OS patching to keep systems secure and up to date.
  • Configured and monitored processes, swap space, and system logs for performance optimization.
  • Supported data center migration activities: cloning VMs, updating IPs, validating services, and minimizing downtime.
  • Provided 24x7 on-call support and worked within ITIL processes (Incident, Change, Problem Management) using ServiceNow.
  • Documented Standard Operating Procedures (SOPs) and provided knowledge transfer for smooth team operations.

Education

MSc - Business and Management

University of Strathclyde
Glasgow, United Kingdom
2025-09

BCA - Computer Applications

Glocal University
Saharanpur,India
2019-06

Skills

Cloud Platforms: AWS (EC2, S3, VPC, ELB, Auto Scaling, IAM, CloudWatch, Route53, SNS)

CI/CD & SCM: Jenkins, Git, GitHub, Bitbucket, Nexus, JFrog, SonarQube

Containerization: Docker, Kubernetes (EKS), Helm

Infrastructure as Code: Terraform, Ansible

Monitoring & Logging: Prometheus, Grafana, CloudWatch, SNS

Accomplishments

  • Designed and implemented end-to-end CI/CD pipelines with Jenkins, integrating Maven, NPM, Docker, and Nexus for seamless deployments.
  • Automated infrastructure provisioning using Terraform and configuration management with Ansible, reducing manual efforts by 60%.
  • Migrated applications to AWS EKS (Kubernetes) and configured readiness/liveness probes and HPA, ensuring high availability and scalability.
  • Implemented Git branching strategies (feature, release, hotfix) and enforced PR rules for better code quality and collaboration.
  • Integrated SonarQube into CI/CD pipelines, improving code quality and enabling enforcement of quality gates.
  • Set up monitoring and alerting solutions using Prometheus, Grafana, and AWS CloudWatch, reducing production issue detection time by 40%.
  • Managed AWS services (EC2, S3, ELB, Auto Scaling, IAM, VPC, Route53) for cloud infrastructure with 99.9% uptime SLA.
  • Worked with Helm charts to simplify Kubernetes deployments and version management.
  • Wrote Shell and Python scripts to automate deployments, log management, and routine system tasks.
  • Collaborated with developers, QA, and business teams using Agile (Jira, Confluence) to deliver projects on time.
  • Good Experience in writing Shell scripts to automate the several activities.
  • Worked on creation playbooks and uploading management of Ansible playbook. Hands on experience with concepts such as setup, Inventory, Playbooks, Roles, and Tasks in Ansible.

Timeline

System Administrator (Part-Time)

EVERYDAY JD LTD
2024.10 - Current

DevOps Engineer

Tech Mahindra
2021.07 - 2024.08

Linux System Administrator

HCL Tech
2019.06 - 2020.07

MSc - Business and Management

University of Strathclyde

BCA - Computer Applications

Glocal University

Additional Information

  • Availability: Immediate
  • Soft Skills: Strong collaboration, troubleshooting, communication, and agile methodology experience
Vinodkumar BalaganiAWS & DevOps Engineer