Professional Summary

Platform and DevOps Engineer with a Master's degree in Computer Science and hands-on experience building production-inspired cloud platforms. Strong background in incident management, monitoring, reliability, and infrastructure automation using Terraform, Kubernetes, and CI/CD pipelines. Experienced across Azure and AWS environments with a reliability-first mindset shaped by real-world production support and on-call troubleshooting. Microsoft Certified Azure Administrator (AZ-104).

Technical Skills

Hover over skills to see which projects use them

DevOps & Automation

  • Terraform Used in: IDP, Multi-Cloud K8s
  • Infrastructure as Code (IaC) Used in: IDP, Multi-Cloud K8s
  • Docker Used in: IDP, Multi-Cloud K8s
  • GitHub Actions Used in: IDP (CI/CD Pipeline)
  • Jenkins Used in: Previous enterprise projects
  • CI/CD Pipelines Used in: IDP, Multi-Cloud K8s

Cloud & Infrastructure

  • Microsoft Azure (VMs, VNets, NSGs) Used in: Ohio DRC, IDP, Multi-Cloud K8s
  • Azure Load Balancers Used in: IDP (AKS Infrastructure)
  • Azure Monitor & Entra ID Used in: Ohio DRC, Infosys
  • Amazon Web Services (AWS) Used in: Multi-Cloud K8s (EKS)

Operations & Reliability

  • ServiceNow (ITSM) Used in: Ohio DRC (99.8% SLA)
  • Incident Management Used in: Ohio DRC, Infosys
  • Monitoring & Alerting Used in: Observability Stack, Ohio DRC
  • Root Cause Analysis Used in: Ohio DRC, Infosys

Scripting & OS

  • Python Used in: IDP (Orchestration), Automation
  • Bash Used in: CI/CD, Infrastructure Scripts
  • PowerShell Used in: Ohio DRC (Windows Server)
  • Linux Used in: IDP, Multi-Cloud K8s, Observability
  • Windows Server Used in: Ohio DRC (Enterprise IT)

Professional Experience

Click highlighted achievements to see additional context and metrics

Information Technologist I – Infrastructure & Operations
Ohio Department of Rehabilitation and Correction
Aug 2024 - Present
  • Support and stabilize production infrastructure services across 50+ enterprise endpoints, acting as an escalation point for availability, reliability, and performance-related incidents
    Responsible for critical infrastructure supporting 50+ endpoints across correctional facilities. Escalation path for P1/P2 incidents affecting system availability. Average response time: <15 minutes for critical alerts.
  • Manage incidents and service requests through ServiceNow, performing root cause analysis and maintaining 99.8% uptime SLAs across supported systems
    Handle ~40 tickets/week ranging from P3 service requests to P1 outages. RCA documentation for all P1/P2 incidents. Maintained 99.8% uptime across monitored systems, exceeding 99.5% SLA target.
  • Collaborate with network, identity, and application teams within a hybrid Active Directory and Azure Entra ID environment, reducing incident resolution time by 15%
    Improved cross-team collaboration through structured handoff procedures and shared runbooks. Reduced average incident resolution time from 2.3 hours → 1.95 hours (15% improvement) by streamlining identity-related troubleshooting workflows.
  • Apply DevOps practices to improve operational efficiency through infrastructure automation, CI/CD workflows, and monitoring configurations using Terraform, GitHub Actions, and cloud-native tooling
    Built Terraform modules for Azure resource provisioning, reducing manual setup time by 60%. Implemented GitHub Actions workflows for automated config validation. Created custom Azure Monitor alerts reducing false-positive alert noise by 40%.
  • Maintain operational documentation and contribute to continuous service improvement initiatives to strengthen system reliability and support readiness
    Authored 20+ runbooks and troubleshooting guides for common incident scenarios. Maintained knowledge base articles in Confluence reducing onboarding time for new team members. Led bi-weekly service improvement meetings, implementing 8 process improvements in incident response workflow.
AI Operations & QA Intern
WelSpot
Mar 2024 - Aug 2024
  • Performed performance testing and operational validation of Large Language Models (LLMs) to improve reliability and response consistency
    Conducted load testing on GPT-4 and Claude models with varying prompt sizes. Validated response consistency across 1000+ test cases. Identified performance degradation patterns at >4K token context windows, informing prompt engineering best practices.
  • Validated Retrieval-Augmented Generation (RAG) workflows integrating vector databases and Google BigQuery to support scalable AI operations
    Tested RAG pipeline accuracy with ChromaDB vector store and BigQuery data sources. Validated retrieval precision improved response accuracy from 72% → 89% compared to zero-shot prompting. Documented performance tradeoffs between embedding models (OpenAI vs open-source).
System Engineer – Application Support & QA
Infosys
Nov 2020 - Apr 2022
  • Conducted system observability and log analysis using Azure Monitor and Nmon to maintain system health during peak workloads
    Monitored Linux server performance during month-end batch processing peaks (CPU, memory, disk I/O). Used Nmon for real-time metrics and Azure Monitor Logs for historical analysis. Identified memory leak pattern causing weekly restarts, recommended JVM tuning reducing restart frequency to monthly.
  • Authored Root Cause Analysis (RCA) reports using SQL and Excel to support performance remediation and operational stability initiatives
    Wrote 15+ RCA documents for production incidents. Used SQL queries to analyze transaction logs and identify performance bottlenecks. Excel dashboards visualized incident trends, leading to database index optimization reducing query times by 35%.

Project Deep Dives

Click any project to explore the problem, solution architecture, technical decisions, and lessons learned.

Internal Developer Platform (IDP)
Kubernetes • Terraform • GitHub Actions • Python

Built a self-service platform enabling developers to deploy applications without infrastructure knowledge.

2h → 15min
Deployment Time
99.95%
Success Rate
100%
Automation
Multi-Cloud Kubernetes Platform
EKS • AKS • Helm • ArgoCD • Istio

Deployed and managed production Kubernetes clusters across AWS and Azure with service mesh.

50+
Microservices
99.9%
Uptime SLA
GitOps
Deployment Model
Observability Stack Implementation
Prometheus • Grafana • Loki • Jaeger

Designed and deployed comprehensive monitoring and logging infrastructure.

70%
Faster MTTD
100+
Dashboards
Full Stack
Tracing