JobDescription.org

Information Technology

Cloud DevOps Engineer II

Last updated

A Cloud DevOps Engineer II is a mid-level practitioner who builds and maintains the CI/CD pipelines, container infrastructure, and cloud automation that development teams rely on to ship software reliably. They work across cloud providers and internal tooling with enough autonomy to own substantial platform components end-to-end.

Role at a glance

Typical education
Bachelor's degree in CS or related field, or equivalent bootcamp/self-taught experience
Typical experience
3-6 years
Key certifications
CKA, CKAD, AWS Solutions Architect, HashiCorp Terraform Associate
Top employer types
Mid-market companies, large enterprises, regulated industries, tech-focused organizations
Growth outlook
Consistent demand driven by cloud adoption and the rise of platform engineering
AI impact (through 2030)
Strong tailwind — demand is expanding as AI/ML infrastructure requires specialized DevOps skills for managing GPU clusters, model training, and large-scale inference endpoints.

Duties and responsibilities

  • Design and maintain CI/CD pipelines using GitHub Actions, GitLab CI, or Jenkins to automate build, test, and deployment workflows
  • Manage Kubernetes clusters on EKS, GKE, or AKS: cluster upgrades, node pool configuration, resource quotas, and namespace governance
  • Write and maintain Terraform or Pulumi modules that provision cloud resources reproducibly across development, staging, and production environments
  • Implement and tune observability stacks: configure Prometheus alerting rules, Grafana dashboards, and log aggregation pipelines
  • Respond to infrastructure incidents: triage alerts, diagnose root cause, and participate in postmortem analysis and follow-up remediation
  • Enforce cloud security controls: IAM role least-privilege, security group audits, secrets management rotation, and container image scanning
  • Collaborate with software development teams to diagnose deployment issues, optimize container resource requests, and improve release reliability
  • Evaluate new platform tools and cloud managed services; propose and prototype improvements to reduce toil and improve developer experience
  • Maintain runbooks and infrastructure documentation kept current with each significant change
  • Support on-call rotation covering production cloud infrastructure, responding to pages and escalating as needed

Overview

Cloud DevOps Engineers at the II level are the people who keep the platform plumbing working. They aren't just maintaining what exists — they're improving it: finding the fragile handoff in the deployment pipeline, fixing the Terraform module that's been causing drift, building the Grafana dashboard that finally makes the service's latency visible to developers.

The day-to-day mixes reactive and proactive work. On a typical week, a Cloud DevOps Engineer II might spend Monday triaging an alert cluster that fired over the weekend and tracing it to a misconfigured health check. Tuesday is pipeline work: migrating a legacy Jenkins job to GitHub Actions, cutting the build time from 22 minutes to 9 by restructuring the Docker layer cache. Wednesday involves a design conversation with a backend team about how to deploy their new service — what resource limits to set, whether it needs a persistent volume, how its health checks should be configured.

The Kubernetes layer is usually central. Managing a production cluster means more than keeping it running — it means capacity planning for node pools, managing namespace resource quotas to prevent runaway workloads from starving others, and planning cluster version upgrades before EKS/GKE drops support for the current version.

Cloud security has become a non-optional part of the role. Engineers who don't understand IAM, don't think about container image vulnerabilities, and don't know how to work with security tools like Wiz or Prisma Cloud are increasingly out of step with organizational expectations — and with audit requirements at regulated companies.

The II-level engineer is expected to work with significant autonomy on well-defined problems and to ask the right questions when facing problems that aren't well-defined yet.

Qualifications

Education:

  • Bachelor's degree in computer science, information systems, or related field common but not required
  • Bootcamp or self-taught backgrounds are common in DevOps, particularly among engineers who transitioned from development or sysadmin roles
  • Strong GitHub portfolio of infrastructure projects or open source contributions is treated as equivalent to academic credentials at many companies

Experience benchmarks:

  • 3–6 years of DevOps, platform engineering, or site reliability engineering experience
  • Production Kubernetes operations at non-trivial scale (20+ services, multi-namespace, autoscaling configured)
  • Hands-on CI/CD pipeline authoring, not just usage — designing the pipeline architecture, not clicking through a wizard

Required technical skills:

  • Infrastructure as code: Terraform at production scale with state management, modules, and workspace strategy
  • Containers: Docker image optimization, multi-stage builds, container security scanning
  • Kubernetes: Deployments, StatefulSets, HPA, NetworkPolicy, Ingress controllers, Helm
  • Cloud platform fluency: AWS (EC2, EKS, RDS, S3, IAM, VPC), Azure, or GCP equivalents
  • Scripting: Python and Bash for automation tasks; Go is a plus for operator development
  • Observability: Prometheus, Grafana, Loki or ELK stack, distributed tracing with Jaeger or Tempo

Security skills:

  • Secrets management: HashiCorp Vault, AWS Secrets Manager, or Azure Key Vault integration patterns
  • Image scanning: Trivy, Snyk, or Grype in pipeline
  • Network security groups and Kubernetes NetworkPolicy enforcement

Certifications valued:

  • CKA or CKAD
  • AWS Solutions Architect Associate or Professional
  • HashiCorp Terraform Associate

Career outlook

Cloud DevOps engineers are one of the more consistently in-demand technical specializations across industries. Every company running software in the cloud needs people who can build and maintain the platform that keeps that software deployed and observable.

The II-level band is where engineers either build specialized expertise — becoming the Kubernetes expert, the observability specialist, the FinOps engineer — or develop the broader architecture and leadership skills that lead toward senior and staff levels. Both paths are viable; the choice depends more on individual preference than market demand.

Platform engineering as a discipline has matured and given DevOps work a sharper product-like framing. Rather than just supporting application teams reactively, platform engineering teams build self-service infrastructure products that reduce toil for developers. Cloud DevOps Engineer II roles are increasingly framed this way, which adds product thinking to the technical toolkit the role demands.

The rise of platform-as-a-service abstractions (Heroku-style internal developer platforms, Railway, Render) has reduced the need for DevOps expertise at the smallest companies. The mid-market and enterprise segments still need substantial cloud infrastructure depth and are the dominant employers for this role.

AI and ML infrastructure is creating significant additional demand. Running model training jobs, serving endpoints at scale, and managing GPU cluster economics requires DevOps skills applied to a new set of tools. Engineers who understand PyTorch serving, Kubeflow, or SageMaker in addition to standard DevOps tooling are positioned for the fastest-growing segment of the market.

The career path from II runs through Senior DevOps Engineer, Staff/Principal Platform Engineer, and Engineering Manager (for those who want to manage). Total compensation at Staff level at a public tech company can easily reach $250K–$400K with equity.

Sample cover letter

Dear Hiring Manager,

I'm applying for the Cloud DevOps Engineer II role at [Company]. I've been a DevOps engineer at [Current Company] for three years, supporting a platform that runs roughly 200 microservices across three EKS clusters serving about 4 million active users.

Most of my recent work has centered on improving deployment reliability and reducing the time developers spend waiting for pipelines. When I joined, our average deployment from merge to production was 45 minutes. We're now at 12 minutes for most services. The main driver was restructuring our Docker builds to use BuildKit cache mounts, parallelizing the test suite, and eliminating a blocking manual approval gate that was almost always rubber-stamped anyway.

On the infrastructure side, I own our Terraform modules for VPC, EKS, and RDS provisioning. I rewrote the EKS module last year when we migrated from version 1.24 to 1.29 — the managed node group configuration had accumulated significant drift from the original design, and I used the upgrade as an opportunity to refactor it into a pattern the team can maintain. I also introduced Atlantis for Terraform plan/apply automation, which stopped the practice of engineers running applies locally from developer machines.

I'm looking to move to an organization where the infrastructure scale is larger and the team has dedicated platform engineers rather than developers who wear DevOps hats part-time. The way [Company] describes its platform team structure is what I'm looking for.

Thank you for considering my application.

[Your Name]

Frequently asked questions

What distinguishes a DevOps Engineer II from a DevOps Engineer I?
A DevOps Engineer I typically works on well-scoped tasks within established systems and needs guidance on novel problems. A II is expected to take ownership of complete platform components, propose solutions independently, mentor juniors, and handle ambiguous situations without hand-holding. In practice, companies treat Level II as the point where the engineer is fully productive and building organizational knowledge.
How much programming does a Cloud DevOps Engineer II need to do?
More than the job title implies. Pipeline automation, custom Kubernetes operators, internal tooling, and infrastructure modules all require writing real code — Python and Go are the most common languages in the space. Engineers who treat DevOps as purely operational (click-ops, YAML configuration) without coding fluency hit a ceiling at mid-level.
Is on-call a standard part of this role?
Yes, almost universally for roles supporting production infrastructure. The burden varies widely — some teams rotate across 8–10 engineers with infrequent pages, others have smaller rotations in noisier environments. Asking about mean pages per on-call week and postmortem cadence during interviews gives a realistic signal about what you're signing up for.
How is AI tooling changing the DevOps engineer role?
AI coding assistants (GitHub Copilot, Cursor) have meaningfully accelerated Terraform and pipeline authoring, especially for boilerplate. More significantly, AI-driven incident analysis tools are beginning to surface root cause candidates automatically from log streams, which shifts the DevOps engineer's role toward validation and decision-making rather than raw log parsing. Neither trend eliminates the job — they shift where expertise matters.
What certifications are most relevant for this role?
Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD) signal hands-on cluster operations skill. AWS Solutions Architect Associate or Professional is broadly valued. HashiCorp Terraform Associate is a credible signal for IaC work specifically. No single certification substitutes for practical experience, but they help in resume screening.
See all Information Technology jobs →