Master DevOps from
absolute scratch
A structured, beginner-friendly journey through cloud computing, automation, containers, and monitoring โ explained with real-world analogies and hands-on examples.
What is DevOps?
DevOps is the combination of two words: Development (writing software) and Operations (running and managing that software). In simple terms, DevOps is a way of working where the people who build software and the people who run it work together smoothly, automatically, and continuously.
Before DevOps, developers would write code and "throw it over the wall" to operations teams who would then struggle to deploy it. Bugs were found late, deployments were scary, and things broke in production all the time.
Think of a pizza restaurant. The chef (developer) makes the pizza, and the delivery team (operations) delivers it. In old-school IT, the chef makes the pizza and just hands it to delivery โ but delivery doesn't know the right address, it's not packed properly, and it arrives cold. DevOps is like giving the chef and delivery team a shared system, clear communication, and automated processes โ so every pizza is delivered hot, on time, every time.
Your Learning Path
Each module below builds on the previous one. Don't skip ahead โ every concept is connected. Click any card to start learning!
The DevOps Infinity Loop
DevOps is often shown as an infinity loop (โ) โ because the process never stops. You plan, build, test, deploy, monitor, and then use what you learn to plan again. This is called the DevOps Lifecycle.
How Does Software Actually Get to Users?
Before diving into tools, you need to understand the big picture. How does an idea in a developer's head end up as a feature on your phone? That journey is the Software Delivery Lifecycle โ and DevOps is about making that journey fast, safe, and automatic.
Without DevOps: a feature might take weeks or months to reach users, with manual steps, broken deployments, and late-night emergencies. With DevOps: the same feature can go live in minutes, automatically, with tests and checks built in.
Think of a car factory. In the old days, each car was hand-built by one team โ slow, inconsistent, error-prone. Modern factories use an automated assembly line: each station does one job perfectly, parts move automatically, quality is checked at every step, and hundreds of cars roll out daily. DevOps is that assembly line โ but for software.
The Journey: Code to Production
Old Way vs DevOps Way
| Topic | โ Old Way (Manual) | โ DevOps Way (Automated) |
|---|---|---|
| Deployments | Manual, risky, every few months | Automated, safe, multiple times per day |
| Testing | QA team tests manually at end | Tests run automatically on every code push |
| Servers | Physical machines, takes weeks to set up | Cloud VMs created in seconds with code |
| Failures | Discovered by angry users | Caught by monitoring before users notice |
| Team silos | Dev and Ops barely talk | Shared responsibility, shared tools |
DevOps is NOT just tools. It's a culture of collaboration + automation. The tools (Docker, Jenkins, Kubernetes etc.) are the how. The culture โ fast feedback, shared ownership, continuous improvement โ is the why.
Why Do DevOps Engineers Need to Know Networking?
Every app you deploy runs on servers that communicate over a network. When your app is slow or unreachable, it's often a network issue. Understanding how data travels between machines is essential for debugging, securing, and scaling applications.
Think of the internet like a postal system. Every house (computer) has an address (IP address). Letters have a "to" and "from" address. The postal system figures out the best route to deliver it (this is called routing). The type of delivery service (standard vs express) is like different protocols (HTTP vs HTTPS). And the post office that sorts the mail is like a router.
Key Networking Concepts
192.168.1.10 (private) or 54.210.1.45 (public).google.com into IP addresses. It's the internet's phone book.How a Web Request Works โ Step by Step
Useful Networking Commands
22 = SSH (terminal access to servers) ยท 80 = HTTP ยท 443 = HTTPS ยท 3306 = MySQL ยท 5432 = PostgreSQL ยท 6379 = Redis ยท 8080 = common dev server port ยท 9090 = Prometheus ยท 3000 = Grafana
What is YAML and Why Does It Matter?
YAML (YAML Ain't Markup Language) is a human-readable format for writing configuration files. If JSON is for machines, YAML is for humans โ it's clean, readable, and used everywhere in DevOps.
You'll write YAML for Docker Compose, Kubernetes manifests, GitHub Actions pipelines, Ansible playbooks, and more. Getting comfortable with YAML will save you hours of debugging.
YAML is like filling out a structured form. Instead of checkboxes and dropdown menus, you write key-value pairs with indentation showing which items belong together. The indentation (spaces โ never tabs!) is everything in YAML.
YAML Basics โ Syntax
YAML in Action โ GitHub Actions Pipeline
Here's a real GitHub Actions CI/CD pipeline written in YAML. Don't worry about understanding every detail โ just notice the structure:
Common YAML Mistakes to Avoid
| โ Wrong | โ Right | Why |
|---|---|---|
name:myapp | name: myapp | Space after colon is required |
| Tab indentation | 2-space indentation | YAML doesn't allow tabs |
port: "8080" when a number is needed | port: 8080 | Numbers shouldn't be quoted |
| Inconsistent indentation | Always consistent 2 spaces | Inconsistency causes parse errors |
Create a file called app-config.yaml describing a fictional app: give it a name, version, a list of 3 features, and a nested database section with host, port, and name. Validate it at yamllint.com.
The Problem: Traditional Servers
Imagine you want to launch a website. In the old world (on-premise infrastructure), here's what you had to do:
- 1Buy expensive physical servers (โน5โ50 lakhs)
- 2Set up a data center room with cooling and power backup
- 3Hire people to manage it 24/7
- 4Wait weeks or months before going live
- 5If traffic spikes, your site crashes โ you can't add more servers instantly
Traditional servers are like building your own hotel every time you travel. You spend months constructing it, buying furniture, hiring staff โ just for a 3-day trip. Then you have to maintain it forever even when no one's using it.
Cloud computing is like Airbnb or OYO โ you book only when you need, pay only for what you use, scale up for a holiday weekend and scale down after. No ownership headaches. No maintenance worries.
What is Cloud Computing?
Cloud computing means renting computing resources (servers, storage, databases, networking) over the internet from a provider instead of owning physical hardware. You pay only for what you use, like an electricity bill.
Old vs New: Side by Side
| Factor | Traditional (On-Premise) | Cloud Computing |
|---|---|---|
| Setup Time | Weeks to months | Minutes |
| Cost | High upfront capital | Pay as you use |
| Scaling | Buy new hardware | Click of a button |
| Maintenance | Your team's problem | Provider handles it |
| Disaster Recovery | Expensive backup systems | Built-in redundancy |
Types of Cloud Services
Cloud providers offer different "layers" of service. Think of it as how much of the stack they manage for you:
Why do we need DevOps in the Cloud era?
Cloud gave us power โ unlimited servers, global reach, instant scaling. But now teams were deploying dozens of times a day. How do you manage all that? You need automation, consistency, and speed. That's exactly what DevOps tools (Jenkins, Docker, Kubernetes, Terraform) provide.
Cloud is the land. DevOps is how you build on it.
- Traditional servers are expensive, slow, and hard to scale
- Cloud computing lets you rent servers over the internet, pay-as-you-go
- Three service types: IaaS, PaaS, SaaS
- Cloud makes DevOps possible by giving us on-demand, programmable infrastructure
What is AWS?
Amazon Web Services (AWS) is the world's most popular cloud platform, used by Netflix, Airbnb, NASA, and millions of companies. AWS has 200+ services, but as a DevOps beginner, you only need to master the core 6.
Core Service #1 โ EC2 (Virtual Servers)
EC2 (Elastic Compute Cloud) is like renting a computer in Amazon's data center. You choose the size (1 CPU or 64 CPUs), the operating system (Linux or Windows), and you pay by the hour. It's just a computer you control via the internet.
Core Service #2 โ S3 (Object Storage)
S3 (Simple Storage Service) is like Google Drive but for developers. Store images, videos, backups, website files โ anything. Files are stored in "buckets" and accessed via URLs. You can even host a full website from S3!
Core Service #3 โ IAM (Access Control)
IAM (Identity and Access Management) is like an office building with different access cards. The security guard can enter the lobby. The developer can enter the server room. The manager can access everything. IAM lets you control WHO can do WHAT in your AWS account.
Core Service #4 โ VPC (Your Private Network)
VPC (Virtual Private Cloud) is like a gated community for your servers. You define the roads (subnets), the gates (security groups), and who can come in or go out. Your database lives in the private area (no internet access), while your web server lives in a public area (accessible to users).
Other Key Services
Goal: Host your personal portfolio site on S3 with HTTPS via CloudFront.
- 1Create an S3 bucket with a unique name
- 2Upload your HTML/CSS files
- 3Enable "Static Website Hosting" in bucket settings
- 4Set bucket policy to allow public read
- 5Your site is live! Share the URL ๐
- EC2 = Virtual servers you rent in the cloud
- S3 = Unlimited file/object storage with web hosting capability
- IAM = Control who can access what in your account
- VPC = Your private, isolated network in the cloud
- RDS, Lambda, CloudWatch = Managed database, serverless compute, monitoring
What is an Operating System?
Before diving into Linux, let's understand what an OS actually does. An Operating System is the software that manages all hardware and software resources on your computer. Without it, your applications have no way to talk to the CPU, memory, or disk.
Linux Architecture
What is Linux & Why Does It Matter?
Linux is a free, open-source operating system founded by Linus Torvalds in 1991. About 96% of the world's servers run Linux โ including all major cloud providers (AWS, GCP, Azure). If you're doing DevOps, you're doing Linux.
Red Hat (RHEL) โ enterprise standard
CentOS / Fedora โ RHEL-based, free
Kali Linux โ security & penetration testing
Amazon Linux โ AWS-optimised
More secure, no antivirus needed
CLI-first vs GUI-first
No reboot needed for most updates
Enterprise-designed โ multi-user, better multitasking
Using a GUI (graphical interface) is like playing a game normally. Using the Linux terminal is like knowing the cheat codes โ you can do in 1 second what takes 5 minutes with a mouse.
Essential Commands โ Navigation
Essential Commands โ Files
Permissions (chmod)
Linux controls who can read, write, or execute a file. Think of it as setting access rules.
Process & System Commands
User Management
Linux allows multiple users on one machine. Every user has a type, a UID, and specific permissions.
| User Type | UID | Description |
|---|---|---|
| ๐ด Root User | 0 | Superuser โ full control over everything. Access via sudo. Direct login disabled by default. |
| โ๏ธ System Users | 1โ999 | Created by OS for background services (daemons). No password, no login. E.g. www-data for nginx. |
| ๐ค Normal Users | 1000+ | Created by admins for real people. Limited permissions โ can only affect their own files. |
File Test Operators โ Check Before You Act
In shell scripts, you often need to check if a file or directory exists before doing something with it. These are called file test operators:
- Linux OS manages processes, memory, files, network, and security
- Architecture: Applications โ Shell โ Kernel โ Hardware
- 96% of servers run Linux โ Ubuntu, RHEL, Amazon Linux are most common
- Navigation: pwd, ls, cd, mkdir, touch
- Files: cat, cp, mv, rm, grep
- Permissions: chmod (755 = rwxr-xr-x), chown, sudo
- Users: root (UID 0), system (1-999), normal (1000+)
- File tests: -e -f -d -r -w -x -s in shell scripts
The Problem Without Version Control
Imagine you and your friend are both editing the same Word document. You each make changes and email it back and forth. After 5 rounds, nobody knows which version is the latest. Someone accidentally overwrites the other's changes. The file is now a mess called final_final_v3_ACTUAL_FINAL.docx.
In software, this is catastrophic. A team of 10 developers could be working on the same codebase. Without coordination, everything breaks. This is why we use a Version Control System (VCS).
Types of Version Control Systems
| Type | How it works | Tools | Problem |
|---|---|---|---|
| ๐ Local VCS (LVCS) | Saves versions only on your local machine | RCS, SCCS | No backup, no collaboration |
| ๐ Centralized VCS (CVCS) | One central server stores all code; everyone connects to it | SVN, Perforce | No internet = can't save; server failure = all lost |
| โ Distributed VCS (DVCS) | Every developer has a full copy of the entire repo | Git, Mercurial | None โ best of both worlds |
Git is a DVCS โ it combines Local VCS (work offline, full local history) with Central VCS (push to shared remote, full collaboration). This is why Git is the industry standard.
Git is like Google Docs for code โ everyone can work on it, every change is tracked, and you can go back to any previous version instantly. GIT stands for Global Information Tracker. Created in 2005 by Linus Torvalds (same person who created Linux!) to manage the Linux kernel source code.
Git vs GitHub โ They're Different!
| Git | GitHub | |
|---|---|---|
| What it is | Version control tool (software on your computer) | Website that hosts Git repositories in the cloud |
| Where it runs | On your local machine | In the cloud (github.com) |
| Analogy | The camera taking photos | The photo album stored online |
| Can work offline? | Yes | No |
| Alternatives | โ | GitLab, Bitbucket, Azure DevOps |
Git's Three Working Areas
The Core Git Workflow
Branching โ Types & Commands
A branch is a separate line of development โ a safe sandbox to work in without touching the main codebase.
๐ฟ Main / Master Branch
The default branch created when you init a repo. This is your production code โ always stable and working. Never push broken code directly here.
โจ Feature Branch
Created to develop a new feature. Isolated from main โ you experiment freely. Once done, raise a Pull Request to merge back into main.
๐ Release Branch
Created when preparing a release. Only bug fixes go here โ no new features. When ready, merged into main and tagged with a version number.
๐ฅ Hotfix Branch
Created for critical production bugs that need immediate fixing. Branched directly from main, fixed fast, merged back to main and also back to any in-progress release branch.
- VCS types: Local (LVCS) โ Centralized (CVCS) โ Distributed (DVCS โ Git)
- Git was created by Linus Torvalds in 2005 for the Linux kernel
- Three areas: Working Directory โ Staging Area โ Local Repository โ GitHub
- Core workflow: git init โ add โ commit โ remote add โ push
- Branch types: main, feature, release, hotfix โ each has a purpose
- git fetch downloads but doesn't merge; git pull = fetch + merge
What is a Shell?
A shell is a user interface that provides access to operating system services. It acts as a translator between the user and the kernel โ you type a command, the shell interprets it, and the kernel executes it.
๐ผ๏ธ GUI Shell
Graphical interface โ icons, windows, menus. Examples: Windows Explorer, Linux GNOME, KDE. Easy to learn but slow and not scriptable.
โจ๏ธ CLI Shell
Text-based โ you type commands. Examples: Windows CMD, PowerShell, Linux Terminal. Faster, automatable, used in all DevOps work.
Types of CLI shells in Linux: sh (original Bourne Shell), bash (Bourne Again Shell โ default on most Linux), zsh (Z Shell โ macOS default, advanced features), fish (user-friendly, auto-suggestions), ksh (Korn Shell โ enterprise), csh (C Shell โ C-like syntax).
What is Shell Scripting?
A shell script is an executable file containing multiple shell commands that run sequentially โ like a recipe. Instead of typing 20 commands every morning, you write them once in a script and run them with a single command.
Imagine training a robot to clean your house. Instead of telling it what to do step by step every day, you write instructions once and hand it the list. Shell scripting is exactly that โ you write instructions for your computer to follow automatically, every time.
Variables, Input & Operators
Conditions & Loops
Functions & I/O Redirection
- A shell translates your commands to the kernel โ bash is the most common
- Shell scripts automate repetitive Linux commands โ start with
#!/bin/bash - Variables:
NAME=value, access with$NAME, read input withread - Conditions: if/elif/else; Loops: for, while, until
- Functions: group reusable commands, pass arguments with $1, $2...
- Redirection:
>overwrites,>>appends,2>errors,&>both - Pipe
|chains commands โ the backbone of all DevOps automation
What Problem Does Maven Solve?
When you write Java code, your computer can't run it directly. You need to: compile it (translate to machine code), download external libraries, run tests, and package everything into a single deployable file. Doing this manually for every developer on a team is chaotic.
Building a house requires ordering materials, scheduling workers, following blueprints. Maven is the construction manager for your Java project. You describe what you want to build (in a file called pom.xml), and Maven handles all the steps: fetching materials (dependencies), building (compilation), testing, and packaging.
The Maven Build Lifecycle
Maven Lifecycle Commands
- Maven automates Java project building: compile โ test โ package
- pom.xml is Maven's configuration file (your project blueprint)
- Dependencies are automatically downloaded from Maven Central
- Output is a JAR/WAR file โ ready to deploy on any server
What is CI/CD?
CI (Continuous Integration) means: every time a developer pushes code, it's automatically built and tested. No more "it works on my machine!" problems.
CD (Continuous Delivery/Deployment) means: if all tests pass, the code is automatically deployed to production. No manual deployment steps.
Imagine a car factory. Old way: workers build the whole car, then quality checks it at the end โ finding defects when it's too late. CI/CD way: at every step of the assembly line, the car part is automatically checked. Problems are caught immediately. Jenkins is the factory manager running the assembly line โ triggering each step automatically.
Jenkins Pipeline โ How It Works
Your First Jenkinsfile
A Jenkinsfile is a script that defines what Jenkins should do. It lives in your project's root directory alongside your code.
Why Docker After Jenkins?
Jenkins builds and deploys perfectly on the build server. But on the production server: "It works in Jenkins but crashes in prod!" Why? Different Java version, different OS, missing libraries. Jenkins automates deployment โ but doesn't guarantee the environment is the same everywhere. That's where Docker comes in.
Set up Jenkins โ Connect to GitHub โ Create a pipeline that builds with Maven, runs tests, builds a Docker image, and deploys to Kubernetes on every push to main branch.
- CI = Automatically build & test on every code push
- CD = Automatically deploy if tests pass
- Jenkinsfile defines your pipeline as code โ stored in Git
- Jenkins catches bugs early and eliminates manual deployments
The "Works on My Machine" Problem
Developer A writes an app on Windows with Python 3.9. Developer B pulls the code on Mac with Python 3.11. The production server runs Linux with Python 3.7. Everyone is confused when things break at each step. This is one of the most frustrating problems in software engineering.
Before shipping containers, loading cargo onto ships was chaotic. Different sizes, different methods, things breaking. Then someone invented the standard shipping container โ one size, works on any ship, any truck, any train, any port in the world.
Docker containers are the shipping containers of software. You pack your app, its dependencies, its runtime โ everything โ into a single container. It runs identically on your laptop, your colleague's laptop, Jenkins, and production. No surprises.
VM vs Container โ What's the Difference?
Core Docker Concepts
Dockerize a Node.js App
Docker Compose โ Multi-Container Apps
Real apps have multiple pieces: a web server, a database, maybe a cache. Docker Compose lets you define and run them all together.
Why Kubernetes After Docker?
Docker is great for running containers. But what happens when your app needs 100 containers running across 10 servers? Who decides which server runs which container? What if a container crashes โ who restarts it? What if traffic spikes โ who adds more containers? Docker alone can't answer these questions. You need an orchestrator. That's Kubernetes.
Write a simple Flask web app โ Create a Dockerfile โ Build the image โ Run it locally โ Push to Docker Hub โ Pull it on your EC2 and run it there. Your app is now portable!
- Docker packages apps into portable containers that run anywhere
- Dockerfile = recipe, Image = packaged product, Container = running instance
- Containers are lighter and faster than virtual machines
- Docker Compose manages multi-container applications
- Docker Hub is the public registry for sharing images
What is Kubernetes?
Kubernetes (K8s) is an open-source system that automatically manages containerized applications at scale. It handles starting, stopping, distributing, scaling, and healing containers โ so you don't have to do it manually.
You have 50 waiters (containers) serving 500 customers across 5 floors (servers). The restaurant manager (Kubernetes): decides which floor each waiter goes to, replaces waiters who go sick automatically, adds more waiters when the restaurant gets busy, ensures every customer gets served. You just tell the manager "I want 50 waiters" โ the rest is handled automatically.
Kubernetes Architecture
Key Kubernetes Objects
Deploy an App on Kubernetes
Self-Healing โ Kubernetes' Superpower
You told Kubernetes: "I want 3 replicas." If one Pod crashes at 3am, Kubernetes automatically starts a new one โ within seconds โ without anyone waking up. This is called self-healing and it's what makes Kubernetes so powerful for production systems.
Use Minikube (local K8s) โ Deploy your Dockerized Node.js app โ Expose it via a Service โ Scale it to 5 replicas โ Simulate a pod crash and watch K8s heal itself.
- Kubernetes orchestrates containers across multiple servers
- Pods = containers, Deployments = desired state, Services = networking
- Self-healing: K8s restarts crashed containers automatically
- Auto-scaling: handles traffic spikes by adding/removing pods
- Rolling updates: deploy new versions with zero downtime
The Problem: ClickOps
Imagine you manually click through the AWS console to create 5 EC2 instances, 3 S3 buckets, 2 VPCs, security groups, and IAM roles. Three months later, you need to recreate this entire setup for a new client. Or worse โ something breaks and you have no idea what the original settings were.
Manual infrastructure = inconsistent, undocumented, irreproducible. This is called "ClickOps" โ and it's a DevOps anti-pattern.
When a builder constructs a house, they work from a blueprint. The blueprint describes exactly every room, door, pipe, and wire. You can build the same house in Mumbai or Delhi from the same blueprint. Terraform is the blueprint for your cloud infrastructure โ write it once, deploy anywhere, recreate identically anytime.
How Terraform Works
Create an EC2 Instance with Terraform
Key Benefits of Terraform
Write Terraform code to create: 1 VPC, 2 subnets (public + private), 1 EC2 web server in public subnet, 1 RDS database in private subnet, security groups with proper rules. Then destroy it all with one command.
- Terraform manages cloud infrastructure as code โ no manual clicking
- Write .tf files โ terraform plan (preview) โ terraform apply (create)
- Infrastructure is version-controlled, documented, and reproducible
- Works with AWS, Azure, GCP, and 1000+ providers
- terraform destroy removes everything cleanly
Why Monitoring? The Final Piece
You've built your app, containerized it, deployed it on Kubernetes, automated your pipeline with Jenkins, and provisioned infrastructure with Terraform. But your work isn't done โ you need to watch what's happening in production.
Without monitoring: servers could be running at 99% CPU right now and you wouldn't know until users complain. Memory could be slowly leaking. A database could be about to run out of space. Monitoring catches these before they become disasters.
A pilot doesn't fly blind. The cockpit has hundreds of gauges showing altitude, speed, fuel, engine temperature, wind direction. If anything goes wrong, an alarm sounds immediately. Prometheus is the system collecting all those gauge readings. Grafana is the cockpit dashboard displaying them beautifully.
Prometheus โ Metrics Collector
Prometheus is an open-source monitoring system. It "scrapes" (collects) metrics from your applications and servers every few seconds and stores them in a time-series database.
PromQL โ Querying Your Metrics
Prometheus has its own query language called PromQL. It looks complex but starts simply:
Grafana โ Beautiful Dashboards
Grafana connects to Prometheus and turns raw metrics into stunning, real-time visual dashboards. You can build dashboards with graphs, gauges, heatmaps, and tables โ and share them with your whole team.
Deploy the Full Stack with Docker Compose
Deploy Prometheus + Grafana + Node Exporter on your EC2 server โ Import the "Node Exporter Full" dashboard from Grafana.com โ Set up an alert that fires on Slack when CPU usage exceeds 70% for 5 minutes.
๐ You've Completed the DevOps Journey!
- โ๏ธ Cloud Computing + AWS โ On-demand infrastructure, EC2, S3, IAM, VPC
- ๐ง Linux โ Architecture, command line, permissions, user management
- ๐ฟ Git & GitHub โ DVCS, branching strategy, collaboration
- โก Shell Scripting โ Automation, functions, I/O redirection
- ๐จ Maven โ Build, test, and package Java applications
- ๐ค Jenkins โ CI/CD pipelines, automated testing and deployment
- ๐ณ Docker โ Containerization, Dockerfiles, Docker Compose
- โธ๏ธ Kubernetes โ Orchestration, scaling, self-healing, rolling updates
- ๐๏ธ Terraform โ Infrastructure as Code, reproducible environments
- ๐ Prometheus & Grafana โ Metrics, dashboards, alerting
Continue to Module 5 โ AWS Deep Dive to master VPC, S3 storage classes, AWS CLI, Lambda, RDS & DynamoDB โ
What is a VPC?
A Virtual Private Cloud (VPC) in AWS is a logically isolated virtual network in the cloud where you can run your own resources securely. It gives you complete control over your networking environment โ similar to having your own private data centre, but hosted on AWS infrastructure.
VPC ensures isolation from other users. You control how resources communicate internally and externally.
Think of a VPC like a gated housing society. The entire society is your VPC. Inside, there are different blocks (subnets) โ some blocks face the main road and are accessible to visitors (public subnet), while other blocks are deep inside the society and only residents can access them (private subnet). The main gate is your Internet Gateway, and the security guards are your Security Groups and NACLs.
VPC Architecture
Key VPC Components Explained
0.0.0.0/0 โ IGW.Security Group vs NACL โ Side by Side
| Feature | ๐ก๏ธ Security Group | ๐ NACL |
|---|---|---|
| Applied at | Instance level | Subnet level |
| State | Stateful โ auto-allow responses | Stateless โ must define both directions |
| Rules | ALLOW only | ALLOW and DENY |
| Rule evaluation | All rules evaluated together | Rules evaluated by number (lowest first) |
| Default | Deny all inbound, allow all outbound | Allow all inbound and outbound |
| Use case | Control access to individual EC2s | Block IPs at subnet boundary |
IP Addressing in AWS
๐ข IPv4 Address Classes
Class A: 0.0.0.0 โ 126.x.x.x (large networks)
Class B: 128.0.0.0 โ 191.x.x.x (medium)
Class C: 192.0.0.0 โ 223.x.x.x (small)
Class D: 224โ239.x.x.x (multicast)
Class E: 240โ255.x.x.x (experimental)
๐ CIDR Notation
CIDR (Classless Inter-Domain Routing) defines a network using IP + prefix.10.0.0.0/16 = 65,536 IPs10.0.1.0/24 = 256 IPs10.0.1.0/28 = 16 IPs
Formula: 2^(32 - prefix) = total IPs
๐ Private IP Ranges
These ranges are for private/internal use only โ not routable on the public internet:10.0.0.0/8172.16.0.0/12192.168.0.0/16
AWS uses these for your VPC.
โก Elastic IP (Static Public IP)
By default, EC2 public IPs change every restart. An Elastic IP is a static public IP you reserve โ it stays the same even after stop/start. Charged if not attached to a running instance.
Hybrid Cloud: VPN vs Direct Connect
| Feature | ๐ AWS VPN | โก AWS Direct Connect |
|---|---|---|
| Connection | Over public internet (encrypted) | Private dedicated line |
| Speed | Variable (depends on internet) | Consistent, high speed |
| Latency | Higher | Lower |
| Cost | Lower | Higher (physical line) |
| Setup time | Minutes | Weeks |
| Best for | Small/dev workloads, quick setup | Production, high throughput, compliance |
Create a VPC with CIDR 10.0.0.0/16 โ Add a public subnet 10.0.1.0/24 and private subnet 10.0.2.0/24 โ Attach an Internet Gateway โ Create a route table for the public subnet โ Launch an EC2 in the public subnet โ Launch an RDS in the private subnet โ Verify the EC2 can reach the internet but the RDS cannot be accessed directly from outside.
S3 Recap + Key Features
Amazon S3 (Simple Storage Service) is object storage โ think of it as an infinite hard drive in the cloud. It stores buckets (containers) and objects (files). Bucket names must be globally unique, and a single object can be up to 5 TB.
S3 Storage Classes โ Visual Guide
S3 Storage Classes โ Quick Reference
| Class | Access | Retrieval | Cost | Best For |
|---|---|---|---|---|
| Standard | Frequent | Instant, no fee | Highest | Websites, active data |
| Standard-IA | Infrequent | Instant, fee applies | Lower storage | Backups, DR |
| One Zone-IA | Infrequent | Instant, fee applies | 20% < Std-IA | Re-creatable data |
| Intelligent-Tiering | Unknown | Instant | Auto-optimized | Data lakes, ML |
| Glacier Instant | Rare | Milliseconds | Low | Medical archives |
| Glacier Flexible | Rare | 1 min โ 12 hrs | Lower | Long-term backups |
| Deep Archive | Very rare | 12 โ 48 hrs | Lowest | Legal/financial 7+ yrs |
Use S3 Lifecycle Policies to automatically move data: keep new files in Standard for 30 days โ move to Standard-IA for 60 days โ archive to Glacier Flexible after 90 days. This can cut storage costs by 60โ90% for older data!
What is the AWS CLI?
The AWS Command Line Interface (CLI) lets you interact with every AWS service by typing commands instead of clicking through the web console. It helps you automate AWS tasks, manage resources, and run operations in scripts โ essential for any DevOps workflow.
The AWS Console is like using a touchscreen menu at a restaurant โ intuitive but slow. The AWS CLI is like calling the kitchen directly โ faster, scriptable, and you can automate the same order 1000 times with a loop.
Setup & Configuration
EC2 Commands
S3 Commands
VPC Commands
Without touching the AWS Console: create a VPC + subnet + IGW โ launch an EC2 inside it โ upload a file to S3 โ SSH into the EC2 โ download the file from S3 โ clean up everything. This is exactly the kind of automation you'd put in a bash script or CI/CD pipeline.
AWS Lambda โ Serverless Computing
AWS Lambda lets you run code without provisioning or managing servers. You upload your function, define when it triggers, and AWS handles everything else. You pay only for the compute time you use โ down to the millisecond.
Running a traditional server is like buying a generator and running it 24/7. Lambda is like using the electrical grid โ you only pay for the watts you actually use. When no one's using your function, it costs nothing.
Amazon SNS โ Simple Notification Service
SNS is a messaging service under the Application Integration category. It's used to send notifications to many subscribers at once. The key components are topics (a channel) and subscriptions (who receives from that channel).
๐ข Topics
A topic is a communication channel. Publishers send messages to the topic. Think of it like a broadcast announcement system โ one message sent to the topic reaches all subscribers.
๐ฌ Subscriptions
Subscribers choose how to receive notifications: via email, SMS, HTTP endpoint, Lambda, SQS, or mobile push. One topic can have millions of subscribers.
CloudWatch โ AWS Monitoring
Amazon CloudWatch is used to monitor AWS resources and applications. It falls under the Management & Governance category.
RDS vs DynamoDB โ Which Database to Use?
| Feature | ๐๏ธ Amazon RDS | โก Amazon DynamoDB |
|---|---|---|
| Type | Relational (SQL) โ tables with rows and columns | NoSQL โ key-value and document-based |
| Engines | MySQL, PostgreSQL, Oracle, MariaDB, SQL Server, Aurora | DynamoDB only (proprietary) |
| Schema | Fixed schema โ define columns upfront | Schema-less โ each item can have different attributes |
| Scaling | Vertical (bigger instance) + read replicas | Automatic, horizontal, to millions of req/sec |
| Latency | Low (milliseconds) | Ultra-low (single-digit milliseconds) |
| Best for | Complex queries, relationships, financial data | Real-time apps, gaming, IoT, mobile backends |
| Multi-region | Multi-AZ for HA | Global Tables โ multi-region replication built-in |
If your data has complex relationships and you need SQL queries (JOINs, transactions) โ use RDS. If you need massive scale, flexible schema, and single-digit ms latency with simple access patterns โ use DynamoDB.
How to Approach DevOps Interviews
DevOps interviews test three things: conceptual understanding (can you explain it simply?), practical knowledge (have you actually done it?), and problem-solving (can you debug a broken pipeline?). Always answer with real examples โ even if they're from personal projects.
๐ถ AWS Interview Questions
EC2 is a virtual machine that you provision and manage โ you pay per hour whether the server is doing work or sitting idle. Lambda is serverless โ you write a function, AWS runs it when triggered, and you pay only for the milliseconds it actually runs. Use EC2 for long-running workloads, Lambda for event-driven short-lived tasks.
Security Groups are stateful firewalls at the instance level โ allow inbound, and the return traffic is automatically allowed. They support ALLOW rules only. NACLs are stateless firewalls at the subnet level โ you must explicitly define both inbound and outbound rules. They support both ALLOW and DENY. Use Security Groups for instance-level control and NACLs for blocking specific IPs at the subnet boundary.
A public subnet has a route to the Internet Gateway, so resources inside can be accessed from the internet (and can access it). A private subnet has no direct route to the internet โ resources inside are not directly reachable from outside. Private resources use a NAT Gateway if they need to make outbound internet calls (e.g. to download packages).
IAM (Identity Access Management) controls who can do what in your AWS account. It's the foundation of AWS security. You create users (individuals), groups (collections of users), roles (temporary identities for services to assume), and policies (JSON documents defining permissions). The golden rule is Principle of Least Privilege โ give only the minimum permissions needed.
S3 is object storage for any type of data. Key classes: Standard (frequent access, highest cost), Standard-IA (infrequent access, lower cost but retrieval fee), Glacier (archival, very low cost, retrieval takes time), and Intelligent-Tiering (auto-moves data between tiers based on access patterns โ good when you don't know your access patterns). All classes offer 11 nines (99.999999999%) durability.
๐ง Linux Interview Questions
A process is any running program โ it has a PID, uses CPU and memory, and exits when done. A daemon is a background process that runs continuously without user interaction, usually started at boot. Examples: nginx (web server daemon), sshd (SSH daemon). Daemons typically end in 'd' by convention.
chmod 755 script.sh sets permissions using the octal system: 7 (owner) = read+write+execute (4+2+1), 5 (group) = read+execute (4+1), 5 (others) = read+execute. So the owner can do everything, group and others can read and run but not modify. Always use 755 for scripts you want to be executable.
> redirects output and overwrites the file โ use it carefully, it deletes the existing content. >> redirects output and appends to the file, preserving existing content. Example: echo "line1" > file.txt creates/overwrites; echo "line2" >> file.txt adds to the file.
๐ฟ Git Interview Questions
git merge creates a new "merge commit" that combines two branches โ it preserves the full history of both branches. git rebase rewrites the history by replaying your commits on top of another branch โ it creates a cleaner, linear history but rewrites commit hashes. Rule: use merge for public/shared branches, rebase for local cleanup before pushing.
git stash temporarily saves your uncommitted changes (both staged and unstaged) so you can switch branches or work on something else without committing half-done work. Run git stash pop to restore your saved changes later. Think of it as a clipboard for your work-in-progress.
git fetch downloads changes from the remote repository but does NOT merge them into your local branch โ it just updates your local copy of the remote branches. git pull = git fetch + git merge โ it downloads AND merges. Use fetch when you want to see what changed before merging.
๐ณ Docker Interview Questions
A Docker image is a read-only template โ like a recipe or a blueprint. It contains the OS, dependencies, and your application code. A container is a running instance of an image โ like a dish cooked from the recipe. You can run many containers from the same image simultaneously. Images are stored in registries (Docker Hub); containers run on your host.
A VM virtualizes an entire computer including its own OS โ heavy, slow to start (minutes), uses gigabytes of RAM. Docker containers share the host OS kernel โ lightweight, start in seconds, use megabytes. The trade-off: VMs provide stronger isolation (each has its own kernel); containers are faster and more efficient but share the kernel.
Scenario-Based Questions
| Scenario | What They're Testing | Key Points to Cover |
|---|---|---|
| "Your deployment just failed in production at 2 AM โ what do you do?" | Incident response, communication | Check logs first โ rollback quickly โ alert team โ root cause analysis after service is restored |
| "Your EC2 instance is running out of disk space โ how do you fix it?" | Linux, AWS troubleshooting | df -h โ find large files with du โ extend EBS volume or clean logs โ add lifecycle policy |
| "Your Jenkins pipeline keeps failing โ where do you look first?" | CI/CD debugging | Check the console output โ check SCM connection โ verify environment variables and credentials โ check agent status |
| "How would you set up a completely new AWS environment from scratch?" | Architecture, IaC | VPC โ subnets โ IGW โ route tables โ security groups โ EC2/ECS โ RDS โ CloudWatch alarms โ Terraform for everything |
- Can you explain every module in this course to a non-technical person?
- Have you deployed a real project end-to-end with a CI/CD pipeline?
- Can you write a Dockerfile and Docker Compose file from scratch?
- Can you create a VPC with public/private subnets in the AWS Console AND via CLI?
- Do you have a GitHub repo with your projects to show interviewers?
- Have you set up Prometheus + Grafana monitoring on at least one app?
- Can you explain what happens when you type a URL in the browser?
What is DevSecOps?
DevSecOps = Development + Security + Operations. It's the practice of baking security into every stage of the DevOps pipeline โ not bolting it on at the end as an afterthought.
The old approach: developers build the app, security team scans it at the end, finds 500 vulnerabilities, everyone panics. The DevSecOps approach: security checks run automatically at every commit, every build, every deployment. Problems are caught when they're cheap to fix โ not after you're in production.
Imagine building a skyscraper and only checking if it's structurally sound after it's finished. That would be insane โ you'd build safety into every floor as you go. DevSecOps does the same for software: safety at every layer, from the start.
Security at Every Stage
Essential Security Practices
Secrets Management โ The Right Way
Every week, developers accidentally commit API keys to public GitHub repos. Attackers have bots that scan GitHub 24/7 for leaked credentials. A leaked AWS key can result in a โน10 lakh bill within hours from crypto miners. Always add .env to your .gitignore!
You've Completed DevOps Zero to Hero! ๐
You've gone from zero knowledge to understanding the complete DevOps lifecycle. Now it's time to think about where to take this next. The DevOps field is massive โ here's a map of the paths ahead.
Your DevOps Career Roadmap
Certifications to Aim For
Tools to Learn Next
| Category | Tool | Why Learn It |
|---|---|---|
| GitOps / CD | ArgoCD | Sync Kubernetes deployments automatically from Git โ the modern CD approach |
| Service Mesh | Istio | Advanced networking, traffic management, and security between microservices |
| Config Management | Ansible | Automate server setup and configuration at scale without writing code |
| Log Management | ELK Stack | Elasticsearch + Logstash + Kibana โ centralized logging for distributed apps |
| Secret Management | HashiCorp Vault | Industry standard for secrets, tokens, certificates management |
| Platform | GitHub Actions | CI/CD built right into GitHub โ simpler alternative to Jenkins for many teams |
Build things. Break things. Fix things. No certification or tutorial replaces the learning you get from deploying a real app, watching it fail, and figuring out why. Pick a personal project โ even a simple website โ and take it all the way through the pipeline you've learned here. That's how DevOps engineers are made.
Build a complete DevOps pipeline from scratch: Create a simple Node.js or Python app โ Push to GitHub โ Set up a Jenkins CI/CD pipeline โ Build a Docker image โ Deploy to Kubernetes on AWS (EKS) โ Set up Terraform to provision the infrastructure โ Monitor with Prometheus + Grafana โ Add security scanning with Trivy. This single project will demonstrate everything in this course to any employer.