Linux System Administrator Interview Questions
35+ Linux system administration interview questions covering the file system, permissions, processes, networking, services, and shell scripting — with practical, command-level answers.
Questions
33+
Topics
6
Est. time
2 hours
File System & Storage
How is the Linux file system structured?
Linux uses a single root hierarchy /. Key directories:
| Directory | Purpose |
|---|---|
/bin, /usr/bin | Essential user binaries (ls, cp, bash) |
/sbin, /usr/sbin | System administration binaries (fdisk, iptables) |
/etc | System-wide configuration files |
/var | Variable data (logs: /var/log, spool, mail) |
/home | User home directories |
/boot | Kernel, initrd, GRUB — files needed at boot |
/proc, /sys | Virtual filesystems — kernel and process information |
/tmp | Temporary files; usually cleared on reboot |
/dev | Device files (block/character devices) |
/mnt, /media | Mount points for filesystems |
What is an inode and why does it matter?
An inode is a data structure that stores metadata about a file:
- File type, permissions, owner, group
- Timestamps (atime, mtime, ctime)
- Number of hard links
- Pointers to data blocks on disk
What an inode does NOT store: the filename. Filenames are stored in directory entries, which map names to inode numbers.
Why it matters in practice:
- A filesystem can run out of inodes before running out of disk space — small files (e.g., thousands of log files) exhaust inodes. Check with
df -i. - Hard links share the same inode — a file is fully deleted only when both its link count drops to 0.
Explain hard links vs symbolic links.
Hard link (ln source link): A directory entry pointing to the same inode as the original file. Both entries are equal — deleting the original doesn't remove the file; the inode remains until all hard links are deleted. Hard links cannot span filesystems or link directories.
Symbolic link (ln -s source link): A file containing the path to another file (like a shortcut). If the original is deleted, the symlink becomes broken. Symlinks can cross filesystems and can point to directories.
ln /etc/hosts /tmp/hosts-hard # hard link
ln -s /etc/hosts /tmp/hosts-sym # symbolic link
ls -la /tmp/hosts-sym # shows -> /etc/hosts
How do you mount and unmount a filesystem?
# View currently mounted filesystems
mount # human-readable
lsblk # block device tree
# Mount a device
mkdir -p /mnt/data
mount /dev/sdb1 /mnt/data # mount ext4/xfs/etc.
mount -t xfs /dev/sdb1 /mnt/data # explicit type
# Unmount
umount /mnt/data
# If "busy": find what's using it
lsof +D /mnt/data
fuser -m /mnt/data
# Persistent mount: add to /etc/fstab
# /dev/sdb1 /mnt/data xfs defaults 0 2
/etc/fstab entries are checked with mount -a — run this after editing fstab to check for errors before rebooting.
How do you check disk usage and what do you do when a filesystem is full?
df -h # filesystem usage (human readable)
df -i # inode usage
du -sh /var/* # disk usage per directory
du -sh * | sort -rh | head -20 # largest directories
# When /var/log is full:
journalctl --vacuum-size=200M # trim systemd journal
find /var/log -name "*.gz" -mtime +30 -delete # old compressed logs
truncate -s 0 /var/log/syslog # zero a log file without deleting it
# Filesystem resize (after extending disk in cloud):
# ext4:
resize2fs /dev/sda1
# xfs (must be mounted):
xfs_growfs /mnt/data
What are LVM and its key concepts?
LVM (Logical Volume Manager) adds an abstraction layer between physical disks and filesystems, enabling flexible resizing, snapshots, and spanning multiple disks.
| Concept | Description |
|---|---|
| PV (Physical Volume) | A disk or partition initialised for LVM (pvcreate /dev/sdb) |
| VG (Volume Group) | Pool of storage from one or more PVs (vgcreate vg0 /dev/sdb) |
| LV (Logical Volume) | Virtual partition carved from a VG (lvcreate -L 50G -n data vg0) |
Extending an LV online (no downtime for ext4/xfs):
lvextend -L +20G /dev/vg0/data
resize2fs /dev/vg0/data # ext4
xfs_growfs /mnt/data # xfs
LVM snapshots: point-in-time consistent copy for backup without unmounting.
How does /proc and /sys differ?
/proc is a virtual filesystem providing a window into the kernel and running processes:
/proc/cpuinfo— CPU information/proc/meminfo— memory statistics/proc/<pid>/— per-process information (fd, maps, status)/proc/sys/— tunable kernel parameters (also accessible viasysctl)
/sys (sysfs) exposes device and driver model information:
- Device properties, driver binding, power management.
- Used by udev to discover and configure hardware.
Neither directory contains actual files on disk — everything is generated on read by the kernel.
Permissions & Users
Explain Linux file permissions and how the octal notation works.
Every file has three permission triplets: owner, group, others. Each triplet has three bits: read (4), write (2), execute (1).
-rwxr-xr-- 1 alice devs 4096 Apr 7 09:00 script.sh
↑↑↑ ↑↑↑ ↑↑↑
│││ │││ └── others: r-- = 4
│││ └────── group: r-x = 5
└────────── owner: rwx = 7
Octal: chmod 754 script.sh sets owner=rwx(7), group=r-x(5), others=r--(4).
Octal cheatsheet:
644— regular file (owner rw, others r)755— executable/directory (owner rwx, others rx)600— private file (owner rw only, e.g., SSH private key)700— private directory
What are SUID, SGID, and sticky bit?
SUID (Set User ID) on an executable: runs as the file's owner, not the invoking user. Used for /bin/passwd (needs root access to /etc/shadow). Octal: chmod 4755.
SGID (Set Group ID) on an executable: runs as the file's group. On a directory: new files inherit the directory's group instead of the creator's primary group. Useful for shared project directories. Octal: chmod 2755.
Sticky bit on a directory: users can only delete their own files even if they have write permission to the directory. Set on /tmp and shared directories. Octal: chmod 1777.
ls -la /tmp # drwxrwxrwt — 't' is the sticky bit
ls -la /bin/passwd # -rwsr-xr-x — 's' is SUID
How do you manage users and groups?
# Users
useradd -m -s /bin/bash -G sudo alice # create user with home dir, sudo group
usermod -aG docker alice # add to group (append with -a)
passwd alice # set password
userdel -r alice # delete user and home directory
# Groups
groupadd developers
groupdel developers
# View
id alice # uid, gid, groups
cat /etc/passwd # all users (uid:gid:home:shell)
cat /etc/group # group membership
# Sudo
visudo # safely edit /etc/sudoers
# alice ALL=(ALL:ALL) ALL — full sudo
# alice ALL=(ALL) NOPASSWD:/usr/bin/systemctl restart nginx — specific cmd
What is sudo and how does it differ from su?
sudo runs a single command as another user (typically root), using the invoking user's password and the /etc/sudoers policy. Leaves an audit trail in /var/log/auth.log.
su switches the session to another user (requires that user's password). su - starts a full login shell as root.
Best practice: Never log in directly as root. Use sudo for administrative commands. Restrict sudo to specific commands where possible (sudoers per-command rules). Use sudo -l to see what a user is allowed to run.
What is umask?
umask is a mask applied to new file permissions, subtracting bits from the default:
- Default file permissions: 666 (no execute)
- Default directory permissions: 777
- Common umask:
022→ files: 644, directories: 755
umask # show current
umask 027 # set: files 640, directories 750 (no access for others)
Set persistently in /etc/profile, ~/.bashrc, or /etc/login.defs (UMASK).
Processes & Services
How do you view and manage processes?
ps aux # all processes (BSD syntax)
ps -ef # all processes (POSIX syntax)
top / htop # interactive process viewer
pgrep nginx # find PID by name
pidof nginx # same
# Kill
kill -15 <pid> # SIGTERM — graceful shutdown
kill -9 <pid> # SIGKILL — forceful termination (use as last resort)
killall nginx # kill all processes named nginx
pkill -f pattern # kill by command-line pattern
# Background & foreground
command & # run in background
jobs # list background jobs
fg %1 # bring job 1 to foreground
bg %1 # resume job 1 in background
nohup command & # immune to hangup signal; output to nohup.out
Explain systemd and key systemctl commands.
systemd is the init system (PID 1) on modern Linux. It manages services (units), parallelises startup, handles dependencies, and provides logging via journald.
# Service management
systemctl start nginx
systemctl stop nginx
systemctl restart nginx
systemctl reload nginx # reload config without full restart (if supported)
systemctl status nginx # show status, last 10 log lines
# Enable/disable auto-start on boot
systemctl enable nginx
systemctl disable nginx
# Listing
systemctl list-units --type=service --state=running
systemctl list-unit-files
# Logs (journald)
journalctl -u nginx -n 100 # last 100 lines for nginx
journalctl -u nginx -f # follow (tail -f equivalent)
journalctl -b -p err # errors since boot
journalctl --since "1 hour ago"
What happens between pressing Enter on a command and seeing output?
- Bash parses the command line, performs expansions (variables, globbing, command substitution).
- A fork creates a child process (copy of the shell).
- exec replaces the child process with the target binary.
- Kernel loads the binary from disk, maps it to memory.
- Dynamic linker loads shared libraries.
- The program runs; writes to stdout (fd 1), stderr (fd 2).
- Shell waits (blocks) until the child exits (
waitsyscall). - Return code stored in
$?.
This is the fork-exec model — interviewers often ask about this to test UNIX fundamentals.
What is a zombie process and an orphan process?
Zombie: A process that has exited but its parent hasn't called wait() to collect its exit status. The zombie exists only as an entry in the process table. Cannot be killed with kill -9 (it's already dead). Fix: fix the parent to call wait(), or kill the parent (init/systemd will reap orphans automatically).
Orphan: A process whose parent has already died. The orphan is adopted by PID 1 (init/systemd), which will eventually wait() for it. Generally harmless.
ps aux | grep Z # find zombie processes
How do you troubleshoot high CPU or memory on a Linux system?
High CPU:
top / htop # which process is using CPU?
ps aux --sort=-%cpu | head -10
pidstat -u 1 5 # per-process CPU over 5 samples
perf top # kernel-level profiling
strace -p <pid> # system calls being made
High memory:
free -h # overall usage, swap
ps aux --sort=-%mem | head -10
cat /proc/<pid>/status | grep VmRSS # resident memory for a process
smem -r # shared memory-aware reporting
Is it a memory leak? Compare RSS over time: watch -n 5 'ps -o rss= -p <pid>'
What is the OOM killer and when does it trigger?
The OOM (Out Of Memory) killer is the kernel's last resort when the system runs out of both RAM and swap. It scores each process with an oom_score (higher score = more memory, less vital) and kills the highest-scored process.
journalctl -k | grep -i "Out of memory" # check if OOM killer fired
cat /proc/<pid>/oom_score # check a process's score
echo -17 > /proc/<pid>/oom_score_adj # protect a process (root only)
Prevention: Set container/cgroup memory limits; size VMs correctly; monitor MemAvailable in /proc/meminfo and alert before hitting swap.
What is cgroups and why is it important?
cgroups (control groups) is a Linux kernel feature that limits, measures, and isolates resource usage (CPU, memory, disk I/O, network) for groups of processes.
It is the foundation for containers — Docker and Kubernetes use cgroups to enforce:
- Per-container CPU shares and limits
- Per-container memory limits (triggers OOM if exceeded)
- Block I/O throttling
cat /proc/<pid>/cgroup # which cgroup a process belongs to
systemd-cgtop # top-like view of cgroup resource usage
Understanding cgroups is critical for troubleshooting containers running out of memory or being CPU-throttled.
Networking
How do you display and configure network interfaces on Linux?
# Modern (iproute2)
ip addr show # list interfaces and IPs
ip addr add 10.0.0.10/24 dev eth0 # add IP
ip addr del 10.0.0.10/24 dev eth0 # remove IP
ip link set eth0 up/down # bring interface up/down
ip route show # routing table
ip route add default via 10.0.0.1 # add default gateway
# Legacy (deprecated but still common in exams)
ifconfig eth0
route -n
Persistent configuration: /etc/netplan/*.yaml (Ubuntu 20+), NetworkManager (nmcli), or /etc/sysconfig/network-scripts/ (RHEL/CentOS).
How do you troubleshoot network connectivity issues?
Systematic triage from the bottom of the network stack up:
# Layer 1/2: Is the interface up?
ip link show eth0
ethtool eth0 # physical link status, speed, duplex
# Layer 3: Do we have an IP and route?
ip addr show eth0
ip route show
# Layer 3: Can we reach the gateway?
ping -c 4 10.0.0.1
# Layer 3: Can we reach a remote IP?
ping -c 4 8.8.8.8
# DNS: Does name resolution work?
nslookup google.com
dig google.com @8.8.8.8 # query specific DNS server
cat /etc/resolv.conf # DNS configuration
# Layer 4: Can we reach the port?
telnet 10.0.0.5 443
nc -zv 10.0.0.5 443
curl -v https://api.example.com/health
# Trace route
traceroute 8.8.8.8
mtr 8.8.8.8 # continuous traceroute with statistics
What are common ss and netstat commands for network diagnostics?
# ss (modern, replaces netstat)
ss -tlnp # TCP listening sockets with process names
ss -tunap # TCP+UDP all with process names
ss -s # summary statistics
# Who's listening on port 443?
ss -tlnp | grep :443
lsof -i :443
# Established connections
ss -tnp state established
# netstat (legacy but still found)
netstat -tlnp # listening TCP
netstat -anp # all connections with processes
What is iptables and how do you add a basic rule?
iptables is the Linux kernel packet filtering framework (superseded by nftables, but still widely used).
Key chains:
INPUT— packets destined for the local systemOUTPUT— packets sent from the local systemFORWARD— packets routed through the system
# List rules
iptables -L -n -v --line-numbers
# Allow inbound SSH
iptables -A INPUT -p tcp --dport 22 -j ACCEPT
# Allow established/related connections
iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
# Drop everything else (default DROP policy)
iptables -P INPUT DROP
# Save rules (RHEL/CentOS)
service iptables save
# Ubuntu: iptables-persistent
On modern systems, use firewalld (firewall-cmd) or ufw as frontends that manage iptables rules.
How does DNS resolution work on Linux?
- Check
/etc/hostsfirst (configurable via/etc/nsswitch.confhosts:line). - If not found, send query to nameservers in
/etc/resolv.conf. - Recursive resolver (often the router or a cloud-provided DNS) resolves the name.
cat /etc/resolv.conf # nameserver, search domains
cat /etc/nsswitch.conf # resolution order (hosts, dns, mdns)
# Clear DNS cache (systemd-resolved)
systemd-resolve --flush-caches
resolvectl flush-caches
# Debug resolution
strace -e trace=network nslookup google.com
Explain TCP three-way handshake.
The TCP three-way handshake establishes a connection before data transfer:
- SYN — Client sends a SYN packet with an initial sequence number (ISN).
- SYN-ACK — Server responds with its own ISN and acknowledges the client's ISN (ACK = client ISN + 1).
- ACK — Client acknowledges the server's ISN.
Connection is established; data can flow.
Why this matters in troubleshooting:
- If SYN times out: firewall dropping SYN packets, or server not listening on the port.
SYN_SENTstate inss: client waiting for SYN-ACK — remote is unresponsive or packet is dropped.SYN_RECVstate: server received SYN but client hasn't completed handshake — possible SYN flood.
Shell Scripting
What is the shebang line and why does it matter?
#!/bin/bash
The shebang (#!) followed by the interpreter path on the first line of a script tells the kernel which interpreter to use when the script is executed directly (./script.sh).
Alternatives:
#!/usr/bin/env bash— portable; finds bash in$PATHrather than assuming/bin/bash.#!/usr/bin/env python3#!/bin/sh— POSIX shell (more portable; avoid bash-specific syntax).
Without a shebang, the current shell tries to run the script, which may work by accident but is not reliable.
Write a script to find and delete log files older than 30 days.
#!/usr/bin/env bash
set -euo pipefail # exit on error, undefined vars, pipe failures
LOG_DIR="/var/log/myapp"
DAYS=30
# Dry-run first — remove '-delete' and add echoes to verify
find "$LOG_DIR" -type f -name "*.log" -mtime +"$DAYS" -print -delete
echo "Cleaned logs older than $DAYS days from $LOG_DIR"
Key scripting practices:
set -euo pipefail— fail fast on errors; catch undefined variables.- Quote all variables (
"$VAR") — prevents word splitting on filenames with spaces. - Test with
-printbefore adding-delete. - Use
$(command)not backticks for command substitution.
Explain bash special variables.
| Variable | Meaning |
|---|---|
$0 | Script name |
$1 – $9 | Positional arguments |
$# | Number of arguments |
$@ | All arguments as separate strings (use in loops) |
$* | All arguments as one string |
$? | Exit code of last command |
$$ | PID of current shell |
$! | PID of last background process |
$_ | Last argument of previous command |
if [ "$#" -lt 2 ]; then
echo "Usage: $0 <source> <destination>"
exit 1
fi
How do you debug a bash script?
bash -x script.sh # trace every command (print before execution)
bash -n script.sh # syntax check only (no execution)
bash -v script.sh # print each line as it's read
# Inside script:
set -x # enable trace mode
set +x # disable trace mode
set -e # exit immediately on error
set -u # treat unset variables as error
For complex bugs: add echo "DEBUG: var=$var" at key points, or use set -x around the suspect section only.
Troubleshooting
How do you triage a server that is responding slowly?
# 1. Load average
uptime # 1, 5, 15 minute load average
# Load > CPU count = overloaded
# 2. CPU
top / htop # is one process consuming it? Is it user or system CPU?
# 3. Memory
free -h # is there swap usage? swap = memory pressure
vmstat 1 5 # swpd, si, so columns (swap in/out)
# 4. Disk I/O
iostat -x 1 5 # await, util% — is a disk at 100% utilisation?
iotop # which process is causing disk I/O
# 5. Network
ss -s # connection count, retransmits
sar -n DEV 1 5 # network throughput
# 6. Application-level
journalctl -u myapp -n 100 --no-pager # service logs
How do you recover a system when /etc/fstab has an error and the server won't boot?
- At the GRUB menu, select the recovery/single-user mode entry.
- Mount root filesystem read-write:
mount -o remount,rw / - Edit
/etc/fstab:vi /etc/fstab— comment out the bad entry. - Validate:
mount -a(should return no errors). - Reboot:
reboot.
If single-user mode is unavailable: boot from a live CD/recovery image, chroot into the broken installation, fix fstab.
In cloud VMs: detach the OS disk, attach to a recovery VM, mount and fix fstab, reattach.
What is strace and when do you use it?
strace intercepts and records system calls made by a process. Useful when:
- A program fails with no useful error message (trace reveals which syscall failed and why).
- Diagnosing file path issues ("why can't the app find its config?").
- Performance investigation (which syscall is slow?).
strace ls /tmp # trace ls
strace -p 1234 # attach to running process
strace -e openat ls # trace only file open calls
strace -c ls /tmp # summary: count and time per syscall
Typical findings: file not found (ENOENT), permission denied (EACCES), connection refused (ECONNREFUSED) on a specific path or port.
How do you check system logs for errors?
# systemd journal (primary on modern systems)
journalctl -p err -b # errors since last boot
journalctl -p err --since "1 hour ago"
journalctl -u nginx # specific service
journalctl -k # kernel messages (equivalent to dmesg)
# Traditional logs
tail -f /var/log/syslog # Debian/Ubuntu
tail -f /var/log/messages # RHEL/CentOS
tail -f /var/log/auth.log # authentication events
grep -i error /var/log/nginx/error.log | tail -50
