beginner1.5 hours18 min read

Junior Developer Interview Questions

30+ junior developer interview questions covering Python/JavaScript fundamentals, Git, REST APIs, SQL databases, basic cloud concepts, and problem-solving — with clear, complete answers.

pythonjavascriptgitrest-apidatabasescloudproblem-solvingjunior

Questions

30+

Topics

Est. time

1.5 hours

Python Fundamentals

What is the difference between a list, tuple, and set in Python?

	List	Tuple	Set
Syntax	`[1, 2, 3]`	`(1, 2, 3)`	`{1, 2, 3}`
Mutable	✅ Yes	❌ No	✅ Yes
Ordered	✅ Yes	✅ Yes	❌ No
Duplicates	✅ Allowed	✅ Allowed	❌ Unique only
Hashable	❌ No	✅ Yes (if elements are)	❌ No

Use a list when order matters and you need to mutate the collection. Use a tuple for fixed data (coordinates, database row), as dictionary keys, or for slightly better performance. Use a set for fast membership testing (x in my_set is O(1)) and deduplication.

Explain mutable vs immutable objects in Python.

Immutable objects cannot be changed after creation. Reassigning a variable creates a new object.

Immutable: int, float, str, tuple, frozenset, bool

x = "hello"
x += " world"   # creates a new string; original "hello" untouched

Mutable objects can be modified in place.

Mutable: list, dict, set, custom objects

nums = [1, 2, 3]
nums.append(4)   # modifies the same list object in memory

Why it matters: Passing a mutable object to a function allows the function to modify the caller's data. This can cause subtle bugs. Passing an immutable object is always safe.

def add_item(lst, item):
    lst.append(item)   # modifies the original list!

my_list = [1, 2]
add_item(my_list, 3)
print(my_list)   # [1, 2, 3]

What is a dictionary and how is it used?

A dictionary (dict) stores key-value pairs with O(1) average lookup time (hash table under the hood). Keys must be hashable (immutable).

user = {
    "name": "Alice",
    "age": 30,
    "roles": ["admin", "editor"]
}

# Access
user["name"]            # "Alice"
user.get("email", "")   # safe get with default; avoids KeyError

# Modify
user["age"] = 31
user.setdefault("active", True)  # set only if key absent

# Iterate
for key, value in user.items():
    print(f"{key}: {value}")

# Check membership
"name" in user   # True

# Dict comprehension
squared = {n: n**2 for n in range(1, 6)}

What are list comprehensions and when should you use them?

A list comprehension creates a new list by transforming or filtering an iterable in a single, readable expression.

# Traditional loop
squares = []
for x in range(10):
    if x % 2 == 0:
        squares.append(x ** 2)

# List comprehension — same result
squares = [x ** 2 for x in range(10) if x % 2 == 0]

When to use: For simple transformations and filters. Prefer them over map()/filter() for readability.

When NOT to use: When logic is complex — a regular loop with comments is clearer than a nested 80-character comprehension.

Similar syntax: dict comprehensions {k: v for ...}, set comprehensions {x for ...}, generator expressions (x for ...).

What is the difference between `==` and `is` in Python?

== checks value equality: do the two objects have the same value?
is checks identity: are the two variables pointing to the same object in memory?

a = [1, 2, 3]
b = [1, 2, 3]

a == b   # True  — same values
a is b   # False — different objects in memory

c = a
a is c   # True  — c is the same object as a

Common mistake: Using is to compare strings or integers works by accident (Python interns small integers and short strings), but it's unreliable. Always use == for value comparisons.

Correct use of is: if x is None: — checking for the singleton None, True, or False.

What is a Python virtual environment and why use one?

A virtual environment is an isolated Python installation with its own packages, separate from the system Python.

python3 -m venv venv        # create
source venv/bin/activate    # activate (Linux/Mac)
venv\Scripts\activate       # activate (Windows)
pip install requests        # installs only into this venv
pip freeze > requirements.txt  # snapshot dependencies
deactivate                  # exit

Why: Different projects need different versions of the same library. Without venv, upgrading requests for project A breaks project B. With venv, each project has its own isolated dependency tree.

Modern alternative: uv (very fast), poetry, or conda for data science.

Explain exception handling in Python.

try:
    result = int(input("Enter a number: "))
    value = 10 / result
except ValueError:
    print("That's not a valid integer")
except ZeroDivisionError:
    print("Can't divide by zero")
except (TypeError, OverflowError) as e:
    print(f"Unexpected error: {e}")
else:
    # Runs only if no exception was raised
    print(f"Result: {value}")
finally:
    # Always runs — use for cleanup (close files, DB connections)
    print("Done")

Best practices:

Catch specific exceptions, not bare except: (hides bugs).
Use finally to release resources (or prefer with statement for automatic cleanup).
Re-raise with raise if you can't handle the exception here.
Create custom exceptions by subclassing Exception.

What is the difference between `*args` and `**kwargs`?

*args collects extra positional arguments into a tuple. **kwargs collects extra keyword arguments into a dictionary.

def show(*args, **kwargs):
    print(args)    # tuple of positional args
    print(kwargs)  # dict of keyword args

show(1, 2, 3, name="Alice", role="admin")
# (1, 2, 3)
# {'name': 'Alice', 'role': 'admin'}

# Unpacking when calling
nums = [1, 2, 3]
print(*nums)           # 1 2 3

config = {"sep": ", ", "end": "!\n"}
print("a", "b", **config)  # a, b!

Git & Version Control

What are the three states of a file in Git?

Modified: Changed in the working directory but not staged.
Staged: Marked to go into the next commit (in the index/staging area).
Committed: Safely stored in the local Git database.

Working Directory  →  Staging Area  →  Repository
       (modified)   git add  (staged)  git commit  (committed)

git status          # see which files are in which state
git add file.py     # move from modified → staged
git add -p          # interactively stage hunks (partial file staging)
git commit -m "msg" # move staged → committed
git diff            # working directory vs staged
git diff --staged   # staged vs last commit

What is the difference between `git fetch` and `git pull`?

git fetch: Downloads commits, branches, and tags from the remote but does not modify your working branch. Safe to run at any time.

git pull: Equivalent to git fetch + git merge (or git rebase with --rebase). Immediately updates your current branch.

git fetch origin        # update remote-tracking branches
git log origin/main     # inspect what changed before merging
git merge origin/main   # merge when ready

# vs:
git pull origin main    # fetch + merge in one step

Best practice: Use git fetch + git log origin/main to inspect changes before merging. git pull --rebase for a cleaner linear history.

How do you undo the last commit without losing changes?

git reset --soft HEAD~1   # undo commit; changes stay staged
git reset HEAD~1          # undo commit; changes unstaged (mixed, default)
git reset --hard HEAD~1   # undo commit AND discard changes (destructive!)

# Already pushed to remote? Don't rewrite — use revert instead:
git revert HEAD           # creates a new commit that undoes the last one

For a file already committed:

git restore --staged file.py   # unstage
git checkout HEAD -- file.py   # discard working directory changes

What is `.gitignore` and what should you put in it?

.gitignore tells Git which files and directories to ignore (not track).

Typically ignored:

# Dependencies
node_modules/
venv/
__pycache__/

# Build output
dist/
build/
*.pyc

# Environment & secrets (NEVER commit these)
.env
.env.local
*.pem
secrets.json

# Editor/OS
.DS_Store
.vscode/settings.json
*.swp

Generate a starting .gitignore: gitignore.io or GitHub's language templates.

Committed files are not retroactively ignored — use git rm --cached <file> to stop tracking a previously committed file.

Explain the Git branching workflow for a feature.

Standard flow (GitHub Flow):

# 1. Start from up-to-date main
git checkout main
git pull origin main

# 2. Create a feature branch
git checkout -b feature/add-login

# 3. Make commits
git add auth.py
git commit -m "add JWT authentication"

# 4. Push and open PR
git push -u origin feature/add-login
# Open PR on GitHub; request review

# 5. After approval, merge to main (squash and merge)
# Delete feature branch
git branch -d feature/add-login
git push origin --delete feature/add-login

Branch naming conventions: feature/, fix/, chore/, docs/ prefixes.

REST APIs

What is a REST API and what are its core principles?

REST (Representational State Transfer) is an architectural style for building APIs over HTTP.

Core principles:

Uniform interface: Resources identified by URLs; standard HTTP methods.
Stateless: Each request contains all information needed; server stores no client session.
Client-server: Client and server are independent; evolve separately.
Cacheable: Responses can declare themselves cacheable.
Layered system: Client doesn't know if it's talking to the server directly or through a load balancer.

HTTP methods map to CRUD:

Method	Operation	Idempotent
GET	Read	✅
POST	Create	❌
PUT	Replace	✅
PATCH	Partial update	No (depends)
DELETE	Delete	✅

What are HTTP status codes? Give common examples.

Status codes indicate the result of an HTTP request:

Range	Category	Examples
2xx	Success	200 OK, 201 Created, 204 No Content
3xx	Redirection	301 Moved Permanently, 302 Found, 304 Not Modified
4xx	Client error	400 Bad Request, 401 Unauthorised, 403 Forbidden, 404 Not Found, 422 Unprocessable Entity, 429 Too Many Requests
5xx	Server error	500 Internal Server Error, 502 Bad Gateway, 503 Service Unavailable, 504 Gateway Timeout

Common mistake: Returning 200 OK with {"error": "not found"} in the body — always use the correct status code.

How do you consume a REST API in Python?

import requests

# GET request
response = requests.get(
    "https://api.example.com/users/42",
    headers={"Authorization": "Bearer my-token"},
    timeout=10
)

response.raise_for_status()   # raises HTTPError on 4xx/5xx
user = response.json()        # parse JSON body
print(user["name"])

# POST request
new_user = {"name": "Alice", "email": "alice@example.com"}
response = requests.post(
    "https://api.example.com/users",
    json=new_user,             # serialises dict to JSON body, sets Content-Type
    headers={"Authorization": "Bearer my-token"},
    timeout=10
)
response.raise_for_status()
created = response.json()
print(created["id"])

Always: set a timeout, call raise_for_status(), handle the requests.exceptions.RequestException.

What is JSON and how do you work with it in Python?

JSON (JavaScript Object Notation) is a text format for structured data. Used everywhere as the wire format for REST APIs.

import json

# Python dict → JSON string
data = {"name": "Alice", "scores": [95, 87, 92]}
json_str = json.dumps(data)           # '{"name": "Alice", "scores": [95, 87, 92]}'
pretty   = json.dumps(data, indent=2) # formatted

# JSON string → Python dict
parsed = json.loads(json_str)
print(parsed["name"])   # "Alice"

# File I/O
with open("config.json") as f:
    config = json.load(f)

with open("output.json", "w") as f:
    json.dump(data, f, indent=2)

JSON type mappings: object → dict, array → list, string → str, number → int/float, true/false → True/False, null → None.

What is authentication vs authorisation?

Authentication ("AuthN"): Verifying who you are. "Are you really Alice?"

Methods: username/password, API key, OAuth token, certificate, MFA.

Authorisation ("AuthZ"): Verifying what you're allowed to do. "Alice, can you delete this record?"

Methods: RBAC, ABAC, ACL, scopes in JWT claims.

In an API context:

Client sends credentials → server validates → issues a JWT token (authentication).
On each subsequent request, client sends the JWT in the Authorization: Bearer <token> header.
Server verifies the token signature and checks the claim's permissions before proceeding (authorisation).

Databases

What is SQL and what are the core commands?

SQL (Structured Query Language) is the standard language for relational databases.

-- Create table
CREATE TABLE users (
    id    SERIAL PRIMARY KEY,
    name  VARCHAR(100) NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- INSERT
INSERT INTO users (name, email) VALUES ('Alice', 'alice@example.com');

-- SELECT
SELECT id, name FROM users WHERE email LIKE '%@example.com' ORDER BY name LIMIT 10;

-- UPDATE
UPDATE users SET name = 'Alice Smith' WHERE id = 1;

-- DELETE
DELETE FROM users WHERE id = 1;

-- JOIN
SELECT orders.id, users.name
FROM orders
JOIN users ON orders.user_id = users.id
WHERE orders.status = 'pending';

What is a primary key and a foreign key?

Primary key: A column (or set of columns) that uniquely identifies each row in a table. Cannot be NULL. Usually an auto-incrementing integer or UUID.

Foreign key: A column in one table that references the primary key of another table, enforcing referential integrity.

CREATE TABLE orders (
    id      SERIAL PRIMARY KEY,
    user_id INT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    total   DECIMAL(10, 2)
);

ON DELETE CASCADE — if a user is deleted, their orders are automatically deleted too. Alternative: ON DELETE RESTRICT (prevents deletion if orders exist).

What is an index and when should you add one?

An index is a data structure (usually a B-tree) that speeds up queries on a column at the cost of extra storage and slower writes.

CREATE INDEX idx_users_email ON users(email);
-- Now: SELECT * FROM users WHERE email = 'alice@...' is fast (O(log n))
-- Without index: full table scan (O(n))

Add an index when:

A column is frequently used in WHERE, JOIN ON, or ORDER BY clauses.
The table is large and queries are slow.

Avoid indexing:

Low-cardinality columns (e.g., a boolean is_active with only two values — the DB often ignores it).
Columns on small tables — full scan is faster than index overhead.
Over-indexing: every extra index slows down INSERT/UPDATE/DELETE.

Use EXPLAIN ANALYZE to see if a query is using your index.

What is the N+1 query problem?

The N+1 problem occurs when you fetch a list of N items and then issue a separate query for each item's related data — totalling N+1 queries.

# N+1 problem
users = db.query("SELECT * FROM users LIMIT 100")   # 1 query
for user in users:
    orders = db.query(f"SELECT * FROM orders WHERE user_id={user['id']}")  # 100 queries!

Fix with a JOIN (single query):

results = db.query("""
    SELECT users.name, orders.id, orders.total
    FROM users
    LEFT JOIN orders ON orders.user_id = users.id
    WHERE users.id IN (SELECT id FROM users LIMIT 100)
""")

Or use your ORM's eager loading: SQLAlchemy joinedload(), Django select_related()/prefetch_related().

What is a transaction and why is it important?

A transaction is a unit of work that is executed atomically — either all operations succeed, or none do (rollback).

import psycopg2

conn = psycopg2.connect(...)
try:
    with conn:              # auto-commits on success, rolls back on exception
        cur = conn.cursor()
        cur.execute("UPDATE accounts SET balance = balance - 100 WHERE id = 1")
        cur.execute("UPDATE accounts SET balance = balance + 100 WHERE id = 2")
        # If the second UPDATE fails, the first is rolled back — money not lost
except Exception as e:
    print(f"Transaction failed: {e}")

ACID properties:

Atomicity: All or nothing.
Consistency: Database transitions from one valid state to another.
Isolation: Concurrent transactions don't interfere.
Durability: Committed data survives crashes.

Basic Cloud Concepts

What is the cloud and why do companies use it?

The cloud provides on-demand computing resources (servers, storage, databases, networking) over the internet, paid for as consumed rather than owned.

Why companies use it:

No upfront hardware cost: Start a server in minutes; no capital expenditure.
Scale on demand: Add capacity for peak seasons; reduce it after.
Global reach: Deploy in multiple regions worldwide instantly.
Managed services: No server patching for managed databases, message queues, or ML platforms.
Reliability: Cloud providers offer 99.9%+ SLAs backed by redundant infrastructure.

Major providers: AWS (market leader), Azure (Microsoft), GCP (Google).

What is the difference between a VM and serverless?

Virtual Machine (VM): A full operating system running on shared physical hardware. You manage OS, runtime, patches. Persistent — always running, always charging.

Serverless (Azure Functions, AWS Lambda): You deploy a function; the cloud runs it on-demand, scales to zero, and you pay only for execution time (per invocation).

	VM	Serverless
Management	You manage OS	No server management
Scaling	Manual or auto-scale	Automatic, to zero
Cost	Always-on billing	Per-invocation
Startup	Seconds (always on)	Cold start: 100ms–2s
Use case	Long-running, stateful	Event-driven, short tasks

What is object storage and how does it differ from a file system?

Object storage (Azure Blob, AWS S3) stores data as flat objects — each with a unique key, the data, and metadata. No directory hierarchy (paths are just naming conventions).

File system organises data in a hierarchical tree of directories and files, with OS-level access.

	Object Storage	File System
Structure	Flat key-value	Hierarchical tree
Access	HTTP (REST API)	POSIX (open/read/write)
Scale	Virtually unlimited	Constrained by storage volume
Use case	Static files, backups, images	App working files, OS operations
Mount	Generally not mountable	Mounted locally

Object storage is ideal for: web assets, database backups, logs, data lake storage, user-uploaded files.

What is a CDN and why would you use it?

A Content Delivery Network caches content (images, CSS, JavaScript, videos) on servers geographically close to users (edge nodes). When a user requests a file, they get it from the nearest edge node rather than the origin server.

Benefits:

Lower latency: Files served from 50 km away vs. 5,000 km away.
Reduced origin load: Most static requests never reach your server.
DDoS absorption: Edge network absorbs large-scale attacks.
HTTPS: CDNs handle TLS termination at the edge.

Common CDNs: Azure CDN, AWS CloudFront, Cloudflare, Fastly.

Problem Solving

How do you approach a programming problem you've never seen before?

A reliable framework:

Understand the problem: Restate it in your own words. Identify inputs, expected outputs, constraints.
Ask clarifying questions: Edge cases? Input size? Performance requirements?
Work through an example: Manually solve a small example on paper/whiteboard.
Identify patterns: Have you seen a similar problem? Does it fit a known pattern (sliding window, frequency map, two pointers)?
Start with a brute-force solution: Get something working first; optimise later.
Optimise: Identify bottlenecks (nested loops = O(n²)?). Consider if a hash map reduces time complexity.
Test: Run your solution against the examples, then think about edge cases (empty input, single element, duplicates, negative numbers).

Interviewers care as much about how you think as the final answer.

Write a function to check if a string is a palindrome.

def is_palindrome(s: str) -> bool:
    """Return True if s reads the same forwards and backwards."""
    # Clean: lowercase, letters/digits only
    cleaned = "".join(c.lower() for c in s if c.isalnum())
    return cleaned == cleaned[::-1]

# Tests
assert is_palindrome("racecar") == True
assert is_palindrome("A man a plan a canal Panama") == True
assert is_palindrome("hello") == False
assert is_palindrome("") == True    # empty string edge case

Two-pointer approach (O(1) space):

def is_palindrome(s: str) -> bool:
    cleaned = [c.lower() for c in s if c.isalnum()]
    left, right = 0, len(cleaned) - 1
    while left < right:
        if cleaned[left] != cleaned[right]:
            return False
        left += 1
        right -= 1
    return True

What is Big O notation and why does it matter?

Big O describes how an algorithm's time (or space) requirement grows relative to input size n.

Notation	Name	Example
O(1)	Constant	Dict lookup, array access by index
O(log n)	Logarithmic	Binary search, balanced BST lookup
O(n)	Linear	Linear scan, iterating through a list
O(n log n)	Log-linear	Merge sort, `sorted()` in Python
O(n²)	Quadratic	Nested loops comparing every pair
O(2ⁿ)	Exponential	Naive recursive Fibonacci

Why it matters: An O(n²) algorithm on 1,000 items makes 1,000,000 operations; on 100,000 items it's 10 billion — it just doesn't scale.

Interviewers often accept a working O(n²) solution first, then ask: "How would you optimise this to O(n)?"