When should I choose a NoSQL database over a traditional RDBMS?

Choose NoSQL when your application needs horizontal scalability for massive data volumes or very high throughput, a flexible or evolving schema (e.g., IoT sensor data, user-generated content), specialized data models (graphs, documents, key-value), high availability during network partitions, or extremely fast reads or writes. For strong ACID guarantees, complex multi-table joins, or strict referential integrity, RDBMS are generally preferred. Most modern architectures use both in a polyglot persistence approach.

What is the "schema-less" nature of NoSQL, and what are its implications?

NoSQL databases are better described as "schema-flexible" or "schema-on-read" rather than truly schema-less. The database itself does not enforce a rigid schema, so documents in the same collection can have different fields. Pros: faster development, easier data model evolution, flexibility for diverse data. Cons: the application must validate and understand the data structure; lack of database-level enforcement can lead to inconsistent data if not carefully managed.

What is the primary difference between sharding and replication?

Sharding (horizontal partitioning) focuses on scalability by distributing data across multiple independent shards, each holding a unique subset of the total data. Replication focuses on availability and fault tolerance by maintaining multiple identical copies of data across nodes -- if one fails, others serve requests. They are often used together: each shard is typically replicated for high availability.

How does the CAP theorem affect NoSQL database design?

The CAP theorem states a distributed system cannot simultaneously guarantee Consistency, Availability, and Partition Tolerance. Since network partitions are inevitable in distributed systems, designers must choose: CP systems (e.g., HBase, MongoDB default) sacrifice availability during partitions to maintain strong consistency, while AP systems (e.g., Cassandra, DynamoDB) remain available but may return stale data, achieving eventual consistency.

How do I choose the right NoSQL database type for my project?

Consider four factors: (1) Data model -- key-value for simple GET/PUT, document for flexible nested objects, column-family for sparse/time-series high-write workloads, graph for relationship-heavy data. (2) Access patterns -- how will you query? By key, content, range, or relationships? (3) Consistency needs -- can you tolerate eventual consistency or do you need strong consistency? (4) Scale requirements -- do you need linear horizontal scaling? A polyglot persistence approach using multiple database types for different parts of an application is often optimal.

What are the trade-offs of eventual consistency in AP systems?

Eventual consistency means writes propagate asynchronously, so reads may temporarily return stale data. The "inconsistency window" is typically milliseconds to seconds. Applications must handle this gracefully: implement read-your-writes consistency for user-facing operations, use conflict resolution strategies (last-write-wins, CRDTs, application-level merging), and design UIs that tolerate brief inconsistencies. The benefit is higher availability and write throughput -- systems like Cassandra can sustain hundreds of thousands of writes per second.

DatabaseCollege

NoSQL Databases

NoSQL databases represent a paradigm shift from traditional relational systems, offering flexible schemas, horizontal scalability, and specialized data models designed for big data, real-time applications, and distributed architectures.

This guide covers the CAP theorem, all four NoSQL categories (key-value, document, column-family, graph), consistency models, sharding and replication, common mistakes, and a 10-question practice quiz.

1Introduction

In the evolving landscape of data management, NoSQL databases (Not Only SQL) represent a paradigm shift from traditional relational database management systems (RDBMS). Coined in 1998 and re-popularized in 2009, the term describes non-relational data stores designed for specific use cases where RDBMS struggle: massive scale, flexible schemas, and specialized data models.

The advent of big data, cloud computing, real-time web applications, and agile development has driven demand for data management solutions that handle massive volumes of diverse data at high velocity. NoSQL databases offer alternatives or complements to RDBMS for scenarios requiring horizontal scalability, flexible schema design, and models suited to graphs, documents, or simple key-value structures.

In Practice: Polyglot Persistence

A global e-commerce platform might use a relational database for financial transactions and user accounts (where ACID is paramount), a document database for product catalogs with varying attributes, a key-value store for session management and caching, and a graph database for personalized recommendations and fraud detection. This polyglot persistence approach optimizes each component for its specific requirements rather than forcing all data into a single model.

NoSQL vs. RDBMS at a Glance

Dimension	RDBMS	NoSQL
Schema	Rigid, schema-on-write	Flexible, schema-on-read
Scalability	Primarily vertical	Horizontal (scale-out)
Consistency	Strong ACID	Often BASE / tunable
Joins	Native, performant	Avoided via denormalization
Best for	Transactional, relational data	Scale, flexible models, speed

2Key Definitions

Essential terms for understanding NoSQL databases at the university level.

NoSQL (Not Only SQL)

A broad category of non-relational databases designed for scalability, flexible schemas, and specialized data models. Examples: MongoDB, Cassandra, Redis, Neo4j.

CAP Theorem (Brewer's Theorem)

A distributed system cannot simultaneously guarantee Consistency, Availability, and Partition Tolerance. In a partition, you must choose C or A.

BASE

Basically Available, Soft state, Eventually consistent. Describes relaxed consistency in many NoSQL AP systems, contrasting with ACID.

ACID

Atomicity, Consistency, Isolation, Durability. Guarantees for reliable transactions, central to RDBMS and some NewSQL databases.

Eventual Consistency

Given no new updates, all reads will eventually return the last written value. No guarantee on when -- the "inconsistency window" may be milliseconds to seconds.

Sharding (Horizontal Partitioning)

Distributing a single logical dataset across multiple database instances (shards), each holding a unique subset. Enables horizontal scalability.

Replication

Maintaining multiple identical copies of data across nodes for availability, fault tolerance, and read scalability. Replicas hold the same data.

Denormalization

Intentionally adding redundant data to optimize read performance by avoiding joins. Common in NoSQL -- e.g., embedding a customer's name directly in an order document.

Key-Value Store

Simplest NoSQL model. Data stored as unique key → opaque value. O(1) GET/PUT. Examples: Redis, DynamoDB, Memcached.

Document Store

Stores self-describing JSON/BSON documents with flexible schemas, nested structures, and arrays. Examples: MongoDB, CouchDB.

Column-Family Store

Rows with dynamic column sets grouped into column families. Optimized for high write throughput and sparse data. Examples: Cassandra, HBase.

Graph Database

Stores data as nodes, edges, and properties. Optimized for relationship traversals. Examples: Neo4j, Amazon Neptune.

Vector Clocks

A mechanism for determining causal ordering of events in distributed systems. Each node maintains a vector of logical timestamps to detect concurrent conflicting updates.

Quorum (W + R > N)

A consistency mechanism: if write quorum W plus read quorum R exceeds total replicas N, reads are guaranteed to see the latest write. Enables tunable consistency.

3The CAP Theorem & Trade-offs

The CAP Theorem, formally articulated by Eric Brewer in 2000 and proven by Gilbert and Lynch, is a cornerstone of distributed systems design. It posits that a distributed data store cannot simultaneously guarantee Consistency (C), Availability (A), and Partition Tolerance (P).

CAP Theorem Triangle

Since network partitions are inevitable in any real distributed system, Partition Tolerance is essentially mandatory. This forces a binary choice between C and A when a partition occurs.

CP Systems (Consistency + Partition Tolerance)

During a partition, the system blocks or errors rather than return stale data.

Reduced availability during partitions
Operations may time out or fail
Use case: banking, inventory, leader election
Examples: HBase, MongoDB (default), Redis (strong replication)

AP Systems (Availability + Partition Tolerance)

During a partition, the system remains operational but may return stale data.

Eventual consistency -- data eventually converges
Requires conflict resolution mechanisms
Use case: social feeds, shopping carts, CDNs
Examples: Cassandra, DynamoDB, CouchDB

PACELC: Extending CAP

The PACELC theorem extends CAP by considering latency vs. consistency trade-offs even when there is no partition: if Partition, choose between A and C; Else, choose between Latency and Consistency.

System	During Partition	No Partition	Examples
PA/EL	Available	Low latency	Cassandra, DynamoDB
PC/EC	Consistent	Strong consistency	Traditional RDBMS, Google Spanner

4NoSQL Database Categories

NoSQL databases are broadly categorized by their fundamental data models, each optimized for different data types, access patterns, and use cases.

Key-Value Stores

Simplest model. Opaque key → value pairs.

O(1) GET/PUT by key
No query by value
Use: caching, sessions, leaderboards
Redis, DynamoDB, Memcached

Document Stores

Flexible JSON/BSON documents with nesting.

Rich query by document content
Secondary indexes, aggregation
Use: catalogs, CMS, user profiles
MongoDB, CouchDB

Column-Family Stores

Rows with dynamic columns in families.

Very high write throughput
Sparse, time-series data
Use: IoT, analytics, event logging
Cassandra, HBase, BigTable

Graph Databases

Nodes, edges, properties for relationships.

Index-free adjacency for O(1) traversal
Cypher / SPARQL query languages
Use: social networks, fraud, recommendations
Neo4j, Amazon Neptune

Feature	Key-Value	Document	Column-Family	Graph
Data Model	Key-value pairs	JSON/BSON docs	Rows + column families	Nodes + edges
Querying	By key only	Rich content query	Row key + CF	Traversals
Scalability	Very high	High	Very high	Moderate
Example	Redis, DynamoDB	MongoDB	Cassandra, HBase	Neo4j, Neptune

5Key-Value Stores Deep Dive

Key-value stores are the simplest NoSQL form: data is an opaque value associated with a unique key. The database has no knowledge of the value's internal structure -- it is a simple lookup table at massive scale.

Core Operations

GET(key)

Retrieve value by key. O(1) average.

PUT(key, value)

Store or update value. O(1) average.

DELETE(key)

Remove key-value pair. O(1) average.

Redis: Session Management Example

Redis Commands

# Store session data with 30-minute TTL
SET sess:user:12345 '{"userId":"alice","cartItems":["itemA","itemB"]}'
EXPIRE sess:user:12345 1800

# Retrieve session
GET sess:user:12345

# Atomic increment (leaderboard score)
INCR leaderboard:player:alice
INCRBY leaderboard:player:alice 100

# Sorted set for real-time leaderboard
ZADD leaderboard 1500 "alice"
ZADD leaderboard 1200 "bob"
ZREVRANGE leaderboard 0 9 WITHSCORES   -- top 10 players

# Delete session on logout
DEL sess:user:12345

SET + EXPIRE: Atomic storage with built-in TTL -- Redis deletes the key automatically when it expires.

INCR: Atomic integer increment -- no race conditions even under high concurrency.

ZADD / ZREVRANGE: Sorted sets enable O(log N) leaderboard updates and O(k) top-k retrieval.

Consistent Hashing for Distribution

Key-value stores distribute data across nodes using consistent hashing. Keys and nodes are mapped onto a ring, and each key is assigned to the nearest node clockwise. When a node joins or leaves, only nearby keys migrate.

// Consistent hashing ring with 4 nodes and 3 keys. Click Play to watch.

Key Assignments

Key	Hash	Node
user:alice	4	B
user:bob	8	D
user:carol	13	A

Node A (pos 2)Node B (pos 6)Node C (pos 11)Node D (pos 9)

Consistent hashing ring with 4 nodes. Adding Node D only moved 1 of 3 keys. Click Play to see the step-by-step process.

Step 0 / 7

Use Cases

Caching

Store DB query results, API responses, rendered pages. Reduces backend load by orders of magnitude. (Memcached, Redis)

Session Storage

Authentication tokens, shopping cart state, user preferences. Fast lookup by session ID + TTL auto-expiry. (Redis)

Real-Time Leaderboards

Gaming scores, rankings updated in real time. Redis sorted sets provide O(log N) updates and O(k) top-k queries.

Rate Limiting

Track API request counts per user per window using atomic INCR and EXPIRE. Prevents abuse at low latency.

6Document Stores Deep Dive

Document databases store data in self-describing documents (typically JSON or BSON) with flexible schemas. Documents can contain nested objects and arrays, naturally mapping to object-oriented application models and eliminating many joins.

Document Store Structure

MongoDB CRUD Operations

MongoDB Query Language (MQL)

// INSERT -- flexible schema, no prior schema definition required
db.products.insertOne({
  "productId": "PROD001",
  "name": "Super Widget",
  "price": 29.99,
  "tags": ["electronics", "home"],
  "manufacturer": { "name": "WidgetCo", "country": "USA" },
  "stock": 150
});

// FIND -- rich query operators
db.products.find({
  "tags": "electronics",          // array containment query
  "price": { "$lt": 50 },         // range operator
  "manufacturer.country": "USA"   // nested field query
});

// UPDATE -- modify specific fields atomically
db.products.updateOne(
  { "productId": "PROD001" },
  { "$set": { "price": 27.50 }, "$inc": { "stock": -1 } }
);

// AGGREGATION PIPELINE -- multi-stage transformations
db.products.aggregate([
  { "$match": { "tags": "electronics" } },
  { "$group": { "_id": "$manufacturer.country",
                "avgPrice": { "$avg": "$price" },
                "count": { "$sum": 1 } } },
  { "$sort": { "avgPrice": -1 } }
]);

Flexible schema: Documents in the same collection can have different fields -- no ALTER TABLE needed.

Nested query: manufacturer.country queries nested fields without a join.

Aggregation pipeline: Stages ($match, $group, $sort) chain to produce complex analytics.

Indexing Strategies

// Single field index -- O(log N) queries on 'category'
db.products.createIndex({ "category": 1 });

// Compound index -- accelerates queries filtering by both fields
db.products.createIndex({ "category": 1, "price": -1 });

// Nested field index
db.products.createIndex({ "manufacturer.country": 1 });

// Text index for full-text search
db.products.createIndex({ "description": "text" });

7Column-Family Stores Deep Dive

Column-family stores, inspired by Google's BigTable, organize data into rows accessed by a row key, with each row containing columns grouped into column families. Unlike relational tables, different rows can have entirely different sets of columns within a family -- enabling sparse, wide-column storage ideal for time-series and high write-throughput workloads.

Column-Family Store Data Model

Cassandra: Time-Series IoT Data

Cassandra Query Language (CQL)

-- Table design: composite primary key for time-series
CREATE TABLE sensor_readings (
  device_id    TEXT,
  reading_time TIMESTAMP,
  temperature  FLOAT,
  humidity     FLOAT,
  PRIMARY KEY (device_id, reading_time)
) WITH CLUSTERING ORDER BY (reading_time DESC);

-- INSERT -- appends a new row (fast, append-only log structure)
INSERT INTO sensor_readings (device_id, reading_time, temperature, humidity)
VALUES ('sensor_A', toTimestamp(now()), 23.5, 60.2);

-- RANGE QUERY -- efficient thanks to composite primary key
SELECT * FROM sensor_readings
WHERE device_id = 'sensor_A'
  AND reading_time >= '2024-01-01 08:00:00'
  AND reading_time <= '2024-01-01 09:00:00';

-- Tunable consistency per operation
CONSISTENCY QUORUM;  -- W+R > N guarantees strong consistency
CONSISTENCY ONE;     -- Fastest, eventual consistency

Composite primary key: device_id is the partition key (determines shard), reading_time is the clustering key (sorted within partition).

Append-only writes: Cassandra's LSM tree structure makes writes extremely fast -- no in-place updates.

Tunable consistency: Choose per-operation between ONE, QUORUM, or ALL.

Use Cases

Time-Series Data

Stock prices, metrics, weather data. Composite keys enable efficient range scans by time window.

IoT Sensor Data

Ingesting millions of readings per second from distributed devices. Cassandra's ring topology has no single point of failure.

Event Logging

Immutable audit trails, application logs. Append-only write model is a natural fit.

Large-Scale Analytics

HBase on Hadoop for batch processing. Column-oriented storage compresses well and enables efficient column scans.

8Graph Databases Deep Dive

Graph databases make relationships first-class citizens alongside entities. Data is modeled as nodes (entities), edges (relationships), and properties (key-value metadata on both). This enables efficient traversal of deeply connected data that would require expensive multi-way joins in a relational model.

Graph Database Data Model

Index-Free Adjacency

The key performance differentiator for graph databases: each node directly stores pointers to its adjacent nodes and edges in memory, rather than relying on an index lookup. Traversal performance is O(1) per hop and proportional to the depth of traversal -- not the total graph size. A relational database would need a self-join for each hop, becoming exponentially more expensive at depth.

Neo4j Cypher Query Language

Cypher (Neo4j)

// Create nodes and relationships
CREATE (alice:Person {name: 'Alice', age: 30})
CREATE (bob:Person {name: 'Bob', age: 25})
CREATE (matrix:Movie {title: 'The Matrix', year: 1999})
CREATE (alice)-[:ACTED_IN {role: 'Neo'}]->(matrix)
CREATE (alice)-[:FRIENDS_WITH {since: 2019}]->(bob)

// Find all friends of Alice
MATCH (a:Person {name: 'Alice'})-[:FRIENDS_WITH]->(f:Person)
RETURN f.name, f.age;

// Find mutual friends of Alice and Bob
MATCH (alice:Person {name: 'Alice'})-[:FRIENDS_WITH]->(mutual)
      <-[:FRIENDS_WITH]-(bob:Person {name: 'Bob'})
RETURN mutual.name;

// Shortest path between two people
MATCH path = shortestPath(
  (alice:Person {name: 'Alice'})-[:FRIENDS_WITH*]-(target:Person {name: 'Charlie'})
)
RETURN length(path), [n IN nodes(path) | n.name] AS route;

// Recommendation: products bought by people who bought what Alice bought
MATCH (alice:Person {name: 'Alice'})-[:BOUGHT]->(prod:Product)
      <-[:BOUGHT]-(other:Person)-[:BOUGHT]->(rec:Product)
WHERE NOT (alice)-[:BOUGHT]->(rec)
RETURN rec.name, count(*) AS score ORDER BY score DESC LIMIT 5;

Pattern syntax: (a)-[:REL]->(b) visually represents graph patterns in ASCII art.

Variable-length paths: [:FRIENDS_WITH*] traverses any depth without writing recursive SQL.

Recommendation: Multi-hop traversal that would require 3+ self-joins in SQL executes efficiently via index-free adjacency.

Use Cases

Social Networks

Friends-of-friends queries, community detection, influence propagation -- all natural graph traversals.

Fraud Detection

Identify suspicious transaction networks by finding cycles or unusual relationship patterns across accounts and devices.

Recommendation Engines

Collaborative filtering via graph traversal -- "people who bought X also bought Y" queries are simple MATCH patterns.

Knowledge Graphs

Representing complex entity relationships (Google Knowledge Graph, enterprise ontologies, drug interaction networks).

9Consistency Models & Distribution

Consistency in distributed NoSQL systems spans a spectrum from strong guarantees to eventual convergence. Understanding the trade-offs is essential for designing correct distributed applications.

Sharding vs. Replication

Strong Consistency

All reads after a write see the updated value, on every node, immediately.

Achieved via synchronous replication
Higher latency and lower availability
Required for: banking, inventory, auth tokens
Examples: RDBMS, MongoDB default, HBase

Eventual Consistency

Writes propagate asynchronously; reads may return stale data temporarily.

Higher availability and write throughput
Inconsistency window: ms to seconds
Required for: social feeds, shopping carts, DNS
Examples: Cassandra, DynamoDB, CouchDB

// Strong vs Eventual consistency side-by-side. Click Play to compare.

Strong Consistency (CP)

200

Eventual Consistency (AP)

200

Both systems end up consistent, but AP (eventual) was available throughout while CP (strong) blocked during the write. Click Play to see the difference.

Step 0 / 6

Quorum-Based Consistency: W + R > N

// Quorum replication with N=3, W=2, R=2. Click Play to watch.

W=2 + R=2 = 4> N=3STRONG CONSISTENCY

200

Idle

200

Idle

200

Idle

All 3 replicas consistent at X=200. The quorum formula W+R>N guaranteed correct reads even during partial sync. Click Play to see the process.

Step 0 / 6

Vector Clocks & Conflict Resolution

Vector clocks track causality in distributed systems: each node maintains a vector of logical timestamps, one per replica. When two vector clocks are incomparable (neither is an ancestor of the other), they represent concurrent updates that conflict and need resolution.

Last Write Wins (LWW)

Highest timestamp wins. Simple but can lose data with clock skew. Default in Cassandra.

Application-Level Merge

Database stores all conflicting versions (siblings); application logic selects or merges. Used in CouchDB.

CRDTs

Conflict-free Replicated Data Types (counters, sets) mathematically guarantee convergence without explicit resolution logic.

Tunable Consistency Levels (Cassandra)

Level	Replicas Consulted	Consistency	Latency
ONE	1 replica	Eventual	Lowest
QUORUM	⌊N/2⌋ + 1 replicas	Strong (if W+R>N)	Medium
ALL	All N replicas	Strongest	Highest

10Common Mistakes

Treating NoSQL as Truly Schema-Less

Assuming no schema means no data discipline

NoSQL databases are schema-flexible, not schema-free. Applications still operate with an implicit schema. Without discipline, collections accumulate inconsistent fields, types, and structures that cause hard-to-debug application errors and make migrations painful.

Ignoring Eventual Consistency Implications

Assuming writes are immediately visible everywhere

In AP systems, a write may not be visible on another replica for milliseconds or longer. Applications must handle stale reads gracefully -- implement read-your-writes for critical user operations, and design UIs to tolerate brief inconsistencies.

Poor Shard Key Selection

Choosing a shard key with low cardinality or skewed distribution

A bad shard key (e.g., a boolean field, or a monotonically increasing time-based key in an append-heavy workload) creates hot spots where one shard receives disproportionate traffic while others sit idle. A good shard key distributes data evenly and supports common query patterns.

Confusing NoSQL with "No Query Language"

Assuming NoSQL means simple or limited querying

NoSQL means "Not Only SQL." Most NoSQL databases have powerful query languages: MQL for MongoDB, CQL for Cassandra, Cypher for Neo4j, SPARQL for RDF stores. These are purpose-built and often more expressive for their respective data models than generic SQL.

Using the Wrong NoSQL Type for the Problem

Forcing all data into a single NoSQL category

Using a key-value store for complex relationship queries, or a graph database for simple high-volume key lookups, leads to poor performance and awkward data modeling. Match the database type to the inherent structure and access patterns of your data -- consider polyglot persistence for complex applications.

Neglecting Data Locality in Graph Databases

Assuming graph traversals are always fast regardless of graph structure

Index-free adjacency is fast when related nodes are co-located in memory. Poorly designed graphs, or graphs distributed across many machines, can suffer performance penalties if traversals constantly cross node boundaries. Understand your query patterns and design the graph layout accordingly.

Frequently Asked Questions

When should I choose a NoSQL database over a traditional RDBMS?: Choose NoSQL when your application needs horizontal scalability for massive data volumes or very high throughput, a flexible or evolving schema (e.g., IoT sensor data, user-generated content), specialized data models (graphs, documents, key-value), high availability during network partitions, or extremely fast reads or writes. For strong ACID guarantees, complex multi-table joins, or strict referential integrity, RDBMS are generally preferred. Most modern architectures use both in a polyglot persistence approach.
What is the "schema-less" nature of NoSQL, and what are its implications?: NoSQL databases are better described as "schema-flexible" or "schema-on-read" rather than truly schema-less. The database itself does not enforce a rigid schema, so documents in the same collection can have different fields. Pros: faster development, easier data model evolution, flexibility for diverse data. Cons: the application must validate and understand the data structure; lack of database-level enforcement can lead to inconsistent data if not carefully managed.
What is the primary difference between sharding and replication?: Sharding (horizontal partitioning) focuses on scalability by distributing data across multiple independent shards, each holding a unique subset of the total data. Replication focuses on availability and fault tolerance by maintaining multiple identical copies of data across nodes -- if one fails, others serve requests. They are often used together: each shard is typically replicated for high availability.
How does the CAP theorem affect NoSQL database design?: The CAP theorem states a distributed system cannot simultaneously guarantee Consistency, Availability, and Partition Tolerance. Since network partitions are inevitable in distributed systems, designers must choose: CP systems (e.g., HBase, MongoDB default) sacrifice availability during partitions to maintain strong consistency, while AP systems (e.g., Cassandra, DynamoDB) remain available but may return stale data, achieving eventual consistency.
How do I choose the right NoSQL database type for my project?: Consider four factors: (1) Data model -- key-value for simple GET/PUT, document for flexible nested objects, column-family for sparse/time-series high-write workloads, graph for relationship-heavy data. (2) Access patterns -- how will you query? By key, content, range, or relationships? (3) Consistency needs -- can you tolerate eventual consistency or do you need strong consistency? (4) Scale requirements -- do you need linear horizontal scaling? A polyglot persistence approach using multiple database types for different parts of an application is often optimal.
What are the trade-offs of eventual consistency in AP systems?: Eventual consistency means writes propagate asynchronously, so reads may temporarily return stale data. The "inconsistency window" is typically milliseconds to seconds. Applications must handle this gracefully: implement read-your-writes consistency for user-facing operations, use conflict resolution strategies (last-write-wins, CRDTs, application-level merging), and design UIs that tolerate brief inconsistencies. The benefit is higher availability and write throughput -- systems like Cassandra can sustain hundreds of thousands of writes per second.

Practice Quiz

Test your understanding of NoSQL databases -- select the correct answer for each question.

1.Which of the following is NOT one of the three guarantees in the CAP Theorem?

2.A database system that prioritizes Availability and Partition Tolerance over strong Consistency is known as a(n):

3.Which NoSQL database category is best suited for managing user session data and caching due to its O(1) read/write performance by key?

4.Which of the following is a defining characteristic of a Document Store?

5.The formula W + R > N is used in which consistency mechanism?

6.Which NoSQL database type would be most appropriate for modeling complex relationships in a social network or for fraud detection?

7.What is the primary benefit of Index-Free Adjacency in graph databases?

8.Which term describes the process of intentionally adding redundant data to optimize read performance in NoSQL databases?

9.Which database system is a prominent example of a Column-Family Store, known for high write throughput and handling time-series data?

10.What does BASE stand for in the context of distributed NoSQL systems?

Study Tips

Draw the CAP triangle from memory: Label each vertex, place Cassandra and MongoDB in the correct regions, and explain why CA systems cannot exist in practice.
Practice all four data models: Sketch a data model for a social network in each NoSQL type -- key-value, document, column-family, and graph. This reveals each type's strengths and weaknesses concretely.
Run Redis and MongoDB locally: The official Docker images make it trivial to experiment with SET/GET/EXPIRE in Redis and insertOne/find/aggregate in MongoDB within minutes.
Work through the quorum formula: For a 5-node cluster, calculate all combinations of W and R that give strong consistency (W+R>5) vs. eventual consistency (W+R≤5) and their availability implications.
Map use cases to database types: For any application described in an exam scenario, practice identifying which NoSQL type (or combination) fits best and why -- this tests deep conceptual understanding.

1Introduction

NoSQL vs. RDBMS at a Glance

2Key Definitions

3The CAP Theorem & Trade-offs

PACELC: Extending CAP

4NoSQL Database Categories

5Key-Value Stores Deep Dive

Core Operations

Redis: Session Management Example

Consistent Hashing for Distribution

Use Cases

6Document Stores Deep Dive

MongoDB CRUD Operations

Indexing Strategies

7Column-Family Stores Deep Dive

Cassandra: Time-Series IoT Data

Use Cases

8Graph Databases Deep Dive

Index-Free Adjacency

Neo4j Cypher Query Language

Use Cases

9Consistency Models & Distribution

Quorum-Based Consistency: W + R > N

Vector Clocks & Conflict Resolution

Tunable Consistency Levels (Cassandra)

10Common Mistakes

Frequently Asked Questions

Practice Quiz

Study Tips

Related Topics