Postgres stores data in tables with a fixed schema. Foreign keys connect one table to another, and the database enforces them. Writes that violate the FK get rejected.
| idPK | name |
|---|
| idPK | user_idFK | title |
|---|
Index lookups over declared keys are fast. Queries that scan every row are slow.
Connections are finite. Row locks serialize concurrent writes to the same row.
If your data has relationships and correctness matters more than throughput, start here.
Redis is one big hash map. Values are addressed by a key: hash(key) → bucket → value. SET, GET, DEL are all O(1). No schema. No cross-key transactions.
Point lookup by key is O(1). Search by value means walking every key, O(N).
Sub-millisecond latency at huge throughput. Hot keys and memory pressure can break that.
Best for caches, ephemeral state, and simple keyed lookups. Rarely your source of truth.
Cassandra groups data by a row key. Rows are schema-flexible and often sparse. Writes are upserts at the column level. Insert with an existing row key and the new columns merge in.
The row key hashes to a node. Query with the row key, one node answers. Query without it, every node has to scan.
Throughput scales linearly with nodes. Concentrate writes on one row key and you pin a single node. That's a hot partition.
Wide-column is a specialist. Pick it when you have massive write volume and already know your access pattern.
A bucket holds objects addressed by a key like photos/2024/sunset.jpg. Objects are immutable blobs with metadata. PUT, GET, DELETE. No partial updates.
GET by exact key is one index lookup. Filter by content or size and you have to walk every object in the bucket.
Parallel GETs and PUTs scale almost for free. The pain shows up on a LIST over a huge bucket, where you page through keys one window at a time.
The default home for big files. Cheap, durable, and infinite. Everything else is a tradeoff.
Each item is a high-dimensional embedding vector (e.g., 1536 dims) plus a small payload. The DB builds an ANN index (HNSW, IVF) so similarity search skips most vectors instead of scanning them all.
ANN search uses the index to narrow down to a handful of candidates near the query, then checks just those. Brute-force KNN compares the query to every vector.
ANN trades a little recall for a lot of speed. Filtered queries can fall off a cliff if the filter is selective, since the index doesn't know about your filter.
Vector storage is for similarity, not lookup. Pick it when "close enough" is the question and an exact key won't do.
Data is nodes with labels (Person, Movie, City) and edges between them (FRIENDS_WITH, ACTED_IN, LIVES_IN). Both nodes and edges carry properties, and edges are first-class citizens you traverse in either direction.
Start at a known node and hop along edges. The DB walks the local neighborhood instead of touching the whole dataset.
Shallow traversals stay fast on huge graphs because each hop is local. Push depth too far and the visited set fans out fast.
Pick a graph when relationships are the point. Multi-hop questions about who connects to whom should be cheap.