perfect — here’s a text-only, rows/columns comparison of the five NoSQL databases you asked about. no diagrams, just tables.
1) Overview & CAP posture
| Database | Primary Data Model | CAP Lean (typical) | Consistency (default) | Tunability | Transactions |
|---|---|---|---|---|---|
| MongoDB | Document (BSON/JSON) | CP by default; AP-ish if reading from secondaries | Strong on primary; “majority” for committed reads/writes | readConcern/writeConcern, readPreference | ACID multi-document (incl. sharded) |
| Cassandra | Wide-column | AP (high availability, partition tolerant) | Eventual; per-op CL (e.g., LOCAL_QUORUM) | Tunable CL for reads/writes, RF | Per-row atomic; LWT (Paxos) for conditional updates |
| Elasticsearch | Indexed documents (search) | AP (NRT search, high availability) | Primary→replica; eventual across replicas | Acks, refresh interval, durability settings | Per-doc atomic only |
| Couchbase | Document + Key/Value (memory-first) | AP-leaning (durability/consistency tunable) | KV strong per-key; queries NRT unless request_plus | Durability levels, scan_consistency | Distributed ACID transactions |
| Neo4j | Property graph | CP in core (leader/followers); AP-like reads on replicas | Strong on core; eventual on read replicas | Causal consistency APIs, topology | ACID |
CAP note: All are P-tolerant by design. “Lean” reflects common defaults when a partition occurs: you typically trade C or A, not both.
2) Internals & scaling
| Database | Replication / Consensus | Partitioning / Scaling | Storage / Index Core | Query Interfaces |
|---|---|---|---|---|
| MongoDB | Replica sets (PV1, Raft-like election); CSRS for metadata | Sharding: hashed / ranged / zone; balancer | WiredTiger (B-Tree), secondary indexes; text/geo | CRUD, Aggregation Pipeline, $lookup, text/geo, Change Streams |
| Cassandra | Gossip membership; hints; Paxos for LWT; anti-entropy via Merkle trees | Consistent-hash ring (token ranges); linear scale; multi-region active-active | LSM / SSTables; leveled/size-tiered compaction | CQL (partition-key and clustering-key oriented) |
| Elasticsearch | Cluster coordination (Zen2, Raft-like); primary/replica shards | Hash-based shards; routing by id/field; cross-cluster search/replication | Lucene inverted index, doc values; translog; segment merges | Full-text DSL, filters, aggregations, geo (REST/HTTP) |
| Couchbase | vBuckets + replicas; DCP change streams; XDCR for x-DC | ~1024 vBuckets hashed across nodes; multi-dimensional scaling by service | Couchstore/Magma; GSI indexes; FTS service | N1QL (SQL for JSON), KV API, FTS, Analytics, Eventing |
| Neo4j | Core cluster uses Raft (leader/followers); read replicas | Fabric for multi-graph; replicas for read scale | Native graph store; schema indexes; optional Lucene for text | Cypher; procedures (APOC); GDS library APIs |
3) Features, use cases & cautions
| Database | Notable Features | Top Use Cases | Cautions / Pitfalls |
|---|---|---|---|
| MongoDB | Time-series, TTL, schema validation, Client/Field-Level & Queryable Encryption, Atlas Search | Catalogs, user profiles, event logs, microservices backends, operational analytics | Bad shard key → hot shards; scatter-gather on unsharded queries; oversized docs/arrays |
| Cassandra | Multi-region active-active, incremental repair, materialized views (with care) | IoT/time-series, session stores, metrics, high-write feeds | Query-driven data modeling; no joins; LWT adds latency; wide partitions can hurt |
| Elasticsearch | Relevance scoring, analyzers, ILM, CCR/CCS, observability ecosystem | Search, log analytics/observability, e-commerce search, SIEM | Not an OLTP store; mapping & high cardinality fields can explode memory; refresh trade-offs |
| Couchbase | Built-in cache, sub-document ops, Sync Gateway (mobile), flexible services (KV/Query/FTS/Analytics) | Low-latency KV & sessions, personalization, catalogs, mobile sync | Pick scan_consistency wisely; index sizing/placement matters; cross-service resource isolation |
| Neo4j | Graph algorithms (PageRank, Community, Shortest Path), path traversals, APOC | Fraud detection, recommendations, knowledge graphs, network analysis | Global write scale needs careful partitioning; hotspot relationships can bottleneck |
How to pick quickly (text-only cheat sheet)
- Need strong consistency + rich JSON + aggregations → MongoDB (CP).
- Need planet-scale writes + tunable consistency → Cassandra (AP).
- Need full-text search/analytics over logs → Elasticsearch (AP).
- Need millisecond KV + SQL-for-JSON + caching → Couchbase (AP-leaning).
- Need deep relationship queries/graph algorithms → Neo4j (CP core).
If you want this exported as a single HTML/CSV table, say “export HTML” or “export CSV” and I’ll generate the file immediately.