pg_mentat: bringing Mozilla's Mentat back inside PostgreSQL.
I have a soft spot for Datomic. The data model — immutable facts indexed by entity, attribute, value, and transaction; schema as a first-class set of attribute definitions; Datalog as the query language; time travel free with the model — is the one piece of database design from the last twenty years that has aged better than I expected. When Cognitect built it on top of DynamoDB and Cassandra and Postgres back-ends, the engineering said "this is a model worth keeping around" loud enough that Mozilla's mentat project tried to ship a desktop / mobile version of the same idea. Then Firefox's roadmap moved on, mentat got archived, and the model went looking for a new home.
pg_mentat is that home. It is
a PostgreSQL extension, written in Rust on
pgrx 0.17, that
implements Datomic's data model entirely inside Postgres: immutable
datoms, schema-first attributes, the Datalog query compiler, the pull
API, time travel as as-of/since/history/tx-range, and ACID
transactions because Postgres gives them to me for free. Supports
PostgreSQL 13 through 18. An optional companion daemon, mentatd,
speaks the Datomic client wire protocol over HTTP for applications
that already expect it.
the data model in two minutes for the SQL crowd
A datom is a 5-tuple [entity, attribute, value, tx, op]. Entity
is a stable id. Attribute is a name with a type and a cardinality
(one or many). Value is the obvious thing. Tx is the transaction
that asserted (or retracted) the fact. Op is whether the fact was
asserted or retracted.
That's it. All of your data is datoms. A row in a SQL table becomes
a small set of datoms — one per non-null column — that share an
entity id. An update becomes a retraction of the old datom and an
assertion of the new one, both at the same tx. A "row that did
not exist before time T" is (entity, attr, value, tx, op)-tuples
where tx ≤ T does not contain an assertion for that pair.
This is a strange shape if you have spent your career on tables. It turns out to also be a powerful shape, because:
- Time travel is free.
as-of Tis just a filter ontx. - Schema migrations are additive. Adding an attribute is
asserting a datom about it; no
ALTER TABLElock dance. - Audit logs are the data. You did not have to build them.
- Reverse references are first-class. "Who points at this entity" is a single index lookup.
- The query language is the same shape as the data. Datalog patterns are 5-tuples with logic variables; nothing has to be translated.
why a Postgres extension and not yet another standalone Datomic clone
Three reasons.
First, Postgres already has the storage, the WAL, the replication, the backup, the role system, and the connection pool that any production-grade database has to have, and that Mozilla's mentat explicitly did not. Re-implementing those is years of work that buys you nothing if your users already have a Postgres operator on call.
Second, Postgres already has my other data. An application that wants the Datomic data model also wants to JOIN against the rest of the data it owns, which is in Postgres tables. Running both inside the same instance — same transaction, same backup, same monitoring — removes the two-store consistency problem that kills most polyglot deployments.
Third, pgrx makes this approachable. pg_mentat is on the order
of tens of thousands of lines of Rust, not hundreds, because pgrx
absorbs the boilerplate that hand-rolled C extensions used to spend
half their bulk on.
what survived from the original mentat
The query AST, the EDN reader, and the schema model survived almost intact. The shape of attributes and the way Datalog patterns compile to executor steps follow Mozilla's design. Where I had to diverge was the storage layer (entirely Postgres-native; no SQLite underneath like the original) and the transaction layer (Postgres transactions all the way down, no separate transaction id space).
where it goes from here
The 1.3.0 release introduced Datalog where-fns that bridge into
other PostgreSQL extensions: pg_trgm, rum, fuzzystrmatch,
pgvector,
pg_infer,
PostGIS. These are soft dependencies:
nothing pg_mentat ships requires any of them, but the where-fns
light up automatically when they are present. This is the bet I am
making about the Postgres extension family. My extensions are
designed to compose with each other and with the broader ecosystem;
none of them tries to swallow another's responsibilities.
A worked example. If you have pg_trgm installed, the where-fn
(trgm-similar? ?text "input string" 0.3) does what you would
expect, returning true when the trigram similarity exceeds the
threshold. If you have pgvector, (cosine-similar? ?embedding v 0.85) filters by cosine distance. If you have pg_infer,
(model-implies? "qwen05b" ?subject ?object 0.7) filters by a
threshold on a learned association from the model's vindex. None
of these are SQL — they are Datalog patterns, expressed against
datoms, that happen to call into other extensions through the
where-fn extension point.
status
pg_mentat is at 1.3.0 and tested against PostgreSQL 13 through 18. Apache 2.0. The repo is at https://github.com/gburd/pg_mentat with a Codeberg mirror.
A Datomic-cloud-shaped person looks at this and says "but Datomic ships transactor / peers / storage as one box, and pg_mentat is running in a Postgres backend." That is correct. The architecture is different. What it shares with Datomic is the data model and the query language, not the deployment shape. If your application wanted Datomic-the-deployment, the right tool is still Datomic Pro or XTDB. If your application wanted Datomic-the- data-model in the database it already runs, that is what pg_mentat is for.