Medallion Architecture: the pattern that organizes your Lakehouse
How the Bronze, Silver and Gold layers turn a chaotic data lake into a reliable, auditable platform.
The medallion architecture is probably the most useful pattern I've adopted on Lakehouse projects. It solves an old problem: how do you keep a data lake organized, reliable and auditable when dozens of sources dump data of varying quality into it every day?
The idea is simple — organize data into three progressive quality layers: Bronze, Silver and Gold.
Bronze — the raw truth
The Bronze layer stores data exactly as it arrived from the source, untransformed. It's your immutable historical record. If a downstream pipeline has a bug, you can always reprocess from Bronze without going back to the original source — which may no longer exist.
Good practices for Bronze:
- Append-only ingestion, preserving the full history.
- Add ingestion metadata (timestamp, source file, hash).
- Don't enforce a rigid schema — capture first, validate later.
Silver — clean and conformed
In the Silver layer, data is cleaned, deduplicated, typed and conformed to a consistent model. This is where you apply quality rules, resolve keys and join related sources.
The Silver layer is where most analytics engineers and data scientists should work. It's trustworthy enough for exploration, yet still granular.
Gold — business-ready
The Gold layer holds aggregations and dimensional models ready for consumption: fact tables, dimensions and business metrics that feed executive dashboards. It's optimized for reads and performance.
The golden rule: each layer only reads from the previous one. Bronze never reads from Silver. This keeps lineage clear and reprocessing predictable.
Why it matters
On a multi-country project for a global industrial company, this separation is what let us extend the platform from Germany and Spain to Brazil, Portugal, India and China without rewriting the business logic. Each country's quirks lived in Bronze and Silver; the Gold layer delivered consistent KPIs to global management.
The medallion architecture isn't a silver bullet, but it gives you something rare in data engineering: predictability. And predictability is what lets you sleep soundly when the pipeline runs at 3am.
Related articles
Incremental ingestion: stop reloading everything every night
Watermarking, change data capture and the patterns that cut cost and processing windows in ETL pipelines.
Read articleSlowly Changing Dimensions Type 2, without the headache
The essential pattern for tracking history in dimensions — explained with a concrete example and the most common mistakes.
Read articleEnjoyed this? Check out the e-books for in-depth content.
E-books