SchemaX
SchemaX is a modern, low-touch, interactive schema migration and management tool for data catalogs. It's Git- and CI/CD-friendly: you design schemas in the VS Code designer or define them in code, store changes as versioned operations and snapshots, and generate or apply SQL per environment. Use it in full mode (create and manage catalogs, schemas, tables, and governance) or governance-only mode (comments, tags, grants, row filters, column masks on existing objects). Unity Catalog is supported today; Hive and PostgreSQL are planned.
New here? Start with Prerequisites and Quickstart.
Only need CI/CD? Go to Git and CI/CD setup to run SchemaX in Azure DevOps or GitHub Actions (non-interactive).
SchemaX is open source (not a Databricks Labs project). When published:
- Python SDK & CLI → PyPI (
pip install schemaxpy) - VS Code extension → VS Code Marketplace (search "SchemaX") Until then, install from source (see Quickstart and Development).
What is schema management and migration?
- Schema management is defining and maintaining the structure of your data catalog—catalogs, schemas, tables, views, and their metadata (columns, types, comments, constraints). In Unity Catalog this is the layer that decides what exists and who can access it.
- Schema migration is applying changes to that structure over time in a controlled way (e.g. add a column, add a table, change grants) and tracking those changes so they can be reviewed, versioned, and deployed per environment (dev → test → prod).
- Without it, schema changes are ad hoc, hard to audit, and risky when promoting to production. With it, changes are declarative, in Git, and applied consistently via CI/CD or manual apply.
When to use SchemaX
- Full creation and management — Greenfield or brownfield projects where you want to design and deploy catalogs, schemas, tables, and governance from one place. Classic "schema as code" with visual design and SQL generation.
- Spark / Delta Live Tables — If tables and schemas are created by Spark or DLT, creation may be less relevant for SchemaX. Governance is still valuable: comments, tags, grants, row filters, column masks on those existing objects. Use SchemaX in governance-only mode so you version and deploy only governance DDL, not CREATE TABLE.
- Databricks Apps and Lakebase — When building applications on the Databricks platform (e.g. Databricks Apps with Streamlit, Dash, or apps that combine Lakebase with the lakehouse), the Unity Catalog layer—catalogs, schemas, tables, and governance (grants, tags, row filters)—still needs to be consistent across environments and versioned. SchemaX manages that UC layer (full or governance-only) and fits into Git and CI/CD so app deployments and schema deployments stay aligned. (SchemaX manages Unity Catalog; Lakebase's own Postgres schema is separate.)
See Environments and deployment scope for full mode vs governance-only and how to scope deployments to existing objects.
Features
Visual Schema Designer (VS Code Extension)
- Intuitive interface for schema modeling
- Provider-based: Unity Catalog (now), Hive/PostgreSQL (coming soon)
- Dependency-aware creation order for views and materialized views (automatic extraction from SQL; optional manual override in the Designer)
- Data governance: constraints, tags, grants, row filters, column masks
- External tables, partitioning, liquid clustering
- Snapshot-based versioning with semantic versions
- Real-time SQL generation
- Bulk operations: apply the same grant or tag to all objects in a catalog or schema scope from the detail panel (one action for many tables/schemas)
Python SDK & CLI
- Command-line tools for automation and CI/CD
- Python API for custom workflows
- Deployment tracking with database-backed audit trail
- Automatic and manual rollback
- Snapshot lifecycle: create, validate, rebase
- Schema validation, circular dependency detection, and dependency-aware view/MV ordering
Quick links
For users — Get started and use SchemaX:
- Prerequisites — What you need before starting
- Quickstart — Get started with the extension and CLI
- Setup — Install extension, CLI, open a project
- Authentication — How the CLI authenticates to Databricks (same locally and in CI/CD)
- Environments and deployment scope — Configure environments, governance-only mode, existing objects
- Unity Catalog grants — Manage GRANT/REVOKE on catalogs, schemas, tables, views
- Only need CI/CD? → Git and CI/CD setup — Run SchemaX in Azure DevOps or GitHub Actions (non-interactive)
- Workflows — Greenfield, brownfield, apply and rollback
- CLI Reference — Commands and options
- Architecture — Design and patterns
- FAQ — Common questions
For contributors — Develop, test, and contribute:
- Development — Build, run, and extend SchemaX
- Testing — How to run and write tests
- Provider contract — Implement a new catalog provider
- Contributing — Code style, PR process, commit signing
Supported providers
| Provider | Status | Hierarchy |
|---|---|---|
| Unity Catalog | Available | Catalog → Schema → Table/View/Volume/Function/Materialized View |
| Hive Metastore | Coming soon | Database → Table |
| PostgreSQL | Coming soon | Database → Schema → Table |