Skip to main content

SchemaX Quickstart Guide

Complete guide to get started with SchemaX — both the VS Code extension and Python SDK/CLI. See Prerequisites for what you need before starting.

Simple path (5 steps)

  1. Open a project — Open a folder in VS Code (or Extension Development Host) that will hold your SchemaX project.
  2. Add a table — Open the designer (SchemaX: Open Designer), add a catalog and schema if needed, then add a table and columns (see Your first schema below).
  3. Generate SQL — Use SchemaX: Generate SQL Migration to export DDL to .schemax/migrations/ (or run schemax sql in the project directory).
  4. Apply (or run in pipeline) — Run schemax apply --target <env> to execute the SQL against Databricks, or run the same in CI/CD (see Git and CI/CD setup).
  5. Create a snapshot (optional) — Use SchemaX: Create Snapshot or schemax snapshot create to version your state.

SchemaX can run in full mode (create and manage catalogs, schemas, tables, and governance) or governance-only mode (comments, tags, grants, row filters, column masks on existing objects). See Environments and deployment scope for how to configure this.

Table of Contents


Providers

SchemaX uses a provider-based architecture to support different data catalog systems.

Supported Providers

ProviderStatusWhen to Use
Unity Catalog✅ Available (v1.0)Databricks Unity Catalog projects
Hive Metastore🔜 Coming Q1 2026Apache Hive / legacy Databricks
PostgreSQL🔜 Coming Q1 2026PostgreSQL with Lakebase extensions

Default Provider

Unity Catalog is the default provider for all new projects. When you create a new SchemaX project (by opening the designer for the first time), it automatically initializes with Unity Catalog.

Provider Selection (Future)

In v0.3.0+, you'll be able to select a provider when creating a new project:

# CLI (future)
schemax init --provider unity # Unity Catalog (default)
schemax init --provider hive # Hive Metastore
schemax init --provider postgres # PostgreSQL

# For now, all projects use Unity Catalog
schemax init

For this quickstart, we'll use Unity Catalog (the current provider).


VS Code Extension

Installation & Launch

If you installed the extension from the VS Code Marketplace or a .vsix file:

  1. Open VS Code
  2. File → Open Folder — open or create your project folder (e.g., ~/my-schema-project)
  3. Press Cmd+Shift+PSchemaX: Open Designer

That's it — the extension is already loaded.


If you're running SchemaX from source (contributing to SchemaX itself):

cd /path/to/schemax-vscode
code .

Press F5 (or Fn+F5 on Mac) to launch the Extension Development Host, then in that new window open your project folder.


Using the Designer

Step 1: Launch SchemaX Designer

  1. Press Cmd+Shift+P (Mac) or Ctrl+Shift+P (Windows/Linux)
  2. Type: SchemaX: Open Designer
  3. Press Enter

The visual designer opens!

Step 2: Create Your First Catalog

  1. Click "Add Catalog" button
  2. Enter name: main
  3. Click OK

Step 3: Add a Schema

  1. Select the main catalog in the tree
  2. Click "Add Schema" button
  3. Enter name: sales
  4. Click OK

Step 4: Add a Table

  1. Select the sales schema
  2. Click "Add Table" button
  3. Enter name: customers
  4. Select format: delta
  5. Click OK

Step 5: Add Columns

  1. Select the customers table
  2. Click "Add Column" button
  3. Fill in details:
    • Name: customer_id
    • Type: BIGINT
    • Nullable: No
    • Comment: Primary key
  4. Add more columns as needed

Step 6: Add a View (Optional)

  1. Select the sales schema
  2. Click "+" button → Choose "View"
  3. Enter SQL definition:
    SELECT customer_id, COUNT(*) as order_count
    FROM customers
    GROUP BY customer_id
  4. View name will be auto-extracted
  5. Click OK

Note: SchemaX automatically:

  • Extracts dependencies from your view SQL
  • Qualifies table references with fully-qualified names (FQN)
  • Orders SQL generation so tables are created before views (and before materialized views)
  • Applies the same ordering for materialized views: tables and views are created before MVs
  • Detects circular dependencies between views

You can edit dependencies manually in the view or materialized view detail panel (Dependencies section → Edit) if you need to fix extraction gaps or force creation order.

Step 7: Create a Snapshot

  1. Press Cmd+Shift+P
  2. Type: SchemaX: Create Snapshot
  3. Enter name: v0.1.0
  4. Enter comment: Initial schema

Your schema is now versioned!

Bulk operations

To apply the same grant or tag to all objects in a catalog or schema, select the catalog or schema in the tree and click Bulk operations in the detail panel. Choose Add grant (principal + privileges) or Add tag, then Apply. See Unity Catalog grants — Bulk grants and tags.

Checking the Files

ls -la .schemax/
cat .schemax/project.json
cat .schemax/changelog.json
ls -la .schemax/snapshots/

Available Commands

  • SchemaX: Open Designer - Launch visual designer
  • SchemaX: Create Snapshot - Version your schema
  • SchemaX: Generate SQL Migration - Export to SQL
  • SchemaX: Show Last Emitted Changes - View operations

Python SDK & CLI

Installation

pip install schemaxpy

The SDK is published on PyPI. For a development install (contributing to SchemaX itself), see the Development guide.

Verify Installation

schemax --version

CLI Commands

Validate Schema

cd your-project
schemax validate

Output:

Validating project files...
✓ project.json (version 4)
✓ changelog.json (5 operations)

Project: my_project
Catalogs: 1
Schemas: 1
Tables: 2

✓ Schema files are valid

Generate SQL

# Output to stdout
schemax sql

# Save to file
schemax sql --output migration.sql

# View the file
cat migration.sql

Apply to Databricks

# Preview (dry-run)
schemax apply --target dev --profile my-profile --warehouse-id abc123 --dry-run

# Apply (tracks deployment automatically)
schemax apply --target dev --profile my-profile --warehouse-id abc123

# CI/CD (non-interactive)
schemax apply --target prod --profile my-profile --warehouse-id abc123 --no-interaction

Deployment tracking is built into schemax apply — no separate record step needed.
See Git and CI/CD setup for pipeline examples.

Python API

Create a script to use SchemaX programmatically:

#!/usr/bin/env python3
from pathlib import Path
from schemax.core.storage import load_current_state, read_project
from schemax.providers.base.operations import Operation

# Load schema with provider
workspace = Path.cwd()
state, changelog, provider = load_current_state(workspace)

# Show summary
project = read_project(workspace)
print(f"Project: {project['name']}")
print(f"Provider: {provider.info.name}")
if "catalogs" in state:
print(f"Catalogs: {len(state['catalogs'])}")
print(f"Pending operations: {len(changelog['ops'])}")

# Generate SQL
if changelog["ops"]:
operations = [Operation(**op) for op in changelog["ops"]]
generator = provider.get_sql_generator(state)
sql = generator.generate_sql(operations)
Path("migration.sql").write_text(sql)
print("✓ SQL generated: migration.sql")

Your First Schema

Let's create a complete example from scratch.

Step 1: Create Project Directory

mkdir ~/my-first-schema
cd ~/my-first-schema

Step 2: Open in VS Code

code .

Step 3: Launch SchemaX (in Extension Development Host)

  1. Press F5 in the main VS Code window
  2. In the new window, open the ~/my-first-schema folder
  3. Press Cmd+Shift+PSchemaX: Open Designer

Step 4: Build Schema

Create Catalog: ecommerce

Create Schema: production

Create Tables:

Table 1: customers

  • Columns:
    • id (BIGINT, NOT NULL, Primary Key)
    • email (STRING, NOT NULL, Unique)
    • name (STRING, NOT NULL)
    • created_at (TIMESTAMP, NOT NULL)
  • Properties:
    • delta.enableChangeDataFeed = true
  • Constraints:
    • PRIMARY KEY (id)

Table 2: orders

  • Columns:
    • id (BIGINT, NOT NULL, Primary Key)
    • customer_id (BIGINT, NOT NULL, Foreign Key)
    • amount (DECIMAL(10,2), NOT NULL)
    • status (STRING, NOT NULL)
    • created_at (TIMESTAMP, NOT NULL)
  • Constraints:
    • PRIMARY KEY (id)
    • FOREIGN KEY (customer_id) REFERENCES customers(id)

Step 5: Create Snapshot

Press Cmd+Shift+PSchemaX: Create Snapshot

  • Name: v1.0.0
  • Comment: Initial e-commerce schema

Step 6: Verify Files

tree .schemax/

Output:

.schemax/
├── changelog.json
├── project.json
└── snapshots/
└── v1.0.0.json

Generating SQL

From VS Code

  1. Make some changes (add columns, tables, etc.)
  2. Press Cmd+Shift+P
  3. Type: SchemaX: Generate SQL Migration
  4. Review the SQL file that opens

The SQL is saved to:

.schemax/migrations/migration_YYYY-MM-DD_HH-MM-SS.sql

From CLI

# Generate SQL from changelog
schemax sql --output deploy.sql

# Review
cat deploy.sql

Example Generated SQL

The SQL file (schemax sql) produces idempotent DDL. Columns are included inline in the CREATE TABLE statement — no separate ADD COLUMN step needed. Each statement is separated by a semicolon so it can be run in any SQL tool or applied via schemax apply.

-- Op: op_abc123 (2025-10-13T12:00:00Z)
-- Type: add_catalog
CREATE CATALOG IF NOT EXISTS `ecommerce`;

-- Op: op_def456 (2025-10-13T12:01:00Z)
-- Type: add_schema
CREATE SCHEMA IF NOT EXISTS `ecommerce`.`production`;

-- Op: op_ghi789 (2025-10-13T12:02:00Z) | op_jkl012 (2025-10-13T12:03:00Z) | ...
-- Type: add_table + add_column (batched)
CREATE TABLE IF NOT EXISTS `ecommerce`.`production`.`customers` (
`id` BIGINT NOT NULL COMMENT 'Primary key',
`email` STRING NOT NULL,
`name` STRING NOT NULL,
`created_at` TIMESTAMP NOT NULL
) USING DELTA;

Apply to Databricks

Use schemax apply to execute the SQL against Databricks — it handles authentication, executes each statement individually, and records the deployment automatically:

schemax apply \
--target dev \
--profile my-databricks-profile \
--warehouse-id <warehouse-id>

See Git and CI/CD setup for pipeline examples, or continue below for a basic GitHub Actions template.


CI/CD Integration

schemax apply with --no-interaction is the recommended way to run SchemaX in pipelines. It handles SQL generation (or accepts a pre-generated file), executes statements one-by-one against Databricks, and records the deployment — no separate deploy step needed.

For detailed pipeline setup including Azure DevOps, see Git and CI/CD setup.

GitHub Actions (quick example)

name: Deploy Schema

on:
push:
branches: [main]

jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: '3.11'

- name: Install SchemaX
run: pip install schemaxpy

- name: Validate schema
run: schemax validate

- name: Apply to dev
env:
DATABRICKS_HOST: ${{ secrets.DATABRICKS_HOST }}
DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
run: |
schemax apply \
--target dev \
--profile DEFAULT \
--warehouse-id ${{ secrets.WAREHOUSE_ID }} \
--no-interaction

GitLab CI (quick example)

deploy-dev:
image: python:3.11
script:
- pip install schemaxpy
- schemax validate
- schemax apply
--target dev
--profile DEFAULT
--warehouse-id $WAREHOUSE_ID
--no-interaction
variables:
DATABRICKS_HOST: $DATABRICKS_HOST
DATABRICKS_TOKEN: $DATABRICKS_TOKEN
only:
- main

Troubleshooting

VS Code Extension

Problem: Extension commands not appearing

Solution:

  1. Make sure you have the SchemaX extension installed (check the Extensions panel)
  2. Open a folder in VS Code (the extension activates only when a workspace is open)
  3. Check the "SchemaX" output channel (View → Output → SchemaX) for errors

If running from source (development):

  1. Make sure you pressed F5 in the schemax-vscode repo window
  2. Look for "Extension Development Host" window title
  3. Open a folder in the Extension Development Host window

Problem: F5 doesn't work (development only)

Solution:

# Make sure you're in the right directory
cd /path/to/schemax-vscode
code .

# Wait for VS Code to fully load, then press F5 or Run → Start Debugging

Problem: Webview doesn't open

Solution:

  1. Check "SchemaX" output channel (View → Output → SchemaX)
  2. Look for build errors
  3. Rebuild: cd packages/vscode-extension && npm run build

Python CLI

Problem: schemax command not found

Solution:

# Check if installed
pip list | grep schemax

# Reinstall
pip install --upgrade schemaxpy

# Verify
which schemax
schemax --version

Problem: Import errors

Solution:

pip install --upgrade schemaxpy

Problem: Validation fails

Solution:

# Check file structure
ls .schemax/

# Validate JSON
python -m json.tool .schemax/project.json
python -m json.tool .schemax/changelog.json

# Check permissions
ls -la .schemax/

SQL Generation

Problem: No SQL generated

Solution:

  • Make sure there are operations in the changelog
  • Check: cat .schemax/changelog.json
  • Create some changes in the designer first

Problem: SQL has errors

Solution:

  • Review the generated SQL
  • Check operation IDs in comments to trace back
  • Verify table/column names in the visual designer

Next Steps

  1. Explore Examples: Check examples/basic-schema/
  2. Environments & scope: See Environments and deployment scope for governance-only mode and existing catalogs.
  3. Grants: See Unity Catalog grants for managing GRANT/REVOKE on catalogs, schemas, tables, and views.
  4. Read Architecture: See Architecture.
  5. Project Lifecycle & Workflows: See Workflows for single/multi-dev, greenfield/brownfield, and rollback timelines.
  6. Set Up CI/CD: Use templates in examples/github-actions/
  7. Join Community: GitHub Discussions

Quick Reference

VS Code Commands

CommandWhat It Does
SchemaX: Open DesignerOpen visual designer
SchemaX: Create SnapshotVersion your schema
SchemaX: Generate SQL MigrationExport to SQL file
SchemaX: Show Last Emitted ChangesView pending operations

CLI Commands

CommandWhat It Does
schemax validateCheck schema files
schemax sqlGenerate SQL migration file
schemax applyExecute SQL against Databricks (with tracking)
schemax rollbackRollback a deployment
schemax snapshot createCreate a versioned snapshot
schemax diffCompare two snapshot versions

File Structure

.schemax/
├── project.json # Metadata & configuration
├── changelog.json # Pending operations
├── snapshots/ # Version snapshots
│ └── v*.json
└── migrations/ # Generated SQL
└── migration_*.sql

You're all set! Start building your schemas! 🚀