Core Builder#
The CreateBuilder provides a fluent API for generating DDL statements. It works in two stages:
Build: Chain methods to construct an abstract syntax tree (AST)
Generate: Call
.sql()to translate the AST to dialect-specific SQL
# Stage 1: Build the AST
builder = (
create("table")
.name("users")
.column("id", "INT", pk=True, not_null=True)
.column("name", "VARCHAR(100)")
)
# Stage 2: Generate SQL for a specific dialect
sql = builder.sql(dialect="postgres")
# Output: CREATE TABLE users (id INT NOT NULL, name VARCHAR(100))
Method Classification#
Methods are classified into two categories:
Universal: Work with any dialect
Dialect-specific: Only work with certain dialects; using them with unsupported dialects raises an
UnsupportedErrorat generation time
Methods Reference#
Universal Methods#
These methods work with any SQL dialect and never raise UnsupportedError.
name()#
Set the table or view name. This is the only required method — without it,
to_ast() raises ASTBuildError.
.name("schema.table_name") # Fully qualified
.name("users") # Simple name
column()#
Add a column definition with optional constraints.
.column("name", "VARCHAR(100)")
.column("id", "INT", pk=True, not_null=True)
.column("price", "DECIMAL(10,2)", default=0)
Parameters:
name(str): Column namedtype(str): Data type (any string — SQLGlot translates it per dialect)not_null(bool, optional): Add NOT NULL constraintpk(bool, optional): Add PRIMARY KEY constraint (sets not_null=True)unique(bool, optional): Add UNIQUE constraintdefault(optional): Default value (int, float, str, bool)
columns()#
Add multiple columns at once. Equivalent to calling .column() multiple times.
.columns(
("id", "INT", "pk"),
("name", "VARCHAR(100)"),
("created_at", "TIMESTAMP"),
)
primary_key()#
Add a table-level PRIMARY KEY constraint. Differs from pk=True in .column()
because it creates a standalone constraint, not a column-level one.
.primary_key("user_id", "tenant_id") # Composite key
unique_key()#
Add a table-level UNIQUE constraint.
.unique_key("email")
.unique_key("phone", "country_code") # Composite unique
Parameters
*cols(str): Column names for the unique constraintname(str, optional): Constraint name
.unique_key("email", name="uq_users_email") # Named constraint
foreign_key()#
Add a table-level FOREIGN KEY constraint.
.foreign_key("user_id", references=("users", ("id",)))
.foreign_key(
"order_id", "product_id",
references=("order_products", ("id", "product_id"))
)
Parameters
*cols(str): Column names on the local tablereferences(tuple): Tuple of(referenced_table, (col1, col2, ...))on_delete(str, optional): ON DELETE action ("CASCADE","SET NULL", etc.)on_update(str, optional): ON UPDATE actionname(str, optional): Constraint name
.foreign_key(
"user_id",
references=("users", ("id",)),
on_delete="CASCADE",
name="fk_orders_user"
)
Note
On Spark/Delta tables, foreign key constraints are not natively supported by the database engine and are not emitted in the output.
check()#
Add a table-level CHECK constraint.
.check("price > 0")
.check("salary > 0 AND salary < 1000000")
Parameters
condition(str): SQL condition expressionname(str, optional): Constraint name
.check("price > 0", name="chk_positive_price")
Note
On Spark/Delta tables (created with .using("delta")), CHECK constraints
are stored as delta.constraints.{name} TBLPROPERTIES instead of SQL
constraints, since Spark does not support SQL CHECK constraints natively.
if_not_exists()#
Add IF NOT EXISTS clause to the CREATE statement.
.if_not_exists()
temporary()#
Mark the table as TEMPORARY. The generated SQL varies by dialect:
Postgres:
CREATE TEMPORARY TABLESQLite:
CREATE TEMPORARY TABLESpark:
CREATE TEMPORARY VIEWOthers: UnsupportedError
.temporary()
comment()#
Add a table comment. Not supported by SQLite (silently ignored).
.comment("User authentication and profile data")
as_select()#
Set the SELECT clause for CTAS (Create Table As Select) or view definitions. Requires a SQLGlot expression, not a raw string.
from sqlglot import exp
.as_select(exp.select("id", "name").from_("source_table"))
# For views:
.as_select(exp.select("*").from_("users").where("active = true"))
Dialect-Specific Methods#
These methods only work with certain dialects. Using them with unsupported
dialects raises sqlglot.errors.UnsupportedError at generation time.
using()#
Set the USING clause for file-based tables. Typically used with .partitioned_by()
and .location().
Supported dialects: Spark, Hive
.using("delta") # Delta Lake format
.using("parquet") # Parquet files
.using("orc") # ORC files
partitioned_by()#
Add PARTITIONED BY clause. In Spark/Hive, partition columns must also be defined
in .columns(). In BigQuery, they are declared only here.
Supported dialects: BigQuery, Spark, Hive
# Spark/Hive: partition columns must be in .columns() too
.column("event_date", "DATE")
.partitioned_by("event_date")
# BigQuery: only here (not in .columns())
.partitioned_by("created_at")
location()#
Set the LOCATION path for external tables.
Supported dialects: Spark, Hive
.location("s3://warehouse/db/table/")
.location("/mnt/data/table")
stored_as()#
Set the STORED AS clause for Hive tables.
Supported dialects: Hive
.stored_as("PARQUET")
.stored_as("ORC")
.stored_as("TEXTFILE")
row_format()#
Set the ROW FORMAT clause for Hive tables.
Supported dialects: Hive
.row_format("DELIMITED")
.row_format("SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'")
tblproperties()#
Add TBLPROPERTIES key-value pairs. Used for table metadata, SerDe properties, Delta Lake features, etc.
Supported dialects: Spark, Hive
.tblproperties({"delta.enableChangeDataFeed": "true"})
.tblproperties({"format": "csv", "delimiter": ","})
Error Handling#
When using dialect-specific methods with unsupported dialects, SQLGlot raises
UnsupportedError at generation time (when .sql() is called), not at
build time.
# This works fine — we only build the AST
builder = (
create("table")
.name("events")
.column("id", "INT")
.partitioned_by("date") # Works in memory
)
# This raises UnsupportedError
sql = builder.sql(dialect="postgres")
# UnsupportedError: Partitioning is not supported for postgres
Best practices:
Use try/except around
.sql()if targeting multiple dialectsCheck the method classification table before using dialect-specific features
Refer to
docs/dialects.rstfor a complete feature matrix
Output Methods#
to_ast()#
Return the SQLGlot exp.Create AST. Use this for debugging or further
manipulation with SQLGlot APIs.
ast = builder.to_ast()
# Returns: exp.Create instance
sql()#
Generate the SQL string with optional dialect-specific translation.
sql = builder.sql() # Dialect-agnostic SQL
sql = builder.sql(dialect="postgres") # PostgreSQL-specific
sql = builder.sql(dialect="spark", pretty=True) # Pretty-printed
Parameters:
dialect(str, optional): SQL dialect (default: None = generic)pretty(bool, optional): Enable pretty printingindent(int, optional): Spaces per indent level (default: 2)pad(int, optional): Alignment padding (default: 2)max_text_width(int, optional): Max line width before wrapping (default: 80)
DDL Inspection#
The .build() method returns an immutable DDL object that exposes all
defined properties for inspection. This is useful for generating metadata,
validating schemas, or integrating with other tools.
ddl = (
create("table")
.name("users")
.column("id", "INT", pk=True, not_null=True)
.column("name", "VARCHAR(100)")
.column("created_at", "TIMESTAMP")
.partitioned_by("created_at")
.location("s3://warehouse/users/")
.using("delta")
.tblproperties({"delta.autoOptimize": "true"})
.build()
)
# Inspect properties
ddl.table_name # → "users"
ddl.kind # → "TABLE"
ddl.columns # → (ColumnDef(...), ColumnDef(...), ...)
ddl.primary_keys # → ("id",)
ddl.partition_cols # → ("created_at",)
ddl.location # → "s3://warehouse/users/"
ddl.file_format # → "DELTA"
ddl.tblproperties # → {"delta.autoOptimize": "true"}
# .sql() and .to_ast() are also available on DDL
ddl.sql(dialect="spark")
ddl.to_ast()
DDL object also exposes:
if_not_exists: Whether IF NOT EXISTS was settemporary: Whether TEMPORARY was setcomment: Table commentunique_keys: Tuple ofUniqueDefobjectsforeign_keys: Tuple ofForeignKeyDefobjectschecks: Tuple ofCheckDefobjects
ddl = (
create("table")
.name("users")
.column("id", "INT")
.column("email", "VARCHAR(100)")
.unique_key("email", name="uq_users_email")
.build()
)
ddl.unique_keys # → (UniqueDef(columns=('email',), name='uq_users_email'),)
Builder Inspection Properties#
The builder itself also exposes inspection properties for convenience:
builder = (
create("table")
.name("users")
.column("id", "INT", pk=True)
.partitioned_by("id")
)
builder.table_name # → "users"
builder.columns_defs # → [ColumnDef(...)]
builder.primary_keys # → ("id",)
builder.partition_columns # → ("id",)
builder.unique_keys # → ()
ColumnDef#
The ColumnDef NamedTuple represents a column definition:
Attribute |
Description |
|---|---|
|
Column name (str) |
|
Data type string (str) |
|
Whether NOT NULL constraint is set (bool) |
|
Whether PRIMARY KEY constraint is set (bool) |
|
Whether UNIQUE constraint is set (bool) |
|
Default value (Any) |
UniqueDef#
The UniqueDef dataclass represents a unique constraint definition:
Attribute |
Description |
|---|---|
|
Tuple of column names (tuple[str, …]) |
|
Constraint name, or None if unnamed (str | None) |
ForeignKeyDef#
The ForeignKeyDef dataclass represents a foreign key constraint definition:
Attribute |
Description |
|---|---|
|
Local column names (tuple[str, …]) |
|
Referenced table name (str) |
|
Referenced column names (tuple[str, …]) |
|
ON DELETE action, or None (str | None) |
|
ON UPDATE action, or None (str | None) |
|
Constraint name, or None if unnamed (str | None) |
CheckDef#
The CheckDef dataclass represents a CHECK constraint definition:
Attribute |
Description |
|---|---|
|
SQL condition expression (str) |
|
Constraint name, or None if unnamed (str | None) |
IndexBuilder and create_index()#
The IndexBuilder class provides a fluent API for generating standalone
CREATE INDEX statements. Use create_index() to create a new builder.
from ddlglot import create_index
sql = (
create_index("idx_users_email")
.on("users", "email")
.unique()
.sql(dialect="postgres")
)
# Output: CREATE INDEX UNIQUE idx_users_email ON users(email)
Method Reference#
Method |
Description |
|---|---|
|
Set table and column names (required) |
|
Mark as UNIQUE index |
|
Set USING type (e.g., “btree”, “hash”) |
|
Add WHERE clause (partial index) |
|
Add INCLUDE column (covering index) |
|
Add index comment |
|
Generate SQL string |
|
Return SQLGlot |
|
Return |
IndexDef#
The IndexDef dataclass represents an index definition:
Attribute |
Description |
|---|---|
|
Index name (str) |
|
Table name, or None (str | None) |
|
Column names (tuple[str, …]) |
|
Whether UNIQUE (bool) |
|
USING type, or None (str | None) |
|
WHERE condition, or None (str | None) |
|
INCLUDE columns (tuple[str, …]) |
|
Comment text, or None (str | None) |
IndexDef has a .build() method that returns an IndexBuilder,
enabling round-trip inspection and modification:
index_def = create_index("idx").on("users", "email").build()
rebuilt = index_def.build() # Returns IndexBuilder
Supported Dialects#
ddlglot supports all SQLGlot dialects:
postgres/postgresqlspark/sparksqlhivebigqueryduckdbsqlitemysqlsnowflakeredshiftAnd many more…
SQLGlot automatically handles dialect-specific syntax translation at generation time.