Architecture¶
Design Objectives¶
sqlalchemy-cubrid is designed to provide a robust, modern interface between SQLAlchemy and the CUBRID database. Its core design goals include:
- Full SQLAlchemy 2.0 dialect implementation
- Dual-driver support (C-extension CUBRIDdb + pure Python pycubrid)
- Schema reflection (tables, columns, constraints, indexes, comments)
- Custom DML extensions (ON DUPLICATE KEY UPDATE, MERGE, REPLACE)
- Alembic migration support
- PEP 561 typed
High-Level Flow¶
Phase 1: Engine Creation & Connection¶
This phase covers how SQLAlchemy discovers the CUBRID dialect and establishes a physical connection to the CUBRID Server.
sequenceDiagram
participant App
participant SA as SQLAlchemy
participant Registry as Dialect Registry
participant Dialect as CubridDialect
participant Driver as pycubrid / CUBRIDdb
participant DB as CUBRID Server
App->>SA: create_engine("cubrid+pycubrid://dba@host:33000/testdb")
rect rgb(230, 245, 255)
note over SA, Registry: Phase 1 — Dialect Discovery
SA->>Registry: Lookup "cubrid.pycubrid" entry point
Registry-->>SA: PyCubridDialect class
SA->>Dialect: Instantiate dialect
Dialect->>Driver: import_dbapi() → import pycubrid
end
App->>SA: engine.connect()
rect rgb(230, 255, 230)
note over SA, DB: Phase 2 — Physical Connection
SA->>Dialect: create_connect_args(url)
Dialect-->>SA: (host, port, database, user, password)
SA->>Driver: pycubrid.connect(host, port, database, user, password)
Driver->>DB: CAS handshake + OpenDatabase
DB-->>Driver: Session established
Driver-->>SA: Connection object
Dialect->>Driver: on_connect() → set autocommit=False
end
SA-->>App: Connection
Phase 2: SQL Compilation¶
This phase describes the transformation of SQLAlchemy Expression Language constructs into CUBRID-compatible SQL strings and their subsequent execution.
sequenceDiagram
participant App
participant SA as SQLAlchemy Core
participant Compiler as CubridSQLCompiler
participant TypeCompiler as CubridTypeCompiler
participant Driver as pycubrid
participant CAS as CAS Process
App->>SA: conn.execute(select(users).where(users.c.id == 1))
rect rgb(255, 245, 230)
note over SA, TypeCompiler: SQL Compilation
SA->>Compiler: process(select_statement)
Compiler->>Compiler: visit_select() → column clause
Compiler->>Compiler: limit_clause() → LIMIT/OFFSET (no FETCH FIRST)
Compiler->>Compiler: for_update_clause()
Compiler->>TypeCompiler: process column types
TypeCompiler-->>Compiler: CUBRID SQL type strings
Compiler-->>SA: "SELECT users.id, ... FROM users WHERE users.id = ?"
end
rect rgb(230, 255, 230)
note over SA, CAS: Execution
SA->>Driver: cursor.execute(sql, params)
Driver->>CAS: PrepareAndExecute packet
CAS-->>Driver: Result rows
Driver-->>SA: DB-API cursor with results
end
SA-->>App: CursorResult
Schema Reflection¶
The reflection process allows SQLAlchemy to inspect an existing CUBRID database and reconstruct Table objects automatically.
sequenceDiagram
participant App
participant SA as SQLAlchemy
participant Dialect as CubridDialect
participant CAS as CAS Process
App->>SA: metadata.reflect(engine)
SA->>Dialect: get_table_names(connection)
Dialect->>CAS: SHOW TABLES
CAS-->>Dialect: Table list
loop For each table
SA->>Dialect: get_columns(connection, table_name)
Dialect->>CAS: SHOW COLUMNS IN "table_name"
CAS-->>Dialect: Column definitions
SA->>Dialect: get_pk_constraint(connection, table_name)
Dialect->>CAS: SHOW INDEX IN "table_name" (filter PRIMARY)
CAS-->>Dialect: PK constraint info
SA->>Dialect: get_foreign_keys(connection, table_name)
Dialect->>CAS: SELECT from db_constraint
CAS-->>Dialect: FK constraints
SA->>Dialect: get_indexes(connection, table_name)
Dialect->>CAS: SHOW INDEX IN "table_name"
CAS-->>Dialect: Index definitions
end
SA-->>App: MetaData with reflected tables
Module Boundaries¶
The package is organized into specialized modules, each handling a specific aspect of the dialect's functionality.
flowchart TD
init["__init__.py<br/>Public API: types, insert(), merge()"]
dialect["dialect.py<br/>CubridDialect: reflection, connection, isolation"]
pycubrid_d["pycubrid_dialect.py<br/>PyCubridDialect: pure Python variant"]
compiler["compiler.py<br/>SQL, DDL, Type compilers"]
base["base.py<br/>ExecutionContext, IdentifierPreparer"]
dml["dml.py<br/>ODKU, MERGE, REPLACE constructs"]
types["types.py<br/>CUBRID type system"]
req["requirements.py<br/>SA test requirement flags"]
alembic_mod["alembic_impl.py<br/>CubridImpl DDL operations"]
init --> types
init --> dml
dialect --> base
dialect --> compiler
dialect --> types
pycubrid_d --> dialect
compiler --> types
compiler --> base
%% External dependencies
sa["SQLAlchemy 2.0"]
pycubrid_pkg["pycubrid (driver)"]
alembic_pkg["Alembic"]
dialect -.-> sa
pycubrid_d -.-> pycubrid_pkg
compiler -.-> sa
alembic_mod -.-> alembic_pkg
req -.-> sa
Module Descriptions¶
__init__.py¶
Defines the public API boundary, exporting CUBRID-specific types and DML extensions like insert(), merge(), and replace(). It serves as the primary entry point for users of the dialect.
dialect.py¶
Contains the base CubridDialect class, implementing core logic for schema reflection, connection management, and transaction isolation levels. It defaults to the C-extension driver CUBRIDdb.
pycubrid_dialect.py¶
Implements the PyCubridDialect variant, which uses the pure Python pycubrid driver. It overrides connection argument parsing and connection-time initialization logic.
compiler.py¶
Houses the SQL, DDL, and Type compilers. It translates SQLAlchemy's abstract syntax trees into CUBRID-specific SQL dialects, handling nuances like LIMIT/OFFSET and FOR UPDATE clauses.
base.py¶
Provides the CubridExecutionContext for statement execution state and the CubridIdentifierPreparer for handling CUBRID's lowercase identifier folding and quoting rules.
dml.py¶
Defines custom DML constructs for CUBRID-specific features such as ON DUPLICATE KEY UPDATE (ODKU), MERGE INTO, and REPLACE INTO.
types.py¶
Implements the CUBRID-specific type system, mapping SQLAlchemy's generic types to CUBRID's internal types like SET, MULTISET, and BIT.
requirements.py¶
Defines feature flags used by the SQLAlchemy test suite to determine which behavioral tests should be executed against a CUBRID backend.
alembic_impl.py¶
Provides the CubridImpl class for Alembic, enabling DDL migration support and defining CUBRID's lack of transactional DDL capabilities.
Dialect Discovery¶
SQLAlchemy uses entry points to discover and load the appropriate dialect class based on the provided connection URL.
flowchart TD
url["Connection URL<br/>cubrid+pycubrid://dba@host:33000/db"]
parse["SQLAlchemy URL Parser<br/>backend=cubrid, driver=pycubrid"]
entry["Entry Point Lookup<br/>sqlalchemy.dialects → cubrid.pycubrid"]
url --> parse
parse --> entry
entry -->|"cubrid://"| cubrid_dialect["CubridDialect<br/>(C-extension CUBRIDdb)"]
entry -->|"cubrid.cubrid://"| cubrid_dialect
entry -->|"cubrid+pycubrid://"| pycubrid_dialect["PyCubridDialect<br/>(Pure Python pycubrid)"]
cubrid_dialect --> import_c["import CUBRIDdb"]
pycubrid_dialect --> import_py["import pycubrid"]
alembic_entry["Entry Point: alembic.ddl → cubrid"]
alembic_entry --> alembic_impl["CubridImpl<br/>transactional_ddl = False"]
Two-Driver Architecture¶
The dialect supports both the legacy C-extension driver and the modern pure Python driver through a hierarchical class structure.
flowchart TD
sa_default["sqlalchemy.engine.default<br/>DefaultDialect"]
cubrid_base["CubridDialect<br/>dialect.py<br/>• reflection<br/>• isolation levels<br/>• type mapping<br/>• import_dbapi() → CUBRIDdb"]
pycubrid_variant["PyCubridDialect<br/>pycubrid_dialect.py<br/>• import_dbapi() → pycubrid<br/>• create_connect_args()<br/>• on_connect()<br/>• do_ping()"]
sa_default --> cubrid_base
cubrid_base --> pycubrid_variant
cubrid_base -.->|"loads"| cci["CUBRIDdb<br/>(C-extension driver)"]
pycubrid_variant -.->|"loads"| pure["pycubrid<br/>(Pure Python driver)"]
Key Design Decisions¶
- SQLAlchemy < 2.2 pin: Uses private SA APIs (
select._limit_clause,select._offset_clause,select._for_update_arg,coercions._is_literal,BindParameter._with_binary_element_type) at compiler.py:71, 81-82, 144, 150-151 — requires version pinning until public alternatives exist. - BOOLEAN → SMALLINT mapping: CUBRID has no native BOOLEAN — dialect maps to
SMALLINT(0/1). - No JSON type mapping (yet): CUBRID 10.2+ supports JSON natively, but the dialect doesn't map it yet — JSON columns can still be used via raw SQL.
transactional_ddl = False: CUBRID auto-commits DDL statements — Alembic cannot roll back failed migrations.supports_statement_cache = True: Required for SA 2.0 performance — dialect is cache-safe.- Lowercase identifier folding: CUBRID folds to lowercase (not SQL-standard uppercase) —
CubridIdentifierPreparerhandles this. - No RELEASE SAVEPOINT: CUBRID doesn't support it —
do_release_savepoint()is a no-op.
Public API Boundary¶
# DML Extensions
insert() # Insert with .on_duplicate_key_update()
merge() # MERGE INTO ... USING ... ON ... WHEN MATCHED/NOT MATCHED
replace() # REPLACE INTO
trace_query() # Query tracing utility
# Types (CUBRID-specific)
STRING, BIT, CLOB, BLOB, SET, MULTISET, SEQUENCE, MONETARY, OBJECT
NCHAR, NVARCHAR, DOUBLE_PRECISION, REAL
# Types (standard, re-exported)
SMALLINT, INTEGER, BIGINT, NUMERIC, DECIMAL, FLOAT, DOUBLE
CHAR, VARCHAR, DATE, TIME, TIMESTAMP, DATETIME
# Entry Points (registered via pyproject.toml)
cubrid:// → CubridDialect
cubrid.cubrid:// → CubridDialect
cubrid+pycubrid:// → PyCubridDialect
cubrid (alembic) → CubridImpl
What This Package Owns / Does Not Own¶
Owns¶
- SQLAlchemy dialect for CUBRID
- SQL compilation (SELECT/INSERT/UPDATE/DELETE with CUBRID syntax)
- DDL compilation
- Type mapping
- Schema reflection
- DML extensions (ODKU, MERGE, REPLACE)
- Alembic DDL support
- Identifier quoting
Does Not Own¶
- The CUBRID driver itself (use pycubrid or CUBRIDdb)
- Connection pooling (SQLAlchemy handles this)
- ORM model definitions (user code)
- The CAS wire protocol (pycubrid handles this)
- Query optimization (CUBRID server handles this)