Documentation Index
Fetch the complete documentation index at: https://openmetadata-feat-feat-2mbfixdeploy.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Python SDK
The OpenMetadata Python SDK provides a comprehensive interface for interacting with the OpenMetadata API. It offers type-safe operations for managing metadata entities and seamless integration with your Python applications.
Installation
Install the OpenMetadata Python SDK using pip:
pip install openmetadata-ingestion
Quick Start
Basic Connection
from metadata.sdk import configure
# Configure with your host and JWT token
configure(
host="http://localhost:8585/api",
jwt_token="your-jwt-token"
)
You can also configure from environment variables (OPENMETADATA_HOST and OPENMETADATA_JWT_TOKEN):
from metadata.sdk import configure
# Reads OPENMETADATA_HOST and OPENMETADATA_JWT_TOKEN automatically
configure()
Working with Entities
from metadata.sdk import Tables, DatabaseServices
# Get all database services
services = list(DatabaseServices.list().auto_paging_iterable())
print(f"Found {len(services)} database services")
# Get a specific table by name
table = Tables.retrieve_by_name("your-service.your-database.your-schema.your-table")
if table:
print(f"Table: {table.name}")
print(f"Columns: {len(table.columns) if table.columns else 0}")
Core Functionality
Entity Management
The Python SDK provides full CRUD operations for all OpenMetadata entities:
Create or Update Entities
from metadata.sdk import Tables
from metadata.generated.schema.api.data.createTable import CreateTableRequest
# Create table request
create_table = CreateTableRequest(
name="sample_table",
databaseSchema="your_database.your_schema",
columns=[
# Define your columns here
],
description="Sample table created via Python SDK"
)
# Create the table
table = Tables.create(create_table)
Retrieve Entities
from metadata.sdk import Tables
# Get by ID
table = Tables.retrieve("uuid-here")
# Get by fully qualified name
table = Tables.retrieve_by_name("service.database.schema.table")
# Get with specific fields
table = Tables.retrieve_by_name(
"service.database.schema.table",
fields=["owners", "tags"]
)
List All Entities
from metadata.sdk import Tables
# Auto-paginating iterator for large datasets
for table in Tables.list().auto_paging_iterable():
print(f"Processing table: {table.name}")
List with Filters
from metadata.sdk import Tables
from metadata.sdk.entities.tables import TableListParams
# List with filters and field selection
params = TableListParams.builder().limit(50).fields(["owners", "tags"]).build()
tables = Tables.list(params)
Update Entities
from metadata.sdk import Tables
table = Tables.retrieve_by_name("service.database.schema.table")
table.description = "Updated description"
updated = Tables.update(str(table.id), table)
Delete Entities
from metadata.sdk import Tables
# Soft delete
Tables.delete("uuid-here")
# Hard delete with recursive removal of children
Tables.delete("uuid-here", recursive=True, hard_delete=True)
Entity References
from metadata.sdk import to_entity_reference, Tables
# Retrieve the entity first, then get a reference
table = Tables.retrieve_by_name("service.database.schema.table")
ref = to_entity_reference(table)
# Use in other entity creation
if ref:
print(f"Table reference ID: {ref.id}")
Advanced Features
Error Handling
from metadata.ingestion.ometa.client import APIError
from metadata.sdk import Tables
try:
table = Tables.retrieve("table-id")
except APIError as e:
if e.status_code == 404:
print("Table not found")
elif e.status_code == 401:
print("Authentication failed")
else:
print(f"Error: {e}")
Common Use Cases
Data Discovery
from metadata.sdk import Tables
# Iterate all tables and filter for a keyword
matching_tables = [
table for table in Tables.list().auto_paging_iterable()
if "customer" in table.name.lower()
]
for table in matching_tables:
print(f"Found customer table: {table.fullyQualifiedName}")
from metadata.sdk import Tables
# Bulk update table descriptions
for table in Tables.list().auto_paging_iterable():
if not table.description:
table.description = f"Production table: {table.name}"
Tables.update(str(table.id), table)
Lineage Management
from metadata.sdk.api import Lineage
lineage = Lineage.get_lineage(
"service.database.schema.table",
upstream_depth=1,
downstream_depth=1
)
if lineage:
print(f"Upstream entities: {len(lineage.get('upstreamEdges', []))}")
print(f"Downstream entities: {len(lineage.get('downstreamEdges', []))}")
API Reference
The Python SDK provides a comprehensive API based on the OpenMetadata data model:
Core Classes
- Entity classes (
Tables, Databases, DatabaseSchemas, DatabaseServices, Users, etc.): Static-method interfaces for each entity type — no instantiation required
configure(): One-time global setup for host and JWT token
to_entity_reference(entity): Convert a retrieved entity to an EntityReference for use in relationships
- Entity Request Classes: Pydantic-based typed request objects (e.g.,
CreateTableRequest)
Key Methods
Each entity class exposes the same consistent interface:
EntityClass.create(request): Create a new entity
EntityClass.retrieve(entity_id): Retrieve entity by UUID
EntityClass.retrieve_by_name(fqn, fields=[]): Retrieve entity by fully qualified name
EntityClass.list(params=None): List entities (returns a pageable result)
EntityClass.list().auto_paging_iterable(): Auto-paginating generator for all entities
EntityClass.update(entity_id, entity): Update an existing entity
EntityClass.delete(entity_id, recursive=False, hard_delete=False): Delete an entity
Type Safety
The Python SDK is built on generated Pydantic models, providing:
- Type hints for better IDE support
- Runtime validation of data structures
- Auto-completion for entity properties
- Error prevention through static typing
from metadata.sdk import Tables
from metadata.generated.schema.entity.data.table import Table
# Type-safe retrieval — IDE provides auto-completion and type checking
table: Table = Tables.retrieve_by_name("service.database.schema.table")
if table:
columns_count: int = len(table.columns) if table.columns else 0
Best Practices
- Configure once: Call
configure() once at application startup and reuse globally — no need to pass a client object around
- Error Handling: Always handle
APIError exceptions for robust integrations
- Pagination: Use
.auto_paging_iterable() for large datasets to avoid loading everything into memory at once
- Performance: Specify only required fields when fetching entities (e.g.,
fields=["owners", "tags"])