Documentation Index
Fetch the complete documentation index at: https://openmetadata-feat-feat-2mbfixdeploy.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Approach
In order to implement the Incremental Extraction for Redshift we rely on the SYS_QUERY_HISTORY to get the latest DDL changes.
We then match the query_text with different regex to get the Schema and Table name and understand if it was a deletion.
If no schema is present in the query we infer that it is the default for the database.
Requisites
The user should be a Superuser to be able to see all the records.
Used Regex
Table Name
Create Table
rf"^.*CREATE\s+(LOCAL\s+|EXTERNAL\s+)?(TEMPORARY\s+|TEMP\s+)?TABLE\s+(IF\s+NOT\s+EXISTS\s+)?(?P<table>{TABLE_NAME_RE}).*$"
Alter Table
rf"^.*(ALTER\s+TABLE)\s+(?P<table>{TABLE_NAME_RE}).*$"
Drop Table
rf"^.*DROP\s+TABLE\s+(IF\s+EXISTS\s+)?(?P<table>{TABLE_NAME_RE}).*$"
Create View
rf"^.*CREATE\s+(OR\s+REPLACE\s+)?(EXTERNAL\s+|MATERIALIZED\s+)?VIEW\s+(?P<table>{TABLE_NAME_RE}).*$"
Alter View
rf"^.*ALTER\s+(EXTERNAL\s+)?VIEW\s+(?P<table>{TABLE_NAME_RE}).*$"
Drop View
rf"^.*DROP\s+(EXTERNAL\s+|MATERIALIZED\s+)?VIEW\s+(IF\s+EXISTS\s+)?(?P<table>{TABLE_NAME_RE}).*$"
# Not supporting Comment changes on Constraint
rf"^.*COMMENT\s+ON\s+(TABLE|COLUMN|VIEW)\s+(?P<table>{TABLE_NAME_RE}).*$"
Used Query
SELECT
query_text
FROM SYS_QUERY_HISTORY
WHERE status = 'success'
AND (
query_type = 'DDL' OR
(query_type = 'UTILITY' AND query_text ilike '%%COMMENT ON%%') OR
(query_type = 'CTAS' AND query_text ilike '%%CREATE TABLE%%')
)
and database_name = '{database}'
and end_time >= '{start_date}'
ORDER BY end_time DESC