Identifying or verifying primary keys (PKs) across numerous tables in Snowflake manually can be tedious and time-consuming. You are probably familiar with the basic method for retrieving primary keys using the INFORMATION_SCHEMA command. This article will introduce you to a more efficient approach, using Snowflake’s DDL commands.
One common challenge is identifying the primary keys in Snowflake tables without manually inspecting each table. Using the DESCRIBE TABLE command works, but it becomes impractical for a large number of tables. Automating this process can save significant time and reduce errors.
To automate the retrieval of primary keys, you can leverage Snowflake’s INFORMATION_SCHEMA views and the GET_DDL function. Here’s a step-by-step guide to doing this efficiently:
Snowflake’s INFORMATION_SCHEMA provides views like TABLE_CONSTRAINTS and REFERENTIAL_CONSTRAINTS which can be queried to get metadata about table constraints, including primary keys.
The GET_DDL function generates the Data Definition Language (DDL) statements for database objects. You can parse the output to extract primary key information.
If we want to find out the primary keys of a larger number of tables stored in multiple schemas, we can use the SHOW PRIMARY KEYS in schema statement.
Here’s a practical example of how you can implement this:
You can query the INFORMATION_SCHEMA.TABLE_CONSTRAINTS to find primary keys for multiple tables. This is a basic and straightforward method.
If you prefer to use GET_DDL, here’s an example query that extracts the primary key from a table’s DDL:
WITH t1 AS (
SELECT 'table1' AS "TABLE_NAME",
GET_DDL('TABLE', '<schema_name>.<table_name>') AS DDL
)
SELECT
"TABLE_NAME",
REPLACE(REPLACE(REPLACE(REPLACE(SPLIT_PART(T.DDL, 'primary key', 2), ';', ''), '(', ''), ')', ''), '"', '') AS PK
FROM t1 AS T;
If you need to find primary keys across multiple schemas, you can use the SHOW PRIMARY KEYS command. Here’s an example:
SHOW PRIMARY KEYS IN SCHEMA "KEBOOLA_00"."schemaA";
SHOW PRIMARY KEYS IN SCHEMA "KEBOOLA_00"."schemaB";
CREATE OR REPLACE TABLE PRIMARY_KEYS AS (
SELECT * FROM TABLE(RESULT_SCAN(LAST_QUERY_ID(-2)))
UNION ALL
SELECT * FROM TABLE(RESULT_SCAN(LAST_QUERY_ID(-1)))
) x;
This script shows how to combine results from multiple SHOW PRIMARY KEYS commands into a single table for easier analysis.
By using INFORMATION_SCHEMA and GET_DDL, you can automate the retrieval of primary keys from multiple tables in Snowflake. This approach saves time and minimizes manual effort, allowing you to focus on more critical tasks. Whether you choose to query the schema views directly or parse the DDL statements, both methods provide efficient solutions for managing your database schema.
Feel free to experiment with these queries and adapt them to fit your specific requirements.
Happy querying and thank you Renée for asking this in Revolt BI Slack!