Skip to content

Instantly share code, notes, and snippets.

@michellebonat
Created October 7, 2025 19:40
Show Gist options
  • Select an option

  • Save michellebonat/b99134e2549be3952447a3ad5111b185 to your computer and use it in GitHub Desktop.

Select an option

Save michellebonat/b99134e2549be3952447a3ad5111b185 to your computer and use it in GitHub Desktop.
SQL query for PG data schema
Here are two turnkey options in a SQL query to get the data schema (table names, columns, data types, etc.) from a PostgreSQL database—one **bash script** that uses `psql \copy` to write CSVs locally, and one **Python script** that does the same (and zips the results). Both include: **tables/columns/PKs/FKs/indexes, views, materialized views, sequences, and triggers**.
Here’s where and how to set your own **file path** depending on which version you’re using:
---
## 🧩 If you’re using the **Python script**
When you run:
```bash
python export_postgres_schema.py --host your-host --db your_db --user your_user --password 'your_pass' --out ./schema_export
```
* The `--out` argument controls **where the CSVs and ZIP will be saved.**
* You can replace `./schema_export` with any absolute or relative path, for example:
```bash
--out /Users/michelle/Desktop/postgres_schema
```
or
```bash
--out "C:\Users\Michelle\Documents\schema_exports"
```
The script will automatically:
* Create that folder if it doesn’t exist ✅
* Write all the CSVs and a ZIP file into it ✅
(`01_tables_columns_keys_indexes.csv`, `02_views.csv`, etc., and `postgres_schema_export.zip`)
---
## 💻 If you’re using the **bash script (psql version)**
In the command line when you run it:
```bash
bash export_postgres_schema.sh --host your-host --db your_db --user your_user --out ./schema_export
```
You can set the `--out` flag to any directory you like:
```bash
--out /Users/michelle/Desktop/schema_docs
```
That folder will be created automatically and will hold the exported CSV files.
---
## 🔎 Quick check
After the script runs, you’ll see output like:
```
Done. Files in /Users/michelle/Desktop/schema_docs
```
and that directory will contain:
```
01_tables_columns_keys_indexes.csv
02_views.csv
03_materialized_views.csv
04_sequences.csv
05_triggers.csv
postgres_schema_export.zip (Python version only)
```
---
# Option A — Bash (psql) one-shot exporter
Save as `export_postgres_schema.sh`, then run:
```bash
bash export_postgres_schema.sh \
--host your-host \
--db your_db \
--user your_user \
--out ./schema_export \
--schema public,analytics # optional comma-separated list; omit for all user schemas
```
```bash
#!/usr/bin/env bash
set -euo pipefail
# ---- args ----
HOST=""
DB=""
USER=""
OUT="./schema_export"
SCHEMAS="" # comma-separated list, e.g., "public,analytics"
while [[ $# -gt 0 ]]; do
case $1 in
--host) HOST="$2"; shift 2;;
--db) DB="$2"; shift 2;;
--user) USER="$2"; shift 2;;
--out) OUT="$2"; shift 2;;
--schema) SCHEMAS="$2"; shift 2;;
*) echo "Unknown arg: $1"; exit 1;;
esac
done
if [[ -z "$HOST" || -z "$DB" || -z "$USER" ]]; then
echo "Usage: $0 --host <host> --db <db> --user <user> [--out <dir>] [--schema public,analytics]"
exit 1
fi
mkdir -p "$OUT"
# Build schema filter fragments
EXCL_SYS="NOT IN ('pg_catalog','information_schema')"
IN_FILTER=""
if [[ -n "$SCHEMAS" ]]; then
# quote each schema
IFS=',' read -ra A <<< "$SCHEMAS"
QUOTED=$(printf "'%s'," "${A[@]}")
QUOTED="${QUOTED%,}"
IN_FILTER="AND n.nspname IN ($QUOTED)"
fi
# ---------- 01: Tables / Columns / PKs / FKs / Indexes ----------
read -r -d '' Q1 <<'SQL'
WITH columns AS (
SELECT
c.table_schema,
c.table_name,
c.column_name,
c.data_type,
c.character_maximum_length,
c.is_nullable,
c.column_default,
pgd.description AS column_description,
c.ordinal_position
FROM information_schema.columns c
LEFT JOIN pg_catalog.pg_statio_all_tables st
ON c.table_schema = st.schemaname AND c.table_name = st.relname
LEFT JOIN pg_catalog.pg_description pgd
ON pgd.objoid = st.relid AND pgd.objsubid = c.ordinal_position
WHERE c.table_schema NOT IN ('pg_catalog','information_schema')
),
pkeys AS (
SELECT kcu.table_schema, kcu.table_name,
string_agg(kcu.column_name, ', ' ORDER BY kcu.ordinal_position) AS primary_keys
FROM information_schema.table_constraints tco
JOIN information_schema.key_column_usage kcu
ON kcu.constraint_name = tco.constraint_name
WHERE tco.constraint_type = 'PRIMARY KEY'
GROUP BY kcu.table_schema, kcu.table_name
),
fkeys AS (
SELECT
kcu.table_schema, kcu.table_name, kcu.column_name,
ccu.table_schema AS foreign_table_schema,
ccu.table_name AS foreign_table,
ccu.column_name AS foreign_column
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu
ON tc.constraint_name = kcu.constraint_name
JOIN information_schema.constraint_column_usage ccu
ON ccu.constraint_name = tc.constraint_name
WHERE tc.constraint_type = 'FOREIGN KEY'
),
indexes AS (
SELECT
schemaname AS table_schema,
tablename AS table_name,
indexname AS index_name,
indexdef AS index_definition
FROM pg_indexes
WHERE schemaname NOT IN ('pg_catalog','information_schema')
)
SELECT
col.table_schema,
col.table_name,
col.column_name,
col.data_type,
col.character_maximum_length,
col.is_nullable,
col.column_default,
col.column_description,
pk.primary_keys,
fk.foreign_table_schema,
fk.foreign_table,
fk.foreign_column,
idx.index_name,
idx.index_definition
FROM columns col
LEFT JOIN pkeys pk ON col.table_schema = pk.table_schema AND col.table_name = pk.table_name
LEFT JOIN fkeys fk ON col.table_schema = fk.table_schema AND col.table_name = fk.table_name AND col.column_name = fk.column_name
LEFT JOIN indexes idx ON col.table_schema = idx.table_schema AND col.table_name = idx.table_name
ORDER BY col.table_schema, col.table_name, col.ordinal_position;
SQL
# apply schema IN() filter if provided
if [[ -n "$SCHEMAS" ]]; then
Q1="${Q1/information_schema')\n)/information_schema')\n AND c.table_schema IN ($(printf "'%s'," ${SCHEMAS//,/\'\,\'})| sed "s/,\$//")\n)/}"
fi
psql -h "$HOST" -U "$USER" -d "$DB" -c "\copy ( $Q1 ) TO '$OUT/01_tables_columns_keys_indexes.csv' WITH CSV HEADER"
# ---------- 02: Views ----------
read -r -d '' Q2 <<SQL
WITH views AS (
SELECT
n.nspname AS view_schema,
c.relname AS view_name,
pg_get_viewdef(c.oid, true) AS view_definition,
obj_description(c.oid, 'pg_class') AS view_description,
c.oid AS view_oid
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind = 'v'
AND n.nspname $EXCL_SYS
${IN_FILTER}
),
deps AS (
SELECT
r.refobjid AS view_oid,
string_agg(DISTINCT (dn.nspname || '.' || d.relname), ', ' ORDER BY dn.nspname, d.relname) AS depends_on
FROM pg_depend r
JOIN pg_class d ON d.oid = r.objid
JOIN pg_namespace dn ON dn.oid = d.relnamespace
WHERE r.classid = 'pg_rewrite'::regclass
AND d.relkind IN ('r','p','v','m','f')
GROUP BY r.refobjid
)
SELECT
v.view_schema,
v.view_name,
v.view_description,
v.view_definition,
coalesce(deps.depends_on, '') AS depends_on
FROM views v
LEFT JOIN deps ON deps.view_oid = v.view_oid
ORDER BY v.view_schema, v.view_name;
SQL
psql -h "$HOST" -U "$USER" -d "$DB" -c "\copy ( $Q2 ) TO '$OUT/02_views.csv' WITH CSV HEADER"
# ---------- 03: Materialized Views ----------
read -r -d '' Q3 <<SQL
WITH m AS (
SELECT
schemaname AS matview_schema,
matviewname AS matview_name,
definition AS matview_definition,
ispopulated AS is_populated,
tablespace,
matviewowner AS owner
FROM pg_matviews
WHERE schemaname $EXCL_SYS
${IN_FILTER/AND n.nspname/AND schemaname}
),
idx AS (
SELECT
schemaname AS matview_schema,
tablename AS matview_name,
string_agg(indexname || ' :: ' || indexdef, ' | ' ORDER BY indexname) AS matview_indexes
FROM pg_indexes
WHERE schemaname $EXCL_SYS
${IN_FILTER/AND n.nspname/AND schemaname}
GROUP BY schemaname, tablename
)
SELECT
m.matview_schema,
m.matview_name,
m.owner,
m.tablespace,
m.is_populated,
m.matview_definition,
coalesce(idx.matview_indexes, '') AS matview_indexes
FROM m
LEFT JOIN idx ON idx.matview_schema = m.matview_schema AND idx.matview_name = m.matview_name
ORDER BY m.matview_schema, m.matview_name;
SQL
psql -h "$HOST" -U "$USER" -d "$DB" -c "\copy ( $Q3 ) TO '$OUT/03_materialized_views.csv' WITH CSV HEADER"
# ---------- 04: Sequences ----------
read -r -d '' Q4 <<SQL
WITH seqs AS (
SELECT
s.sequence_schema,
s.sequence_name,
s.data_type,
s.start_value,
s.minimum_value,
s.maximum_value,
s.increment,
s.cycle_option
FROM information_schema.sequences s
WHERE s.sequence_schema $EXCL_SYS
${IN_FILTER/AND n.nspname/AND s.sequence_schema}
),
owned AS (
SELECT
n.nspname AS sequence_schema,
c.relname AS sequence_name,
n2.nspname AS owned_by_table_schema,
c2.relname AS owned_by_table,
a.attname AS owned_by_column
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
LEFT JOIN pg_depend d ON d.objid = c.oid AND d.deptype = 'a'
LEFT JOIN pg_class c2 ON c2.oid = d.refobjid
LEFT JOIN pg_namespace n2 ON n2.oid = c2.relnamespace
LEFT JOIN pg_attribute a ON a.attrelid = c2.oid AND a.attnum = d.refobjsubid
WHERE c.relkind = 'S'
AND n.nspname $EXCL_SYS
${IN_FILTER}
)
SELECT
seqs.sequence_schema,
seqs.sequence_name,
seqs.data_type,
seqs.start_value,
seqs.minimum_value,
seqs.maximum_value,
seqs.increment,
seqs.cycle_option,
COALESCE(owned.owned_by_table_schema,'') AS owned_by_table_schema,
COALESCE(owned.owned_by_table,'') AS owned_by_table,
COALESCE(owned.owned_by_column,'') AS owned_by_column
FROM seqs
LEFT JOIN owned
ON owned.sequence_schema = seqs.sequence_schema
AND owned.sequence_name = seqs.sequence_name
ORDER BY seqs.sequence_schema, seqs.sequence_name;
SQL
psql -h "$HOST" -U "$USER" -d "$DB" -c "\copy ( $Q4 ) TO '$OUT/04_sequences.csv' WITH CSV HEADER"
# ---------- 05: Triggers ----------
read -r -d '' Q5 <<SQL
WITH triggers AS (
SELECT
nt.nspname AS table_schema,
ct.relname AS table_name,
t.tgname AS trigger_name,
pg_get_triggerdef(t.oid, true) AS trigger_definition,
CASE t.tgenabled
WHEN 'O' THEN 'ENABLED'
WHEN 'D' THEN 'DISABLED'
WHEN 'R' THEN 'REPLICA'
WHEN 'A' THEN 'ALWAYS'
END AS enabled_state,
np.nspname AS function_schema,
p.proname AS function_name,
t.oid AS trigger_oid,
t.tgconstraint AS constraint_oid
FROM pg_trigger t
JOIN pg_class ct ON ct.oid = t.tgrelid
JOIN pg_namespace nt ON nt.oid = ct.relnamespace
JOIN pg_proc p ON p.oid = t.tgfoid
JOIN pg_namespace np ON np.oid = p.pronamespace
WHERE NOT t.tgisinternal
AND nt.nspname $EXCL_SYS
${IN_FILTER/AND n.nspname/AND nt.nspname}
),
constraint_flags AS (
SELECT
tr.trigger_oid,
(pc.oid IS NOT NULL) AS is_constraint_trigger
FROM triggers tr
LEFT JOIN pg_constraint pc ON pc.oid = tr.constraint_oid
)
SELECT
tr.table_schema,
tr.table_name,
tr.trigger_name,
tr.enabled_state,
tr.function_schema,
tr.function_name,
cf.is_constraint_trigger,
tr.trigger_definition
FROM triggers tr
LEFT JOIN constraint_flags cf ON cf.trigger_oid = tr.trigger_oid
ORDER BY tr.table_schema, tr.table_name, tr.trigger_name;
SQL
psql -h "$HOST" -U "$USER" -d "$DB" -c "\copy ( $Q5 ) TO '$OUT/05_triggers.csv' WITH CSV HEADER"
echo "Done. CSVs written to: $OUT"
```
---
# Option B — Python (single script; also creates a ZIP)
Save as `export_postgres_schema.py`, install deps, run:
```bash
pip install psycopg2-binary pandas
python export_postgres_schema.py --host your-host --db your_db --user your_user --password 'your_pass' --out ./schema_export --schema public,analytics
```
```python
#!/usr/bin/env python3
import argparse, os, zipfile
import pandas as pd
import psycopg2
def main():
ap = argparse.ArgumentParser()
ap.add_argument("--host", required=True)
ap.add_argument("--db", required=True)
ap.add_argument("--user", required=True)
ap.add_argument("--password", required=True)
ap.add_argument("--out", default="./schema_export")
ap.add_argument("--schema", default="") # comma-separated
args = ap.parse_args()
os.makedirs(args.out, exist_ok=True)
conn = psycopg2.connect(
host=args.host, dbname=args.db, user=args.user, password=args.password
)
def in_list(alias):
if not args.schema:
return ""
items = ", ".join("'" + s.strip() + "'" for s in args.schema.split(","))
# alias is a table alias prefix for schema column (e.g., c.table_schema, n.nspname, schemaname)
if alias.endswith(".table_schema") or alias.endswith(".sequence_schema"):
return f" AND {alias} IN ({items})"
if alias.endswith(".nspname"): # for pg_namespace alias
return f" AND {alias} IN ({items})"
if alias == "schemaname": # for pg_indexes/matviews columns
return f" AND {alias} IN ({items})"
return ""
q1 = f"""
WITH columns AS (
SELECT
c.table_schema,
c.table_name,
c.column_name,
c.data_type,
c.character_maximum_length,
c.is_nullable,
c.column_default,
pgd.description AS column_description,
c.ordinal_position
FROM information_schema.columns c
LEFT JOIN pg_catalog.pg_statio_all_tables st
ON c.table_schema = st.schemaname AND c.table_name = st.relname
LEFT JOIN pg_catalog.pg_description pgd
ON pgd.objoid = st.relid AND pgd.objsubid = c.ordinal_position
WHERE c.table_schema NOT IN ('pg_catalog','information_schema'){in_list('c.table_schema')}
),
pkeys AS (
SELECT kcu.table_schema, kcu.table_name,
string_agg(kcu.column_name, ', ' ORDER BY kcu.ordinal_position) AS primary_keys
FROM information_schema.table_constraints tco
JOIN information_schema.key_column_usage kcu
ON kcu.constraint_name = tco.constraint_name
WHERE tco.constraint_type = 'PRIMARY KEY'
GROUP BY kcu.table_schema, kcu.table_name
),
fkeys AS (
SELECT
kcu.table_schema, kcu.table_name, kcu.column_name,
ccu.table_schema AS foreign_table_schema,
ccu.table_name AS foreign_table,
ccu.column_name AS foreign_column
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu
ON tc.constraint_name = kcu.constraint_name
JOIN information_schema.constraint_column_usage ccu
ON ccu.constraint_name = tc.constraint_name
WHERE tc.constraint_type = 'FOREIGN KEY'
),
indexes AS (
SELECT
schemaname AS table_schema,
tablename AS table_name,
indexname AS index_name,
indexdef AS index_definition
FROM pg_indexes
WHERE schemaname NOT IN ('pg_catalog','information_schema'){in_list('schemaname')}
)
SELECT
col.table_schema,
col.table_name,
col.column_name,
col.data_type,
col.character_maximum_length,
col.is_nullable,
col.column_default,
col.column_description,
pk.primary_keys,
fk.foreign_table_schema,
fk.foreign_table,
fk.foreign_column,
idx.index_name,
idx.index_definition
FROM columns col
LEFT JOIN pkeys pk ON col.table_schema = pk.table_schema AND col.table_name = pk.table_name
LEFT JOIN fkeys fk ON col.table_schema = fk.table_schema AND col.table_name = fk.table_name AND col.column_name = fk.column_name
LEFT JOIN indexes idx ON col.table_schema = idx.table_schema AND col.table_name = idx.table_name
ORDER BY col.table_schema, col.table_name, col.ordinal_position;
"""
q2 = f"""
WITH views AS (
SELECT
n.nspname AS view_schema,
c.relname AS view_name,
pg_get_viewdef(c.oid, true) AS view_definition,
obj_description(c.oid, 'pg_class') AS view_description,
c.oid AS view_oid
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind = 'v'
AND n.nspname NOT IN ('pg_catalog','information_schema'){in_list('n.nspname')}
),
deps AS (
SELECT
r.refobjid AS view_oid,
string_agg(DISTINCT (dn.nspname || '.' || d.relname), ', ' ORDER BY dn.nspname, d.relname) AS depends_on
FROM pg_depend r
JOIN pg_class d ON d.oid = r.objid
JOIN pg_namespace dn ON dn.oid = d.relnamespace
WHERE r.classid = 'pg_rewrite'::regclass
AND d.relkind IN ('r','p','v','m','f')
GROUP BY r.refobjid
)
SELECT
v.view_schema,
v.view_name,
v.view_description,
v.view_definition,
coalesce(deps.depends_on, '') AS depends_on
FROM views v
LEFT JOIN deps ON deps.view_oid = v.view_oid
ORDER BY v.view_schema, v.view_name;
"""
q3 = f"""
WITH m AS (
SELECT
schemaname AS matview_schema,
matviewname AS matview_name,
definition AS matview_definition,
ispopulated AS is_populated,
tablespace,
matviewowner AS owner
FROM pg_matviews
WHERE schemaname NOT IN ('pg_catalog','information_schema'){in_list('schemaname')}
),
idx AS (
SELECT
schemaname AS matview_schema,
tablename AS matview_name,
string_agg(indexname || ' :: ' || indexdef, ' | ' ORDER BY indexname) AS matview_indexes
FROM pg_indexes
WHERE schemaname NOT IN ('pg_catalog','information_schema'){in_list('schemaname')}
GROUP BY schemaname, tablename
)
SELECT
m.matview_schema,
m.matview_name,
m.owner,
m.tablespace,
m.is_populated,
m.matview_definition,
coalesce(idx.matview_indexes, '') AS matview_indexes
FROM m
LEFT JOIN idx ON idx.matview_schema = m.matview_schema AND idx.matview_name = m.matview_name
ORDER BY m.matview_schema, m.matview_name;
"""
q4 = f"""
WITH seqs AS (
SELECT
s.sequence_schema,
s.sequence_name,
s.data_type,
s.start_value,
s.minimum_value,
s.maximum_value,
s.increment,
s.cycle_option
FROM information_schema.sequences s
WHERE s.sequence_schema NOT IN ('pg_catalog','information_schema'){in_list('s.sequence_schema')}
),
owned AS (
SELECT
n.nspname AS sequence_schema,
c.relname AS sequence_name,
n2.nspname AS owned_by_table_schema,
c2.relname AS owned_by_table,
a.attname AS owned_by_column
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
LEFT JOIN pg_depend d ON d.objid = c.oid AND d.deptype = 'a'
LEFT JOIN pg_class c2 ON c2.oid = d.refobjid
LEFT JOIN pg_namespace n2 ON n2.oid = c2.relnamespace
LEFT JOIN pg_attribute a ON a.attrelid = c2.oid AND a.attnum = d.refobjsubid
WHERE c.relkind = 'S'
AND n.nspname NOT IN ('pg_catalog','information_schema'){in_list('n.nspname')}
)
SELECT
seqs.sequence_schema,
seqs.sequence_name,
seqs.data_type,
seqs.start_value,
seqs.minimum_value,
seqs.maximum_value,
seqs.increment,
seqs.cycle_option,
COALESCE(owned.owned_by_table_schema,'') AS owned_by_table_schema,
COALESCE(owned.owned_by_table,'') AS owned_by_table,
COALESCE(owned.owned_by_column,'') AS owned_by_column
FROM seqs
LEFT JOIN owned
ON owned.sequence_schema = seqs.sequence_schema
AND owned.sequence_name = seqs.sequence_name
ORDER BY seqs.sequence_schema, seqs.sequence_name;
"""
q5 = f"""
WITH triggers AS (
SELECT
nt.nspname AS table_schema,
ct.relname AS table_name,
t.tgname AS trigger_name,
pg_get_triggerdef(t.oid, true) AS trigger_definition,
CASE t.tgenabled
WHEN 'O' THEN 'ENABLED'
WHEN 'D' THEN 'DISABLED'
WHEN 'R' THEN 'REPLICA'
WHEN 'A' THEN 'ALWAYS'
END AS enabled_state,
np.nspname AS function_schema,
p.proname AS function_name,
t.oid AS trigger_oid,
t.tgconstraint AS constraint_oid
FROM pg_trigger t
JOIN pg_class ct ON ct.oid = t.tgrelid
JOIN pg_namespace nt ON nt.oid = ct.relnamespace
JOIN pg_proc p ON p.oid = t.tgfoid
JOIN pg_namespace np ON np.oid = p.pronamespace
WHERE NOT t.tgisinternal
AND nt.nspname NOT IN ('pg_catalog','information_schema'){in_list('nt.nspname')}
),
constraint_flags AS (
SELECT
tr.trigger_oid,
(pc.oid IS NOT NULL) AS is_constraint_trigger
FROM triggers tr
LEFT JOIN pg_constraint pc ON pc.oid = tr.constraint_oid
)
SELECT
tr.table_schema,
tr.table_name,
tr.trigger_name,
tr.enabled_state,
tr.function_schema,
tr.function_name,
cf.is_constraint_trigger,
tr.trigger_definition
FROM triggers tr
LEFT JOIN constraint_flags cf ON cf.trigger_oid = tr.trigger_oid
ORDER BY tr.table_schema, tr.table_name, tr.trigger_name;
"""
outputs = [
("01_tables_columns_keys_indexes.csv", q1),
("02_views.csv", q2),
("03_materialized_views.csv", q3),
("04_sequences.csv", q4),
("05_triggers.csv", q5),
]
for fname, q in outputs:
df = pd.read_sql(q, conn)
df.to_csv(os.path.join(args.out, fname), index=False)
# zip everything
zip_path = os.path.join(args.out, "postgres_schema_export.zip")
with zipfile.ZipFile(zip_path, "w", zipfile.ZIP_DEFLATED) as z:
for fname, _ in outputs:
z.write(os.path.join(args.out, fname), fname)
conn.close()
print(f"Done. Files in {args.out} and ZIP at {zip_path}")
if __name__ == "__main__":
main()
```
---
## Notes & tweaks
* Both variants default to **excluding** `pg_catalog` and `information_schema`. Use `--schema` to restrict to specific schemas.
* They **save locally** (no server write access needed).
* Safe for large DBs, but if you have many objects, consider adding `LIMIT` while testing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment