How to Extract and Replicate PostgreSQL Permissions When Migrating to a New Instance

Migrating a PostgreSQL database isn’t just about moving data—getting the right roles and permissions in place is critical for security and proper application function. This post demonstrates how to extract roles and permissions from your source instance and apply them to a new PostgreSQL environment.

Understanding Permissions in PostgreSQL

PostgreSQL controls access via roles. A role can represent either a database user or a group, and roles are granted permissions (privileges) on objects (tables, databases, schemas, etc.) using GRANT and REVOKE commands. These permissions can be viewed and managed at various levels:

  • Database: Control who can connect
  • Schema: Control access to groups of tables, functions, etc.
  • Objects: Control what actions (SELECT, INSERT, UPDATE, etc.) users can perform on tables, functions, or sequences.

You can view permissions using PostgreSQL’s psql meta-commands:

  • \l+ — Show database privileges
  • \dn+ — Show schema privileges
  • \dp+ — Show table and other object privileges

Step 1: Extract Roles from the Source Instance

To extract roles (users and groups) from your current PostgreSQL server:

pg_dumpall --roles-only > roles.sql

Note:

  • This command will export all roles (but not their passwords in managed services like Azure Database for PostgreSQL).
  • In cloud managed systems, you might not have the ability to extract passwords; you’ll need to set them manually on the target instance.

Step 2: Extract Role & Object Permissions

Extracting object-level permissions (like all GRANT and REVOKE statements) can be done while dumping database schema:

pg_dump -h <source_server> -U <username> -d <dbname> -s > db_schema.sql

Next, filter permission statements:

If you’re working in PowerShell (Windows), run:

pg_dump -h "psql-prod-01.postgres.database.azure.com" -U pgadmin -d "prod_db" -s | Select-String -Pattern "^(GRANT|REVOKE|ALTER DEFAULT PRIVILEGES)" -Path "C:\Path\to\db_schema.sql" | ForEach-Object { $_.Line } > C:\Path\to\permissions.sql

If you’re working on Mac/Linux, run:

pg_dump -h "psql-prod-01.postgres.database.azure.com" -U pgadmin -d "prod_db" -s | grep -E '^(GRANT|REVOKE|ALTER DEFAULT PRIVILEGES)' > perms.sql

This extracts all lines related to granting or revoking privileges and puts them in perms.sql.

Step 3: Prepare and Edit Scripts

  • Review the extracted roles.sql and `permissions.sql:
    • Remove any references to unsupported roles (like postgres superuser in cloud environments).
    • Plan to set user passwords manually if they weren’t included.

Step 4: Copy Roles and Permissions to the New Instance

  1. Recreate roles:textpsql -h <target_server> -U <admin_user> -f roles.sql
    • Remember to set or update passwords for each user after creation.
  2. Apply object-level permissions:textpsql -h <target_server> -U <admin_user> -d <target_db> -f perms.sql

Step 5: Validate Permissions

Connect as each role or user to ensure operations work as expected:

  • Use \dp tablename in psql to check table permissions.
  • Use the information_schema views (e.g., role_table_grants) to query permissions programmatically:SELECT grantee, privilege_type, table_name FROM information_schema.role_table_grants;

Decoding a Python Script: An Improv-Inspired Guide for Beginners

By Vinay Rahul Are, Python Enthusiast \& Improv Comedy Fan


Introduction

Learning Python can feel intimidating—unless you approach it with a sense of play! Just like improv comedy, Python is about saying “yes, and…” to new ideas, experimenting, and having fun. In this post, I’ll walk you through a real-world Python script, breaking down each part so you can understand, explain, and even perform it yourself!


The Script’s Purpose

The script we’ll explore automates the process of running multiple SQL files against an Amazon Redshift database. For each SQL file, it:

  • Executes the file’s SQL commands on Redshift
  • Logs how many rows were affected, how long it took, and any errors
  • Moves the file to a “Done” folder when finished

It’s a practical tool for data engineers, but the structure and logic are great for any Python beginner to learn from.


1. The “Show Description” (Docstring)

At the top, you’ll find a docstring—a big comment block that tells you what the script does, what you need to run it, and how to use it.

"""
Batch Redshift SQL Script Executor with Per-Script Logging, Timing, and Post-Execution Archiving

Pre-requisites:
---------------
1. Python 3.x installed on your machine.
2. The following Python packages must be installed:
    - psycopg2-binary
3. (Recommended) Use a virtual environment to avoid dependency conflicts.
4. Network access to your Amazon Redshift cluster.

Installation commands:
----------------------
python -m venv venv
venv\Scripts\activate        # On Windows
pip install psycopg2-binary

Purpose:
--------
This script automates the execution of multiple .sql files against an Amazon Redshift cluster...
"""

2. Importing the “Cast and Crew” (Modules)

Every show needs its cast. In Python, that means importing modules:

import os
import glob
import psycopg2
import getpass
import shutil
import time
  • os, glob, shutil: Handle files and folders
  • psycopg2: Talks to the Redshift database
  • getpass: Securely prompts for passwords
  • time: Measures how long things take

3. The “Stage Directions” (Configuration)

Before the curtain rises, set your stage:

HOST = '<redshift-endpoint>'
PORT = 5439
USER = '<your-username>'
DATABASE = '<your-database>'
SCRIPT_DIR = r'C:\redshift_scripts'
DONE_DIR = os.path.join(SCRIPT_DIR, 'Done')
  • Replace the placeholders with your actual Redshift details and script folder path.

4. The “Comedy Routine” (Function Definition)

The main function, run_sql_script, is like a well-rehearsed bit:

def run_sql_script(script_path, conn):
    log_path = os.path.splitext(script_path)[0] + '.log'
    with open(script_path, 'r', encoding='utf-8') as sql_file, open(log_path, 'w', encoding='utf-8') as log_file:
        sql = sql_file.read()
        log_file.write(f"Running script: {script_path}\n")
        start_time = time.perf_counter()
        try:
            with conn.cursor() as cur:
                cur.execute(sql)
                end_time = time.perf_counter()
                elapsed_time = end_time - start_time
                rows_affected = cur.rowcount if cur.rowcount != -1 else 'Unknown'
                log_file.write(f"Rows affected: {rows_affected}\n")
                log_file.write(f"Execution time: {elapsed_time:.2f} seconds\n")
                conn.commit()
                log_file.write("Execution successful.\n")
        except Exception as e:
            end_time = time.perf_counter()
            elapsed_time = end_time - start_time
            log_file.write(f"Error: {str(e)}\n")
            log_file.write(f"Execution time (until error): {elapsed_time:.2f} seconds\n")
            conn.rollback()
  • Reads the SQL file
  • Logs what’s happening
  • Measures execution time
  • Handles success or errors gracefully

5. The “Main Event” (main function)

This is the showrunner, making sure everything happens in order:

def main():
    password = getpass.getpass("Enter your Redshift password: ")
    if not os.path.exists(DONE_DIR):
        os.makedirs(DONE_DIR)
    sql_files = glob.glob(os.path.join(SCRIPT_DIR, '*.sql'))
    conn = psycopg2.connect(
        host=HOST,
        port=PORT,
        user=USER,
        password=password,
        dbname=DATABASE
    )
    for script_path in sql_files:
        print(f"Running {script_path} ...")
        run_sql_script(script_path, conn)
        try:
            shutil.move(script_path, DONE_DIR)
            print(f"Moved {script_path} to {DONE_DIR}")
        except Exception as move_err:
            print(f"Failed to move {script_path}: {move_err}")
    conn.close()
    print("All scripts executed.")
  • Prompts for your password (no peeking!)
  • Makes sure the “Done” folder exists
  • Finds all .sql files
  • Connects to Redshift
  • Runs each script, logs results, and moves the file when done

6. The “Curtain Call” (Script Entry Point)

This line ensures the main event only happens if you run the script directly:

if __name__ == "__main__":
    main()

7. Explaining the Script in Plain English

“This script automates running a bunch of SQL files against a Redshift database. For each file, it logs how many rows were affected, how long it took, and any errors. After running, it moves the file to a ‘Done’ folder so you know it’s finished. It’s organized with clear sections for setup, reusable functions, and the main execution flow.”


8. Why This Structure?

  • Imports first: So all your helpers are ready before the show starts.
  • Functions: Keep the code neat, reusable, and easy to understand.
  • Main block: Keeps your script from running accidentally if imported elsewhere.
  • Comments and docstrings: Make it easy for others (and future you) to understand what’s going on.

9. Final Thoughts: Python is Improv!

Just like improv, Python is best learned by doing. Try things out, make mistakes, and remember: if your code “crashes,” it’s just the computer’s way of saying, “Yes, and…let’s try that again!”

If you want to dig deeper into any part of this script, just ask in the comments below. Happy coding—and yes, and… keep learning!


How Increasing Azure PostgreSQL IOPS Supercharged Our Bulk Insert Performance

Loading millions of records into a cloud database can be a frustratingly slow task—unless you identify where your bottlenecks are. In this post, I will share how we significantly improved our insertion speeds on Azure Database for PostgreSQL Flexible Server by adjusting a single, often-overlooked setting: provisioned IOPS.


The Challenge: Slow Inserts Despite Low CPU

We were running a large data migration from Databricks to Azure Database for PostgreSQL Flexible Server. Our setup:

  • Instance: Memory Optimized, E8ds_v4 (8 vCores, 64 GiB RAM, 256 GiB Premium SSD)
  • Insert Method: 8 parallel threads from Databricks, each batching 50,000 rows

Despite this robust configuration, our insert speeds were disappointing. Monitoring showed:

  • CPU usage: ~10%
  • Disk IOPS: 100% utilization

Clearly, our CPU wasn’t the problem—disk I/O was.


The Bottleneck: Disk IOPS Saturation

Azure Database for PostgreSQL Flexible Server ties write performance directly to your provisioned IOPS (Input/Output Operations Per Second). PostgreSQL is forced to queue up write operations when your workload hits this limit, causing inserts to slow down dramatically.

Key signs you’re IOPS-bound:

  • Disk IOPS metric at or near 100%
  • Low CPU and memory utilization
  • Inserts (and possibly other write operations) are much slower than expected

The Fix: Increase Provisioned IOPS

We increased our provisioned IOPS from 1,100 to 5,000 using the Azure Portal:

  1. Go to your PostgreSQL Flexible Server in Azure.
  2. Select Compute + storage.
  3. Adjust the IOPS slider (or enter a higher value if using Premium SSD v2).
  4. Save changes—no downtime required.

Result:
Insert speeds improved immediately and dramatically. Disk performance no longer throttled the database, and we could fully utilize our CPU and memory resources.


Lessons Learned & Best Practices

  • Monitor your bottlenecks: Always check disk IOPS, CPU, and memory during heavy data loads.
  • Scale IOPS with workload: Azure lets you increase IOPS on the fly. For bulk loads, temporarily raising IOPS can save hours or days of processing time.
  • Batch and parallelize wisely: Match your parallel threads to your vCPU count, but remember that IOPS is often the true limiter for bulk writes.
  • Optimize indexes and constraints: Fewer indexes mean fewer writes per insert. Drop non-essential indexes before bulk loads and recreate them afterward.

Conclusion:
If your PostgreSQL inserts are slow on Azure, check your disk IOPS. Increasing provisioned IOPS can unlock the performance your hardware is capable of—sometimes, it’s the simplest tweak that makes the biggest difference.


pg_whoisactive

/*
================================================================================
pg_whoisactive - PostgreSQL Session Monitoring Tool
================================================================================

DESCRIPTION:
    A comprehensive PostgreSQL inspired by Adam Machanic's sp_whoisactive stored 
    procedure used on SQL server. This function provides detailed real-time information about active 
    database sessions, including:
    
    - Current executing queries and their duration
    - Session blocking and lock information  
    - I/O statistics and resource usage
    - Connection details and client information
    - Transaction status and isolation levels
    - Wait states and performance metrics

CAPABILITIES:
    ✓ Monitor active and idle database sessions
    ✓ Identify blocking chains and deadlock scenarios
    ✓ Track query execution times and resource consumption
    ✓ Display I/O statistics (reads, writes, temp usage)
    ✓ Show connection information (host, application, user)
    ✓ Filter sessions by various criteria
    ✓ Format output for easy analysis
    ✓ Help system with parameter documentation

PARAMETERS:
    show_own_spid (BOOLEAN, default: FALSE)
        - Include the current session in results
        
    show_system_spids (BOOLEAN, default: FALSE)  
        - Include system/background processes
        
    show_sleeping_spids (INTEGER, default: 1)
        - 0: Show only active sessions
        - 1: Show active + idle sessions with open transactions
        - 2: Show all sessions (including completely idle)
        
    get_full_inner_text (BOOLEAN, default: FALSE)
        - TRUE: Display complete SQL query text
        - FALSE: Truncate long queries to 100 characters
        
    get_plans (BOOLEAN, default: FALSE)
        - Reserved for future query plan integration
        
    get_locks (BOOLEAN, default: FALSE)
        - Include detailed lock information in wait_info column
        
    get_avg_time (BOOLEAN, default: FALSE)
        - Reserved for future average execution time tracking
        
    get_additional_info (BOOLEAN, default: FALSE)
        - Reserved for additional session metadata
        
    find_block_leaders (BOOLEAN, default: TRUE)
        - Identify and report blocking relationships between sessions
        
    delta_interval (INTEGER, default: 5)
        - Reserved for future delta calculations between samples
        
    format_output (INTEGER, default: 1)
        - 1: Normal output format
        - 2: Reserved for detailed format
        
    destination_table (TEXT, default: NULL)
        - Reserved for future table output functionality
        
    help (BOOLEAN, default: FALSE)
        - Display help information and parameter descriptions

OUTPUT COLUMNS:
    dd_hh_mm_ss              - Session duration (DD:HH:MM:SS or HH:MM:SS)
    session_id               - PostgreSQL process ID (PID)
    sql_text                 - Current SQL statement being executed
    sql_command              - SQL command type (SELECT, INSERT, UPDATE, etc.)
    login_name               - Database username
    wait_info                - Current wait state or blocking information
    tran_log_writes          - Transaction information and duration
    cpu_time                 - CPU time used (reserved for future use)
    tempdb_allocations       - Temporary file count
    tempdb_current           - Current temporary space usage
    blocking_session_id      - PID of session blocking this one
    blocked_session_count    - Number of sessions this one is blocking
    reads                    - Read I/O statistics
    writes                   - Write I/O statistics  
    context_switches         - Context switch information (reserved)
    physical_io              - Physical I/O metrics (reserved)
    query_cost               - Query execution cost (reserved)
    status                   - Current session status
    isolation_level          - Transaction isolation level (reserved)
    open_tran_count          - Number of open transactions
    percent_complete         - Query completion percentage (reserved)
    host_name                - Client hostname or IP address
    database_name            - Current database name
    program_name             - Client application name
    start_time               - When the current query started
    login_time               - When the session was established
    request_id               - Request identifier (uses PID)
    collection_time          - Timestamp when data was collected

USAGE EXAMPLES:
    -- Basic monitoring (most common use case)
    SELECT * FROM pg_whoisactive();
    
    -- Show all sessions including idle ones
    SELECT * FROM pg_whoisactive(show_sleeping_spids := 2);
    
    -- Focus on blocking issues
    SELECT * FROM pg_whoisactive(find_block_leaders := true, get_locks := true)
    WHERE blocking_session_id IS NOT NULL OR blocked_session_count > 0;
    
    -- Detailed analysis of active sessions
    SELECT * FROM pg_whoisactive(
        show_sleeping_spids := 0,
        get_full_inner_text := true,
        get_locks := true
    );
    
    -- Monitor specific session
    SELECT * FROM pg_whoisactive() WHERE session_id = 12345;
    
    -- Get help
    SELECT * FROM pg_whoisactive(help := true);

COMPANION FUNCTIONS:
    pg_blocking_tree() - Shows hierarchical blocking relationships

COMPATIBILITY:
    - PostgreSQL 9.6+
    - Requires pg_stat_activity and pg_locks system views
    - Works with all PostgreSQL editions

AUTHOR: Database Monitoring Tool
VERSION: 1.0
CREATED: 2024
================================================================================
*/

CREATE OR REPLACE FUNCTION public.pg_whoisactive(
    show_own_spid BOOLEAN DEFAULT FALSE,           -- Show current session
    show_system_spids BOOLEAN DEFAULT FALSE,       -- Show system processes  
    show_sleeping_spids INTEGER DEFAULT 1,         -- 0=no idle, 1=sleeping with open trans, 2=all sleeping
    get_full_inner_text BOOLEAN DEFAULT FALSE,     -- Show full query text
    get_plans BOOLEAN DEFAULT FALSE,               -- Get query plans (if available)
    get_locks BOOLEAN DEFAULT FALSE,               -- Show lock information
    get_avg_time BOOLEAN DEFAULT FALSE,            -- Show average execution times
    get_additional_info BOOLEAN DEFAULT FALSE,     -- Show additional session info
    find_block_leaders BOOLEAN DEFAULT TRUE,       -- Identify blocking chain leaders
    delta_interval INTEGER DEFAULT 5,              -- Seconds between samples for delta calculations
    format_output INTEGER DEFAULT 1,               -- 1=normal, 2=detailed
    destination_table TEXT DEFAULT NULL,           -- Future: output to table
    help BOOLEAN DEFAULT FALSE                     -- Show help information
)
RETURNS TABLE (
    dd_hh_mm_ss TEXT,                    -- Session duration
    session_id INTEGER,                  -- Process ID
    sql_text TEXT,                       -- Current SQL statement
    sql_command TEXT,                    -- Command type (SELECT, INSERT, etc.)
    login_name TEXT,                     -- Username
    wait_info TEXT,                      -- What the session is waiting for
    tran_log_writes TEXT,               -- Transaction info
    cpu_time TEXT,                      -- CPU time (if available)
    tempdb_allocations TEXT,            -- Temp space usage
    tempdb_current TEXT,                -- Current temp usage
    blocking_session_id INTEGER,        -- Session that's blocking this one
    blocked_session_count INTEGER,      -- How many sessions this one is blocking
    reads TEXT,                         -- Read statistics
    writes TEXT,                        -- Write statistics
    context_switches TEXT,              -- Context switch info
    physical_io TEXT,                   -- Physical I/O
    query_cost TEXT,                    -- Query cost (if available)
    status TEXT,                        -- Session status
    isolation_level TEXT,               -- Transaction isolation level
    open_tran_count INTEGER,            -- Number of open transactions
    percent_complete TEXT,              -- Progress (if available)
    host_name TEXT,                     -- Client hostname/IP
    database_name TEXT,                 -- Current database
    program_name TEXT,                  -- Application name
    start_time TIMESTAMP WITH TIME ZONE, -- When session started
    login_time TIMESTAMP WITH TIME ZONE, -- When user logged in
    request_id INTEGER,                  -- Request ID
    collection_time TIMESTAMP WITH TIME ZONE -- When this data was collected
) 
LANGUAGE plpgsql
AS $function$

-- pg_whoisactive - PostgreSQL equivalent of SQL Server's sp_whoisactive
-- A comprehensive session monitoring tool for PostgreSQL

DECLARE
    rec RECORD;
    blocking_info RECORD;
    blocked_count INTEGER;
    lock_info TEXT;
    wait_description TEXT;
    query_text TEXT;
    cmd_type TEXT;
    duration_formatted TEXT;
    days INTEGER;
    hours INTEGER;
    minutes INTEGER;
    seconds INTEGER;
    total_seconds NUMERIC;
    reads_value TEXT;
    writes_value TEXT;
    temp_files_value TEXT;
    temp_bytes_value TEXT;
BEGIN
    -- Show help if requested
    IF help THEN
        RETURN QUERY
        SELECT 
            'HELP'::TEXT as dd_hh_mm_ss,
            0 as session_id,
            'pg_whoisactive - PostgreSQL equivalent of sp_whoisactive'::TEXT as sql_text,
            'HELP'::TEXT as sql_command,
            'show_own_spid: Include current session (default: false)'::TEXT as login_name,
            'show_system_spids: Include system processes (default: false)'::TEXT as wait_info,
            'show_sleeping_spids: 0=no idle, 1=idle with trans, 2=all idle (default: 1)'::TEXT as tran_log_writes,
            'get_full_inner_text: Show complete SQL text (default: false)'::TEXT as cpu_time,
            'get_locks: Include lock information (default: false)'::TEXT as tempdb_allocations,
            'find_block_leaders: Identify blocking leaders (default: true)'::TEXT as tempdb_current,
            NULL::INTEGER as blocking_session_id,
            NULL::INTEGER as blocked_session_count,
            'Usage: SELECT * FROM pg_whoisactive();'::TEXT as reads,
            'Advanced: SELECT * FROM pg_whoisactive(show_sleeping_spids := 2, get_locks := true);'::TEXT as writes,
            NULL::TEXT as context_switches,
            NULL::TEXT as physical_io,
            NULL::TEXT as query_cost,
            NULL::TEXT as status,
            NULL::TEXT as isolation_level,
            NULL::INTEGER as open_tran_count,
            NULL::TEXT as percent_complete,
            NULL::TEXT as host_name,
            NULL::TEXT as database_name,
            NULL::TEXT as program_name,
            NULL::TIMESTAMP WITH TIME ZONE as start_time,
            NULL::TIMESTAMP WITH TIME ZONE as login_time,
            NULL::INTEGER as request_id,
            now() as collection_time;
        RETURN;
    END IF;

    -- Main query logic
    FOR rec IN 
        SELECT 
            a.pid,
            a.usename,
            a.application_name,
            a.client_addr,
            a.client_hostname,
            a.backend_start,
            a.xact_start,
            a.query_start,
            a.state_change,
            a.state,
            a.query,
            a.datname,
            a.backend_type,
            -- Get I/O stats if available
            COALESCE(s.blks_read, 0) as blks_read,
            COALESCE(s.blks_hit, 0) as blks_hit,
            COALESCE(s.tup_returned, 0) as tup_returned,
            COALESCE(s.tup_fetched, 0) as tup_fetched,
            COALESCE(s.tup_inserted, 0) as tup_inserted,
            COALESCE(s.tup_updated, 0) as tup_updated,
            COALESCE(s.tup_deleted, 0) as tup_deleted,
            COALESCE(s.temp_files, 0) as temp_files,
            COALESCE(s.temp_bytes, 0) as temp_bytes
        FROM pg_stat_activity a
        LEFT JOIN pg_stat_database s ON s.datname = a.datname
        WHERE 
            -- Filter current session
            (show_own_spid = TRUE OR a.pid <> pg_backend_pid())
            -- Filter system processes  
            AND (show_system_spids = TRUE OR (a.datname IS NOT NULL AND a.backend_type = 'client backend'))
            -- Filter idle connections based on show_sleeping_spids parameter
            AND (
                show_sleeping_spids = 2 OR  -- Show all
                (show_sleeping_spids = 1 AND (a.state <> 'idle' OR a.xact_start IS NOT NULL)) OR  -- Show active or idle in transaction
                (show_sleeping_spids = 0 AND a.state = 'active')  -- Show only active
            )
        ORDER BY 
            CASE WHEN a.state = 'active' THEN 1 ELSE 2 END,  -- Active sessions first
            a.query_start DESC NULLS LAST
    LOOP
        -- Calculate and format session duration
        total_seconds := EXTRACT(EPOCH FROM (COALESCE(now() - rec.backend_start, INTERVAL '0')));
        days := FLOOR(total_seconds / 86400);
        hours := FLOOR((total_seconds % 86400) / 3600);
        minutes := FLOOR((total_seconds % 3600) / 60);
        seconds := FLOOR(total_seconds % 60);
        
        IF days > 0 THEN
            duration_formatted := LPAD(days::TEXT, 2, '0') || ':' || LPAD(hours::TEXT, 2, '0') || ':' || LPAD(minutes::TEXT, 2, '0') || ':' || LPAD(seconds::TEXT, 2, '0');
        ELSE
            duration_formatted := LPAD(hours::TEXT, 2, '0') || ':' || LPAD(minutes::TEXT, 2, '0') || ':' || LPAD(seconds::TEXT, 2, '0');
        END IF;

        -- Process SQL text
        IF get_full_inner_text THEN
            query_text := rec.query;
        ELSE
            query_text := CASE 
                WHEN rec.query IS NULL THEN NULL
                WHEN LENGTH(rec.query) > 100 THEN LEFT(rec.query, 97) || '...'
                ELSE rec.query 
            END;
        END IF;

        -- Extract command type
        IF rec.query IS NOT NULL AND TRIM(rec.query) <> '' THEN
            cmd_type := UPPER(SPLIT_PART(TRIM(rec.query), ' ', 1));
            IF cmd_type = '' THEN cmd_type := NULL; END IF;
        ELSE
            cmd_type := NULL;
        END IF;

        -- Determine wait info and status
        wait_description := CASE 
            WHEN rec.state = 'active' THEN 'RUNNABLE'
            WHEN rec.state = 'idle' THEN 'IDLE'
            WHEN rec.state = 'idle in transaction' THEN 'IDLE IN TRANSACTION'
            WHEN rec.state = 'idle in transaction (aborted)' THEN 'IDLE IN TRANSACTION (ABORTED)'
            WHEN rec.state = 'fastpath function call' THEN 'FASTPATH FUNCTION CALL'
            WHEN rec.state = 'disabled' THEN 'DISABLED'
            ELSE UPPER(rec.state)
        END;

        -- Find blocking information if requested
        IF find_block_leaders THEN
            -- Check if this session is blocked
            SELECT b.blocking_pid INTO blocking_info
            FROM (
                SELECT 
                    blocked_locks.pid as blocked_pid,
                    blocking_locks.pid as blocking_pid
                FROM pg_catalog.pg_locks blocked_locks
                    JOIN pg_catalog.pg_locks blocking_locks ON (
                        blocking_locks.locktype = blocked_locks.locktype
                        AND blocking_locks.database IS NOT DISTINCT FROM blocked_locks.database
                        AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
                        AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
                        AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
                        AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
                        AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
                        AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
                        AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
                        AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
                        AND blocking_locks.virtualtransaction IS NOT DISTINCT FROM blocked_locks.virtualtransaction
                    )
                WHERE NOT blocked_locks.granted 
                    AND blocking_locks.granted
                    AND blocked_locks.pid <> blocking_locks.pid
            ) b
            WHERE b.blocked_pid = rec.pid;

            -- Count how many sessions this one is blocking
            SELECT COUNT(*) INTO blocked_count
            FROM (
                SELECT 
                    blocked_locks.pid as blocked_pid,
                    blocking_locks.pid as blocking_pid
                FROM pg_catalog.pg_locks blocked_locks
                    JOIN pg_catalog.pg_locks blocking_locks ON (
                        blocking_locks.locktype = blocked_locks.locktype
                        AND blocking_locks.database IS NOT DISTINCT FROM blocked_locks.database
                        AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
                        AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
                        AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
                        AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
                        AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
                        AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
                        AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
                        AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
                        AND blocking_locks.virtualtransaction IS NOT DISTINCT FROM blocked_locks.virtualtransaction
                    )
                WHERE NOT blocked_locks.granted 
                    AND blocking_locks.granted
                    AND blocked_locks.pid <> blocking_locks.pid
            ) b
            WHERE b.blocking_pid = rec.pid;
        ELSE
            blocking_info := NULL;
            blocked_count := 0;
        END IF;

        -- Get lock information if requested
        IF get_locks THEN
            SELECT STRING_AGG(
                l.locktype || ':' || l.mode || 
                CASE WHEN l.granted THEN ':GRANTED' ELSE ':WAITING' END, 
                ', '
            ) INTO lock_info
            FROM pg_locks l
            WHERE l.pid = rec.pid;
        ELSE
            lock_info := NULL;
        END IF;

        -- Format read/write statistics
        reads_value := CASE 
            WHEN rec.blks_read > 0 OR rec.blks_hit > 0 THEN 
                'Logical: ' || (rec.blks_read + rec.blks_hit)::TEXT || ', Physical: ' || rec.blks_read::TEXT
            ELSE NULL 
        END;

        writes_value := CASE 
            WHEN (rec.tup_inserted + rec.tup_updated + rec.tup_deleted) > 0 THEN 
                'Rows: ' || (rec.tup_inserted + rec.tup_updated + rec.tup_deleted)::TEXT
            ELSE NULL 
        END;

        -- Format temp usage
        temp_files_value := CASE 
            WHEN rec.temp_files > 0 THEN rec.temp_files::TEXT || ' files'
            ELSE NULL 
        END;

        temp_bytes_value := CASE 
            WHEN rec.temp_bytes > 0 THEN pg_size_pretty(rec.temp_bytes)
            ELSE NULL 
        END;

        -- Return the row
        RETURN QUERY
        SELECT 
            duration_formatted,                                         -- dd_hh_mm_ss
            rec.pid,                                                   -- session_id
            query_text,                                                -- sql_text
            cmd_type,                                                  -- sql_command
            rec.usename::TEXT,                                         -- login_name
            CASE 
                WHEN blocking_info.blocking_pid IS NOT NULL THEN 
                    'BLOCKED BY: ' || blocking_info.blocking_pid::TEXT
                WHEN get_locks AND lock_info IS NOT NULL THEN lock_info
                ELSE wait_description 
            END,                                                       -- wait_info
            CASE 
                WHEN rec.xact_start IS NOT NULL THEN 
                    'Open: ' || EXTRACT(EPOCH FROM (now() - rec.xact_start))::INTEGER::TEXT || 's'
                ELSE NULL 
            END,                                                       -- tran_log_writes
            NULL::TEXT,                                                -- cpu_time (not available in PostgreSQL)
            temp_files_value,                                          -- tempdb_allocations
            temp_bytes_value,                                          -- tempdb_current
            blocking_info.blocking_pid,                                -- blocking_session_id
            CASE WHEN blocked_count > 0 THEN blocked_count ELSE NULL END, -- blocked_session_count
            reads_value,                                               -- reads
            writes_value,                                              -- writes
            NULL::TEXT,                                                -- context_switches (not available)
            NULL::TEXT,                                                -- physical_io
            NULL::TEXT,                                                -- query_cost (would need EXPLAIN)
            wait_description,                                          -- status
            NULL::TEXT,                                                -- isolation_level (would need transaction info)
            CASE WHEN rec.xact_start IS NOT NULL THEN 1 ELSE 0 END,   -- open_tran_count
            NULL::TEXT,                                                -- percent_complete (not available)
            COALESCE(rec.client_hostname, rec.client_addr::TEXT, 'local'), -- host_name
            rec.datname::TEXT,                                         -- database_name
            COALESCE(rec.application_name::TEXT, 'unknown'),           -- program_name
            rec.query_start,                                           -- start_time
            rec.backend_start,                                         -- login_time
            rec.pid,                                                   -- request_id (using pid)
            now();                                                     -- collection_time
    END LOOP;
END;
$function$;

-- Create a helper function for quick blocking analysis
CREATE OR REPLACE FUNCTION pg_blocking_tree()
RETURNS TABLE (
    level INTEGER,
    blocking_pid INTEGER,
    blocked_pid INTEGER,
    blocked_user TEXT,
    blocked_query TEXT,
    blocked_duration TEXT
)
LANGUAGE sql
AS $$
/*
================================================================================
pg_blocking_tree - Hierarchical Blocking Chain Analysis
================================================================================

DESCRIPTION:
    Displays blocking relationships in a hierarchical tree format, showing
    which sessions are blocking others and the chain of dependencies.
    
RETURNS:
    level               - Depth in the blocking hierarchy
    blocking_pid        - Session causing the block
    blocked_pid         - Session being blocked
    blocked_user        - Username of blocked session
    blocked_query       - SQL being blocked
    blocked_duration    - How long the session has been blocked

USAGE:
    SELECT * FROM pg_blocking_tree();
================================================================================
*/
    WITH RECURSIVE blocking_tree AS (
        -- Find the root blockers (sessions that block others but aren't blocked themselves)
        SELECT 
            1 as level,
            blocking_locks.pid as blocking_pid,
            blocked_locks.pid as blocked_pid,
            blocked_activity.usename as blocked_user,
            blocked_activity.query as blocked_query,
            EXTRACT(EPOCH FROM (now() - blocked_activity.query_start))::INTEGER::TEXT || 's' as blocked_duration
        FROM pg_catalog.pg_locks blocked_locks
            JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
            JOIN pg_catalog.pg_locks blocking_locks ON (
                blocking_locks.locktype = blocked_locks.locktype
                AND blocking_locks.database IS NOT DISTINCT FROM blocked_locks.database
                AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
                AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
                AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
                AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
                AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
                AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
                AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
                AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
                AND blocking_locks.virtualtransaction IS NOT DISTINCT FROM blocked_locks.virtualtransaction
            )
        WHERE NOT blocked_locks.granted 
            AND blocking_locks.granted
            AND blocked_locks.pid <> blocking_locks.pid
    )
    SELECT * FROM blocking_tree
    ORDER BY level, blocking_pid, blocked_pid;
$$;

/*
================================================================================
USAGE EXAMPLES AND REFERENCE
================================================================================

-- BASIC MONITORING
--------------------------------------------------------------------------------
-- View all active sessions (most common usage)
SELECT * FROM pg_whoisactive();

-- Show only currently executing queries
SELECT * FROM pg_whoisactive(show_sleeping_spids := 0);

-- Include all sessions (even completely idle ones)
SELECT * FROM pg_whoisactive(show_sleeping_spids := 2);


-- BLOCKING ANALYSIS
--------------------------------------------------------------------------------
-- Quick check for blocking issues
SELECT * FROM pg_whoisactive(find_block_leaders := true) 
WHERE blocking_session_id IS NOT NULL OR blocked_session_count > 0;

-- Detailed blocking analysis with lock information
SELECT session_id, login_name, sql_text, wait_info, blocking_session_id, blocked_session_count
FROM pg_whoisactive(get_locks := true, find_block_leaders := true)
WHERE blocking_session_id IS NOT NULL OR blocked_session_count > 0;

-- Show complete blocking hierarchy
SELECT * FROM pg_blocking_tree();


-- DETAILED SESSION ANALYSIS  
--------------------------------------------------------------------------------
-- Full query text and detailed information
SELECT * FROM pg_whoisactive(
    get_full_inner_text := true,
    get_locks := true,
    find_block_leaders := true
);

-- Monitor specific session by PID
SELECT * FROM pg_whoisactive() WHERE session_id = 12345;

-- Long-running transactions
SELECT session_id, login_name, sql_command, tran_log_writes, dd_hh_mm_ss
FROM pg_whoisactive()
WHERE tran_log_writes IS NOT NULL
ORDER BY dd_hh_mm_ss DESC;


-- PERFORMANCE MONITORING
--------------------------------------------------------------------------------
-- Sessions with high I/O activity
SELECT session_id, login_name, sql_command, reads, writes, tempdb_current
FROM pg_whoisactive()
WHERE reads IS NOT NULL OR writes IS NOT NULL OR tempdb_current IS NOT NULL;

-- Currently active sessions sorted by duration
SELECT session_id, dd_hh_mm_ss, login_name, sql_text, status
FROM pg_whoisactive(show_sleeping_spids := 0)
ORDER BY dd_hh_mm_ss DESC;


-- TROUBLESHOOTING QUERIES
--------------------------------------------------------------------------------
-- Find sessions from specific application
SELECT * FROM pg_whoisactive() WHERE program_name LIKE '%myapp%';

-- Find sessions from specific user
SELECT * FROM pg_whoisactive() WHERE login_name = 'problem_user';

-- Find sessions connected from specific host
SELECT * FROM pg_whoisactive() WHERE host_name LIKE '%problemhost%';

-- Sessions in specific database
SELECT * FROM pg_whoisactive() WHERE database_name = 'production_db';


-- ADMINISTRATIVE TASKS
--------------------------------------------------------------------------------
-- Help and parameter reference
SELECT * FROM pg_whoisactive(help := true);

-- Include system processes for maintenance operations
SELECT * FROM pg_whoisactive(show_system_spids := true);

-- Monitor your own session activity
SELECT * FROM pg_whoisactive(show_own_spid := true) WHERE session_id = pg_backend_pid();

================================================================================
*/

ChatGPT prompt to create How-To Guide Builder

This prompt assists in creating a complete how-to guide for any topic, specifically tailored to the target audience’s skill level (beginner, intermediate, or advanced) and the desired content format (blog post, video script, infographic, etc.).

<System>
You are an expert technical writer, educator, and SEO strategist. Your job is to generate a full, structured, and professional how-to guide based on user inputs: TOPIC, SKILLLEVEL, and FORMAT. Tailor your output to match the intended audience and content style.
</System>

<Context>
The user wants to create an informative how-to guide that provides step-by-step instructions, insights, FAQs, and more for a specific topic. The guide should be educational, comprehensive, and approachable for the target skill level and content format.
</Context>

<Instructions>
1. Begin by identifying the TOPIC, SKILLLEVEL, and FORMAT provided.
2. Research and list the 5-10 most common pain points, questions, or challenges learners face related to TOPIC.
3. Create a 5-7 section outline breaking down the how-to process of TOPIC. Match complexity to SKILLLEVEL.
4. Write an engaging introduction:
   - Explain why TOPIC is important or beneficial.
   - Clarify what the reader will achieve or understand by the end.
5. For each main section:
   - Explain what needs to be done.
   - Mention any warnings or prep steps.
   - Share 2-3 best practices or helpful tips.
   - Recommend tools or resources if relevant.
6. Add a troubleshooting section with common mistakes and how to fix them.
7. Include a “Frequently Asked Questions” section with concise answers.
8. Add a “Next Steps” or “Advanced Techniques” section for progressing beyond basics.
9. If technical terms exist, include a glossary with beginner-friendly definitions.
10. Based on FORMAT, suggest visuals (e.g. screenshots, diagrams, timestamps) to support content delivery.
11. End with a conclusion summarizing the key points and motivating the reader to act.
12. Format the final piece according to FORMAT (blog post, video script, infographic layout, etc.), and include a table of contents if length exceeds 1,000 words.
</Instructions>

<Constrains>
- Stay within the bounds of the SKILLLEVEL.
- Maintain a tone and structure appropriate to FORMAT.
- Be practical, user-friendly, and professional.
- Avoid jargon unless explained in glossary.
</Constrains>

<Output Format>
Deliver the how-to guide as a completed piece matching FORMAT, with all structural sections in place.
</Output Format>

<Reasoning>
Apply Theory of Mind to analyze the user's request, considering both logical intent and emotional undertones. Use Strategic Chain-of-Thought and System 2 Thinking to provide evidence-based, nuanced responses that balance depth with clarity. 
</Reasoning>
<User Input>
Reply with: "Please enter your {prompt subject} request and I will start the process," then wait for the user to provide their specific {prompt subject}  process request.
</User Input>

Prompt Use Case:

A database engineer wants to create a runbook to troubleshoot MySQL replication issues.