How to Extract and Replicate PostgreSQL Permissions When Migrating to a New Instance

Migrating a PostgreSQL database isn’t just about moving data—getting the right roles and permissions in place is critical for security and proper application function. This post demonstrates how to extract roles and permissions from your source instance and apply them to a new PostgreSQL environment.

Understanding Permissions in PostgreSQL

PostgreSQL controls access via roles. A role can represent either a database user or a group, and roles are granted permissions (privileges) on objects (tables, databases, schemas, etc.) using GRANT and REVOKE commands. These permissions can be viewed and managed at various levels:

  • Database: Control who can connect
  • Schema: Control access to groups of tables, functions, etc.
  • Objects: Control what actions (SELECT, INSERT, UPDATE, etc.) users can perform on tables, functions, or sequences.

You can view permissions using PostgreSQL’s psql meta-commands:

  • \l+ — Show database privileges
  • \dn+ — Show schema privileges
  • \dp+ — Show table and other object privileges

Step 1: Extract Roles from the Source Instance

To extract roles (users and groups) from your current PostgreSQL server:

pg_dumpall --roles-only > roles.sql

Note:

  • This command will export all roles (but not their passwords in managed services like Azure Database for PostgreSQL).
  • In cloud managed systems, you might not have the ability to extract passwords; you’ll need to set them manually on the target instance.

Step 2: Extract Role & Object Permissions

Extracting object-level permissions (like all GRANT and REVOKE statements) can be done while dumping database schema:

pg_dump -h <source_server> -U <username> -d <dbname> -s > db_schema.sql

Next, filter permission statements:

If you’re working in PowerShell (Windows), run:

pg_dump -h "psql-prod-01.postgres.database.azure.com" -U pgadmin -d "prod_db" -s | Select-String -Pattern "^(GRANT|REVOKE|ALTER DEFAULT PRIVILEGES)" -Path "C:\Path\to\db_schema.sql" | ForEach-Object { $_.Line } > C:\Path\to\permissions.sql

If you’re working on Mac/Linux, run:

pg_dump -h "psql-prod-01.postgres.database.azure.com" -U pgadmin -d "prod_db" -s | grep -E '^(GRANT|REVOKE|ALTER DEFAULT PRIVILEGES)' > perms.sql

This extracts all lines related to granting or revoking privileges and puts them in perms.sql.

Step 3: Prepare and Edit Scripts

  • Review the extracted roles.sql and `permissions.sql:
    • Remove any references to unsupported roles (like postgres superuser in cloud environments).
    • Plan to set user passwords manually if they weren’t included.

Step 4: Copy Roles and Permissions to the New Instance

  1. Recreate roles:textpsql -h <target_server> -U <admin_user> -f roles.sql
    • Remember to set or update passwords for each user after creation.
  2. Apply object-level permissions:textpsql -h <target_server> -U <admin_user> -d <target_db> -f perms.sql

Step 5: Validate Permissions

Connect as each role or user to ensure operations work as expected:

  • Use \dp tablename in psql to check table permissions.
  • Use the information_schema views (e.g., role_table_grants) to query permissions programmatically:SELECT grantee, privilege_type, table_name FROM information_schema.role_table_grants;

How I used ChatGPT to prep me for AI-900 certification

Prompt:

You are my personal tutor for the topic AI-900, Azure AI fundamentals. Please help me learn the topic by having a learning dialogue with me. Guide me through the content, create examples, ask questions and let me work out the concepts. Only if I cannot answer your questions and analogies help me with hints and more guidance.

That’s it, that’s all the post.

Decoding a Python Script: An Improv-Inspired Guide for Beginners

By Vinay Rahul Are, Python Enthusiast \& Improv Comedy Fan


Introduction

Learning Python can feel intimidating—unless you approach it with a sense of play! Just like improv comedy, Python is about saying “yes, and…” to new ideas, experimenting, and having fun. In this post, I’ll walk you through a real-world Python script, breaking down each part so you can understand, explain, and even perform it yourself!


The Script’s Purpose

The script we’ll explore automates the process of running multiple SQL files against an Amazon Redshift database. For each SQL file, it:

  • Executes the file’s SQL commands on Redshift
  • Logs how many rows were affected, how long it took, and any errors
  • Moves the file to a “Done” folder when finished

It’s a practical tool for data engineers, but the structure and logic are great for any Python beginner to learn from.


1. The “Show Description” (Docstring)

At the top, you’ll find a docstring—a big comment block that tells you what the script does, what you need to run it, and how to use it.

"""
Batch Redshift SQL Script Executor with Per-Script Logging, Timing, and Post-Execution Archiving

Pre-requisites:
---------------
1. Python 3.x installed on your machine.
2. The following Python packages must be installed:
    - psycopg2-binary
3. (Recommended) Use a virtual environment to avoid dependency conflicts.
4. Network access to your Amazon Redshift cluster.

Installation commands:
----------------------
python -m venv venv
venv\Scripts\activate        # On Windows
pip install psycopg2-binary

Purpose:
--------
This script automates the execution of multiple .sql files against an Amazon Redshift cluster...
"""

2. Importing the “Cast and Crew” (Modules)

Every show needs its cast. In Python, that means importing modules:

import os
import glob
import psycopg2
import getpass
import shutil
import time
  • os, glob, shutil: Handle files and folders
  • psycopg2: Talks to the Redshift database
  • getpass: Securely prompts for passwords
  • time: Measures how long things take

3. The “Stage Directions” (Configuration)

Before the curtain rises, set your stage:

HOST = '<redshift-endpoint>'
PORT = 5439
USER = '<your-username>'
DATABASE = '<your-database>'
SCRIPT_DIR = r'C:\redshift_scripts'
DONE_DIR = os.path.join(SCRIPT_DIR, 'Done')
  • Replace the placeholders with your actual Redshift details and script folder path.

4. The “Comedy Routine” (Function Definition)

The main function, run_sql_script, is like a well-rehearsed bit:

def run_sql_script(script_path, conn):
    log_path = os.path.splitext(script_path)[0] + '.log'
    with open(script_path, 'r', encoding='utf-8') as sql_file, open(log_path, 'w', encoding='utf-8') as log_file:
        sql = sql_file.read()
        log_file.write(f"Running script: {script_path}\n")
        start_time = time.perf_counter()
        try:
            with conn.cursor() as cur:
                cur.execute(sql)
                end_time = time.perf_counter()
                elapsed_time = end_time - start_time
                rows_affected = cur.rowcount if cur.rowcount != -1 else 'Unknown'
                log_file.write(f"Rows affected: {rows_affected}\n")
                log_file.write(f"Execution time: {elapsed_time:.2f} seconds\n")
                conn.commit()
                log_file.write("Execution successful.\n")
        except Exception as e:
            end_time = time.perf_counter()
            elapsed_time = end_time - start_time
            log_file.write(f"Error: {str(e)}\n")
            log_file.write(f"Execution time (until error): {elapsed_time:.2f} seconds\n")
            conn.rollback()
  • Reads the SQL file
  • Logs what’s happening
  • Measures execution time
  • Handles success or errors gracefully

5. The “Main Event” (main function)

This is the showrunner, making sure everything happens in order:

def main():
    password = getpass.getpass("Enter your Redshift password: ")
    if not os.path.exists(DONE_DIR):
        os.makedirs(DONE_DIR)
    sql_files = glob.glob(os.path.join(SCRIPT_DIR, '*.sql'))
    conn = psycopg2.connect(
        host=HOST,
        port=PORT,
        user=USER,
        password=password,
        dbname=DATABASE
    )
    for script_path in sql_files:
        print(f"Running {script_path} ...")
        run_sql_script(script_path, conn)
        try:
            shutil.move(script_path, DONE_DIR)
            print(f"Moved {script_path} to {DONE_DIR}")
        except Exception as move_err:
            print(f"Failed to move {script_path}: {move_err}")
    conn.close()
    print("All scripts executed.")
  • Prompts for your password (no peeking!)
  • Makes sure the “Done” folder exists
  • Finds all .sql files
  • Connects to Redshift
  • Runs each script, logs results, and moves the file when done

6. The “Curtain Call” (Script Entry Point)

This line ensures the main event only happens if you run the script directly:

if __name__ == "__main__":
    main()

7. Explaining the Script in Plain English

“This script automates running a bunch of SQL files against a Redshift database. For each file, it logs how many rows were affected, how long it took, and any errors. After running, it moves the file to a ‘Done’ folder so you know it’s finished. It’s organized with clear sections for setup, reusable functions, and the main execution flow.”


8. Why This Structure?

  • Imports first: So all your helpers are ready before the show starts.
  • Functions: Keep the code neat, reusable, and easy to understand.
  • Main block: Keeps your script from running accidentally if imported elsewhere.
  • Comments and docstrings: Make it easy for others (and future you) to understand what’s going on.

9. Final Thoughts: Python is Improv!

Just like improv, Python is best learned by doing. Try things out, make mistakes, and remember: if your code “crashes,” it’s just the computer’s way of saying, “Yes, and…let’s try that again!”

If you want to dig deeper into any part of this script, just ask in the comments below. Happy coding—and yes, and… keep learning!


Granting dbo Access to a User on All SQL Server Databases with dbatools

Need to give a user full control (dbo/db_owner) across every database in your SQL Server? Here’s how you can do it quickly using PowerShell and dbatools.


Why db_owner?
Adding a user to the db_owner role in each database gives them broad permissions to manage all aspects of those databases—ideal for trusted developers or DBAs in non-production environments.


Quick Steps with dbatools

  1. Make sure the login exists at the server level:
New-DbaLogin -SqlInstance "YourInstance" -Login "YourUser"
  1. Loop through all databases and assign db_owner:
$instance = "YourInstance"
$login = "YourUser"

Get-DbaDatabase -SqlInstance $instance | Where-Object { -not $_.IsSystemObject } | ForEach-Object {
    New-DbaDbUser -SqlInstance $instance -Database $_.Name -Login $login -User $login -Force
    Add-DbaDbRoleMember -SqlInstance $instance -Database $_.Name -Role "db_owner" -User $login
}
  • This script creates the user in each database (if needed) and adds them to the db_owner role.

T-SQL Alternative

You can also use T-SQL:

USE master;
GO

DECLARE @DatabaseName NVARCHAR(128)
DECLARE @SQL NVARCHAR(MAX)
DECLARE @User NVARCHAR(128)
SET @User = 'YourUser'

DECLARE db_cursor CURSOR FOR
SELECT name FROM sys.databases WHERE database_id > 4 -- Exclude system DBs

OPEN db_cursor
FETCH NEXT FROM db_cursor INTO @DatabaseName
WHILE @@FETCH_STATUS = 0
BEGIN
    SET @SQL = 'USE [' + @DatabaseName + ']; ' +
               'IF NOT EXISTS (SELECT 1 FROM sys.database_principals WHERE name = N''' + @User + ''') ' +
               'CREATE USER [' + @User + '] FOR LOGIN [' + @User + ']; ' +
               'EXEC sp_addrolemember N''db_owner'', [' + @User + '];'
    EXEC sp_executesql @SQL
    FETCH NEXT FROM db_cursor INTO @DatabaseName
END
CLOSE db_cursor
DEALLOCATE db_cursor

How Increasing Azure PostgreSQL IOPS Supercharged Our Bulk Insert Performance

Loading millions of records into a cloud database can be a frustratingly slow task—unless you identify where your bottlenecks are. In this post, I will share how we significantly improved our insertion speeds on Azure Database for PostgreSQL Flexible Server by adjusting a single, often-overlooked setting: provisioned IOPS.


The Challenge: Slow Inserts Despite Low CPU

We were running a large data migration from Databricks to Azure Database for PostgreSQL Flexible Server. Our setup:

  • Instance: Memory Optimized, E8ds_v4 (8 vCores, 64 GiB RAM, 256 GiB Premium SSD)
  • Insert Method: 8 parallel threads from Databricks, each batching 50,000 rows

Despite this robust configuration, our insert speeds were disappointing. Monitoring showed:

  • CPU usage: ~10%
  • Disk IOPS: 100% utilization

Clearly, our CPU wasn’t the problem—disk I/O was.


The Bottleneck: Disk IOPS Saturation

Azure Database for PostgreSQL Flexible Server ties write performance directly to your provisioned IOPS (Input/Output Operations Per Second). PostgreSQL is forced to queue up write operations when your workload hits this limit, causing inserts to slow down dramatically.

Key signs you’re IOPS-bound:

  • Disk IOPS metric at or near 100%
  • Low CPU and memory utilization
  • Inserts (and possibly other write operations) are much slower than expected

The Fix: Increase Provisioned IOPS

We increased our provisioned IOPS from 1,100 to 5,000 using the Azure Portal:

  1. Go to your PostgreSQL Flexible Server in Azure.
  2. Select Compute + storage.
  3. Adjust the IOPS slider (or enter a higher value if using Premium SSD v2).
  4. Save changes—no downtime required.

Result:
Insert speeds improved immediately and dramatically. Disk performance no longer throttled the database, and we could fully utilize our CPU and memory resources.


Lessons Learned & Best Practices

  • Monitor your bottlenecks: Always check disk IOPS, CPU, and memory during heavy data loads.
  • Scale IOPS with workload: Azure lets you increase IOPS on the fly. For bulk loads, temporarily raising IOPS can save hours or days of processing time.
  • Batch and parallelize wisely: Match your parallel threads to your vCPU count, but remember that IOPS is often the true limiter for bulk writes.
  • Optimize indexes and constraints: Fewer indexes mean fewer writes per insert. Drop non-essential indexes before bulk loads and recreate them afterward.

Conclusion:
If your PostgreSQL inserts are slow on Azure, check your disk IOPS. Increasing provisioned IOPS can unlock the performance your hardware is capable of—sometimes, it’s the simplest tweak that makes the biggest difference.