Decoding a Python Script: An Improv-Inspired Guide for Beginners

By Vinay Rahul Are, Python Enthusiast \& Improv Comedy Fan


Introduction

Learning Python can feel intimidating—unless you approach it with a sense of play! Just like improv comedy, Python is about saying “yes, and…” to new ideas, experimenting, and having fun. In this post, I’ll walk you through a real-world Python script, breaking down each part so you can understand, explain, and even perform it yourself!


The Script’s Purpose

The script we’ll explore automates the process of running multiple SQL files against an Amazon Redshift database. For each SQL file, it:

  • Executes the file’s SQL commands on Redshift
  • Logs how many rows were affected, how long it took, and any errors
  • Moves the file to a “Done” folder when finished

It’s a practical tool for data engineers, but the structure and logic are great for any Python beginner to learn from.


1. The “Show Description” (Docstring)

At the top, you’ll find a docstring—a big comment block that tells you what the script does, what you need to run it, and how to use it.

"""
Batch Redshift SQL Script Executor with Per-Script Logging, Timing, and Post-Execution Archiving

Pre-requisites:
---------------
1. Python 3.x installed on your machine.
2. The following Python packages must be installed:
    - psycopg2-binary
3. (Recommended) Use a virtual environment to avoid dependency conflicts.
4. Network access to your Amazon Redshift cluster.

Installation commands:
----------------------
python -m venv venv
venv\Scripts\activate        # On Windows
pip install psycopg2-binary

Purpose:
--------
This script automates the execution of multiple .sql files against an Amazon Redshift cluster...
"""

2. Importing the “Cast and Crew” (Modules)

Every show needs its cast. In Python, that means importing modules:

import os
import glob
import psycopg2
import getpass
import shutil
import time
  • os, glob, shutil: Handle files and folders
  • psycopg2: Talks to the Redshift database
  • getpass: Securely prompts for passwords
  • time: Measures how long things take

3. The “Stage Directions” (Configuration)

Before the curtain rises, set your stage:

HOST = '<redshift-endpoint>'
PORT = 5439
USER = '<your-username>'
DATABASE = '<your-database>'
SCRIPT_DIR = r'C:\redshift_scripts'
DONE_DIR = os.path.join(SCRIPT_DIR, 'Done')
  • Replace the placeholders with your actual Redshift details and script folder path.

4. The “Comedy Routine” (Function Definition)

The main function, run_sql_script, is like a well-rehearsed bit:

def run_sql_script(script_path, conn):
    log_path = os.path.splitext(script_path)[0] + '.log'
    with open(script_path, 'r', encoding='utf-8') as sql_file, open(log_path, 'w', encoding='utf-8') as log_file:
        sql = sql_file.read()
        log_file.write(f"Running script: {script_path}\n")
        start_time = time.perf_counter()
        try:
            with conn.cursor() as cur:
                cur.execute(sql)
                end_time = time.perf_counter()
                elapsed_time = end_time - start_time
                rows_affected = cur.rowcount if cur.rowcount != -1 else 'Unknown'
                log_file.write(f"Rows affected: {rows_affected}\n")
                log_file.write(f"Execution time: {elapsed_time:.2f} seconds\n")
                conn.commit()
                log_file.write("Execution successful.\n")
        except Exception as e:
            end_time = time.perf_counter()
            elapsed_time = end_time - start_time
            log_file.write(f"Error: {str(e)}\n")
            log_file.write(f"Execution time (until error): {elapsed_time:.2f} seconds\n")
            conn.rollback()
  • Reads the SQL file
  • Logs what’s happening
  • Measures execution time
  • Handles success or errors gracefully

5. The “Main Event” (main function)

This is the showrunner, making sure everything happens in order:

def main():
    password = getpass.getpass("Enter your Redshift password: ")
    if not os.path.exists(DONE_DIR):
        os.makedirs(DONE_DIR)
    sql_files = glob.glob(os.path.join(SCRIPT_DIR, '*.sql'))
    conn = psycopg2.connect(
        host=HOST,
        port=PORT,
        user=USER,
        password=password,
        dbname=DATABASE
    )
    for script_path in sql_files:
        print(f"Running {script_path} ...")
        run_sql_script(script_path, conn)
        try:
            shutil.move(script_path, DONE_DIR)
            print(f"Moved {script_path} to {DONE_DIR}")
        except Exception as move_err:
            print(f"Failed to move {script_path}: {move_err}")
    conn.close()
    print("All scripts executed.")
  • Prompts for your password (no peeking!)
  • Makes sure the “Done” folder exists
  • Finds all .sql files
  • Connects to Redshift
  • Runs each script, logs results, and moves the file when done

6. The “Curtain Call” (Script Entry Point)

This line ensures the main event only happens if you run the script directly:

if __name__ == "__main__":
    main()

7. Explaining the Script in Plain English

“This script automates running a bunch of SQL files against a Redshift database. For each file, it logs how many rows were affected, how long it took, and any errors. After running, it moves the file to a ‘Done’ folder so you know it’s finished. It’s organized with clear sections for setup, reusable functions, and the main execution flow.”


8. Why This Structure?

  • Imports first: So all your helpers are ready before the show starts.
  • Functions: Keep the code neat, reusable, and easy to understand.
  • Main block: Keeps your script from running accidentally if imported elsewhere.
  • Comments and docstrings: Make it easy for others (and future you) to understand what’s going on.

9. Final Thoughts: Python is Improv!

Just like improv, Python is best learned by doing. Try things out, make mistakes, and remember: if your code “crashes,” it’s just the computer’s way of saying, “Yes, and…let’s try that again!”

If you want to dig deeper into any part of this script, just ask in the comments below. Happy coding—and yes, and… keep learning!


Generate User creation script for Redshift

"""
This script generates a user creation script and a list of ALTER GROUP scripts to add the user to the corresponding groups.

Sample Execution:
python Generate_user_creation_script.py
Enter the username for the new user: test_user
Enter the schema names (comma-separated, e.g., 'etl,hr,public'): "etl","hr","public"

Output:
CREATE USER "test_user" PASSWORD 'Hidden1!';
ALTER GROUP etl_ro ADD USER "test_user";
ALTER GROUP hr_ro ADD USER "test_user";
ALTER GROUP public_ro ADD USER "test_user";

"""

def strip_and_parse_schemas(schemas_string):
    """
    Strips the double quotes and commas from a given string of schema names and returns a list of schema names.
    
    :param schemas_string: String of schema names, e.g. '"etl","hr","public"'
    :return: List of schema names ['etl', 'hr', 'public']
    """
    schemas_list = schemas_string.replace('"', '').split(',')
    return schemas_list

def generate_create_user_script(username, password):
    """
    Generates SQL script to create a new user with a given username and password.
    
    :param username: The username of the new user
    :param password: The password for the new user
    :return: SQL statement for user creation
    """
    create_user_script = f'CREATE USER "{username}" PASSWORD \'{password}\';'
    return create_user_script

def generate_alter_group_scripts(schemas, username):
    """
    Generates SQL statements to add the user to the pre-existing groups for each schema.
    
    :param schemas: List of schema names
    :param username: The username to be added to the groups
    :return: List of ALTER GROUP SQL statements
    """
    alter_group_scripts = []
    for schema in schemas:
        group_name = f"{schema}_ro"  # Dynamically generate the group name (e.g., "etl_ro")
        alter_group_script = f'ALTER GROUP {group_name} ADD USER "{username}";'
        alter_group_scripts.append(alter_group_script)
    return alter_group_scripts

def print_scripts(schemas, username, password):
    """
    Prints the user creation script and the ALTER GROUP scripts to add the user to the corresponding groups.
    
    :param schemas: List of schema names
    :param username: The username of the new user
    :param password: The password for the new user
    """
    print(generate_create_user_script(username, password))
    alter_group_scripts = generate_alter_group_scripts(schemas, username)
    for script in alter_group_scripts:
        print(script)

def main():
    # Prompt for username input
    username = input("Enter the username for the new user: ")
    
    # Prompt for schema array input (comma-separated string)
    schemas_string = input("Enter the schema names (comma-separated, e.g., 'etl,hr,public'): ")
    
    # Strip and parse the schema names
    schemas = strip_and_parse_schemas(schemas_string)
    
    # Set the password (or prompt for it if you'd like)
    password = "Hidden1!"  # You can customize this if needed
    
    # Generate and print the scripts
    print_scripts(schemas, username, password)

if __name__ == "__main__":
    main()