Difference between users, groups, and roles on Postgres and Redshift

Postgres

We’ll start here, because Postgres generally sets the standard for relational databases.

In Postgres 9, users and groups were replaced with roles (Postgres have been very kind, and have maintained user and group as aliases for role for backwards compatibility). The difference between users and roles? Simply put, users can now be considered as roles with the ability to login.

  • Membership: Any role can be a member of another role, so you can create a hierarchical structure with as many tiers as you’d like.
  • Ownership: Any role can own relations.
  • Privileges: Any role can have privileges granted to it.
  • Inheritance: Up to you! Roles can be configured such that any members automatically inherit the privileges of the role, or can be configured to force a member to explicitly change their role in order to use those privileges.

Redshift

Redshift forked from Postgres somewhere around version 8. As such, roles look more like an old-school Postgres database rather than the role-based world of Postgres 9 – on Redshift, users and groups are separate entities.

  • Membership: Only a user can be a member of groups, i.e. a group cannot be a member of other groups, so you can only have a two-tiered structure.
  • Ownership: Users, rather than groups, own relations.
  • Privileges: Both users and groups can have privileges granted to them.
  • Inheritance: Users automatically inherit all privileges of any groups they are a member of.

Show Actual Maximum Column Lengths In Redshift

#!/bin/bash
#
# Many third-party data sources do lazy character column definitions, making them varchar(65535), which is stoopid.
# This little script will read in the DDL for a table, parse out the columns defined like this and then hork out a
# SQL script that discovers the actual max length of the actual data for more efficient DDL.  Suggested use is to
# run this script and output it to a xxxx.sql file, then \i that file in Redshift while in the correct schema.
#
# Assumes tablename will be passed on the command line and a file, tablename.sql will exist in this dir.

TABLENAME=${1}

if [ "${TABLENAME}" == "" ]
then
	echo -e "\nUSAGE: $0 redshiftTableName\n\nExpects redshiftTableName.sql to exist in this directory.\n\n"
	exit 1
fi

COLNAMES=`grep 65535 ${TABLENAME}.sql | cut -f2 -d","|cut -f1 -d" "`

for COL in ${COLNAMES}
do
 echo "SELECT MAX(LEN(${COL})) AS max_${COL}  FROM ${TABLENAME};"
done

Various Queries For Showing User Permissions To Objects

SELECT
   u.usename,
   s.schemaname,
   has_schema_privilege (u.usename, s.schemaname, 'create') AS user_has_select_permission,
   has_schema_privilege (u.usename, s.schemaname, 'usage') AS user_has_usage_permission
FROM
   pg_user u
   CROSS JOIN
      (SELECT DISTINCT
         schemaname
      FROM
         pg_tables) s
WHERE u.usename = 'tylerbullen'
   AND s.schemaname = 'vacasa';
SELECT
   u.usename,
   t.schemaname || '.' || t.tablename,
   has_table_privilege (u.usename, t.tablename, 'select') AS user_has_select_permission,
   has_table_privilege (u.usename, t.tablename, 'insert') AS user_has_insert_permission,
   has_table_privilege (u.usename, t.tablename, 'update') AS user_has_update_permission,
   has_table_privilege (u.usename, t.tablename, 'delete') AS user_has_delete_permission,
   has_table_privilege (
      u.usename,
      t.tablename,
      'references'
   ) AS user_has_references_permission
FROM
   pg_user u
   CROSS JOIN pg_tables t
WHERE u.usename = 'tylerbullen' --    AND t.tablename = 'myTableName'
    AND t.schemaname = 'vacasa';
SELECT
   u.usename,
   t.schemaname || '.' || t.tablename,
   has_table_privilege (u.usename, t.tablename, 'select') AS user_has_select_permission,
   has_table_privilege (u.usename, t.tablename, 'insert') AS user_has_insert_permission,
   has_table_privilege (u.usename, t.tablename, 'update') AS user_has_update_permission,
   has_table_privilege (u.usename, t.tablename, 'delete') AS user_has_delete_permission,
   has_table_privilege (
      u.usename,
      t.tablename,
      'references'
   ) AS user_has_references_permission
FROM
   pg_user u
   CROSS JOIN pg_tables t
WHERE --    u.usename = 'tylerbullen'
--    AND t.tablename = 'myTableName'
    user_has_insert_permission = 't'
   AND t.schemaname = 'vacasa';

Generate Overall Rowcount Tallies In Redshift

#SCHEMA="${2}"

STARTIME=`date`

echo "Starting rowcount update at $STARTIME"

# fivetran database
DBASE="fivetran"
SCHEMALIST="greenhouse hubspot jira mandrill marketo"
# Truncate rollup table
echo "Deleting from rollup table for ${DBASE}  schema..."
PGPASSWORD=xxxxxxxxxx psql -t -h  warehouse.vacasa.services -p 5439 -Uvacasaroot -d${DBASE} -n -q -c "TRUNCATE TABLE admin.overall_rowcount;"


# Iterate through schemata
for SCHEMA in ${SCHEMALIST}
do

	TABLES=`PGPASSWORD=xxxxxxxxxx psql -t -h  warehouse.vacasa.services -p 5439 -Uvacasaroot -d${DBASE} -n -q -c "select table_name from information_schema.tables where table_schema='${SCHEMA}' and table_type='BASE TABLE' ORDER BY table_name;"`

	for TAB in $TABLES
	do
		echo ">>> Working on ${DBASE}.${SCHEMA}.${TAB}..."

		TABLINE=`PGPASSWORD=xxxxxxxxxx psql -t -h  warehouse.vacasa.services -p 5439 -Uvacasaroot -n -q -d${DBASE} -c "insert into admin.overall_rowcount select '${SCHEMA}','$TAB',count(*) from ${SCHEMA}.${TAB}"`
	done
done
echo "*** All done with database ${DBASE}."



# eng_metrics database
DBASE="eng_metrics"
SCHEMALIST="jira"
# Truncate rollup table
echo "Deleting from rollup table for ${DBASE}  schema..."
PGPASSWORD=xxxxxxxxxx psql -t -h  warehouse.vacasa.services -p 5439 -Uvacasaroot -d${DBASE} -n -q -c "TRUNCATE TABLE admin.overall_rowcount;"

# Iterate through schemata
for SCHEMA in ${SCHEMALIST}
do
	TABLES=`PGPASSWORD=xxxxxxxxxx psql -t -h  warehouse.vacasa.services -p 5439 -Uvacasaroot -d${DBASE} -n -q -c "select table_name from information_schema.tables where table_schema='${SCHEMA}' and table_type='BASE TABLE' ORDER BY table_name;"`

	for TAB in $TABLES
	do
		echo ">>> Working on ${DBASE}.${SCHEMA}.${TAB}..."

		TABLINE=`PGPASSWORD=xxxxxxxxxx psql -t -h  warehouse.vacasa.services -p 5439 -Uvacasaroot -n -q -d${DBASE} -c "insert into admin.overall_rowcount select '${SCHEMA}','$TAB',count(*) from ${SCHEMA}.${TAB}"`
	done
done
echo "*** All done with database ${DBASE}."




# warehouse database
DBASE="warehouse"

SCHEMALIST="adwords_master adwords_owner aircall analysis_ready bing_ads common data_staging etl_testing facebook_ads finance google_adwords google_analytics high_risk housekeeping hubspot looker mailing marketo owner_portal_prod phones property_nexus prospect public rates reporting reporting_prod s3_files secure secure_scrubbed segment segment_marketo split_io_production splitio vacasa vacasa_gps_service vacasa_prod vacasa_units_svc_prod"
# Truncate rollup table
echo "Deleting from rollup table for ${DBASE}  schema..."
PGPASSWORD=xxxxxxxxxx psql -t -h  warehouse.vacasa.services -p 5439 -Uvacasaroot -d${DBASE} -n -q -c "TRUNCATE TABLE admin.overall_rowcount;"

# Iterate through schemata
for SCHEMA in ${SCHEMALIST}
do

	TABLES=`PGPASSWORD=xxxxxxxxxx psql -t -h  warehouse.vacasa.services -p 5439 -Uvacasaroot -d${DBASE} -n -q -c "select table_name from information_schema.tables where table_schema='${SCHEMA}' and table_type='BASE TABLE' ORDER BY table_name;"`

	for TAB in $TABLES
	do
		echo ">>> Working on ${DBASE}.${SCHEMA}.${TAB}..."

		TABLINE=`PGPASSWORD=xxxxxxxxxx psql -t -h  warehouse.vacasa.services -p 5439 -Uvacasaroot -n -q -d${DBASE} -c "insert into admin.overall_rowcount select '${SCHEMA}','$TAB',count(*) from ${SCHEMA}.${TAB}"`
	done
done
echo "*** All done with database ${DBASE}."

ENDTIME=`date`

echo "All done with every damned thing."
echo "Startime was: $STARTIME"
echo "Endtime was : $ENDTIME"

Inelegant Duplicate Row Remover For Redshift

drop table if exists example.reservation_finance_item_dedupe;

create table example.reservation_finance_item_dedupe as (select distinct * from example.reservation_finance_item);

select count(*) from example.reservation_finance_item;
select count(*) from example.reservation_finance_item_dedupe;

drop table if exists example.reservation_finance_item_old;

set search_path=example;

alter table reservation_finance_item rename to reservation_finance_item_old;
alter table reservation_finance_item_dedupe rename to reservation_finance_item;

select count(*) from example.reservation_finance_item;
select count(*) from example.reservation_finance_item_old;