Deploy an AWS EC2 instance using TerraForm

Recently, I needed to create a test VM to play around with a POC. I figured why not use TerraForm to recreate a test VM with a pre-set configuration each time rather than having to do it through GUI over multiple click of buttons.

Prerequisites:

  • Install Terraform on your machine (Will cover the basics on how to do it in another post). However, you have plenty of tutorials online on how to do that. Pretty straight forward.
  • You will need to do the following step only once. Go to AWS console, create an IAM user with access to deploy EC2 instance. Store the access_key and secret_key in a file named “credentials” without a .txt extension
  • Create a folder named “.aws” and copy credentials file to this newly created folder. On my machine, I created it under C:/Users/username/.aws/

Steps:

  1. Open Git Bash, navigate to the folder where you will store the terraform code (example: C:\Users\username\Documents\Github\terraform). You can pretty store your code anywhere. Above is just my example.
  2. Create file named variables.tf (You will define the region where you would like to deploy your EC2 instance)
#Variable for AZ
variable "availability_zone" {
    type = string
    default = "us-east-1a"
}

3. Next, create a file named main.tf (You will define the infrastructure piece here)

provider "aws" {
	region = "us-east-1"
}

#AWS Instance
resource "aws_instance" "example" {
    ami = data.aws_ami.windows.id
    instance_type = "t2.micro"
    availability_zone = var.availability_zone
  
  lifecycle {
    ignore_changes = [ami]
  }
}

#AMI Filter for Windows Server 2019 Base
data "aws_ami" "windows" {
  most_recent = true

  filter {
    name   = "name"
    values = ["Windows_Server-2019-English-Full-Base-*"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }

  owners = ["801119661308"] # Canonical
}

#EBS Volume and Attachment

#resource "aws_ebs_volume" "example" {
 # availability_zone = var.availability_zone
  #size              = 30
#}

#resource "aws_volume_attachment" "ebs_att" {
 # device_name = "/dev/sdh"
  #volume_id   = aws_ebs_volume.example.id
  #instance_id = aws_instance.example.id
#}

4. Once you have the .tf files in place, run the below command to install the necessary plugins. It will download the highlighted folder below.

terraform init

5. After this, run the below statement to show the deployment plan

terraform plan

6. This will allow you to review all the items that terraform is going to create for you. I did have an issue with reaching aws sometimes. Error is shown below as an example:

7. Once everything looks good, run the following to create the EC2 instance. type “yes” upon receiving a prompt

terraform apply

8. If you ever need to destroy the VM you created, run the following:

terraform destroy --target aws_instance.example

I named my VM as example in the code above. You will need to provide the name of your VM that is applicable on your end.

That is all there is to create an EC2 instance using terraform. You can also adjust the code to deploy more than one instances. The above is a very simple example.

My next idea is to use terraform to deploy EC2 instance, and then use something like dbatools or powershell to download and install SQL server on it without ever logging into the VM.

9. If you ever would like to reproduce the above result, you can save the plan using the command below:

terraform apply -out=exampleEC2.plan
#Use the below to apply the saved plan
terraform apply "exampleEC2.plan"

How do you overcome CPU bottlenecks – under utilization / over utilization?

Lets take a sample query (CTOP 10, MAXDOP 0)

SELECT * FROM Sales.SalesOrderDetail 
ORDER BY UnitPrice DESC

This query ran in parallel on my test machine (16 CPUS / 128GB).

Run the query 5 times, it takes about 15 seconds to complete.

Wait statistics

Now i am going to run with OPTION (MAXDOP 8)

SELECT * FROM Sales.SalesOrderDetail 
ORDER BY UnitPrice 
OPTION (MAXDOP 8) DESC

It took about 15 seconds to complete (same as above).

There is usually a common misconception that when you give more CPU, a query runs faster. That is not actually true in most cases.

Please note that when your query uses 16 CPUs , it is actually doing a lot more work than you think. The work is being divided across 16 worker threads, and could end up being more work coordinating and merging results across more CPUs.

Also, please note that you will see thread 0, this does not process the query. Thread 0 is called a coordinator thread.

If you have an OLTP workload, would recommend to use MAXDOP as 1 or 2. Anything higher would slow down the queries in general. If your environment uses a warehouse environment, a higher MAXDOP is recommended.

How do you overcome Blocking in SQL Server?

  • Lets say that we created database on a single drive (both data and log file). Insert one Million records into the database table. Now query the table and it takes about 8 to 10 seconds to retrieve the data. It took about 2 seconds to insert the rows.
  • Now, lets say that we create database with data and log files on separate drives. Do the same thing, insert rows and run a select query. What do you think happens now? Now, it took longer to insert the million rows. It took about 45 seconds to insert the rows. Why do you think this ran longer? Check the top waits, i see that the top waits are CXPACKET and ASYNC_NETWORK_IO. It shows that the data is being inserted in parallel and ASYNC_NETWORK_IO shows that your app is slow in retrieving the data. Now, when you look at the waits at the first scenario, top wait is WRITELOG. It had to waits for 10 seconds out of 12 sec execution time in order to complete the insert statement. In this scenario, the slowness was caused due to latency on the disk. In a real world scenario, you would have to investigate by talking to system admin or network admin to check if a particular disk is slow (measure the latency on the disk), or check if a network adapter is slow.

Get status of SQL server services on one or more servers remotely

'SQLSERVER01','SQLSERVER02','SQLSERVER03'| get-dbaservice -Type Engine, Agent | Out-GridView

This will provide output as shown below:

The great thing about this approach is that, I can swap out the command get-dbaservice with restart-dbaservice. This will restart the SQL services remotely without having to RDP into a machine.

'SQLSERVER01','SQLSERVER02','SQLSERVER03'| Restart-dbaservice -Type Engine, Agent | Out-GridView

If you don’t have a monitoring tool, this is a great way to check if the services are running or not on a specific SQL server when you are troubleshooting a connection issue.

Update SQL service account on multiple SQL servers using dbatools

Typically, I use the following statement whenever i want to update a SQL service account

Get-DbaService sql1 -Type Engine -Instance MSSQLSERVER | Update-DbaServiceAccount -Username 'MyDomain\sqluser1'

Configures SQL Server engine service on the machine sql1 to run under MyDomain\sqluser1. Will request user to input the account password.

Here is a simple one liner to update SQL service account on multiple servers:

'SQLSERVER01','SQLSERVER02','SQLSERVER03'| Get-DbaService -Type Engine,Agent -Instance MSSQLSERVER | Update-DbaServiceAccount -Username 'MyDomain\sqluser1'

The type is inputs for both Engine and Agent. Therefore, both services will be set to run as that domain user.