May 2021 – Database Blog

Importing from a CSV to a database table

A common request that i get frequently is to load contents of CSV into the database. Until i came across dbatools, I used to perform this operation using GUI.

dbatools offers two ways to do this:

Import-DbaCsv
Import-Csv and Write-DbaDataTable

Which one should you choose? It is a personal choice, I have found that Import-DbaCsv is better choice for larger CSVs because it is optimized to keep memory usage low.

When you use Import-DbaCsv, it’s using some kind of streaming methodology to move the data efficiently between the disk and the SQL Server.

The output from the command shows table information, the number of rows copied and even how quickly it copied them. This is useful when you are testing a script with a smaller set of data as you can extrapolate the time it will take to load the data in your production environment.

Get-ChildItem -Path C:\temp\data.csv | Import-DbaCsv -SqlInstance $sqlinstance1 -Database StagingDB -Table data1

ComputerName  : SQLDEV01
InstanceName  : MSSQLSERVER
SqlInstance   : SQLDEV01
Database      : StagingDB
Table         : data1
Schema        : dbo
RowsCopied    : 2450
Elapsed       : 55.16 ms
RowsPerSecond : 44663
Path          : C:\temp\data.csv

The beauty of the command is that you can even import multiple CSVs at once! You can do so using the command shown below:

Get-ChildItem C:\temp\data*.csv | Import-DbaCsv -SqlInstance sql2017 -Database StagingDB -AutoCreateTable

ComputerName  : SQLDEV01
InstanceName  : MSSQLSERVER
SqlInstance   : SQLDEV01
Database      : StagingDB
Table         : data-tracks1
Schema        : dbo
RowsCopied    : 2450
Elapsed       : 73.02 ms
RowsPerSecond : 33712
Path          : C:\temp\data-tracks1.csv

ComputerName  : SQLDEV01
InstanceName  : MSSQLSERVER
SqlInstance   : SQLDEV01
Database      : StagingDB
Table         : data-tracks2
Schema        : dbo
RowsCopied    : 1312
Elapsed       : 65.41 ms
RowsPerSecond : 20160
Path          : C:\temp\data-tracks2.csv

Using Import-Csv with Write-DbaDataTable

Import-Csv is a powerful command that turns the text within CSV files to objects. Generally, Import-Csv is piped right to Write-DbaDataTable

# Import the CSV and write to the database
Import-Csv -Path .\DatabaseServerinfo.csv | Write-DbaDataTable -SqlInstance $sqlinstance1 -Database StagingDB -Table Databases -AutoCreateTable

Piping to Write-DbaDataTable is convenient and extremely fast for small batches, but slows down for larger datasets (similar to SQL’s RBAR concept). If you intend to import a large dataset, use Import-DbaCsv or the following syntax instead:

Import-DbaCsv -Path \\NetworkPath\server\bigdataset.csv -SqlInstance sql2017 -Database StagingDB -Table NewTable -AutoCreateTable

Ports required to use dbatools – SQL Server

Protocol	Default port	Sample Command	Percentage of Commands
SQL Database Engine	1433	Get-DbaDatabase	62%
WS-Management	5985 or 5986	New-DbaClientAlias	25%
SQL WMI	135	Enable-DbaAgHadr	4%
SMB	139	Invoke-DbaDbLogShipping	4%
SMB over IP	445	Get-DbaPfDataCollectorCounterSample	<1%

What are few ways to fix locking and blocking in SQL Server?

Have enough indexes to make queries go fast but not so many that they make our queries go slow. That wont fix the 5000 rows problem. SQL server will escalate to table level lock if it needs to access/lock more than 5000 rows.
Keep your transactions as short as possible. That helps but you have modify the query to make that happen.
Last is changing your default isolation level -> RCSI. This is a server level setting and you need account for extra tempdb usage to keep track of all those transactions.

As long as you de-dupe and eliminate the indexes, then you add indexes that are desperately needed to help the queries go faster. Just the above talked point almost eliminates blocking problem.

Typically, you see users have blocking issues, they either had no indexes whatsoever or they had dozens of indexes.

Proxy (1) is not allowed for subsystem “SSIS” and user “xxxUser”

ERROR:

Proxy (1) is not allowed for subsystem “SSIS” and user “xxxUser”. Grant permission by calling sp_grant_proxy_to_subsystem or sp_grant_login_to_proxy.

EXPLANATION:

In fact, the content of the error message is clear. The reason for the error is that Login does not have the right to use the appropriate Proxy account. I got this error when trying to create a Job.

SOLUTION:

1	EXEC msdb.dbo.sp_grant_login_to_proxy @proxy_name=N”, @login_name=N’xxxUser’

Connect from R Studio to SQL server

Recently, I have had to assist one of our newly added Data Science team to connect R with SQL server. This is the first time I had to deal with R server.

One of the things I found out was that R studio did not use LDAP for authentication. After much effort, we settled with granting users to access using native SQL authentication.

To install and load the RODBC package, do the following:

Open the RStudio console (make sure the R version is at least 3.1.3: If it isn’t, then use the updateR() function)
Run the following command: install.packages(“RODBC”)
Run the following command: library(RODBC)

You will need to load the following libraries:

library(rstudioapi)
libary(RODBC)

After this, run the following command to connect to the SQL server

con <- DBI::dbConnect(odbc::odbc(),
Driver = “SQLServer”,
Database = “enter_db_name“,
UID = “enter_user_name“,
PWD = rstudioapi::askforPassword(),
Server = “enter_IP_Address“,
Port = 1433)

The rstudioapi::askForPassword function will prompt the user for a password and reduce the need to store passwords in code.

Thats it! After doing this, users were able to connect to SQL server from R studio and run queries.

Credit: https://support.rstudio.com/hc/en-us/articles/214510788-Setting-up-R-to-connect-to-SQL-Server-

Share this:

Share this:

Share this:

ERROR:

EXPLANATION:

SOLUTION:

Share this:

Share this: