R: Database Access

In today’s post, I’m going to use RODBC to connect to SQL Server and perform a few operations.

Pre-Requisites

There are a couple of steps we’re going to have to do in order to get RODBC to work with SQL Server.

Grab The Notebook

The first step is to grab the Basic SQL Server Connection notebook from my GitHub repo.  We’ll follow along with this notebook throughout the blog post.

Grab The Database Backup

If you want to follow along, I have a backup of PASSData which you can download from the GitHub repo.  It’s about 20 MB and does require SQL Server 2016.  If you don’t want to download this file or don’t have SQL Server 2016 installed, you can still get most of the benefit from this notebook by using your own DSN and writing your own queries.

Set Up Your DSN

If You’re On Linux

First, if you’re running on Linux, you’ve got a bit of work to do.  For this post, I’m going to assume that you’re running Windows, but if you are on Linux, follow the instructions above and you should be able to follow along.  You’ll probably have to change the connection to use SQL authentication instead of Windows authentication, but that’s a minor one-line change in the conn <- obdcConnect("PASSDATA") block.

If You’re On Windows

Before you begin connecting to a SQL Server instance, you need to set up an ODBC connection. To do this in Windows, go to the ODBC Data Sources application and create a SYSTEM-level data source.

Once you have that data source loaded, you can use RODBC to connect to SQL Server using that DSN. In this case, I created a data source named “PASSData” which connects to a local SQL Server instance’s PASSData database.

Learning What RODBC Can Do

First, we need to get RODBC from CRAN.  RODBC is a package which allows you to connect to ODBC-compliant databases using R.  This package allows you to connect to a wide variety of database products, which has its positives and negatives.  The big positive is that you don’t need to use SQL Server on the back end:  you can set up an Oracle, MySQL, Postgres, or a large number of other products.  The big negative is that this means less effort can be put into vendor-specific extensions.  RODBC is not a least-common-denominator product, but won’t be optimized for every product’s use cases.

We install RODBC with the following commands:

install.packages("RODBC", repos = "http://cran.us.r-project.org")
library(RODBC)

If you’d like more details on RODBC, I recommend checking out the documentation.

help(RODBC)
RShowDoc("RODBC", package="RODBC")

RODBC provides the odbcConnect function, which lets you connect to a database using a DSN or a connection string.  We’re going to use the DSN we created above.

Data Retrieval

We have a couple methods for getting data.  First, we can get metadata using built-in methods.  Here’s how to get a list of all tables:

sqlTables(conn, schema = "dbo")

Once you have that, you can easily grab all of the rows in a table using the sqlFetch command:

registrations <- sqlFetch(conn, "dbo.SQLSatRegistrations")

If you want a more fine-grained query or want to write SQL to perform some of the early data cleansing, write a sqlQuery:

sessions <- sqlQuery(conn, paste("SELECT TOP(50) EventName, EventDate, SessionName",
                                 "FROM dbo.SQLSatSessions",
                                 "WHERE SatNum = 217",
                                 "ORDER BY SessionName DESC"))

Other Operations

You can also insert and update data, create tables, and manage tables using RODBC, but for the most part, aside from inserting cleansed data, I’d personally shy away from using a data analysis tool for database management.

At any rate, once you’re done with the connection, close it using the close command:

close(conn)

Conclusion

Given the continued immense popularity of relational databases, it makes sense to know how to connect to a relational database using R.  RODBC is just one package which is up to the task—RJDBC and RSQLServer are two others.  If you find yourself having performance problems with RODBC on SQL Server, try the others.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s