Upcoming Events: SQL Saturday Madison

Key Details

What: SQL Saturday Madison
Where: 1000 Edgewood College Drive, Madison
When: Saturday, April 6th, all day
Admission is free. Sign up at the SQL Saturday website.

What I’m Presenting

11:00 AM — 12:00 PM — APPLY Yourself
02:45 PM — 03:45 PM — Eyes on the Prize: Simple and Effective Dashboard Visualization Techniques

SQL Saturday Madison is in a new venue this year, so don’t book your hotel thinking it’s where it was last year and end up a few miles away from the new spot. And especially don’t do it as a non-refundable booking. Reading is fundamental.

PolyBase Revealed: the DW Databases

Today is a fairly short post covering a trio of databases you might not even know you have: DWConfiguration, DWDiagnostics, and DWQueue. The PolyBase installer drops all three of these on your instance. Let’s go in ascending order of the number of useful tables.

DWQueue

The DWQueue database has two tables, neither of which I’ve ever seen populated in an on-prem SQL Server instance. It is probably useful in some internal system context and I have seen a procedure named dbo.MessageQueueReceived get called, along with an extended stored procedure whose name I was unable to find. Don’t look for enlightenment here.

DWDiagnostics

The DWDiagnostics database has a dozen tables and if you look at row counts, most of them have data. But not all of that data is directly useful to us.

The most useful table here is dbo.pdw_errors, which contains error information and messages for each error which has occurred on our system. If you’re used to the standard PolyBase error messages (which are a mess), you won’t get too much more info than what you see in them, but if you’re monitoring a busy system or one which swallows errors, at least you can see what’s going wrong.

One of the fun things about these tables is that it looks like PolyBase pre-allocates rows. For example, I have 10,000 rows in dbo.pdw_errors but only 175 of them have my machine name; the rest have a machine name of N and fake data for everything, as well as a DateTimePublished value which was probably when I originally installed PolyBase.

There are also 10,000 rows in dbo.pdw_component_alerts_data and dbo.pdw_os_event_logs but those two appear to do nothing for on-prem SQL Server running the PolyBase engine. The other tables aren’t worth mentioning.

DWConfiguration

The DWConfiguration database has two potentially interesting tables, which is twice as many as its nearest competitor.

First up is dbo.configuration_properties, which looks to be the values PolyBase uses for costing Hadoop operations, setting pdw_user as the default username, and so on. I might mess with some of this on a VM one of these days just to see what happens.

The other table is dbo.distribution, which lays out distributions. There are 8 rows in this table, which corresponds to the number of files created in most Hadoop write operations. A wild guess says there are 60 rows on Azure SQL Data Warehouse, but that’s mere speculation.

Ignore dbo.pdw_sp_configure as that, like Europa, is not meant for you and you should attempt no landing.

Conclusion

These databases seem like they’re primarily for Microsoft’s Analytics Platform System or Azure SQL Data Warehouse. There are a couple of tables which are interesting to us for on-prem PolyBase usage but not many. The important thing, though, is that even though these tables aren’t interesting to us, messing with them can have deleterious consequences for your SQL Server instance, so just leave them be.