Today’s post is a fairly short one. Let’s say you want to connect to Azure Data Lake Storage Gen2 from SQL Server 2022. We saw in a recent post how to connect to Azure Blob Storage. When it comes to Azure Data Lake Storage, the story is almost the same, though there are a couple pitfalls you will want to avoid.

Just the Highlights

First up, let’s create an external data source. I’m going to start with the proper way to do this and then show an improper way which will not work.

Gallant Says Do this

If you want to create an external data source to Azure Data Lake Storage, create an external data source like so:

CREATE EXTERNAL DATA SOURCE MarsFarming WITH
(
	LOCATION = 'adls://bdadatalake.dfs.core.windows.net/synapse/marsfarming_curated',
	CREDENTIAL = DataLakeCredential
);

In this case, I’m connecting to my ADLS Gen2 storage account (and SQL Server knows this because of the adls:// protocol) named bdadatalake, in a container called synapse and a folder called marsfarming_curated. Connect with my database-scoped credential (which should be a Shared Access Signature token) and you’re golden.

Goofus Leads You Astray

If you’re moving fast and breaking things, you might try to replicate the connection for Azure Blob Storage. Here’s a valid Azure Blob Storage connection:

CREATE EXTERNAL DATA SOURCE AzureNCPopBlob WITH
(
	LOCATION = 'abs://ncpop@cspolybaseblob.blob.core.windows.net',
	CREDENTIAL = AzureStorageCredential
);

Note the abs:// protocol, indicating that this is a blob storage account. Also note that the end of the domain is blob.core.windows.net. Now, if you’re typing quick and just want to change an existing example, you might try to create this totally incorrect external data source:

CREATE EXTERNAL DATA SOURCE MarsFarming WITH
(
	LOCATION = 'adls://bdadatalake.blob.core.windows.net/synapse/marsfarming_curated',
	CREDENTIAL = DataLakeCredential
);

This will create successfully and will even let you create an external table, but as soon as you try to access data from that external table, bam!

Msg 16561, External table ‘dbo.ArabilityScore’ is not accessible because content of directory cannot be listed.

This confused me for a bit because I definitely had contents and I definitely could list them in Azure Storage Explorer using the same SAS token. Only after fixing the domain, changing the URL from blob.core.windows.net to dfs.core.windows.net did things start working as I expected.

Conclusion

If you’re using Azure Blob Storage, make sure your protocol is abs:// and your domain ends with blob.core.windows.net. If you’re using Azure Data Lake Storage Gen2, make sure your protocol is adls:// and your domain ends with dfs.core.windows.net. Otherwise, you’ll get a weird error.

One thought on “SQL Server 2022 Data Virtualization: ADLS

Leave a comment