Back with SQL Server 2019 CU2, I reported an error with PolyBase connecting to Excel when trying to select TOP(10) from the table. I’m using the Microsoft Access Database Engine 2016 Redistributable’s Excel driver.
Here’s a sample external data source and table statement:
CREATE EXTERNAL DATA SOURCE VolcanoType WITH
(
LOCATION = 'odbc://noplace',
CONNECTION_OPTIONS = 'Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)}; DBQ=C:\SourceCode\Polybase\Scripts\04 - Data Virtualization\VolcanoTypes.xlsx'
);
CREATE EXTERNAL TABLE dbo.VolcanoType
(
Type NVARCHAR(100),
Description NVARCHAR(1000)
)
WITH
(
LOCATION = '[VolcanoTypes$]',
DATA_SOURCE = VolcanoType
);
Prior to CU5, I could run SELECT * FROM dbo.VolcanoType
successfully, but trying SELECT TOP(10) * FROM dbo.VolcanoType
would return the following error:
Msg 7320, Level 16, State 110, Line 1
Cannot execute the query “Remote Query” against OLE DB provider “MSOLEDBSQL” for linked server “(null)”. 105082;Generic ODBC error: [Microsoft][ODBC Excel Driver] Syntax error (missing operator) in query expression ‘`T_0`.`Type` `Type`’. .
Now, I could get this to work by turning off external pushdown: SELECT TOP(10) * FROM dbo.VolcanoType OPTION(DISABLE EXTERNALPUSHDOWN);
But now with CU5, I was excited to see bug reference 13525968 corrected:
Disables PolyBase Generic ODBC External Data Sources’ default behavior of pushing down the TOP operator and calling the SQLRowCount function
After installing CU5, I can run a TOP()
operation hitting Excel and it won’t fail, so that’s good. Unfortunately, another bug still hasn’t been corrected so I’m hoping they are able to correct that in a future CU.
2 thoughts on “PolyBase and Excel: TOP Now Works”