- Michael Swart has a good post regarding data reading and yield. Unless you have snapshot isolation on, yield has the potential to blow up pretty quickly.
- FatherJack reminds you always to make sure your jobs notify you in case of problem.
- Gail Shaw has a great series on advanced indexing: part 1, part 2, part 3.
- Robert Young wants us to expand our database model beyond the relational model and incorporate inferential statistical models. This is actually meaningful for us at work: our organization does a lot of statistical modeling using SPSS and SAS, so it’s interesting seeing what is easy in SQL versus what is easy in SPSS. Even something like calculating a median is relatively difficult in SQL Server; more advanced statistical analysis is all the more difficult. The way I see it, statistical analysis tools could (and perhaps even should) be built on top of relational databases or OLAP warehouses. Making it easier for data analysts to plug into SQL Server and do their work limits problems like needing all kinds of flat file data sets, overlapping data, out of date date, etc. from being a problem. Anyhow, important notes: IBM bought SPSS, so I could see that in DB2; in addition, Oracle is incorporating R into their database. SQL Server could also include R—it is open source, after all—and this would be a good opportunity to get in on the ground floor.