This next anti-pattern is another pet peeve, though not as bad as ANSI-89 join syntax. This one is using SELECT * in queries. For ad hoc queries that you just run to gather diagnostics, SELECT * isn’t necessarily too bad. As far as line-of-business work goes, however, it’s a bad idea for a couple of reasons.
The first reason is that SELECT * depends upon a specific column order. Suppose I have a table which has ColumnA, ColumnB, and ColumnD, and I have a query which uses SELECT * to get the column information. At this point, any structural change will silently break my expectations of that query. I, as a developer, expect to get back columns ColumnA, ColumnB, and ColumnD, in that order. But if somebody adds ColumnC in the middle, my SELECT * now returns four columns, and the layer which receives that result set could break, especially if it’s a relatively fragile layer like SSIS.
It’s not just column additions which break things, however. Deleting a column or even changing column order could cause your query to fail. But SSMS will never tell you about this error at design time because “SELECT *” is still valid syntax. In contrast, suppose you had SELECT ColumnA, ColumnB, ColumnD as your clause. In this case, I can do anything short of dropping one of those columns and the query would still work as expected. Adding new columns does not alter the result set, nor does revising the table to change column order (for whatever reason you may want to do that).
The second reason this is a bad idea is that it typically returns too many columns. If you want an optimal query, you want to get back only those columns which are necessary for the result set. If I only need ColumnA, why pull back excess columns that my consuming layer would just ignore? Instead, I can just get the one necessary column, resulting in less data passed across the wire between servers. I can also take advantage of a covering index which includes ColumnA, whereas SELECT * makes covering indexes much less likely (or useful).
Unfortunately, a lot of Object-Relational Mapping solutions tend to want to pull back full table sets. Then, when some developer is doing a table scan and sending a billion rows across the wire to his app server just to get the one column from one row that he needs, he complains about how slow the database is…
The easy fix for this problem is to name explicitly the columns you want. This involves more upfront development effort, but honestly, at this point, there are tools which will automatically convert “SELECT *” into “SELECT [each column]” for you. I use one called SQL Complete, and it does the job just fine.