36 Chambers – The Legendary Journeys: Execution to the max!

March 28, 2013

Do You Need That Trigger?

Filed under: Database Administration, Programming & Work — Kevin Feasel @ 6:00 pm

Merrill Aldrich has a wonderful flow chart pertaining to database triggers.  The short answer is “No.”

At one place I have worked, the primary production database had over 200 triggers, mostly to handle auditing and logging.  Why couldn’t they do something like Service Broker or change tracking or some asynchronous mechanism to keep things up to date without triggers everywhere?  Good question…

March 11, 2013

Health Care IT Rants

Filed under: Database Administration — Kevin Feasel @ 6:00 pm

I would rant about the state of IT in the health care industry, but Sean McCown does it so much better.

In rant #1 (not directly health care related), McCown asks why it’s so hard to adopt Powershell and notes that part of the problem is that proponents aren’t selling its biggest advantage.  At my present job, we aren’t allowed to use Powershell; my manager and I have made the case, but the higher-ups (who are more comfortable with circa 2000 technology than today’s) just aren’t interested.  They’d rather have people waste time doing point-and-click activities, perform unnecessary and mistake-prone manual operations, and use the wrong tool for the job than adapt to what’s already here.  This makes absolutely no business sense—they’re definitely wasting time and money on things that should be automated, such as database and report deployments, server log reading (collecting the logs on a central syslog server and e-mailing out whenever certain threshholds get met), and regular maintenance.

Rant #2, however, ties directly to health care IT.  I’ve worked in health care IT for most of my career (including time at a couple of hospitals, a health-related state agency, and a health insurance provider).  Every word of this rant rings true to me.  Between subject-matter expert bleedover into everything expert, HIPAA (and a slew of other federal and state regulations), and extreme risk aversion, it’s amazing that things actually ever happen.

March 8, 2013

T-SQL To Check For Directories

Filed under: Database Administration — Kevin Feasel @ 6:00 pm

I needed to use T-SQL to check for backup folders and create them if missing.  Yes, I would normally use Powershell, but I’m saving up a rant for later on that topic…

This script by Jugal Shah worked quite well.  Combined with Aaron Bertrand’s sp_ForEachDb, you can easily find and create missing backup directories.  For example, suppose that you keep your backups at D:\Backups\[database name]\.  After creating a new database called DB, there is no folder under D:\Backups\ named DB.  If we tried to back up to that directory, we would get an error.  Before running backups, you can run this code to generate folders given a particular structure. 

declare @BackupDirectoryPath nvarchar(255) = 'D:\Backups\?';
declare @DatabaseList nvarchar(max);
select @DatabaseList = STUFF((select ', ' + name from sys.databases where state_desc = 'ONLINE' and source_database_id IS NULL and name <> 'tempdb' FOR XML PATH ('')), 1, 1, '');
declare @sql nvarchar(max) = N'

	declare @file_results table
	(
		file_exists int,
		file_is_a_directory int,
		parent_directory_exists int
	);

	insert into @file_results
	(
		file_exists,
		file_is_a_directory,
		parent_directory_exists
	)
	exec master.dbo.xp_fileexist ''' + @BackupDirectoryPath + N''';

	if exists (select * from @file_results where file_exists = 0 AND file_is_a_directory = 0)
	begin
		EXECUTE master.dbo.xp_create_subdir ''' + @BackupDirectoryPath + N''';
	end

';

exec dbo.sp_ForEachDb
	@command = @sql,
	@suppress_quotename = 1,
	@database_list = @DatabaseList,
	@print_command_only = 0;

Note that the question mark in the code will be substituted with the database name.  Also note that because this is for backups, I wanted to get online databases (can’t back up offline datbases) which are not snapshots (can’t back those up) and are not tempdb (can’t back that up).

February 15, 2013

Formal Formatting Standards Are Overrated

Filed under: Database Administration, Programming & Work — Kevin Feasel @ 6:00 pm

Grant Fritchey has a blog post about naming standards, including an excerable tblDdltbl.  The comments thread then becomes people defending their particular naming and formatting standards.

Let me explain my standard:  I try to be consistent, except when I’m not.  Anything more than that runs the risk of hitting rationalist stupidity, like tblDdltbl.

So what does this mean?  Well, I have a particular naming format for tables:  give the table a clear, unambiguous name that describes the entity this table implements.  Anything more than that is unnecessary cruft, and if I ever have to look at a chart to translate a table name, I know I’ve failed.  We get 128 characters for our names, so even very long names work, and auto-complete saves your precious fingertips so you can spend more time chatting on Facebook.

Okay, so what about stored procedures?  Well, name them in a way that makes sense.  I like Grant’s [Object][Action] combination, but be consistent…except when it makes sense not to be consistent.  Sometimes, [Object][Action][Predicate] is necessary, but that doesn’t mean you have to re-name all of those [Object][Action] procedures.  And if you want to use [Action][Object] instead, that’s fine:  just be consistent.

With database objects out of the way, what about formatting?  I could write an MLA-length document on precise formatting of SQL code, but what’s the benefit?  Is there any point in indenting here versus there, commas here versus there, or splitting out into lines here versus there?  Maybe.  But here’s the problem:  we don’t really have any solid metrics on this, so all of this comes down to stylistic preferences.  You may see your preferred formatting pattern as the pattrern, but that’s little more than an opinion based on the relatively small sample of formatting styles to which you have been exposed.  Furthermore, even if you write up a long style guide, chances are you’ll either hit the shoals of stupidity (tblDdltbl) or have gaping holes that you never thought about, thereby requiring actual independent thought and the risk of non-consistent code styles.

Let’s give an example of that now.  There are certain things which I believe make SQL code easier to read:  columns on separate lines, primary clauses (e.g., SELECT, FROM) outdented and the only thing on the line, subqueries following consistent indentation patterns, etc.  When I write a procedure or view, I tend to do that, except when I don’t:  for example, trivial NOT EXISTS subqueries might end up on one line rather than broken out.  But what constitutes a trivial NOT EXISTS clause?  Just a SELECT and FROM?  SELECT and FROM with one element in the WHERE clause?  Maybe an INNER JOIN somewhere?  Would splitting this out into several lines make that particular procedure significantly more difficult to read?  And if not, why should I put much mental effort into following The One True System?

I get the notion of trying to have a single coding style which everybody can follow to try to make it easier to read other developers’ code, but we live in the world of auto-formatting.  If you don’t like the format, it’s trivial to reformat code to your preference.  Or, better yet, get over it:  the interpreter doesn’t really care how you format your code, so if it makes sense to you, go for it.  Just don’t pretend that what makes sense to you is scientifically superior (unless you have the studies to back this up) or even makes sense to everybody else.

January 11, 2013

Parentheses Around Joins

Filed under: Database Administration — Kevin Feasel @ 7:00 pm

This is something that I just learned about recently.

Here’s the scenario:  we have four tables.  A User has separate OAuth authorization types (for example, Facebook, Google, and Twitter), which we track through the UserAuthorization table.   Finally, each user has a number of items that they have to offer, but not all items need to be tied to a user.

The question that the following query tries to answer is, which items are either not owned by a user at all, or are owned by a user who has linked their Twitter account to the application?  In this case, we only want to get the Username value for users who have Twitter; if they don’t have Twitter, we don’t want to see the username at all.  This is kind of a silly query, but bear with me.  It turns out that you can use parentheses to direct inner and outer joins, allowing us to answer exactly this question.

select
	i.ID,
	u.Username
from
	dbo.Item i
	LEFT OUTER JOIN
	(
		dbo.[User] u
		INNER JOIN dbo.UserAuthorization ua on ua.UserId = u.Id
		INNER JOIN Lookup.OAuthType ot
			on ua.OAuthTypeId = ot.id
			and ot.Name = 'Twitter'
	) on i.UserId = u.Id;

This nested join means that we only get records back from the User table if there is a Twitter handle, but we do not exclude any records from the Item table; neither do we include any users who do not have Twitter handles.

There are other ways of doing this, such as two queries using a UNION ALL.  This is a succinct way of doing that, however, and could potentially be better for performance, because it would not need to scan the Item table twice (once for each side of the UNION ALL).

January 4, 2013

DBA Notes

Filed under: Database Administration — Kevin Feasel @ 7:00 pm

A few SQL Server notes on the passing scene:

* Maria Zakourdaev has a great roundup of analytical databases.
* Simon Sabin shares his technique for trying to do clean ETL loads.  Along with that, he shares a Connect item which could potentially be awesome.  The problem is that, as the comments show, this is something that the people at Microsoft need to think over in considerable detail.  Different people have different thoughts on how what the table structure should look like and how exactly it should work.  I think this would be an awesome feature in the next version of SQL Server, but given the Microsoft reluctance to back out from old decisions, they need to do it the right way.
* Nic Cain rants about SCCM.  Microsoft is a huge company, but here’s one thing I really wish they would do:  have their vendor-style database products go through the SQL Server team, or at least qualified database administrators.  SCCM and Sharepoint are two big examples of big, popular products which fall flat when it comes to the SQL Server side of things.

January 2, 2013

Database Insecurity

Filed under: (In)Security, Database Administration — Kevin Feasel @ 7:00 pm

Let’s take a look at some of the database breaches of 2012.  There are a few interesting points to consider.  The first is, do not store anything that you do not absolutely need to.  If you do not absolutely need to store credit card information, do not store it.  I don’t care if your shop is supposedly PCI compliant (fat lot of good that does you); get rid of the credit card number after you use it.  This might not work as well for ongoing subscription-based services, but it saves the embarrassment (not to mention potential legal liability) from losing this data.  Even if the information is encrypted (or “encrypted” if you try to roll your own encryption mechanism and get it wrong), if the key is available to the attacker, your encryption is worthless.

From there, if you do have to store information, store it securely.  Look at internal attackers:  volunteers, disgruntled employees, corporate spies, malicious governments, that kind of thing.  What is the minimum amount of information these people need?  Look at what the organization is collecting.  Is there a reason to collect all of that?  By “a reason,” I mean something more than “Marketing said so.”

In the article, the author asks, “The more I read, the more I thought: ‘Why don’t the database administrators understand how vulnerable their records really are?’”  There are a couple of reasons for this.  The first one is that many database administrators do not understand security very well.  They may not know of built-in tools (such as SQL Server’s Transparent Database Encryption) or paid products that help secure the database.  They may not realize that you should encrypt and protect the backups the same as you do the servers.  But the second reason is that database administrators provide a service.  It is not necessarily their job—or they may not be allowed—to provide that service in the most secure manner.  Third-party products have specified schemas and don’t want client DBAs to muck about with them.  They may force poor practices (like giving a web-based account sysadmin rights) or have backdoors built in.

In addition to vendor-related problems, don’t forget about the rest of the company.  Developers creating databases likely will not understand good security practices, and so without an energetic DBA (with enough organizational cache to pull this off!) to step in and say “What you’re doing is unsafe; here is a better way,” bad stuff gets through.  Poor decision-making higher up the line can also hurt things:  those sales and marketing guys need direct access to the database from their laptops and the brass say to put the database server out in the DMZ.  This is a horrible, horrible idea, but unfortunately quite common.  Alternatively, the more accounts you create, the more likely somebody’s going to have a bad password.  And when your CIO or CEO says that everybody needs to be a sysadmin (because it’s the business’s database, not IT’s, so the business people should be able to do what they need!), that guy in sales with a password which matches his custom license plate means your CIO gets to explain how your company “follows security best practices” even though they got owned and happened to lose a million credit cards and plaintext passwords.

December 25, 2012

Automated Backup Tuning

Filed under: Database Administration — Kevin Feasel @ 6:00 pm

Nic Cain has an excellent script to automate backup tuning.  He published this literally the same day that I had to start tuning a large database’s backups, so I was cursing him for not releasing it 24 hours earlier…

December 23, 2012

“Scheduling” An E-Mail For A Long-Running Job

Filed under: Database Administration — Kevin Feasel @ 6:00 pm

Here’s the scenario:  you have a SQL Server Agent job which is currently running and you want to know when it completes.  You can use Adam Machanic’s sp_whoisactive (or a less civilized alternative, but honestly, who wantst that?) and hit F5 like a monkey on crack until you see the SPID running this job ends.  Or you could do the same thing with the Job Activity monitor, but again, I’d like to do some more work in the meantime.  So here’s my hacked-up one-off solution, which I ran in an SSMS query window:

declare @JobName sysname = 'MY JOB NAME';
declare @Now datetime = current_timestamp;

while not exists
(
	select *
	from
		msdb.dbo.sysjobhistory jh
		inner join msdb.dbo.sysjobs j on jh.job_id = j.job_id
	where
		j.name = @JobName
		and run_date = convert(char(8), @Now, 112)
		and STUFF(STUFF(RIGHT('000000' + CONVERT(VARCHAR(8), run_time), 6), 5, 0, ':'), 3, 0, ':') &gt; cast(@Now as time)
)
begin
	declare @msg nvarchar(500) = N'Still waiting:  ' + cast(current_timestamp as varchar(32))
	raiserror(@msg, 10, 1) with nowait;
	waitfor delay '00:01:00'
end

declare @JobRunMessage nvarchar(250) = N'The job you are waiting for (' + @JobName + N') has completed.'
exec msdb..sp_send_dbmail @recipients='my e-mail address', @subject='Job completed', @body=@JobRunMessage;

This is a fairly simple, light-weight script.  Given a job name, it searches through the SQL Server Agent history for an outcome later than the current time.  Once it gets one, the loop ends and we get an e-mail.

If you’re doing this often, you might simply want a notification e-mail sent when the job completes.  You can set that up in the Notifications tab on your job.  In my case, I didn’t want this because this is a job which runs every minute and occasionally has a large load to process, and I simply wanted a one-time solution.

December 21, 2012

Finding References To Particular Columns In Procedures And Views

Filed under: Database Administration — Kevin Feasel @ 6:00 pm

Here’s the scenario:  you want to find all of the cases in which a particular column name gets used.  Here’s the simple version, built off of sys.sql_modules:

declare @TableName sysname = 'MyTable';
declare @ColumnName sysname = 'MyColumn';

select
	object_schema_name(object_id) as SchemaName,
	object_name(object_id) as TableName,
	definition
from
	sys.sql_modules
where
	definition like '%' + @TableName + '%'
	and definition like '%' + @ColumnName + '%';

This works just fine if you have a unique column name or you’re willing to wade through a number of false positives.  In my case, though, I had to do this on a common column name and often-used table, so that there were well over 100 results.  In this case, it’s time to pull out the big guns:  dm_sql_referencing_entities and dm_sql_referenced_entities.  These are two DMFs which take a couple of parameters:  schema+object name, and type.  Here is a query which looks at all modules (views, functions, procedures, triggers, etc.) and returns the items which reference @SchemaName.@TableName.@ColumnName:

declare @SchemaName sysname = 'dbo';
declare @TableName sysname = 'MyTable';
declare @ColumnName sysname = 'MyColumn';

select
	referencing.referencing_schema_name + '.' + referencing_entity_name as CallingProcess,
	so.type,
	so.type_desc
from
	sys.dm_sql_referencing_entities(@SchemaName + '.' + @TableName, 'object') referencing
	inner join sys.objects so on referencing.referencing_id = so.object_id
where
	referencing.referencing_entity_name not in
	(
		'List Broken Objects Here'
	)
	and exists
	(
		select
			*
		from
			sys.dm_sql_referenced_entities(referencing_schema_name + '.' + referencing_entity_name, 'object') referenced
		where
			referenced_entity_name = @TableName
			and referenced.referenced_minor_name = @ColumnName
	)

This procedure first gets all of the referencing entities for @SchemaName.@TableName.  That is, what modules (views, procedures, triggers, check constraints, etc.) reference @SchemaName.@TableName anywhere in their definition?  From there, we perform a couple of filters.  The important filter is the “and exists” clause.  What we’re doing is getting all of the referenced entities for each of those referencing entities.  The reason we have to do this is that sys.dm_sql_referencing_entities keeps the column name in a field called referenced_minor_name.  Because we’re looking for a particular column, we need to get that, and it’s not enough simply to look for references to our schema and table.

Above that I have a snippet which filters out bad rows.  The problem is that procedures and views which are not schemabound are allowed to have bad references.  When you try to get the referenced entities, you’ll end up with an error like:

Msg 2020, Level 16, State 1, Line 1
The dependencies reported for entity “dbo.MyBrokenprocedure” might not include references to all columns. This is either because the entity references an object that does not exist or because of an error in one or more statements in the entity. Before rerunning the query, ensure that there are no errors in the entity and that all objects referenced by the entity exist.

If you get that, there’s a good chance that you might need to rewrite (or remove) that module because it isn’t working. 

Caveats:

  • There are exceptions to the statement above.  For example, suppose that your procedure uses a linked server but your account does not have permission to access that linked server connection.  In that case, you’ll see the same error even though the procedure itself may work correctly when the correct person runs it.
  • Columns do not appear in the sys.dm_sql_referenced_entities DMF when using linked server connections.  You can see the tables, but no further.
  • This method really is designed to work within a single database, not to span across databases.
  • This method might not get all of the references.  I did notice a case in which a procedure did not have columns show up even though they definitely were used.  It might have had to do with the columns only being used in a WHERE clause or not being part of the final select/insert statement, but I’m not positive.

 

Older Posts »

Theme: Silver is the New Black. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 44 other followers