Monday, July 18, 2005

The Shade Tree Developer is moving to CodeBetter.com

Sorry for the inconvenience, but I'm moving The Shade Tree Developer to CodeBetter. Besides keeping some great company at CodeBetter, I'll get to have a bit better blogging infrastructure like trackbacks and categories so you can skip over my recent rantings about legacy code to get to the TDD content. I might also try to remember enough about CSS to get a stylesheet that doesn't make my code samples look fugly. I'll put forwarding links on the blogspot location for the older posts.

The new address is http://codebetter.com/blogs/jeremy.miller/.

The new RSS feed is http://codebetter.com/blogs/jeremy.miller/rss.aspx.

An Atom feed is available at http://codebetter.com/blogs/jeremy.miller/atom.aspx.

Friday, July 15, 2005

Resources from the Continuous Integration talk last night

If you were at the Continuous Integration talk last night at the Austin SPIN, here is the set of resources I mentioned. The slides from the presentation are here or at http://www.jwspro.net/aspin/. Thanks for coming out last night in the rain and putting up with me.

Articles

Tools

  • I published my company's .Net tool usage with descriptions and links here.
  • CruiseControl (Continuous Integration for Java)

Books

I know there has been a couple of books published on CI now, but I don't know any to recommend. I heartily recommend Michael Feather's Working Effectively with Legacy Code for retroactively applying Test Driven Development and CI.

Thursday, July 14, 2005

Bunch of Good Links

First, maybe the best blog post I've every read from Mike Spille. I've been guilty of the "guardrail to guardrail" effect a time or two. On a project last year we probably did some dumb stuff because of someone else's negative experience with a certain technology on his previous project.

Read the comments too --> "BTW, you left out the obligatory insult to VB."


In a similar vein, James Bach warns us against being absolutist about "Best Practices." One of my favorite quotes is "A maturity model is basically a gang of best practices hooked on crystal meth." Just to add my 2 cents worth, I think it's best to always keep an eye on the first causes. I've written before that a centralized architecture group handing down black and white mandatory best practices from Mount Olympus to the pitiful wretches in the trenches is a terrible organization anti-pattern. There is always some kind of exception to every rule and the guys doing the actual work should be capable of exercising their own judgment.

Lastly, integration guru Gregor Hohpe asks what the real difference is between coding and configuration. Here you go Gregor, the difference is that I can write unit tests for code and use a debugger to troubleshoot code. I don't know why people keep inviting Gregor to these kinds of conferences because he just mocks the proceedings in his blog later.

My StructureMap tool is very configuration intensive and I have an unnatural tolerance for external XML configuration. To compensate, I've tried to provide the usage of .Net attributes for some configuration to be closer to the code. I've also had to invest a lot of energy into creating better tools for troubleshooting and validating configuration files in StructureMap. Since I dogfood StructureMap, expect more and better troubleshooting tools (because my colleague gets irritated otherwise).

Tuesday, July 12, 2005

Impressions from the VSTS talk last night

I went to a talk on Visual Studio Team System last night given by Chris Menegay at the Austin DotNet Group. I tried to go into this with an open mind, but my bias is clearly with open source tools and agile processes. I'm still not convinced VSTS provides any value beyond our current tools and process, but it'll help somebody. Either way, I think VSTS is a valuable addition to the .Net world. If nothing else, it's helped introduce lifecycle practices to shops that have none and started some healthy conversations.

I wrote this several months ago, and nothing I saw last night substantially changes my opinion.

http://jeremydmiller.blogspot.com/2005/04/bypass-vsts-and-get-yourself-poor-mans.html

Without further rambling, cue the "wah, wahhh, wahhhhhh" theme music and let the cliche begin...

The Good

  • I think VSTS could be great for project tracking. Being able to intelligently log checkins to a user story or a bug fix with the time spent on the task could be a great boon to project management. The Heisenberg Principle always applies to this kind of tracking. If I, the developer, have to go out of my way, my velocity is going to be slower. VSTS might change the cost/benefit equation somewhat. A lot of any Agile process is the ability to accurately forecast user stories during iteration planning. Using a good "yesterday's weather" report from past iterations could certainly help. We're still going to take a look at Trac though as an alternative.
  • The code coverage visualization is awesome. It's not clear yet, but it might be possible to create test coverage reports from other tools outside of VSTS (NUnit, NFit, etc.)
  • It is an integrated suite of tools that will be supported by MS. I personally don't buy the argument that assembling a stack of NUnit/NAnt/CC.NET/NDoc/Subversion is really that hard, but I've been immersed in the agile world for awhile and I'm comfortable doing that (I've also worked with some of the NUnit and CC.NET folks too, so I'm not really impartial). For a shop with no experience in these tools or with a hostile policy towards open source, I can see the value of buying the whole stack in one place.



The Bad
  • Call me close-minded, but I wouldn't touch MSF with a ten foot pole and a firewall in between. The process templates are editable, but it doesn't sound like there is a GUI yet for doing the editing. I'll be interested to see if anybody (with much more patience and ambition than me) tries to make an XP or Scrum template.
  • At this point it doesn't look like you can use any other source control system than what comes with VSTS. As much a pain as it has been trying to move from VSS to Subversion, I'm not very excited about another migration. The source control is clearly better than VSS (how could it not be), but I didn't see any particular value over Subversion today.
  • VSTS does not support Continuous Integration, only scheduled builds. You'll need to continue using CruiseControl.Net. I fully expect the CC.NET guys to have a plugin for VSTS and/or MSBuild for CC.NET by the end of the year.
  • The web testing looks weak to me. I still think we're looking towards either Selenium or Watir.


The Ugly
  • From the initial impression, the MSF for Agile process looks like a heavyweight process. It is clearly meant as an iterative process. While that's certainly an improvement on the older MSF, iterative alone doesn't automatically translate to being agile.
  • VSTS is clearly aimed at very large companies using laborious, high ceremony processes--everything that my employer is not. I think process automation is a great thing, but if your process is so complex that nobody can know the whole, I think you've got some serious issues.
  • Pricing. When alternative tools are available and generally free, I have a hard time justifying the cost.
  • The one piece of the presentation where I thought Menegay was completely off his rocker was the idea that VSTS became a directed workflow tool to tell everybody (expecially developers) exactly what they should be doing. As a tracking tool for project record keeping, I think VSTS is great. Using something like VSTS as your primary communication channel is absurd. Turning developers into mindless zombies doing exactly what the VSTS workflow tells them to do sounds like a recipe for disaster to me. Not to mention a steep drop off in developer retention.

Monday, July 11, 2005

Don't play with strange databases...

...you don't know where they've been, or who's been touching them.

I've been burned pretty badly in the past by making assumptions about existing databases. Database documentation isn't always helpful either, because it is out of date or just wrong to begin with. Even if the documentation is "correct," the corresponding code may be using it differently than the database designer intended. Even worse, you often have to interpret the data, and that's dangerous when you don't know the database. The only accurate source of information is usually the subject matter expert for the system.

What's the answer? I don't know. All I can say is to approach someone else's database with great caution. Kinda like going up to a big, mean looking dog and trying to make friends. I wouldn't make any kind of assumption about the meaning of any table or field. Put some effort into understanding the schema and expect to go back and forth a little bit.

It's the year 2005. Isn't the existence of SOA supposed to eliminate the need to be doing integration directly against a database? I'm writing a little code to integrate our main application with another 3rd party system. The only way I have to do this right now is to just write SQL queries against their database schema. I feel dirty.

Even worse is when shops start writing their own extensions and back door queries against a 3rd party application. That's a great idea. Take a database that you're not supposed to touch directly, and that you don't fully understand, and write all kinds of custom code into it. That'll really make it easy to upgrade the software package later.

Don't Make It Worse

At a previous employer they use an ancient inventory system that is written in a rare 3GL language. If you think of your IT infrastructure as the cardiovascular system of a manufacturer, the inventory system is the heart muscle. Everything else talks to the inventory system. The business logic of the system was bound up in the UI screens, so the only front door integration point was a screen emulation package from the vendor that was only certified for low volumes. At this point an intelligent IT organization in the SOA era would start looking for alternatives. My old employer beat the integration problem by writing hundreds of PL/SQL procedures to duplicate the business logic in the screens in lieu of a real service layer. To support other functions they added about 300 custom tables to the out of the box database. Just to make things worse, the database customizations were different region to region.

At one point I needed to write an integration from the inventory system to my application. I went and asked the SME where I could find a certain piece of information. He told me to look in table A and I turned around to leave. He then said, "you could also get it out of table B, or table C come to think of it." Oh, ouch. When I left they were struggling with a batch job that was supposed to run nightly that was taking about 25 hours to run and locking all kinds of database tables. They've experienced some problems with scalability, who would have guessed?

Even though the system is horribly obsolete and arguably an opportunity cost slowing down new development, they have no chance of upgrading to the newer J2EE version with web services because their database is way too wired. Idiots.

Harris Boyce warns us that O/R isn't a silver bullet

Harris Boyce wrote a response to my ADO.NET contagion post: "Soul Vaccination" for Data Access Layers.

I think that Harris is quite right in warning us about overusing Object/Relational mapping.

While there are still plenty of scenarios where I would forgo an O/R approach for the classic data access layer, I still think that the "Persistence Ignorant" (POCO/PONO/POO whatever) approach for business classes is the way to go for the sake of testability. I think the decision hinges primarily on the amount of business logic in your application (and secondarily on the comfort level of the developers with OO coding in general). Even if I'm not using an O/R tool, I usually use some type of Inversion of Control to keep the rest of the code loosely-coupled from the data access.

Before diving into O/R mapping, you might take a look at these resources. Understanding some of the underlying patterns and mechanisms of O/R mapping might help alleviate some of the pain or help in making design decisions.

  • Patterns of Enterprise Application Architecture by Martin Fowler has a couple of good chapters about O/R patterns. The chapter on organizing business logic is a canonical read for deciding whether or not to use a Domain Model approach that would lead to using O/R for persistence.
  • Applying UML and Patterns by Craig Larman has a great chapter on creating a small persistence framework. This book is my choice as the best introduction to OO programming around.
  • This article from Scott Ambler introduces the basics of O/R
  • My article on C# Today from a couple years ago, but it's derivative of the guys above.

Sunday, July 10, 2005

Continuous Integration Presentation at the Austin SPIN

I'm giving a presentation on Continuous Integration at the Austin SPIN meeting on Thursday, July 14th. The meeting info is here. It's an adaptation of the talk Steve Donie and I did for Austin Agile a while back, just less .Net-centric for a broader audience.

I'm excited (OT)

Communications technology is undeniably changing our world. "The World is Flat" type effects may act as a downward force limiting our salaries, but my wife just let me sign up for the NFL Sunday Ticket and that's pretty darn cool. I live in Texas, so every Sunday at noon my football viewing choices consist of mediocre Cowboys on one channel and mediocre (but improving) Texans on the other channel. I grew up in Missouri, so I'm a lifetime Kansas City Chiefs fan. Thanks to a little technology I'm gonna be watching my beloved Chiefs in my living room this year instead of trying to sneak over to a sports bar to see the game.

And if the Chiefs' new linebackers still can't stop my Grandmother from running up the middle, I'll just change to a different game.

Friday, July 08, 2005

When I was a Mort...

Yesterday at lunch we were discussing (mocking) Rhocky Lhotka's defense of "Mort's" on his blog. What the heck is a Mort you ask? A Mort is supposed to mean an opportunistic programmer who is most concerned with delivering business functionality rather than wasting any time on silly ivory tower concepts like technical quality. Unfortunately the term "Mort" has become a pejorative term synonymous with low skill developers using data aware widgets to drag'n drop their way to one tier applications. I'll admit that I commonly use the term Mort as a putdown, but there was a day and time when I was a Mort, too.

My first programming experience was writing some ASP and Access tools for my engineering group in the late 90's. At that time the engineering and construction world was pretty crude in terms of IT automation. There were a lot of data silos, and quite often the most junior engineer (always me) was stuck manually typing information contained in one database into Excel sheets. We had to generate a lot of paperwork and track a lot of audit style information. I hated the paperwork aspect of engineering (hence my gravitation towards agile processes), so I set out to create some automated tools to create the Word and Excel documents from information in an Access database that was edited by ASP pages. This work led me to a position with the project automation group creating an ASP system to verify and manage a very poorly written, but mission critical, data exchange.

I clearly created a lot of business value and I was pretty proud of myself. Then I left the company to relocate to Austin, and everything I left behind collapsed in a few months because no one could support it. Some of it was rewritten by actual IT folks (and consider this a long overdue apology to you all), but most of it just disappeared. So what did I do wrong? Here's a laundry list of really bad things I did because I just didn't know much about good practices in software development.

  • Source control? What's that?
  • (Boss) Jeremy, where is this stuff running? (Me) It's running on personal web server on my box. Hey wait a minute, what's that awful sound coming from my hard drive and why won't my box reboot anymore?
  • (Boss) Do you have that stuff backed up? (Me) What does 'back up' mean?
  • Coding directly into a production database of a 3rd party product that was notorious for database corruption on projects ranging from little 100 million dollar projects to multi-billion dollar projects
  • ActiveX controls on ASP pages. Believe it or not we had some issues with installation.
  • ASP pages running against MS Access via ODBC in production
  • Connecting directly to production Oracle databases from ActiveX controls on an ASP page
  • Using the old Remote Data Service (RDS) library. Great stuff for productivity, huge security hole. Using RDS meant that I had the user name and password for the fragile Oracle production database embedded in each ASP page where any yokel could go "View Source" and hack our sensitive data.
I was focused on business needs and delivered, but I was dramatically ignorant of anything approaching decent development practices or design. I distinctly remember one humbling episode where I asked a senior automation expert for help on comparing two different databases. He took a very short look at my several hundred lines of VBScript and banged out a SQL statement that did exactly what I needed to do in one line of code. His exact words were "you've got good ideas, but you would be much more effective if you just knew more." Ouch.

All right, I just shared some of the truly stupid and ignorant things I did in my Mort days. What did you do in your Mort days?

What's so wrong with being a Mort? After all, they're pretty common and they definitely add value to their employers. They're usually closer to the business and have a better understanding of the problem domain. For one thing the Mort's of the world are in mortal danger if I have to work with any more awful Mort-ish code (regardless of what language it's written in). For another more serious reason, the Mort jobs are the first developers to be offshored. Lastly, if you're developing software, learn about your craft and climb the skill ladder. The whole Mort/Elvis/Einstein taxonomy may be crap, but the more you know the more effective you'll be and that seems pretty pragmatic to me. Make your great business solutions stand the test of time.

Thursday, July 07, 2005

Great CMM quote at lunch today

At the Agile Austin lunch today we had a great quote. We were discussing using a continuous integration build to auto-generate documentation specifically for CMM compliance when somebody said "[doing CMM paperwork] is like paying protection money to the mafia."

Quarantine the ADO.NET Contagion in your Code

From my perspective the .Net world is clearly becoming more sophisticated in our approach to persistence. Reusable persistence solutions like NHibernate or NEO are starting to become more popular, reducing the need or desire to work with ADO.NET directly.

However, most of the .Net systems out there that I encounter still have old fashioned data access layers rolled by hand. This doesn't have to be that bad, ADO.NET is pretty easy stuff, and the .Net community is generally pretty strong in data access. The problem is that ADO.NET usage has to be done right, every single time. Forget to close a connection or an IDataReader somewhere, then add in enough volume through the system, and watch your application roll over and crash because the database has too many open connections. The same thing with exception handling occurs. You have to have the same "try/catch/finally" code every single time you use ADO.NET code.

The second problem is a little more endemic. Because ADO.NET is so easy to use, developers coding a web page or a business project give into temptation to just write the data access code right where it is needed instead of separating out the data access concern with a little thought. These people are often colloquially known as "Mort's." To Mort, please stop doing this. Especially if you're writing code that I have to support. Just in case it's not obvious, writing the ADO.NET code straight into a business object basically eliminates any possibility of writing easy unit tests.

My advice is to simply count the number of times the following lines of code show up in your application. Offhand, I'd say the maximum number of times this code should be called to be no more than 3-5 times.

  • connection.Open();
  • command.ExecuteNonQuery();
  • command.ExecuteReader();
  • connection.Close();

You can eliminate a grotesque amount of duplicated code by simply centralizing the low level ADO.NET manipulation into one or very few classes. It's much easier to write the error handling and connection management once than it is to get it right everywhere. It'll also allow you to create a better answer for tracing and instrumentation. All other higher level data access layer classes call into the central execution class to actually execute database commands. Other than the possible exception of DataSet's, nothing from the System.Data namespace should ever leak out beyond the classes that are specific to data access.



Here's an example of writing the inner code execution class from our data access layer:





using System;
using System.Data;
using StructureMap.DataAccess.ExecutionStates;

namespace StructureMap.DataAccess
{
[Pluggable("Default")]
public class DataSession : IDataSession
{
private readonly IDatabaseEngine _database;
private readonly ICommandFactory _factory;
private readonly IExecutionState _defaultState;
private readonly ITransactionalExecutionState _transactionalState;
private IExecutionState _currentState;
private readonly ICommandCollection _commands;
private readonly ReaderSourceCollection _readerSources;

[DefaultConstructor]
public DataSession(IDatabaseEngine database)
: this(database,
new CommandFactory(database),
new AutoCommitExecutionState(database.GetConnection()),
new TransactionalExecutionState(database.GetConnection()))
{

}

///
/// Testing constructor
///

///
///
///
///
public DataSession(
IDatabaseEngine database,
ICommandFactory factory,
IExecutionState defaultState,
ITransactionalExecutionState transactionalState)
{
_database = database;
_factory = factory;
_defaultState = defaultState;
_transactionalState = transactionalState;

_currentState = _defaultState;

_commands = new CommandCollection(this, _factory);
_readerSources = new ReaderSourceCollection(this, _factory);
}



public int ExecuteCommand(IDbCommand command)
{
try
{
return _currentState.Execute(command);
}
catch (Exception ex)
{
throw new CommandFailureException(command, ex);
}
}

public int ExecuteSql(string sql)
{
IDbCommand command = createCommand(sql);
return this.ExecuteCommand(command);
}

private IDbCommand createCommand(string sql)
{
IDbCommand command = _database.GetCommand();
command.CommandType = CommandType.Text;
command.CommandText = sql;
return command;
}

public IDataReader ExecuteReader(IDbCommand command)
{
try
{
return _currentState.ExecuteReader(command);
}
catch (Exception ex)
{
throw new CommandFailureException(command, ex);
}
}

public IDataReader ExecuteReader(string sql)
{
IDbCommand command = createCommand(sql);
return _currentState.ExecuteReader(command);
}

public ICommandCollection Commands
{
get { return _commands; }
}

public IReaderSourceCollection ReaderSources
{
get { return _readerSources; }
}

public void Initialize(IInitializable initializable)
{
initializable.Initialize(_database);
}

}
}

using System;
using System.Data;

namespace StructureMap.DataAccess.ExecutionStates
{
public class AutoCommitExecutionState : IExecutionState
{
private readonly IDbConnection _connection;

public AutoCommitExecutionState(IDbConnection connection)
{
_connection = connection;
}

public int Execute(IDbCommand command)
{
try
{
setupConnection(command);
return command.ExecuteNonQuery();
}
finally
{
cleanupConnection(command);
}
}

private void setupConnection(IDbCommand command)
{
command.Connection = _connection;
_connection.Open();
}

private void cleanupConnection(IDbCommand command)
{
try
{
command.Connection = null;
_connection.Close();
}
catch (Exception)
{
}
}

public IDataReader ExecuteReader(IDbCommand command)
{
try
{
setupConnection(command);
return command.ExecuteReader
(CommandBehavior.CloseConnection);
}
catch (Exception)
{
cleanupConnection(command);
throw;
}
}
}



}






In this example, all data access classes have to use a class called DataSession
to execute an IDbCommand object or a sql string. The DataSession class delegates to an IExecutionState interface object. The AutoCommitExecutionState deals with all of the connection opening and shutting, as well as providing a consistent error handler to report the actual sql (with parameters) that failed. By centralizing this operation to one spot, we could add extra code to provide more instrumentation or simply change the error handling across the board. Most importantly, other classes don't have to be obfuscated by mucking with so much ADO.NET manipulation.




My Poor Friend

One of my previous career stops was at a very bad internal IT shop. The leadership in our division was making it painfully clear that developers were merely a commodity, and undeserving of anything like good working conditions, raises, or a career path. After years of abuse working on a truly horrendous VB6 system (inspired by the architectural writings of a "Super Mort"), a friend of mine named Scott secured a transfer to a different organization. Scott was baited by promises of getting to do hard core J2EE development, the hottest thing going at the time.



Upon arrival at the new digs, Scott was told they needed him to help clean up the existing system first to improve stability before he could work on the new J2EE work. It turned out that the existing system was a 900 page ASP classic application that had been written by people that had grown into a mission critical application. The instability problems turned out to be because none of the shadow IT developers had closed any of the ADO connections in the ASP code, causing the database (I don't know what the database was, but it wouldn't surprise me if it had been MS Access) to crash and burn with too many open connections. Scott's new Job? Go through each and every ASP page and add code by hand to close connections, commands, and recordsets.

I know what you're thinking, just write a fancy Perl script with some regex magic and go on. No such luck, the code was too inconsistent. Just in case you've never been afflicted by classic ASP code, take the worst code you've ever written and put it into a blender. That's about what you'll find in ASP (my ASP code circa 1998 was pretty smelly).


I Don't Think Much of the DAAB

Just to be opinionated, I don't see much value in the original Data Access Application Block or the improved replacement in the Enterprise Library. In my opinion, they just don't add much value. It's still a low level API with not much there to guarantee good data access practices. I know a lot of people like them and they're better than nothing, but there are so many better tools out there. For that matter, I just don't think it is difficult to make a data access layer specific to your application that is superior to the DAAB in terms of connection management, transaction management, error handling, and instrumentation.

Wednesday, July 06, 2005

Attitudes towards Stored Procedures

My philosophy towards stored procedures has shifted so much over the past five years that I'm not sure I have a strong opinion on them anymore. At one point I would have told you with great fervor that you had to be doing data access through stored procedures to have a good application architecture, now I'm not quite sure. On the other hand, I've seen developers write hundreds of lines of procedural code to do set-based manipulations that would have been relatively simple to express as a SQL statement.

For me today the decision to use stored procedures comes down to the particulars of the application, the comfort level of the rest of the team, the relationship with an application DBA (if he/she exists), and the relative pain or ease of configuration management for database code. For whatever reasons, a lot, if not most, shops do a terrible job with configuration management for database code and structure. As a consultant I worked on an engagement with a client that went through way too much manual process rigamarole just to get a stored procedure moved into the development server. Of course, they handled the database code moves in an unrelated process separate from the code, allowing all kinds of bad things to happen.

I spent a lot of my early career writing Oracle PL/SQL, so I’m pretty comfortable with using and writing stored procedures in general. And to be honest, I think a lot of ex-DNA developers (victims) like me are inclined to write sproc’s because they were often easier to change and deploy than COM(+) DLL’s, especially with a clustered server architecture. Case in point, my current employer has a deep VB6/ASP/Sql Server background and stored procedures in T-SQL run rampant throughout our architecture. Developers who aren't comfortable with object oriented programming will often use stored procedures as an extensibility mechanism (I don't like this, but it does work).

When I moved to a shop doing TDD that was mostly stocked with J2EE veterans I was a little shocked at their disdain for stored procedures. In their world order, stored procedures were evil. Not very many of them could explain exactly why stored procedures were problematic, but they all knew it for established fact. Later on I started to understand their point of view. J2EE has a multitude of proven persistence mechanisms that isolate middle tier code from the database and eliminate a lot of manual SQL creation. In most cases, there was also a fear that using stored procedures would inevitably lead to business logic in the database. A general discomfort with relational databases probably explains a lot of the hostility towards stored procedures in some developer circles.

For many people the most important reason was that stored procedures are inherently more work to test than plain old objects. Just as importantly, unit tests that touch the database are slower to execute. A domain model approach coupled with some kind of metadata-driven persistence mechanism can lead to much easier unit testing and faster development overall. Stored procedures often simply interfere with persistence strategies.

Some O/R tools support stored procedures, but it’s often more trouble than it’s worth. If you want both O/R mapping and stored procedures, I’d actually recommend rolling your own tool. Writing a complete solution like NHibernate is hard, but creating a just good enough custom tool isn’t that bad. Reflection in .Net is pretty easy to use, and the System.Convert class makes type coercion relatively painless. If your domain model is small, just hard code the database mapper classes and move on with your life.

Enter the DBA

Another obvious factor in an organization’s attitude towards stored procedures is the relative strength and influence of DBA’s. Developer/DBA conflict can be even more harmful than the spitball fights between developers and testers. If a DBA is the only person allowed to write SQL, or has to approve every piece of SQL, you’re going to go slower. I hate having this kind of territorial bottleneck in the development process, but when you’re stuck with this situation my advice is to:

  1. Keep your DBA engaged in the development work of the rest of the team. Don’t let them work in a vacuum. It helps tremendously if the DBA actually has a stake in the success of the project.
  2. Aggressively use mocks or stubs to build business logic and user interfaces so you aren’t stuck waiting on a DBA. This also has the advantage of firming up the data access requirements before engaging the DBA.
  3. The DBA must follow the exact same configuration management practices as developers. Ideally, you want a DBA to check any changes into source control and monitor the continuous integration build and fix any problems from their check-in's. Stored procedures are just code and should be treated as code. Make sure you help the DBA out with check in policies because it is a different world for most of them. It’s also cool as a way to give the DBA’s some visibility into the project at large. I'm plenty paranoid about this one because of some early issues on a project last year when the DBA decided to start optimizing some PL/SQL and broke the application for a while.

Tuesday, July 05, 2005

Good and Evil in the Garden of Stored Procedures

Periodically I see or engage in an argument over the usage of stored procedures. I think a lot of these types of arguments are somewhat unproductive because the combatants are thinking and talking about radically different things. Besides, I think the argument over the proper role of stored procedures in data access is often obfuscated by the misuse of stored procedures for things other than data access. My opinion has changed quite a bit through the years, so I try to stay out of these arguments.

However, my development group has to make a change soon in our philosophy for using stored procedures (or hit a brick wall). The way my company has used T-SQL in the past is jeopardizing our future development (besides, I’m an Oracle guy and I think T-SQL is butt-ugly). We’re a product development company that has grown by leaps and bounds from a startup venture and the existing codebase has grown somewhat chaotically. To keep the growth going and accommodate new customers both here and abroad we’re going to need to localize the application for different languages and port our main application to Oracle. Toss in the ongoing adoption of TDD and continuous integration as a means of creating better software quicker, and the usage of stored procedures has to be reevaluated.

What’s Good about Stored Procedures

Before I start ranting about the grotesque misuse of stored procedures I’m dealing with, here is what I think is useful about stored procedures.

Using stored procedures externalizes the SQL from the compiled code. I really like having the SQL in an easy to read format instead of built by string concatenation in the code. This is also great if you need to optimize the stored procedure for performance. I haven’t seen that come up very often, but it only seems to happen on systems where the SQL is embedded in the code. There’s also the issue of allowing a DBA to alter or write stored procedures independently of the code. I have generally had control of both database and code throughout my career, so I’m always suspicious of DBA’s writing stored procedures instead of developers. I do think that dynamically constructed SQL can result in systems that are hard to debug, but that can be beaten with just a little judicious instrumentation in your data access layer.

Performance is an advantage of stored procedures, but maybe not enough by itself to justify stored procedures over other approaches. I think the performance advantage is either largely mitigated by the ability of newer database engines to cache execution plans or simply a case of premature optimization. If you have an opportunity to batch up a set of updates or simply reduce the amount of information you send over the wire to the database a stored procedure certainly leads to a performance improvement. Otherwise, I think the stored procedure advantage is too slight to be a serious consideration.

Security is another advantage that is often cited as an advantage of using stored procedures. On one hand, stored procedures quickly mitigate vulnerability to SQL injection attacks. On the other hand, the database can be tightened down so that no ad hoc SQL can be executed, stopping security breaches right at the start. Personally, I abhor this idea myself because it drags down the development team’s velocity. I don’t see why it’s necessary, but it is an option.

Most of this goodness can be happily achieved with nary a stored procedure. Parameterized queries can eliminate SQL injection vulnerabilities while providing comparable performance. The data access layer we built (I might release this as part of StructureMap someday) for our last project already externalizes SQL from the code for ease of maintenance and decoupling from the database.


When Stored Procedures Go Bad

While there is nothing inherently evil about stored procedures, developers often misuse stored procedures. I think stored procedures should be nothing but basic CRUD. I wrote in an earlier post that “IF/THEN/ELSE” logic in stored procedures is a design smell.

Stored procedures are a terrible place to create business logic. Development environments for the .Net languages or Java are far better suited for developing applications. Intellisense support, debugging, and refactoring are all inferior with stored procedure languages. T-SQL is a procedural language that pretty well forces you into the transaction script style of organizing business logic. One thing I’m observing in the reams of T-SQL I’m wrestling with is the obscene amount of code duplication. Now that we have to support Oracle as well, this duplication stands out like a sore thumb. An OO language can do a much better job of code reuse.

Here’s my list of things I’ve seen put into stored procedures that should never be in a stored procedure.

  1. User input validation that creates the validation messages seen on user screens. I can’t even begin to describe how stupid this is.
  2. Creating HTML, enough said.
  3. Security authorization logic. This is definitely a middle tier responsibility. I’ve used joins to some sort of permission table in select statements to filter records. I think that’s fine, but using procedural logic or even role checks inside stored procedures is nuts. Security is volatile and sensitive. Put it where it’s easy to test. For that matter, always leave a way to test your business logic without security checks if you can. Security always makes code harder to unit test. By wrapping security up with business logic in a bag of stored procedure goo, you’ve essentially shot yourself in the foot for testability.
  4. Business logic. It’s easier to write and test in C#, period. Splitting logic between stored procedures and middle tier code is even worse, but I’ve seen it done many times.

Friday, July 01, 2005

Insight from Scott Bellware

Fellow Austinite and champion ranter Scott Bellware has some great insight on the impedance mismatch between application architecture and the newer SOA paradigm.

Overall, I think SOA is an advance in the construction of software solutions, especially in large IT infrastructures. What always bugs me is the seeming disdain (or ignorance) the SOA guys have for good OO or even application design. I'm familiar with the situation and perpetrator of the (very foolish) guidance of mandatory SOAP web services between the UI and backend (even if the code is all running in the same process space!) "just in case" they need a connected service later. In no way does the existence of SOA services abrogate the need to create well-factored solutions. Write a well-structured code base and your application will always be able to expose SOA endpoints later. And if you're doing SOA, keep the internals of your service or system completely independent of the messaging transport. Like Scott implies, serializing your real domain classes as part of the SOAP contract strongly couples your clients to the internals of your service.