Jared's Weblog

Main Page - Books - Links - Resources - About - Contact Me - RSS


Jared Richardson Most Popular
Help! I've Inherited Legacy Code
Testing Untestable Code
Continuous Integration... Why Bother?
Mock Client Testing
The Art of Work
Technical Idiot Savants
Targetted Skills Aquition vs Free Range Chickens
The Habits of Highly Effective Developers



Blog Archive
2005-December
2005-November
2005-October
2005-September
2005-August
2005-July
2005-June

Sun, 14 Aug 2005

Blog technical issues

I had a group of blog source files get their date time stamps reset tonight. I think I restored it all from a backup and reset the date time with the touch command, but if your blog aggreator thinks I'm spamming you, I aplogize. :(

FYI, I had a Linux box behaving strangely so I bounced it. Apparently the OS hadn't run flushed the disk buffers for a week! I've ~never~ had Linux do this before and it's running on older hardware, so I suspect something on the disk or motherboard is getting flakey, but a week!!? Wow! I lost a ton of Wiki entries but all the blog entries were backed up.

:(

I gotta break down and buy one of those Mac Minis....

Jared

posted at: 18:41 | path: | permanent link to this entry

Rails with MySql hint

If you've used MySql before, this is just tribal knowledge, but if you're trying out Rails for the first time, you might encounter this. I know two people who encountered the situation this week, so I'm posting it here.

MySql has all networking turned off by default. This is a security measure that makes your MySql much safer from network attacks, but also from your own network use. :) So, if you see this error message:

Errno::ECONNREFUSED (No connection could be made because the target machine actively refused it. - connect(2)): 
    c:/ruby/lib/ruby/gems/1.8/gems/activerecord-1.11.1/lib/active_record/vendor/mysql411.rb:47:in `initialize' 
    c:/ruby/lib/ruby/gems/1.8/gems/activerecord-1.11.1/lib/active_record/vendor/mysql411.rb:47:in `new' 
    c:/ruby/lib/ruby/gems/1.8/gems/activerecord-1.11.1/lib/active_record/vendor/mysql411.rb:47:in `real_connect' 
    c:/ruby/lib/ruby/gems/1.8/gems/activerecord-1.11.1/lib/active_record/connection_adapters/mysql_adapter.rb:39:in `mysql_connection'
and so forth and so on, then you probably need to turn on MySql networking (assuming MySql is running of course).

In /etc/mysql/my.cnf find skip-networking. Comment it out (by putting a # at the beginning of the line).

Rail on!

Jared

posted at: 18:36 | path: | permanent link to this entry

Test Driven Refactoring

This is something that people have probably been doing for years, but I haven't seen it named and I've found it's easier to remember something with a "name", so TDR it is. :)

If you are a developer or tester trying to get an automated test suite for your product, you may find yourself in the uncomfortable position of trying to test a product that doesn't have good "hooks" for your testing framework. It's not impossible to test "untestable" code, but the effort involved in test creation and maintainence usually cancels out the benefit of the automation.

For instance, you might be trying to test an HTML page that has a lot of unnamed fields. So instead of using the "name" or "id" tags to locate a field, you count the items on the page and check the value in the 5th item. Not only is this very difficult for someone else to understand and maintain, the test is also very fragile. Often changes to the page will break your test.

What's the solution? Tell your manager you want to start Test Driven Refactoring. You want to start adding the simple hooks to your application that make it possible (or feasible) to do good automated testing.

First, create (or borrow) a test plan for the product. Keep the plan simple at first. Don't try to hit the moon on your first attempt. Shoot for a simple pass that exercises basic functionality.

Second, write using some framework (JUnit, MBUnit, whatever). Make the test go as far as you can. When you get stuck, what's the least amount you can add and get the test running? Is it just a return code to an existing API or an "id" tag for an HTML field?

Don't try to "boil the ocean" with your first pass. Don't try to add "id" tags to every page in your product; shoot for the one page you need to get the test passing. If your test plan hits a roadblock, don't stop. Remove the part of the test plan you can't test and move on. Remember that you can also add the hard part to your next test plan.

The goal here is incremental improvement and momentum. As you make small improvements, your developers will start learning how to create testable products. As you write automated tests, you'll start learning tricks too. You'll be surprised at how much support you'll get from developers, testers and even managers once you've got a basic automation suite in place.

posted at: 18:23 | path: | permanent link to this entry

Test Driven Development is faster than test after development.

Here's an interesting idea that makes a lot of sense. Keith Ray's blog entry on the subject is insightful, and it has a nice description of what TDD actually is.

Having an automatable test suite up front makes a huge difference in how quickly you can get everything working ~right~ and how fast you can make major changes later. If you've never tried TDD, try it out with a small project. It's amazing how much code worked almost (but not quite) the way you thought it did.

posted at: 18:23 | path: | permanent link to this entry

Wal-mart has Ship It! in stock!

Although Amazon still doesn't realize that our book is in their warehouse, Wal-Mart finally has the book listed as in stock and orderable! Since it's our first book, I guess I tend to watch things like this closer than most authors. :)

Here's the link

Although the book has been selling on the Pragmatic Programmer site for nearly a month, it's nice to see it starting to trickle out into the traditional channels...

Now how long before I can find a copy in a local brick-n-mortar bookstore....



posted at: 18:23 | path: | permanent link to this entry

Ship It stuff
I've set up an online store at Cafe Press with a few bits of Ship It! stuff. There are a few shirts, some coffee mugs, etc. There is also a poster version of the Ship It poster.

Check out the online store and let me know what you think.

posted at: 18:23 | path: | permanent link to this entry

Lisp versus Erlang

I got this note from Will Gwaltney and it was too good not to pass along!

This guy wrote a poker server in Common Lisp, then switched over to Erlang because of its better performance and feature set. His server can now handle 27,000 simultaneous poker games *on his Powerbook*! Note that he doesn't have the web front end in place yet, but still! He also gets "load balancing, fault tolerance, failover, a database that lets me store objects without writing a lot of code, etc." for free in Erlang.

Several lessons here:

1. The old "Golden Hammer" trap ("If all you have is a hammer, everything looks like a nail"). Don't let your love for a certain technology (no matter how good it's been to you in the past) blind you to its inappropriateness *for a specific task*.

2. Open source *does* do the job. Erlang was developed by at the Ericsson Computer Science Laboratory a number of years ago. Even though open source, it's got the backing of a big company. They're motivated to make is as good as it can possibly be, because they use it themselves.

This link is for the original story.



posted at: 18:23 | path: | permanent link to this entry

Continuous Integration... Why Bother?

As Ship It! is starting to trickle out into the world, we are beginning to get feedback from readers around the world. As we hear from you about the topics in the book we'll try to post some of the more interesting replies to shine more light on some (hopefully) interesting topics.

M. Fridental asked some questions about our justification for using a Continuous Integration system.

One of the reasons we use a CI system to catch compile problems quickly. M. pointed out that the nightly or weekly builds would catch the same errors, and he's correct.

(The problems) all will be detected during the next release (say, in two weeks), but CI can detect it quicker (say, in two hours). One must decide for himself if the advantage of quick bug reports covers extra costs of creating and supporting separate build environment.

True, but if you've got a lot of code changes during those two weeks you can waste a lot of time running down the problems. The CI system isolates each code change so that you don't have that pain. Most people I've worked with think it's just a part of development, but after a few months of running with a CI system, they can't imagine working without one. I guess that's where I'm at today. I truly can't imagine trying to work without a CI system in the background.

Problems can also be found and fixed by the developer who changed the code. Most CI systems will send mail to all the developers who've changed the code since the last good build.

Haven't we all tried to fix someone else's code at one time or another? It's not that you can't fix it. It's that the developer who wrote the code can usually fix it faster than you can. It saves time.

Last point: in a weekly build, who runs down the errors? Is it the new guy in the shop doesn't know the code well enough to fix the problems? You might dump it on the new guy so he can learn, but what you're really doing is giving him the job you don't want to do yourself.

However, M. Fridental liked what we had to say about using the CI system to run the project's tests.

this testing argument is probably the strongest one for using CI. I must remind my colleagues every week to run test cases more often. With CI I can determine myself how often test cases will run.

If the CI system is running your tests every time the code changes, then functional breaks can be fixed very quickly. If you wait for the end of the week to run a test suite, how do you figure out which change broke the test? You waste more time code diving! Use a CI system and have the changes to your code isolated.

Build Continuously
Test Continuously
Fast Feedback Leads to Fast Fixes


posted at: 18:23 | path: | permanent link to this entry

Isolating databases

Eric Starr in Charlotte, North Carolina had a few questions about database isolation...

I have a question about Chapter 1 Develop in a Sandbox (from Ship It!).

From Page 16: "That may sound easy enough, especially in terms of isolating source code (see Practice 2, Manage Assets,, on page 19), but the real trick is to remember that it applies to all resources: source code, database instances, web services on which you depend, and so on."

I understand and practice isolating source code. I have never thought about isolating the database instance for each individual developer What I am curious about is how practical this has been for you given the frequency of database schema changes during the beginning of a project? How do you manage schema changes across a development team when each user has their own instance of the database? Do you put the schema meta data in SCM?

First, I always insist on having a tool that creates the schema. Usually this means a Java program that can read in a schema definition (even if it's just straight SQL) and feeds it to the database... but it could also be Perl, Ruby, whatever. Then the tool and the schema definition can be stored in your Source Code Management system (SCM). This means we can always reproduce our schema on any machine that has the database installed on it by just running the tool. We usually just write something simple (JDBC connections with basic error handling).

Second, because the entire tool can be checked into the SCM software, we manage the schema just like code. When a change is made, let everyone know to update their schema.

Along the same lines, we try to have a script to load minimal test data as well. This way every developer can have a "real" database for coding against without worrying about messing up anyone else's data.

Do you typically use a local instance of the database or a remote instance?

I try to have a local install. Most vendors will provide cheap development licenses... or you can use MySql, or Postgres which are free for everyone. Even if you can't go production with one of the free databases, if you are careful with your SQL, you can code cross platform queries. I've found it to be worth the extra work to have a complete database available to every developer and tester.

Do you typically determine which database to hit using an environment variable and a properties file or some other way?

Yup. I wrote a JDBC connection pool a few years back and I always use that one. :) But the idea is simple... have a Java Property file with your database information in it. Connection URL, usernames, etc. It makes it trivial to switch out databases.

Do you typically populate the database with test data (and remove the test data) using scripts (possibly useful for automated testing so that your data is at a steady/known state at the beginning of every test cycle...you may answer this one later when I get to the chapter about automated testing)?

Yes. In the past I've used Continuous Integration system with an Ant script that would wipe a database, load the test data and then run the tests. If you have the schema creation tool mentioned earlier, this becomes easy to do.

These questions cover the "how" but not the "why" of database isolation. The biggest reason is if everyone shares a database and some of the code you write modifies the data, how does anyone ever know the source of a problem? Is it my code or did Jim corrupt the data again? The last query ran slowly. Did I code some inefficient SQL or was Sue swamping the database again? Isolated instances help you to quickly recognize and understand problems when they occur.

The basic idea is to keep every resource isolated. You will spend more time up front building some basic tools (like the SQL schema creation tool and the data load tool), but I've found that these tools to be worth their weight in gold. They can be re-used across projects and make many useful automation tasks trivial.

posted at: 18:23 | path: | permanent link to this entry

Continuous Integration in the enterprise environment

Have you wanted to try out Continuous Integration but were afraid it wouldn't scale to your environment? Then this story is for you!

Will Gwaltney and I helped to introduce and roll out CruiseControl (a Java CI system) at SAS (the world's largest privately owned software company). Because we have five million lines of Java code and nearly 300 projects, some people were sure that a CI system would swamp the build infrastructure.

It did take a lot of resources to start up the system, but we dodged that by staggering the project startups.

Once CC was up and running, the load was amazingly light. Even with nearly 1,000 developers involved, we never had a situation that swamped the build environment. People being people, everyone finished up their work at a slightly different time, and that staggered the load.

Read more about CruiseControl at SAS at Mike Clark's blog.

posted at: 18:23 | path: | permanent link to this entry

So what is this blog about anyway?

I've had a team wide project weblog at work for about a year (a plog) and a company wide weblog (a blog) for about six months. This public blog is the next step. Will Gwaltney and I recently published our first book Ship It! A Practical Guide to Successful Software Projects, so this seemed to be the right time to get a public blog rolling.

I intend to repost a number of my work blog entries (with slight retooling) in the next few weeks, but after that the volume should drop back considerably.

So who am I? I've worked as a developer, a tester and a manager at companies as large as SAS and IBM and as small as a three person startup. During that time I've been entry level, director level and on a few occasions, a consultant. I started writing software in early high school on a Ti994/A and have enjoyed tinkering with technology ever since. I've built a few computers, water cooled a few, but these days having two daughters keeps me too busy to tinker with radiators in the computer room.

posted at: 18:23 | path: | permanent link to this entry

Just get something running!

Imperfect tests, run frequently, are much better than perfect tests that are never written at all. -Martin Fowler

Kind of says it all, doesn't it? Get it out there and get it running, perfect or not... fix it, perfect it, whatever, later (if you decide it even needs fixing!).

Momentum is a powerful concept... and you'll get more benefit from the first "Hello World" automated test than you ever realized.

posted at: 18:23 | path: | permanent link to this entry

Measure something to improve it

I've been telling a story for years but couldn't remember where I originally read it. I stumbled across it recently so I thought I'd post the story. It's an interesting insight into human nature.

A manufacturing facility wanted to improve production, so they put people with clipboards on the manufacturing floor to record information.

The first thing they did was turn up the lights. Sure enough, production went up!

The next thing they tried was dimming the lights. Production went up again!

Ignoring their choice of environmental factors to tinker with (the lights?), the lesson is interesting. Production increased because people realized that someone cared. The line worker realized that their work was important. I'm not just a cog in a big machine they thought. Someone is watching my work! As a result, production increased.

Apparently this is called the Hawthorne effect (Wikipedia link). I also saw it cited in Peopleware recently.

Of course, this also leads one to consider the old statement "Be careful what you measure because that's what you'll get."

posted at: 18:23 | path: | permanent link to this entry

How do you track your source code?

I've had a few people ask me why we included something as basic as source code control in our book. Doesn't everybody use CVS or Subversion they ask. Actually, no, a lot of teams don't use anything.

This conversation was posted on the web...(Google finds ~everything~!)... the guys involved were discussing what their manager wants them to use for source code management.

Apparently they sent the source code out in email... that way if the disk crashes, you can just poll your email from another machine. Here's the link:

http://tunes.org/~nef/logs/haskell/05.02.13

......
20:30:05 The most useful part is the extremely clear explanations of what can happen if you don't do the things it recommends.
20:30:41 "if your hard drive died right this instant, how much work would you lose? if it's more than a day of work, you need to use your SCM better."
20:31:10 SCM?
20:31:16 source code management
...
20:32:07 Anyway, I've been fighting to put everything we change into darcs so that we can instantly rebuild our projects, but I've gotten some opposition.
20:32:12 Ha! Mailinglist is a good idea.
20:32:40 on what grounds?
20:32:45 You send all your code to the mailing list. If your disk crashes you can retrieve your code back from the mailing list.
20:32:54 shapr: What do you use at the moment?
20:33:00 Opposition on the grounds that it's an extra step that just wastes time.
20:33:03 (If anything.)
20:33:09 This is also a reason why when people ask "what is a good editor" I reply "outlook express".

Wow...

First, you keep no history of your changes. A good source code management system (see this page for a list of products) will track the history of a file. Who changed this file? When? Why?

Second, your changes and Mike's changes don't get merged in together automatically. Your changes just overwrite Mikes. :) Sorry Mike! A real versioning system will either lock the file so you and Mike can't edit the file at the same time or it will merge in the changes.

Third, security... are you sending out the source code for your company through email? By the way, how much source do you have? Can you send it all or does that take too long?

Given that products like CVS and Subversion are free and they integrate almost every editor on the planet, there really isn't a good reason to not use source code management software! Keep fighting the good fight guys!

So what can these guys do? I'd suggest trying to show the benefits of source code control.

  1. Install Subversion (or CVS) on an existing machine. Preferably not your desktop, but if that's the only choice, use a desktop.
  2. Start using it yourself everyday. Get used to the tool and learn to use it. Show that it's not going to slow you down.
  3. Show it to your coworkers. Show them how it is tracking your changes, how it can handle collisions, etc.
  4. Finally, show it to the boss. Have a week's worth of changes to the code so they can see the merges. Show them how you can pull out last Tuesday's work on demand

It's possible that the boss still won't get on board... if so, I'd run a CVS or Subversion box in the corner and not tell anyone (assuming there is not direct order to NOT do it). :) It's not a radical departure that will hurt anything. Many an obsinate boss will allow something just because the developers want it. Don't underestimate your ability to introduce change.

posted at: 18:23 | path: | permanent link to this entry

Coding to an Interface by Erich Gamma
I saw two interviews with Erich Gamma today at Artima.com, but the second interview caught my eye. (Okay, they both caught my eye, but I blogged on the first article at work, so I'm trying to do something different!)

The summary reads:

In this interview, Erich Gamma, co-author of the landmark book, Design Patterns, talks with Bill Venners about two design principles: program to an interface, not an implementation, and favor object composition over class inheritance.

What caught my attention? Coding to an interface is a major part of Tracer Bullet Development, which is a major part of Ship It.

One of the big ways we've used Tracer Bullet Development is by defining the interfaces (or APIs) between major parts of your system. That's the design by contract between the developers in each section.

As long as the interface doesn't change, the developers in a given area can go crazy with retooling or refactoring. The developers above and below shouldn't know that anything changed.

These interfaces also give you a great place to code your automated test suites. By testing at the API level you (1) ensure the entire product area still works properly and (2) validate that the code works as intended instead of "working at coded".

There is an entire chapter on Tracer Bullets in Ship It, but this should give you a basic idea... it's nice to see Erich's article and know that we lined up with a smart guy! :)

posted at: 18:23 | path: | permanent link to this entry

Automated Linux Kernel Testing At IBM
IBM is sponsoring automated tests of the Linux kernel. The Slashdot post is here .

This is a step in the right direction, but a little short of what a complete Continuous Integration system can provide. The IBM system runs within 15 minutes of a release (see here) which is ~awesome~. It would be nice to have a system that also watchs the source code management software (CVS or Subversion) as well. (I don't intend this to be a criticism of the great work done by the IBM team, but rather educational about Continuus Integration in general).

First, your complete CI system only rebuilds when the code changes, and you only incur the overhead of a test run when you need a test run. Running nightly is often a waste of time for most products.

Second, with a Continuous Integration system, you isolate the changes to the code. If ten developers push code on Thursday and the tests break, who fixes them? Or rather, who goes exploring through the code and the tests to figure out whose changes broke the tests?

There are lots of projects, open source and commercial, that you can use for this type of work. I'm a big fan of CruiseControl, which is a Java implementation. See the Resouces page for more Continuous Integration systems.



posted at: 18:23 | path: | permanent link to this entry