when once is not enough

Jane Prusakova's picture

D.R.Y. principal of software design is based on one of the most natural and intuitive ideas: once is enough. D.R.Y. stands for "Do not Repeat Yourself" and applies not so much to code, but to representing ideas and concepts in software systems. Every piece of knowledge, decision, policy, algorithm, etc., should be represented exactly once, and no more than once. The concept of D.R.Y. was introduced by Andy Hunt and Dave Thomas in "The Pragmatic Programmer" book, first published in 1999.

However, the D.R.Y. principal is extremely hard to consistently follow in programming practice. Commonly, the same algorithms will be implemented independently in different parts of the system, for different objects, and same concepts will be implemented differently (with the idea of producing the same results) at multiple parts of the system. Lots of bugs happen when a concept needs to be changed, but not all representations are modified, or modified in the same way.

A common case is dealing with a simple list of things. It is natural to want to store a list as database entries. It makes sense to store a list of things in the database table. It also makes for easier report generation when working strictly from the database data.

However, when writing code, it is desirable to have each item in the list available as an enumerated type. It simplifies the code greatly to be able to work with an enumeration. It also works well with the fact that the list is not going to change often, thus saving on database accesses.

But storing the list as a database table and as an enumerated type directly violates the D.R.Y. - this list is now defined in two places. More importantly, it is now a very likely source of bugs - one of the two sources defining the list can be updated without the other. Depending on the usage, these bugs can be extremely hard to track.

While there is no good solution that restores the D.R.Y. principal, there are ways to automate checking for synchronization of two implementations. A unit test can verify that two lists are in sync and warn about a potential mismatch.

To verify that two lists are the same, the unit test must that retrieve all items from the database and ensure that each item has a corresponding enumerated type. It should also run through all defined enumerated types and verify that there is a correspond database entry. Here's the pseudo code:

List dbItems = retrieveAllFromDB();
foreach (item : dbItems) {
EnumeratedType enum = EnumeratedType.get(item.id);
Assert.valid(enum);
}


foreach (EnumeratedType enum : EnumeratedType.getAll() ) {
DatabaseItem dbItem = retrieveFromDB(enum.id);
Assert.valid(dbItem);
}

While this is not quite as good or right as following the D.R.Y. philosophy, it does prevent many errors, and makes bug hunting a lot simple and more predictable. One of the more substantial drawbacks is that it requires more code, and more maintainance. Still, it is miles better than simply defining the same objects multiple places.

Comments

NY2TX's picture

Is once ever enough? But I

Is once ever enough? But I suspect that it relates to more than "just" code.

Jane Prusakova's picture

Repetition in human-oriented

Repetition in human-oriented knowledge tends to be caused by uncertainty, power struggles, strategical games of various kinds. But software platforms do not (and are not designed to) handle those things at all, so once ought to be enough as far as software is concerned.

Jane Prusakova
Software Architect & Developer
My blog

jdunham's picture

What she said. This

What she said.

This principal applies to many things besides software, and has existed for long before 1999. Back in my early days as a mechanical engineer our drawings were ink on linen and were reproduced using the real, original blueprint process that produced white lines on a blue background (and smelled really bad for the first day or two). Back then our product "databases" hung on wooden bars in large cabinets and a database "access" consisted of a clerk pulling out a sheet of linen, rolling it up, and dragging it to the blueprint room.

And we had the same problem.

As soon as a copy is made, there are two instances. If I'm out on the factory floor with a blueprint, how do I know that the original back in the vault hasn't changed? How do I know that someone else with a copy hasn't marked it up with an "improvement".

The bottom line here is that whenever there exists more than one copy of critical information there MUST be a single master, whether in a computer's database, a drawing vault, or elsewhere, and systems must be in place to manage any second (or more) copies, particularly with regard to changes.

Poor change management has broken MANY critical processes, and you can't have good change management without recognition of the problem of multiple instances of information.

--
Jerry Dunham
Change is inevitable, except from vending machines

threew's picture

Agree with Jane. The need

Agree with Jane.

The need for repetition in knowledge related activities comes from a wide variety of human needs; not all are negative. Once is never enough in knowledge or understanding because these things are continuously iterating and incrementally changing.

Software design and build techniques, optimally, should result in an innate ability to effectively and quickly handle changes, modifications, and improvements in the system... For whatever rationale. Once is optimal -- twice is OK but with every repetition of the same sequence within the system, the probability of error with subsequent change increases and the slow down effect on debug is logarithmic.

William W. (Woody) Williams
Project Management Consultant
| Blog | Twitter |
w3src Consulting

NY2TX's picture

At least twice is nice.

At least twice is nice.

jdunham's picture

It is if you're designing

It is if you're designing fault-tolerant computers.

--
Jerry Dunham
Been there; done that.
Been there; done that.

NY2TX's picture

I don't think I was...LOL

I don't think I was...LOL

kemulholland's picture

You'll solve the D.R.Y.

You'll solve the D.R.Y. problem when you convince people that it's more work to repeat yourself than to say it once and refer back to what you said. You can fix the process and you can provide the technology to make it easy, but the problem doesn't go away until you successfully change how people think.

The D.R.Y. issue exists for technical documentation, too. Writers repeat themselves because their readers want to find information in the place where it occurs to them to look for it, and they get cross if it's not there. Other than that, it is fundamentally the same problem as in software, with fundamentally the same solution.

There are two parts to the problem and its solution in the realm of documentation. The first part is to write the material once and use it wherever it's needed (content reuse); the second part is to ensure that it's always referenced, not copied, from the same place (single-sourcing). You could summarize this as "say it once, then play back the recording as often as necessary." This saves time, reduces errors and slashes translation costs; the latter is usually the argument that drives its adoption. (Does any of this sound like old information in a new context?)

These are not yet universal practices in technical writing, because it requires the writer to think and write in reusable "chunks" instead of whole chapters and manuals. (Gee, do you suppose something analogous is true on the software side?) As usual, fixing the people is the hard part of fixing the problem. Trust me: They're not going to throw rose-petals at you when you liberate them from the drudgery they know, unless you spend time and effort on getting them happy with the idea.

jdunham's picture

kemulholland wrote: "The

kemulholland wrote:

"The D.R.Y. issue exists for technical documentation, too. Writers repeat themselves because their readers want to find information in the place where it occurs to them to look for it, and they get cross if it's not there. Other than that, it is fundamentally the same problem as in software, with fundamentally the same solution."

I think it's an even greater problem in documentation than it is in SW and HW. People expect the written word to flow and not appear boring, while nuts, bolts and SW modules don't care. The temptation is to write the same thing a different way each time it appears, and this makes document maintenance SO much more difficult and haphazard.

In the end, it becomes exactly the same problem, as you write, but it is a difficult culture to change, and that's the real "change management" problem.

--
Jerry Dunham
The more things change the more they stay the same

Jane Prusakova's picture

Writing documentation, and

Writing documentation, and writing for human consumption in general, has been around longer, and has deeper habits and traditions than writing for machines (i.e. writing code of any kind). That makes it so much harder to change to use the newer D.R.Y. paradigm.

More importantly, written works are traditionally meant to be used in sequence - the same way they were created. A reader expects to find a particular piece of information several times throughout the text, rather than be referred to the same single piece of content multiple times.

However, this sequential tradition has been breaking since HTML with its links has been introduced and got popular starting in the early 90s. Being able to easily jump between chunks makes it easier to write reusable content.

Jane Prusakova
Software Architect & Developer
My blog

jdunham's picture

Too bad we can't use HTML in

Too bad we can't use HTML in printed material!

--
Jerry Dunham
Hyperlinking dead trees

kemulholland's picture

Ah, but we can use HTML (or

Ah, but we can use HTML (or better yet, XML) to *make* the printed material, along with the on-screen version...and when you've written it in reusable chunks, the print and screen versions don't have to look the same or be organized in the same way. You set up the build parameters to optimize each output format to take advantage of its intended medium.

It's called single-sourcing, and it can drastically cut product documentation costs while improving quality. Sure, sure, anyone can make wild claims; but here's the data: I've used this approach successfully to singlehandedly manage a product documentation library of a size that would normally take two or three people to maintain - and won two international awards for the resulting material, because it was so much easier to enforce consistency across the entire collection of manuals and help.

[shameless plug] Contact me if you want to know how single-sourcing can lower your product documentation costs.

KEM

jdunham's picture

You're write, of course.

You're write, of course. (Sorry for the pun.)

The problem I see is that the weight of tradition limits the use of single-sourcing. Perhaps it's gotten better, but when I last worked with technical writers I didn't see it. They were very good at controlling the masters for their documents and maintained excellent rev control, but they did NOT take the same concept deeper into the documents and apply it to chunks of specific information.

--
Jerry Dunham
Search-and-destroy ... er ... search-and-edit