Home > Storage > Bus timetables and storage efficiency

Bus timetables and storage efficiency

Sometimes storage insight comes from the most unlikely of places…

I recently attended an IBM IT Architect networking event where the topic was IBM’s Smarter Planet initiative. One of the talks, by IBM software Architect Andy Heys, was on the topic of Smarter Transportation. His talk included an example of IBM’s Traffic Planning Tool and how IBM had developed a traffic prediction model for the Singapore Land Transport Authority (LTA).

This gave me an unexpected insight into the parallels between Singapore bus timetables and storage thin provisioning. Seriously!

The bus bit

The story goes, that after having developed a system to accurately predict traffic flow for up to 60 minutes into the future, IBM then suggested taking the next step, which was to start adjusting traffic signals so as to reduce congestion. It’s a natural assumption that studying and predicting traffic flow would then be used to improve that same traffic flow…


At this point, says Andy, “we realised that they had more insight into the problem than ourselves”. The LTA was more interested in improving bus arrival time prediction than in directly improving traffic flow. That’s because they understood that any improvement in traffic flow would simply attract more traffic to the roads, which would rapidly return them to the same congested state, and probably worse.

Their plan was to attract more people onto public transport by reducing the schedule variance and unpredictability of bus arrival times. After all, who wants to wait at a bus stop if they are unsure of when the next bus will arrive. More people taking the bus would also reduce the number of cars on the road, and indirectly improve traffic flow too!

The storage bit

At first glance, traffic prediction has some obvious similarities to storage capacity planning – measuring utilisation and plotting trends. But I think the more interesting parallel is the competition between high storage utilisation (ie. road congestion) and storage efficiency tools such as thin provisioning, compression and deduplication (ie. methods for improving traffic flow).

Just like with traffic planning, storage also suffers from unexpected side effects, such as how increased available capacity (eg. via thin provisioning, or deduplication) is rapidly consumed by newly attracted workloads. With the same end result being a larger and more difficult problem to manage.

So, what is the analogue for public transport in the storage world? How do we improve capacity management, not by increasing storage density, but by reducing the storage burden?

It’s at this point that the conversation changes to be about data archiving, which serves the dual purpose of managing capacity and “getting cars off the road”.

Of course, I’m not saying that storage efficiency tools are unimportant, just that they are not enough on their own. They provide great value by decreasing (sometimes dramatically) the amount of physical storage that is required.

Back to the buses

The difficult question then is “how to encourage good behaviour (public transport / archiving) when the demand is always for more cars (increased storage capacity, greater performance, lower cost, and faster provisioning)”?

To use another Singaporean traffic analogy, this is already a solved problem, where congestion charging is used to toll traffic within the CBD area (ie. Tier-1 storage).This may never be a popular tool, but it is certainly an effective one. Tolls (and archiving) are like broccoli – good for other people 🙂

I suppose that I should highlight that I’m not really advocating for charge-back, but rather for intelligent capacity management.

While it might be difficult to introduce chargeback for an existing storage environment, might it be possible to introduce new capacity, capabilities, and costings which encourage proactive storage management? If it’s easier, cheaper, and faster to catch the bus rather than to drive a car, then more people will choose the bus willingly.

The (unfortunately very dated) diagram below shows the impact of congestion charging on Singapore CBD traffic over a 20 year period. I like how similar this looks to the archetypal chart for storage growth (always increasing), except the doubling period is about ~18 years, not ~18 months as for storage (YMMV).

Effects of ALSData is scaled to 100% for base year 1975
Blue: Car Population, Yellow: AM Inbound, Red: PM Inbound

Can a similar model of selective charge-back (or relative “charge-less” for lower storage tiers) work in enterprise storage? It always amazes me that it’s easier to impose tolls on public roads than on commercially managed storage. For many environments which don’t practice storage charge-back, perhaps the answer is simply to make the “public transport” option more appealing?

PS/ A sub-title for this post might be “Is public transport (ie. archiving and/or tape) dead – no way!” 🙂

Categories: Storage
  1. Paul Sorrentino
    April 20, 2011 at 10:11 am

    You would need to include performance SLAs otherwise users of cheaper storage tiers will complain about performance.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: