Bus timetables and storage efficiency
Sometimes storage insight comes from the most unlikely of places…
I recently attended an IBM IT Architect networking event where the topic was IBM’s Smarter Planet initiative. One of the talks, by IBM software Architect Andy Heys, was on the topic of Smarter Transportation. His talk included an example of IBM’s Traffic Planning Tool and how IBM had developed a traffic prediction model for the Singapore Land Transport Authority (LTA).
This gave me an unexpected insight into the parallels between Singapore bus timetables and storage thin provisioning. Seriously!
The bus bit
The story goes, that after having developed a system to accurately predict traffic flow for up to 60 minutes into the future, IBM then suggested taking the next step, which was to start adjusting traffic signals so as to reduce congestion. It’s a natural assumption that studying and predicting traffic flow would then be used to improve that same traffic flow…
At this point, says Andy, “we realised that they had more insight into the problem than ourselves”. The LTA was more interested in improving bus arrival time prediction than in directly improving traffic flow. That’s because they understood that any improvement in traffic flow would simply attract more traffic to the roads, which would rapidly return them to the same congested state, and probably worse.
Their plan was to attract more people onto public transport by reducing the schedule variance and unpredictability of bus arrival times. After all, who wants to wait at a bus stop if they are unsure of when the next bus will arrive. More people taking the bus would also reduce the number of cars on the road, and indirectly improve traffic flow too!
The storage bit
At first glance, traffic prediction has some obvious similarities to storage capacity planning – measuring utilisation and plotting trends. But I think the more interesting parallel is the competition between high storage utilisation (ie. road congestion) and storage efficiency tools such as thin provisioning, compression and deduplication (ie. methods for improving traffic flow).
Just like with traffic planning, storage also suffers from unexpected side effects, such as how increased available capacity (eg. via thin provisioning, or deduplication) is rapidly consumed by newly attracted workloads. With the same end result being a larger and more difficult problem to manage.
So, what is the analogue for public transport in the storage world? How do we improve capacity management, not by increasing storage density, but by reducing the storage burden?
It’s at this point that the conversation changes to be about data archiving, which serves the dual purpose of managing capacity and “getting cars off the road”.
Of course, I’m not saying that storage efficiency tools are unimportant, just that they are not enough on their own. They provide great value by decreasing (sometimes dramatically) the amount of physical storage that is required.
Back to the buses
The difficult question then is “how to encourage good behaviour (public transport / archiving) when the demand is always for more cars (increased storage capacity, greater performance, lower cost, and faster provisioning)”?
To use another Singaporean traffic analogy, this is already a solved problem, where congestion charging is used to toll traffic within the CBD area (ie. Tier-1 storage).This may never be a popular tool, but it is certainly an effective one. Tolls (and archiving) are like broccoli – good for other people :)
I suppose that I should highlight that I’m not really advocating for charge-back, but rather for intelligent capacity management.
While it might be difficult to introduce chargeback for an existing storage environment, might it be possible to introduce new capacity, capabilities, and costings which encourage proactive storage management? If it’s easier, cheaper, and faster to catch the bus rather than to drive a car, then more people will choose the bus willingly.
The (unfortunately very dated) diagram below shows the impact of congestion charging on Singapore CBD traffic over a 20 year period. I like how similar this looks to the archetypal chart for storage growth (always increasing), except the doubling period is about ~18 years, not ~18 months as for storage (YMMV).
Can a similar model of selective charge-back (or relative “charge-less” for lower storage tiers) work in enterprise storage? It always amazes me that it’s easier to impose tolls on public roads than on commercially managed storage. For many environments which don’t practice storage charge-back, perhaps the answer is simply to make the “public transport” option more appealing?
PS/ A sub-title for this post might be “Is public transport (ie. archiving and/or tape) dead – no way!” :)