OpenOffice.org obeys Moore's Law? - OpenOffice.org Ninja

OpenOffice.org obeys Moore's Law?

Posted by Andrew Z at Tuesday, May 13, 2008 | Permalink

Kryder's Law, a variation of Moore's Law, describes the trend "magnetic disk areal storage density doubles annually" [1]. In other words, you don't want to know how much I paid for a 40MB Seagate MFM drive in 1989, but today 1000GB drives rule the day for much less money. This increase in capacity follows a predictable trend.

Conversely to Kryder's Law and Moore's Law (which basically describes computers becoming predictably faster), Wirth's Law states software becomes larger, more complex, and slower [2]: in the end the win on one end is washed out by the loss of the other.

Let's compare these laws against OpenOffice.org to see which law wins. We'll measure the installed disk usage of OpenOffice.org for Linux in English as built by Sun Microsystems. The size of OpenOffice.org installation over time fits a linear equation with R2 = 0.858 and an expoential curve with R2 = 0.876: that means it is predictable like Kryder's Law.

Growth in the 1.x version series was slow, but the 2.x versions made up for it. The chart looks odd around versions 1.1.5 and 2.0.0 because they were released 36 days apart. The 3.0.0 beta currently also has a oddly sharp curve that will look more natural when its data point moves over to September 2008.

Size of OpenOffice.org as installed on the disk (English version for Linux) over time

OpenOffice.org's disk storage usage comprises several parts. The vast majority of the space is in these subdirectories:

Subdirectory name Description
help Documentation
preset Configuration
program OpenOffice.org executable code including a functional copy of Python
share Configuration, dictionaries, icons, templates, fonts, XSLT filters, etc.

The proportional disk usage of these directories has changed over time. The largest growing directory share has ballooned 513% from 23MB in 2002 to 141MB in 2008. In version 2.4.0, the many preinstalled dictionaries consume 83MB (59%) of the share directory. Meanwhile, the program directory has slowly increased by 52% over the same period.

One component of computing performance that has not grown as fast as others is hard disk seek times, so it's an overall win to store large indexes on a disk to speed access to large files [3]. OpenOffice.org uses this trick: indexes consume 7% of the dictionary directory.

OpenOffice.org installation disk size proportions over time

Back to the Laws. Has the OpenOffice.org installation size indeed followed the general trend of growth in PC disk capacity? This chart, which compares the hard drive capacity to OpenOffice.org installations, indicates the answer is no. In 2002, OpenOffice.org version 1.0.1 consumes 0.45% of a then-common 40GB drive. In 2008, version 2.4.0 consumes 0.15% of a 250GB drive—common on new systems today—and even less of the monster 1TB drives available. OpenOffice.org has grown slower than predicted.

OpenOffice.org vs Kirth's Law Moore's Law

Conclusion

Though slower than Kryder's Law, OpenOffice.org's growth is still fairly predictable. The increase is positive and means a wider variety of features, more comprehensive documentation, and otherwise more of what customers want. Even today's larger OpenOffice.org installation occupies a lower percentage of a hard disk drive of 5 years ago.

OpenOffice.org performance

This article begins a series on OpenOffice.org performance. Subscribe so you don't miss anything. :)

20 comments:

Anonymous said...

Would you be able to compare this to MS-Office?

Anonymous said...

Yeah, I would like to see a comparison to MS office too.

Andrew Z said...

Sander Marechal and anonymous: Good idea. I would like to include Microsoft Office. It is difficult because of the many editions (Basic, Standard, Home and Student, Professional, Professional Plus, etc.) and the fact old versions of MSO are hard to find. Also, I assume MSO is more likely to install itself into various directories (such as c:\windows\), so it is less clear how much it exactly consumes. I'll keep this in mind for later and see what I can find...

intangible said...

If you're going to do MS Office, I'd suggest using VMWare or some other virtualization program so you can revert back to a clean copy after installing each version, and you can just count the entire hard drive (excluding swap) usage before and after installation as the proper size.

Anonymous said...

How about comparing memory footprints? How about repeating all the measurements after a feature-rich document is loaded?

Andrew Z said...

Anonymous: This series is less about disk space and more about speed and to some degree memory. For 2-3 months, I have been working on a system to gather and report this data, and I have an exciting article nearly ready.

Andrew Z said...

OK, I posted a comparison against Microsoft Office.

Anonymous said...

... but, with OOo 3, we can choose which programs to install... or not? then, if i only write texts, i can install only 'write', thus occupying less space than in version 2.4... =]

Andrew Z said...

Anonymous: OpenOffice.org versions 2 and 3 will let you choose which modules (Writer, Calc, Base, etc.) to install, but the difference may not be as dramatic as you would think. Disk space is so cheap that few people care to bother slimming OpenOffice.org.

(just) Wally said...

Something that is notably missing is any mention of the addition of little things...like OpenBase? Of course it makes OO grow.

When you talk to MS, they will tell you all about their MS Office "Family," which includes Access, Outlook, MS Publisher, Visio, and on and on...

Taking both points into consideration, I fail to see how the analysis serves as a realistic look at the situation.

Just some thoughts...

Anonymous said...

Like the others, I'd like to see a comparison to MS Office.

However, what I think we're seeing is a fairly tightly written program that isn't expanding that much itself, but has a large expansion as extra fonts and filters are added to it, the little extras to allow people to easily import files from other sources. Every time MS change MS Office, a whole new set of filters need to be incorporated into OO so we can open the new MS documents.

I like the OO percentage graph, but would also like to see it on an actual MB growth chart to see how the size of the help and the program has grown in reality.

Andrew Z said...

Anonymous: Please see the next article OpenOffice.org vs. Microsoft Office vs. Moore's Law for a comparison to Microsoft Office, and I uploaded the data tables and the chart you mentioned. There you can see the little details.

justwally: On Linux OOo 2.4.0, OpenOffice.org Base is just 8,930,631 bytes. Writer is even smaller. Most of the bulk is in core packages. All these packages (Base, Writer, Calc, Math, etc.) are included in all the charts in these two articles because they are installed by default.

(just) Wally said...

Andrew, it appears I need to read your posting a few more times then. Somehow I am missing something. :-)

Andrew Z said...

justwally: Maybe I misunderstood? Yes, OpenOffice.org Base did increase the size of OpenOffice.org going from 1.1.5 to 2.0.0. However, it's not feasible to measure the components of OpenOffice.org (Writer, Calc, Base, etc.) independently because OpenOffice.org is more integrated than Microsoft Office. OpenOffice.org has more common code, so it is not possible to claim that OpenOffice.org Base consumes X bytes for a comparison to Microsoft Access which consumes Y bytes.

Anonymous said...

You guys missed the point! Who cares if it uses less % of a 250 gb drive? This doesn't really help people who still use 40-80 gb drives. And I'm pretty sure that's still fairly high. (Take my job for example, we're locked in for another 2 years with celerons with 512mb, 80gb hardrives, and win XP.)

The hardware of the REGULAR user (ie., the majority of people) doesn't keep up with any of these laws.

This "research" is useless until it factors in the performance experience on a regular computer.

Andrew Z said...

Anonymous: I sympathize with your concern: my computer at home several years old. However, today's OpenOffice.org 2.4.0 requires about 1% of an 40GB HDD. For a modern, major application, is 1% anything to be concerned about? I find that many users don't fill up their HDDs. Also, you are still free to use the older versions of OpenOffice.org.

Anonymous said...

@andrew z@ - the point is that this is just "spin" -- reality: OO is getting bloated, spinster: "No, see my pretty data" Compare OO to kde, for example. Believe it or not, kde4 uses less memory than kde3.

Still, this an excellent article, can you post the data?

Andrew Z said...

Anonymous: I disagree. While there is room for improvement, OpenOffice.org overall has done a good job fighting bloat in terms of disk usage and speed.

The disk usage data is available here: Size of OpenOffice.org installed.

(just) Wally said...

Andrew,

I'm thinking that what I was missing is contained/will be contained in additional blog entries here. It appears that this was the beginning of a series, and that is what left me scratching my head.

Am I reading that correctly?

Andrew Z said...

justwally,

Yes, this the first in a series of what may be about 10 performance articles. I recently posted "Is OpenOffice.org Getting Faster." Feel free to ask any questions anytime.

Andrew