Convert OpenXML (.docx, etc.) in Linux using command line - OpenOffice.org Ninja

Convert OpenXML (.docx, etc.) in Linux using command line

Posted by Andrew Z at Monday, January 7, 2008 | Permalink


Microsoft Office 2007 came out about a year ago. Have you yet started getting .docx, .xlsx, or .pptx files? Whether you are an OpenOffice.org user or a Microsoft Office 2003 user, you are probably frustrated trying to find some way to convert, import, or otherwise open these documents.

These new document formats are called Office Open XML, OOXML, or OpenXML. They are significantly different from the Microsoft Office 97, 2000, 2003, or XP formats. The new formats are based on XML while the old formats are binary.

If you don't use Linux or you are afraid of the terminal, skip to the very end. Otherwise here is one easy way to convert these documents thanks to Novell. They developed a tool called the OpenOffice OpenXML Translator, and soon it will be fully integrated into the mainstream OpenOffice.org.

Because this procedure is for the command line (and does not directly integrate into the vanilla OpenOffice.org), this procedure is well suited for automated, batch conversions between OpenXML and OpenDocument Format. This procedure should work on all Linux distributions: Ubuntu, Fedora, SUSE, Mandriva, Debian, and so on.

Installation prerequisites

Install the programs rpm2cpio and cpio. If you run a system such as Fedora, run this command:

sudo yum -y install cpio rpm

If you run Ubuntu, run this command:

sudo apt-get install rpm libgif4

Installation procedure

The general idea is the same no matter which Linux distribution you use. You are basically copying one file out of the RPM as if it were a tarball or a zip file. You are not installing the RPM in the traditional sense, so don't worry if you run a non-RPM-based system such as Ubuntu, Debian, or Slackware

  • Download odf-converter-1.1-7.i586.rpm for i386 systems or odf-converter-1.1-7.x86_64.rpm for x86_64 systems.
  • Open a console.
  • Change directory to your download directory. Depending on your setup, it may be: cd ~/Desktop
  • To unpackage the rpm, run this command: rpm2cpio odf-converter*rpm | cpio -ivd
  • To copy the binary run this command: sudo cp usr/lib/ooo-2.0/program/OdfConverter /usr/bin
  • Optionally you may now delete the opt and usr directories you just unpacked as well as the .rpm file you downloaded. However, you may wish to keep the files under usr if you are interested in documentation and sample OpenXML documents.

Usage

The usage is simple. To convert a .docx file (Word 2007) to a .odt (OpenDocument Format) file, just run:

OdfConverter /i example.docx

Then, you will find the .odt file in the same directory as the .docx. Then, open the ODF file in OpenOffice.org or your favorite office suite.

For more help on arguments, just run OdfConverter without arguments. If you are curious, there are some OpenXML sample documents included in the RPM: check the directory usr/share/doc/packages/odf-converter/.

Related articles

27 comments:

Redric said...

Thank you so mach!

Andrew Z said...

Redric,

You're welcome!

Tsiolkovsky said...

On thing to always keep in mind is that OOXML is a broken format and the user sending files in this format should be warned about this. See more on NoOOXML.

Anonymous said...

helped a lot!

Thanks!!

Scott said...

Trying to run it on CentOS, and I'm getting:

OdfConverter: error while loading shared libraries: libgif.so.4: cannot open shared object file: No such file or directory

Thoughts?

Andrew Z said...

Scott:

Here is the command (which you can enter in a terminal) to find which package you need:
sudo yum whatprovides libgif.so


Here is how to install the necessary package:
sudo yum install giflib


Andrew

jeremo said...

I'm trying to run this in ubuntu, but i get:

OdfConverter: error while loading shared libraries: libtiff.so.3: cannot open shared object file: No such file or directory

how can i resolve this?
many thanks!!

jeremo said...

ok i figured it out.

by running "locate libtiff.so" I found that i have libtiff.so.4

so i made a symbolic link with
"sudo ln -s /usr/lib/libtiff.so.4 /usr/lib/libtiff.so.3"

then a bit of "sudo ldconfig" to update the library cache and bob is now my uncle

i'd love to take credit for that bit of wisdom but i actually found it here:
http://linux.derkeiler.com/Mailing-Lists/Ubuntu/2006-04/msg02897.html

thanks for the guide, it's a real treat

jeremo said...

http://linux.derkeiler.com/Mailing-Lists/Ubuntu/
2006-04/msg02897.html

Josiah said...

thanks, jeremo. That fixed my problem!

Anonymous said...

I've just installed it, but I only get

bash: /usr/bin/OdfConverter: cannot execute binary file

any help?

Andrew Z said...

@anonymous: Which Linux distribution do you use? Is it 32-bit or 64-bit? Do you have the 32-bit libraries installed? Have you tried searching for the error?

Charlie said...

thanks to everyone's help (especially jeremo's follow up) I was able to get this running on Ubuntu as well...turns out google-earth had libtiff.so.3 and i just copied it into /usr/lib ...don't get confused when you run OdfConvert and it doesn't work, it's "OdfConverter" :)

Giray said...

thanks for all

it's great to be able to find whatever you want in linux.
Linux forever

etittley said...

To unpackage, I had to filter through bunzip2:

rpm2cpio odf-converter-*.i586.rpm | bunzip2 - | cpio -ivd

Debian 4.0

Anonymous said...

Thanks!

This worked fine on Fedora 8 and default OpenOffice.

I had to 'rpm -i --nodeps odf-convertor...' because it complained about openoffice not being > 2.0 (which it is: 2.3)

Thanks, it was a big help.

Uwe said...

Mathematical equations are not
translated.


Hello, I wrote a ODT file, containing enumerated list, tables colors and mathematical formula. I converted it to docx format.

- I could not open that file with the Oygen Openoffice which contains a docx import filter, files corrupted.
- converted the docx file back to odt, but then the math formula were gone.

can anybody comment on this?

TJ said...

Thanks a lot, this is REALLY helpful. I was using zamzar to convert and that works but is slow with the free account.

One suggestion to make it easier. If you prefer GUI rather than CLI you can right-click on a docx file and say "open with" then choose custom command. For the custom command type "OdfConverter /i" After that you can go to properties of any docx file and go to the "Open With" tab and choose OdfConverter as the default program. From now on, when you click to open a docx it will not open it but will instead spit out a odt file in the same directory which you can open. This works well in Gnome/Nautilus. I am sure the instructions are slightly different in KDE/Konqueror.

BK SimonB said...

No joy with PCLinuxOS 2007 yet. This is similar to Mandriva.

I get
error: Failed dependencies:
OpenOffice_org >= 2.0 is needed by odf-converter-1.1-7.i586
libgif.so.4 is needed by odf-converter-1.1-7.i586

OO is installed, just has a different package name. Same with libgif, since libgif.so.4 is present in the /usr/lib directory.

If I install with nodeps I get
/var/tmp/rpm-tmp.6321: line 22: SuSEconfig: command not found

When it finishes there is no file called OdfConverter installed in /usr/bin.

Andrew Z said...

BK SimonB: Did you install OdfConverter with "rpm -Uvh ...."? You aren't supposed to do that unless you run the Novell version, so please read the instructions again.

dmk said...

In order to get the conversion in the action context menu under KDE/Konqueror you need to do two things
1. save the following as $HOME/.kde/share/apps/
konqueror/servicemenus/
convertmsXMLtoodt.desktop

#Word XML>Openoffice.odt
[Desktop Entry]
ServiceTypes=application/mswordXML
Actions=docx2odt

[Desktop Action docx2odt]
Name=Convert XML to Openoffice .odt
Icon=doc
Exec=Odfconverter /i %f


2. create an application/mswordXML mimetype by
Control Center -> KDE Components -> File Associations -> Add... Group = applications, Type Name = mswordXML, Filename Patterns = *docx

bhups said...

Is it necessary to have openoffice installed on the system before using OdfConverter?
I am able to convert pptx into odp's but the output is not good enough i.e. in lots of slides TEXT in the slide is getting clipped.
So is OdfCoverter somehow dependent on OpenOffice i.e. some schema files, it looks for in oo installation directory?
Thanks!

Andrew Z said...

dmk: I was thinking of something like that, and you encouraged me finally to do it. Check out odf-converter-integrator which should be usable in the next few days.

bhups: odf-converter is completely independent of OpenOffice.org, and you can use odf-converter if OpenOffice.org is never installed. Check odf-converter on how to report a bug

Anonymous said...

SWEET! Thanks! This worked like a champ :)

Jacques Charroy said...
This post has been removed by the author.
Jan said...

A cool thing this converter :) . BTW, in your guide you're mentioning "OdfConvert" instead of "OdfConverter" command. I spend a few minutes figuring out what's the problem with the command:).

Tried with SuSe 10.1

Andrew Z said...

Jan: Thanks. I fixed it and made a few updates.