Convert Text to Columns in Calc - OpenOffice.org Ninja

Convert Text to Columns in Calc

Posted by Andrew Z at Saturday, January 26, 2008 | Permalink

OpenOffice.org 2.4.0 introduces text to columns for Calc. Before it was possible to import delimited text (typically comma or tab delimited) into the spreadsheet through a file or a clipboard. Now it is possible to also do it within existing cells. Calc will parse the text and split it apart using whichever delimiter character you choose including tab, comma, space, or semicolon. A delimiter simply means a separator.

How to use

Our example spreadsheet contains city and country names delimited by a comma:

Screenshot: OpenOffice.org 2.4.0: Calc spreadsheet with comma separated values

To remove the comma and put the country names into column B, we proceed as follows:

  1. Make sure there are enough clear cells to accept the new values. Otherwise, the cells will be overwritten. In this case, column B must be empty because the data will grow one column to the right.
  2. Highlight the full range of cells (in this case, A1 through A4).
  3. Click Data > Text to Columns.
    Screenshot: OpenOffice.org 2.4 Calc: Text to Columns menu
  4. Uncheck the box Tab.
  5. Check the box Comma.
    OpenOffice.org Calc Text to Columns dialog
  6. Click the box OK.

You're done, and here are the results:

screenshot: OpenOffice.org 2.4.0 Calc: Finished results of Text to Columns

Column options

To specify the type of data in the column, in the preview area click the top of the column. Then, choose a type from the drop-down list. In many cases, this step is unnecessary.

Column options for text to columns feature

Getting rid of extra spaces

Here are three choices to get rid of extra spaces you may end up with:

  • Manually delete them when there are few.
  • Using regular expressions.
  • In the Text to Columns dialog, click Space and Merge Delimiters. This does not work if your data have embedded spaces.
  • Use the =TRIM() spreadsheet function. In the example above after splitting the cells, you could do this:
    1. Type =TRIM(B1) in cell C1.
    2. Select cell C1 (by clicking on it).
    3. There is a square handle protruding from the bottom-right-hand corner of the cell. Drag it down to cell C4. This copies the formula down.
    4. Copy cells C1 to C4 to the clipboard.
    5. Put the cursor in C1 (by clicking on it).
    6. Click Edit > Paste Special.
    7. Uncheck Paste All and Formulas.
    8. Click the OK button.
    9. Delete column B.

Applications

You can use this feature for a variety of use cases including:

  • Splitting city and states such as Los Angeles, California
  • Splitting last and first names such as Smith, John (using a comma).
  • Splitting full names such as Mr. John A. Smith (using a space).
  • Splitting fixed-width (non-delimited) data such as 20080126 (in one cell) to 2008 01 26 (across three cells).
  • Cleaning up .csv or .tsv files embedded in a spreadsheet.
  • Copying certain tables from a PDF (which may be space delimited).

Alternatives

There are other ways to convert text to columns:

  1. Put the data in any other program (such as GEdit or Windows Notepad). Then, copy it to the clipboard, and then paste it in OpenOffice.org Calc. You will see the same Text Import dialog when pasting from any external program including Acrobat Reader and Firefox.
  2. Save the data to a file with the extension .csv (even if it is delimited by a character other than commas). Then, when you open it in OpenOffice.org, you'll see the same Text Import dialog.
  3. Use Writer's text to table feature.
    1. Copy the data to the clipboard.
    2. Open a Writer document.
    3. Click Edit > Paste Special.
    4. Choose Unformatted Text.
    5. Click the OK button.
    6. Highlight the text.
    7. Click Table > Convert > Text to Table.
    8. Choose the necessary options.
    9. Click the OK button.
    10. Copy the table.
    11. Paste into Calc.
  4. Before OpenOffice.org version 2.4 was released, there were macros written for converting Text to Columns. OOo version 2.4 makes these macros obsolete.

Competition and retraining

Microsoft Excel already has a very similar Text to Columns feature. Both Excel 2003 and Calc have the feature in the same place. In Excel 2007, the feature is roughly in the same place. Excel's dialog is a wizard style, so it requires a few more clicks to accomplish the same task. Excel has a few more options, which most people may not need, such as a choice of destination.

Related articles

17 comments:

Anonymous said...

Getting rid of extra spaces is actually easier than you've noted. When I recieve (as excel or csv files) reports that have fields filled with blank spaces I clear them using Edit > Replace. Under Advanced options, there's an option to use Regular Expressions. Using "\ " (no quotes)as the search criteria and replacing with nothing will remove all spaces from each cell. using "\ \ " will remove only instances in which 2 spaces appear together - thus preserving single spaces (in full name fields, for example).

Andrew Z said...

Anonymous,

Perfect suggestion about the regex. I linked your comment from the body of the article. Thanks!


Andrew

Anonymous said...

Hello,
I would like to also define the cell format as a "number" in "Column options"

Anonymous said...

Is it also possible to split cells using regular expressions (say, splitting an address line into number, street name, street type, city and state)?

Andrew Z said...

Anonymous March 29, 2008 5:59 PM: Indirectly, yes. Use a regex to insert a unique delimiter such as a semicolon. Then, use this Text to Columns feature to split the delimiter into cells.

Anonymous said...

I just couldn't find that "Text To Columns..." menu entry after I updated to OOo 2.4 (Win platform).
Seems OOo does not append new menu entries if you have customized your menu so you have to either do it manually or reset your customizations by either using the UI or deleting the %APPDATA%\"OpenOffice.org2" directory (Win platform) before starting Calc.

--Henrik

Andrew Z said...

Henrik: If you want to save your OpenOffice.org profile (most of it, anyway), rename its directory, create a new one by starting OpenOffice.org, restore the old profile, and then copy (user configuration base path)\user\config\soffice.cfg\modules\swriter\menubar\menubar.xml which just has the menu items.

Anonymous said...

Andrew:
Funny, I actually did almost exactly that except I tried to trigger the new configuration to create those xml config files (that does not get created unless you alter the menubar/accelerator/toolbar) so I could compare the diff. with my old ones and then add the diff. to the new ones.
All this, just to convince my self I wouldn't miss out on some new fancy functions in 2.4 :)

It would be nice if the user customization was designed so that your settings just saved the appended customization in the xml files and not the whole settings.
Would make it much easier to migrate to a new version.
Wonder if there are any feature request on the issue tracker in that direction ? A quick search on the words "customize" and "soffice.cfg" at http://qa.openoffice.org/issues/query.cgi
with the options FEATURE and ENHANCMENT sat didn't give me any result though.
--Henrik

Mike Ritter said...

I want to split my cell at the first space (file_name my file description). Is there a formula I can plug in?

Mike Ritter said...

I received an answer in the forum at http://user.services.openoffice.org/en/forum/viewtopic.php?f=9&t=6174. Thanks all!

Tony Bullard said...

I'd love to know how to do the opposite. I'd like to combine two columns into one.

Anonymous said...

Thanks for this tutorial, it provided exactly the information we were looking for. Your work is much appreciated!!

George Johnson said...

If you want to merge two columns ( or more ) of text, in OO or Excel for that matter, use the ampersand in a cell formula.

Say you have Firstname in col A and Surname in col B, col C formula would be : =A1&" "&B1

Simple!

Anonymous said...

fuggypiggy, helped my issues, i did not read whole document :D

anita said...

I am trying to figure out how to convert a colum containing both number and text. For ex: 44example. I want only the text and it is ok if i can delete the number. How do i work on calc in openoffice. Pls help

Andrew Z said...

Anita: There are a few ways. The most simple is to highlight the area and perform a Find and Replace on the highlighted selection. Replace all numbers with blanks. You may do each number separately, or if you want an advanced method, use a regular expression in the search field.

chrisRoald said...

Hi. Is there a OOCalc function that will perform text to columns-like parsing so that I can combine the Match function and get an index?