MS Excel "How to"
Edit UTF-8 encoded *csv files with MS Excel and PSPad
Since UTF-8 has a key role as a global character encoding in the Internet (free according to Wikipedia), will the data from our system, when exporting to a CSV file, correspondingly likewise be UTF-8 encoded. To open such files with the appropriate character set, must be proceeded as follows:
Excel 2003
Menu Selection in Excel
Select on the menu "Data -> Import External Data" the menu item "Import Data".
Select Data Source
It opens the dialog "Select Data Source". Select the file to open and click the "Open" button.
Text Import Wizard - Step 1 of 3
Next set the file origin (character encoding) to UTF-8 (almost at the bottom of the list) and proceed with the button 'Continue'.
Text Import Wizard - Step 2 of 3
In this dialog, set the delimiter to a semicolon instead of tab stop and proceed again with the button 'Continue'.
Text Import Wizard - Step 3 of 3
To import all columns correctly and to prevent automated reformatting by Excel, the entire data area shall be formatted as text: To do this, select the first column, press and hold the Shift key now pressed until you have added the remaining columns to the selection. As a result should the entire data area - as seen in the image - now be marked from left to right and appropriately highlighted black. Afterwards you choose as the data format of the columns the 'Text' option and quit the wizard with the 'Finish' button.
Import Data
In the last step, there is the option of selecting where the data should be inserted. It is advisable here, to select a new worksheet. With the 'OK' button you terminate the procedure and the data appears UTF-8 encoded in Microsoft Excel.
Allow Calculations
The data will now be displayed as text - at first no calculations can be done. Now replace all Dots (.) by Comma (,) - or vice versa - with the 'Search and Replace' button. Then select all columns that contain numbers, and format the cells. You do this in the 'Numbers' category where you select 'Standard'.
Save File
- Select "Save As" -> "CSV (comma delimited) * .csv"
- Select directory, assign a filename
- Save
- Answer security questions with "Yes" or "OK"
Convert back in UTF-8 with PSPad
Since MS Excel saves CSV-files only in Windows ANSI format, we need to convert the file back in the UTF-8 format. Here we use the freely available software PSPad. Other text editors should but also be able to accomplish this.
- File -> Open
- Format -> Select UTF-8
- Save
Excel 2007/2010
Menu Selection in Excel
In the 'Retrieve External Data' section, on the tab 'Data', use the option 'From Text'.
Import Text File
It opens the 'Import Text File' dialog. Select the file to open and click the 'Open' button.
Text Import Wizard - Step 1 of 3
Next set the file origin (character encoding) to UTF-8 (almost at the bottom of the list) and proceed with the button 'Continue'.
Text Import Wizard - Step 2 of 3
In this dialog, set the delimiter to a semicolon instead of tab stop and proceed again with the button 'Continue'.
Text Import Wizard - Step 3 of 3
To import all columns correctly and to prevent automated reformatting by Excel, the entire data area shall be formatted as text: To do this, select the first column, press and hold the Shift key now pressed until you have added the remaining columns to the selection. As a result should the entire data area - as seen in the image - now be marked from left to right and appropriately highlighted black. Afterwards you choose as the data format of the columns the 'Text' option and quit the wizard with the 'Finish' button.
Import Data
In the last step, there is the option of selecting where the data should be inserted. It is advisable here, to select a new worksheet. With the 'OK' button you terminate the procedure and the data appears UTF-8 encoded in Microsoft Excel.
Allow Calculations
The data will now be displayed as text - at first no calculations can be done. Now replace all Dots (.) by Comma (,) - or vice versa - with the 'Search and Replace' button. Then select all columns that contain numbers, and format the cells. You do this in the 'Numbers' category where you select 'Standard'.
Save File
- Select "Save As" -> "CSV (comma delimited) * .csv"
- Select directory, assign a filename
- Save
- Answer security questions with "Yes" or "OK"
Convert back in UTF-8 with PSPad
Since MS Excel saves CSV-files only in Windows ANSI format, we need to convert the file back in the UTF-8 format. Here we use the freely available software PSPad. Other text editors should but also be able to accomplish this.
- File -> Open
- Format -> Select UTF-8
- Save