Loading...

Convert File Encoding with iconv

:heavy_exclamation_mark: This post is older than a year. Consider some information might not be accurate anymore. :heavy_exclamation_mark:

Dealing with data in text files sometimes gives you headaches with file encoding. A tool that helps us overcome our problem is iconv. iconv converts text from one character encoding to another encoding.

Check Contents

Examine the file with binary mode in the editor vim.

vim -b input.csv

If you detect problems with umlauts and Windows line ending ^M like this:

"R<e4>fis, Stationsstrasse",Buchs (SG)^M

Convert it to Unix/Linux with dos2unix. It is text file format converter from DOS/MAC to UNIX.

dos2unix input.csv

Character Sets

To list all supported character sets:

iconv -l
# iconv --list

Convert to Unicode

Convert contents of input.csv from ISO-8859-1 to UTF-8 and write it to output.csv.

iconv -f ISO-8859-1 -t UTF-8 input.csv -o output-UTF_8.csv

Another example ISO-8859-1 To UTF-16

iconv -f ISO-8859-1 -t UTF-16 input.csv -o output-UTF_16.csv
Please remember the terms for blog comments.