Converting text files between Windows, MacOS and Linux

Each operating system has different special characters to mark the end of line (EOL) or end of file (EOF) in simple text files. If transferring by FTP, the file conversion may be done automatically by your client based on the file extension or you could force it using the ascii command before the transfer; if not, you may have to employ the methods we discuss below.

conversionNote that this does not apply to complex word processor documents or other data files, as these are almost always saved in some binary format that does not need this kind of conversion – in fact, applying this conversion to anything other than a simple text file (i.e., a binary file) will damage it and make it unusable.

Perhaps the most powerful method is by downloading and installing dos2unix and unix2dos (link below), then issuing <command> input.txt output.txt and making the appropriate substitutions.

windows

If you are on Windows and do not want to install other programs, you may try “more”.

1. A simple, straightforward method that involves the command prompt only is the use of the “more” command (run more /? to learn more about it).

TYPE input_filename | MORE /P > output_filename

This command will take quite a bit to do its magic on a large file, so be patient.

The following solutions are slightly more complicated, but may be useful in certain special situations.

2. You may also use vbscript, as follows:

Do Until WScript.StdIn.AtEndOfStream
  WScript.StdOut.WriteLine WScript.StdIn.ReadLine
Loop

Put the above lines in a file unix2dos.vbs and run it like this:

cscript //NoLogo unix2dos.vbs <C:\path\to\input.txt >C:\path\to\output.txt

or like this:

type C:\path\to\input.txt | cscript //NoLogo unix2dos.vbs >C:\path\to\output.txt

3. PowerShell:

(Get-Content "C:\path\to\input.txt") -replace "`n", "`r`n" |
  Set-Content "C:\path\to\output.txt"

which could be further simplified to this:

(Get-Content "C:\path\to\input.txt") | Set-Content "C:\path\to\output.txt"

4. DOS CLI Ninja:

(for /f "delims=" %i in (file.unix) do @echo %i)>file.dos

If converting several files, try

for %%z in (*.txt) do (for /f "delims=" %%i in (%%z) do @echo %%i)>%%z.tmp

You may check in hex with: xxd -g1 file.ext.

linux

Although we are providing this ostensibly for Linux, it will most likely work under any Unix system.

1. This tr command may be used to remove all carriage returns and ctrl+z from a Windows file:

tr -d '\15\32' < winfile.txt > unixfile.txt

2. Unlike tr, awk may be used to convert in both directions:

awk '{ sub("\r$", ""); print }' winfile.txt > unixfile.txt
awk 'sub("$", "\r")' unixfile.txt > winfile.txt

3. Perl also allows bidirectional conversions:

perl -pe 's/\r$//' < winfile.txt > unixfile.txt
perl -pe 's/\n/\r\n/' < unixfile.txt > winfile.txt

4. You may also remove the carriage return character in vi:

 :1,$s/^M//g

You enter ^M with ctrl+v followed by enter.

5. In vim, use :set ff=unix to convert to Unix; use :set ff=dos to convert to Windows.

6. Install and use tofrodos.

7. There’s also sed:

sed s/$/$'\r'/ < input.txt > output.txt

Where $'\r' expands to a carriage return.

8. Python:

def write(self, s): parent.write(self, s.replace(r'\n', '\r\n'))

9. sed (via Cristian Ciupitu / ghostdog74)

# IN UNIX ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format.
sed 's/.$//'               # assumes that all lines end with CR/LF
sed 's/^M$//'              # in bash/tcsh, press Ctrl-V then Ctrl-M
sed 's/\x0D$//'            # works on ssed, gsed 3.02.80 or higher

# IN UNIX ENVIRONMENT: convert Unix newlines (LF) to DOS format.
sed "s/$/`echo -e \\\r`/"            # command line under ksh
sed 's/$'"/`echo \\\r`/"             # command line under bash
sed "s/$/`echo \\\r`/"               # command line under zsh
sed 's/$/\r/'                        # gsed 3.02.80 or higher

Use sed -i for in-place conversion e.g. sed -i 's/..../' file.

Options. What we’re all about.

Sources / More info: waterlan, so-win

Comments

Popular posts from this blog