Why are text file line breaks wrong, after the file is transferred or edited?
After transferring or editing a file, it may happen that line breaks are wrong, what may manifest as:
- Line breaks are lost. It seems like if a whole file content is on a single line.
- Line breaks are duplicated. It seems like there’s additional empty line between every line.
- There’s strange symbol/character at the end of every line.
Advertisement
This article explains possible causes of the problem and their solutions.
- Text File Formats
- Text/ASCII Transfer Mode
- Known Issues with Transfer Mode
- Debugging Text File Conversion
Text File Formats
Different platforms (operating systems) use a different format of text files. The most common formats are Unix and Windows format. A primary difference is that different character or sequence of characters is used to signify an end of a line. On Unix, it’s LF character (\n
, 0A
or 10 in decimal). On Windows, it’s a sequence of two characters, CR and LF (\r
+ \n
, 0D
+ 0A
or 13 + 10 in decimal).
While many applications and systems nowadays can work with both formats, some require a specific format (notably Windows Notepad supported Windows format only until Windows 10 1809). When presenting a file in another format, they fail to display it correctly, as described above.
Text/ASCII Transfer Mode
For this reason, file transfer clients and servers support a text/ASCII transfer mode. When transferring a file in this mode, the file gets (ideally) converted from a format native to a source system, to a format native to a target system. For example, when uploading a text file using text mode from Windows to Unix system, the file line endings get converted from CR+LF to LF. Opposite to the text/ASCII transfer mode is a binary transfer mode that transfer the file as is (binary identical).
Advertisement
WinSCP by default uses the binary transfer mode for all regular file transfers. Learn how to configure it to use the text/ASCII transfer mode. You may also need to configure correct server-side text file format.
On the contrary, WinSCP always uses text transfer mode, when editing file in WinSCP internal editor (or Windows Notepad on Windows versions older then Windows 10 1809). If you want to force WinSCP to use the binary mode when editing files, you have to use an external text editor1 and make sure WinSCP does not force text mode for edited files. Also make sure your external text editor saves the file in the format you need.2
Known Issues with Transfer Mode
- Pure-FTPd FTP server: When downloading a file with Windows line-endings (CR+LF) in a text/ASCII mode, the server replaces LF with CR+LF, resulting in an incorrect CR+CR+LF. When opening such file in an Internal editor of WinSCP, the editor interprets the sequence as two line endings (CR and CR+LF) resulting in a blank line after each and every content line. When the file is saved, the internal editor saves two Windows line endings CR+LF and CR+LF. On upload they get converted to two LF’s. A workaround is to use an external editor and make sure WinSCP does not force text mode for edited files.
Debugging Text File Conversion
If enabling (or disabling) text/ASCII transfer mode does not help with the problem and your transferred/edited file is still perceived incorrectly by the target system, you need to find out in what step the file got converted incorrectly (or haven’t got converted).
To detect line endings used by a file on Windows, use following command on PowerShell console to display hex dump of the first 100 characters of given file (example.txt
):
Get-Content -Encoding Byte -TotalCount 100 example.txt |% {Write-Host ("{0:x2} " -f $_) -NoNewline}; Write-Host
For a file with following contents in a Windows format
One Two
it displays:
4f 6e 65 0d 0a 54 77 6f 0d 0a
Note the two sequences 0d 0a
(CR + LF) indicating Windows format.
To detect line endings used by a file on Unix/Linux system use command:3
xxd example.txt | head
For the same file as above, just in Unix format, it displays:
0000000: 4f6e 650a 5477 6f0a One.Two.
Advertisement
Note the character 0a
(LF) indicating Unix format.
If you do not have a shell access to the remote system, download the file using binary encoding and use the PowerShell command on a local binary-identical copy.
Use these techniques to detect, what format both source and destination files have. When editing a file, detect also a format of a local temporary copy of the edited file as saved by the editor. See preferences for a location of the temporary copies.
Requesting Support
When the above does not help you understand the problem and you decide to seek support, include all your findings, including copies of both source and destination file. When editing a file, include also a local temporary copy as saved by the editor. Ideally compress (ZIP) the files to avoid your browser altering file format, when attaching the files to support request.