Copy a Long Line from a Text File and Save It to a New File
This is a follow-up to my recent post on how to store the nth line of a text file in a variable. The solution given won’t work for extremely long lines because the line requested by the user is stored in a variable and variables in Batch can only hold up to 8191 characters.
If you need to select an extremely long line from a file and save it to a new file, there is a workaround. But it ain’t pretty, or efficient. It involves
findstr, and a whole lot of temporary files. 😈
In this example, we’re looking for the 100th line from
longlines.txt. If the 100th line happens to be the last line of the file, all we have to do is:
more +99 longlines.txt > line100.txt
And we’re done! Note that
more starts counting lines from zero. However, the last line is a special case. The rest of the time, we have to jump through the following hoops…
First, we copy all but the first 99 lines of
longlines.txt to a new file called
longlines-1.txt using this command:
more +99 longlines.txt > longlines-1.txt
Note that if the last line in the file doesn’t end with a newline (CR+LF),
more will append one (whether we want it to or not).
Next, we add numbers to the start of each line with the help of
findstr /n $ longlines-1.txt > longlines-2.txt
$ is a special character used by
findstr to match the end of a line. A loose translation of the command above would be: “Add a consecutive number (starting from 1) to the start of every line in
longlines-1.txt that ends with a newline and save the output to
longlines-2.txt.” Note how we don’t have to worry about a missing newline at the end of the file since
more has already taken care of that for us!
At this point, we’re finished with
longlines-1.txt. It could potentially be a huge file so let’s delete it:
Now we need to extract the first line of
longlines-2.txt and save it to a file on its own. Because we’ve already numbered the lines in the file, we can be certain that the line we require uniquely begins with
1:. This is yet another job for
findstr /b "1:" longlines-2.txt > longline1.txt
/b switch tells
findstr to only match lines that contain
1: at the beginning of the line. The output from
findstr is redirected into the
longline1.txt file. We’re finished with
longlines-2.txt at this point and it can be safely deleted.
Next, we need to remove the
1: from the start of the long line. To do this, we
more the file and pipe the output through a chain of two
pause commands. A side-effect of
pause is that it consumes the first character piped into it. We redirect
pause‘s output to
nul to suppress the “Press any key to continue” message and then we simply redirect the resulting output to a file using a second
more command. The whole thing goes a little something like this:
more longline1.txt | (pause >nul & pause >nul & more > longline2.txt)
We can now delete
longline1.txt. All those
more commands and use of pipes will have appended one or more additional newlines to the end of the file. To get rid of them, we have to—you guessed it—rely on our old friend
findstr one last time. The following filters out all blank lines:
findstr /v "^$" longline2.txt > line100.txt
And at long last we have picked out the 100th line and saved it to a new file (assuming it wasn’t a blank line in which case the file will be empty).
The example above can be easily followed by entering one command at a time from the Command Prompt, but for a more general solution, you’ll need to write a program which can be a can of worms because you’ll run into all sorts of problems to do with verifying filenames and checking the line number to be sure it’s within the permitted range, etc, etc…
Well, it’s a good thing I got here first and did it for you! 😉 Consider it an early Christmas present.
@echo off & setlocal enableextensions set "a2z=abcdefghijklmnopqrstuvwxyz" & set "nos=0123456789" if "%~1"=="" (call :usage && goto end) else set "nth=%~1" if "%~2"=="" (call :error "specify a file name" || goto end ) else set "infile=%~2" if "%~3" neq "" (echo("%~3"| findstr /ix^ ^"\"[%a2z%][%a2z%%nos%\._-]*[%a2z%%nos%]\"^" >nul ^ && (if not exist "%~3" (set "outfile=%~3") else ( call :error ""%~3" already exists" || goto end)) ^ || (call :error ""%~3" is invalid filename" || goto end)) for /f "tokens=1* delims=0123456789" %%a in ("A0%nth%") ^ do if "%%b" neq "" ^ call :error ""%nth%" is invalid line number" || goto end for /f "tokens=* delims=0" %%z in ("%nth%") do ( set "nth=%%~z" & if not defined nth set /a nth=1) if exist "%infile%\" (call :error ""%file%" is a folder" || ^ goto end) else if not exist "%infile%" ( call :error "file "%infile%" not found" || goto end) echo("%infile%" | findstr "\* \?" >nul ^ && (call :error "wildcards not permitted in filename" || ^ goto end) for /f "tokens=1" %%c in ('type "%infile%" ^| find /c /v ""') ^ do set /a lines=%%c if %lines%==0 call :error ""%infile%" is an empty file" || ^ goto end if %nth% gtr %lines% ^ call :error "specify a line number between 1 and %lines%" || ^ goto end if "%~3"=="" set "outfile=%~n2_line%nth%.txt" if exist "%outfile%" ^ call :error ""%outfile%" already exists" || goto end (type nul >"%outfile%") 2>nul || ^ call :error "unable to create files in current folder" || ^ goto end set /a nth-=1,lines-=1 if %lines%==%nth% ( more +%nth% "%infile%" >"%outfile%" & goto result) call :rfn "%infile-1" infile1 (type nul >"%infile1%") 2>nul || ^ call :error "unable to create temporary files" || ^ goto end more +%nth% "%infile%" > "%infile1%" call :rfn "%infile%-2" infile2 findstr /n $ "%infile1%" > "%infile2%" del /f /q "%infile1%" call :rfn "%infile%-line1" infile-line1 findstr /b "1:" "%infile2%" > "%infile-line1%" del /f /q "%infile2%" call :rfn "%infile%-line2" infile-line2 more "%infile-line1%" | (pause >nul & pause >nul & more >"%infile-line2%") del /f /q "%infile-line1%" findstr /v "^$" "%infile-line2%" > "%outfile%" del /f /q "%infile-line2%" :result set /a nth+=1 echo(line %nth% of file "%infile%" has been saved to "%outfile%" :end endlocal & goto :EOF :rfn random filename generator setlocal :retry set /a randno=%random% set "randfn=%tmp%\%~n1-%randno%.tmp" if exist "%randfn%" goto retry endlocal & set "%~2=%randfn%" & exit /b 0 :usage cls 1>&2 (echo(Copies a long line from a text file to a new file. echo(&echo(Usage:&echo(&echo( %~n0 N InFile [OutFile]&echo( echo(where the contents of line number N from file Infile ^ will be saved to file&echo(OutFile. If OutFile is not ^ specified, a filename based on the InFile's name&echo(will ^ be generated.) exit /b 0 :error for /f delims^=^ eol^= %%e in ("%*") do 1>&2 echo(%%~e exit /b 1
One small thing to be mindful of is that
more will pause and wait for a keypress after 65,535 lines regardless of whether it’s piping data, redirecting a file, or sending output to the console.