Sonntag, 25. November 2007

BATCH_ input parameter with Ampersand

The Ampersand

What is the ampersand? The ampersand (or And-sign, '&') is used for conditinal execution in batch scripts. So ECHO one & ECHO two will give you the output
one
two

The problem with the ampersand is, that it is a special character that can occur in paths and filenames (other characters like >, <, : etc. are not allowed!) passed to batched scripts. Of course, these characters can occur in user input, when reading text files etc, but that's not the point here because I want to focus on the ampersand only.
When it occurs in a batch script parameter, it will be either quoted or escaped:
:: Standard file
C:\>script.cmd file1.txt
:: File with spaces needs quotes
C:\>script.cmd "file 1.txt"
C:\>script.cmd "C:\Documents and Settings\file1.txt"
:: same applies to the ampersand occuring in paths
C:\>script.cmd "C:\files & more\file.txt"
:: but an ampersand itself can be also escaped
C:\>script.cmd files^&more


Batch script parameters

You can refer to parameters to your batch script with %1 to %9. You will get the parameter just as it was entered. You will not "see" the carets (^) of an escaped string. So if the first parameter entered is "C:\files & more\file.txt", an ECHO will give you exactly that phrase:
ECHO My input is %1.
results in
My input is "C:\files & more\file.txt".

You can also assign the parameter's value to an environment variable with the statement SET PARAM=%1. But what can we do about the annoying quotes around the string? The batch allows us multiple parameter substitutions, which all look like %~[op]n. So %~d1 gives us the drive name of the first parameter regarding it as an absolute or relative file path. All substitutions remove the optional surrounding quotes; if you just want to remove the quotes, you use %~1 without any operator.

But if the string contains an escaped ampersand or you remove the surrounding quotes from a string with an ampersand, that character will be treated by the interpretor as the conditional command execution character which will obviously break you script.

Parsing and executing a batch script

The way the batch interpretor works itself through a batch file is very special. In short this looks like this:
  1. Read a line. Round brackets spanning lines count as a single line.
  2. Expand all environment variables and batch script parameters by replacing them through their values. Optional substitution operations can be performed.
  3. Parse the line, identify commands, strings, operators etc.
  4. Perform delayed variable expansion on environment variables if enabled. Optional substitution operations can be performed.
  5. Execute all identified commands in order.
I assume you are already familiar with round brackets and substitution operations, so I will only explain the delayed variable expansion. Variables to be delayed expanded are written with exclamation marks instead of percent signs:
ECHO This %VARIABLE% expands before this !VARIABLE!.
Because the expansion takes place AFTER parsing, all special characters (including ampersand, caret, pipe etc.) will be indirectly escaped. That's why you can use this technique to handle strings with ampersands!
You enable delayed variable expansion with
SETLOCAL ENABLEDELAYEDEXPANSION


Working with Variables

As I told above, parameters with ampersand easily break your script. So you have to either quote or escape your parameters when using them. Just - you cannot escape parameters in a script, that only works for variables! Alright, let's quote - but what if a parameter is already quoted?
I told you above that through parameter substitution you can remove the quotes. With %~1 you are left with an unquoted string possibly containing one ore ampersands waiting to break your script. So ALWAYS use something like
ECHO "This is my first parameter: %~1."
SET PARAM="%~1"
IF "%~1" == "a" ECHO First parameter is a!

Okay, now you have a quoted string - but that sometimes sucks like in the ECHO line, because that command also echoes the quotes.
Aaaahh, yes, I wrote something about substitution operations.
SET STR="%~1"
ECHO This is my parameter: %STR:~1,-1%.
That removes the first and the last character, which are my manually added quotes. But dammit, it's unquoted now which breaks the script again! Hmmm, wait, there was something about "delayed" variable expansion after parsing? Right, that's the way to use unquoted strings WITH ampersands! As the variables are expanded AFTER parsing, special characters won't break your code because they will always appear as strings.
SET STR="%~1"
ECHO This is my parameter: !STR:~1,-1!.
Assignment works as well:
SET STR="%~1"
SET STR=!STR:~1,-1!
An now we got it: The variable STR contains the unquoted input parameter string. So if you want to use that variable, you either have to quote it again or use delayed expansion:
ECHO The variable STR contains the value !STR!.
ECHO "The variable STR contains the value %STR%".

Hmm, but what about when one actually cannot use delayed expansion? For example when using subroutines with local variables of which one should be left after return?
:MY_SUBROUTINE
SETLOCAL
.... assign many variables here etc. ....
ENDLOCAL & SET STR="%STR:"=^"%"& SET STR=!STR:~1,-1!
GOTO :EOF
This special technique to allow both local and global context needs two SET statements (note the missing space between quote and ampersand!!!)? That looks really bad, but is in fact the only clean solution.
However, if you are sure that the only special character contained in that string is the ampersand (e.g. it is an original file path), you can manually escape that string:
ENDLOCAL & SET STR=%STR:&=^&%
That has the same effect like above, but only for the ampersand!

Notes

  • The %0 parameter (storing the script name) can also be placed in a path containing ampersands. So the rules stated above also apply to this parameter.
  • Avoid using %*. It merges all parameters into one single string. That means, it can contain quoted parameters, so you cannot quote %*, and it can contain unquoted, escaped parameters, so you must use quotes. You see, there is no way handling it correctly, so you should use the shift command instead.
  • A common IF expression is
    IF "%TEST%" == "abcd" GOTO :DO_IT
    This works unless the variable contains double quotes. You could use other characters like [] to surround your variables in case they are empty, however this would still break in case of an ampersand. So here, you can use two equitable construtions: escape the quotes or use delayed variable expansion:
    IF "%TEST:"=^"%" == "abcd" GOTO :DO_IT
    IF "!TEST!" == "abcd" GOTO :DO_IT
  • As for FOR constructs, you should be very careful and use delayed expansion and quotation whenever possible. Sometimes you have to work the right solution out by trial and error.

BATCH_ Strip surrounding double quotes from string

Testing a batch string for surrounding quotes is quite annoying. Here is a solution for Windows 2000 and newer:
:: %T% contains the test-string
IF "%T:~0,1%%T:~0,1%" == """" IF "%T:~-1%%T:~-1%" == """" (
  SET T=%T:~1,-1%
)
You can see, I am using the first respectively the last character twice, because a single double quote would break the IF. I'll explain it with this little example:
T keeps the string "Hello World!" with the surrounding quotes. The (simplified) test

IF %T:~0,1% == " (...)
would expand to
IF " == " (...)
which will be parsed as
IF <string>
with the string being " == " (...). Cannot work, huh?

BATCH_ String length

This little subroutine calculates the string length in Batch (cmd.exe) using the divide and conquer algorithm. It starts with the maximum string length possible under Windows XP (4191, NT and 2000 are limited to 2047, see http://support.microsoft.com/kb/830473). It divides the string in two equal halfs and tests if the character in the middle exists. If not, does that again with the left half, if it does, stores the length including that character and starts of for the rest of the string.
:: Returns the string length.
:: PARAM
:: 1 String string to get length of
:: RETURN Integer string length
:STRLEN
SETLOCAL ENABLEEXTENSIONS ENABLEDELAYEDEXPANSION
SET STR="%~1"& SET STR=!STR:~1,-1!
SET LEN=0
:: Max string length is 8191, see http://support.microsoft.com/kb/830473
SET POS=8192
:STRLEN_LOOP
SET /A POS /= 2
IF NOT "!STR!" == "" (
    IF NOT "!STR:~%POS%,1!" == "" (
        SET /A LEN += %POS% + 1
        :: work on rest, that is after POS+1
        SET STR=!STR:~%POS%!
        SET STR=!STR:~1!
    )
    GOTO :STRLEN_LOOP
)
ENDLOCAL & SET RESULT=%LEN%
GOTO :EOF
I have to double check the string because substring extraction %STR:~x,y% on an empty string leaves you with ~x,y. However, extracting a substring after the end of a non-empty string works as expected and returns the empty string.

I did some performance testing, and though jumping to labels is heavily dependent of script layout and size, 100 calculations of a string no matter what length lasted about 1 sec and 300 ms on an AMD Athlon XP 3200+.

If you are using only very short strings, you can loop through each character and test it for existence:
:: Calculates the string length of the first parameter.
:: PARAM
:: 1 String to return length of
:: RESULT Integer length of tring
:STRLEN
SETLOCAL ENABLEEXTENSIONS ENABLEDELAYEDEXPANSION
SET I=-1
SET T=%~1
:STRLEN_LOOP
SET /A I += 1
IF NOT "!T!" == "" IF NOT "!T!" == "!T:~0,%I%!" (
    GOTO :STRLEN_LOOP
)
ENDLOCAL & SET RESULT=%I%
GOTO :EOF
However, the cost scales with the string length. Compared with the first solution, it is faster (less than 1 second for 100 tests) for strings shorter than 10 characters, and for 16 characters takes exactly the same amount of time.

Reflog

Informationstechnische Howtos, Hinweise und Merkwürdiges

Batchlib v1.0 2008-03-29

Aktuelle Beiträge

HOWTO_ O2 DSL Surf &...
Der O2 DSL Surf & Phone-Router ist für die alleinige...
cypressor - 12. Feb, 19:57
Uptweak Windows XP Home...
There are a lot of annoying limitations in Windows...
cypressor - 9. Okt, 19:30
BATCHLIB_ Batchlib package...
Download Batchlib package v1.0 (5 KB zip file) What...
cypressor - 29. Mär, 19:10
BATCHLIB_ Batchlib library...
The batchlib library string.cmd is part of the batchlib...
cypressor - 29. Mär, 18:10

Homepage Ticker

Links

Status

Online seit 6815 Tagen
Zuletzt aktualisiert: 28. Jun, 11:32
RSS XML 1.0 Button-Get-Firefox

batch
batchlib
howto
tech
video
Profil
Abmelden
Weblog abonnieren