Recently I worked on a project which required some basic command line scripting. One of the VMs I worked on was a Windows box, and the scripts consisted of Batch files. Although I'd seen a few Batch scripts before in my short software development career (~2.3 years), I never actually got a chance to write one myself. I figured this was the perfect opportunity to take a look at the basics of Batch scripting. With some knowledge of how to write a Batch script, I'll be capable of comparing Batch to scripts in PowerShell and Bash.
Batch scripts are pieces of code written in a command line interface (shell) on the Windows operating system. For someone new to programming like myself, I always thought of Batch as the precursor to PowerShell. This actually forms a pretty good one sentence comparison between Batch and PowerShell. Although you will often hear developers advocating the switch from Batch to PowerShell, Batch scripting is far from extinct1.
Batch scripts first appeared in the DOS family of operating systems and are still used in Microsoft operating systems today2. On today's Windows operating systems Batch scripts are executed in the cmd.exe CLI3. You can write commands on the command prompt or store a text file of commands in a Batch file. Batch files allow for reusable command sequences.
When running batch files, each line is executed in order. A batch file is run by simply executing the file path in cmd.exe. Arguments can be passed to the batch file. For example, the following command executes the batch file test.bat with the argument "Hello." All the program does is print the argument to standard output.
Interestingly the default behavior of batch scripts is to print out all the commands that were executed. Usually this isn't preferable (except for debugging circumstances) so suppressing print statements is accomplished with the
@echo off command 4. Arguments are accessed with the
%<arg-number> command, and later on I will use this same syntax to access function arguments in Batch.
Batch scripts also allow you to interact with strings and integer numbers. The
set command is used to assign a value to a variable. Variables can then be accessed with the
Unfortunately, I quickly learned that working directly with floating point arithmetic in Batch is much more difficult. I won't include any examples in this post, however it is an interesting topic to explore5.
Batch also allows you to set inner scopes for a script. Unfortunately the syntax isn't quite as concise as C-like curly brackets (Batch requires a more verbose
You can see that any variables defined in the inner scope are not accessible to the outer scope and more importantly don't modify any of the existing outer scope variables.
Bash also has support for array data structures. This is also where I ran into the first major "gotcha" of Batch programming.
The code above creates an array of towns in my home state of Connecticut. It first prints the town at index 0 in the array, and then goes on to loop through the array, creating a string of all the town names. The first
for loop uses an array defined elsewhere to loop through. The second
for loop is more concise and defines an array to loop through inside the
for loop syntax itself.
The two points of interest in the code above are
setlocal enableDelayedExpansion and
set allTowns=!allTowns! !towns[%%i]!. The exclamation point variable access syntax and
setlocal enableDelayedExpansion are related in an unexpected way.
The code samples I've shown you so far access values of variables through the
%variable-name% syntax. In Batch
%variable-name% defines variable expansion. Variable expansion is the act of replacing a variable reference with its actual value. That means if you created a variable
set age=23 and then accessed it with
%age%, the token
%age% actually gets replaced with
23. This replacement occurs when a line in a Batch script is parsed, not when it's finally executed.
Variable expansion with the
%variable-name% syntax only happens once when the line it occurs on is parsed6. The value the variable is replaced with never changes.
Obviously this causes issues in
for loops. With
%variable-name% syntax the value in the
for loop will never change! This breaks from the behavior you come to expect from
for loops. Variable expansion side effects is one of the most common beginner mistakes with Batch, and I fell victim to it7.
The solution is to use delayed variable expansion by executing
setlocal enableDelayedExpansion. With delayed variable expansion and the
!variable-name! syntax, variables are expanded each time a line is executed. In the case of
for loops, this occurs on each loop iteration8.
Another interesting construct you can make in Batch scripts are functions.
The above function (beginning at
:displayTime and ending at
exit /b 0) simply prints out the current time. The interesting thing is that while the above construct can be treated as a function, it is actually a subprogram inside a Batch script. The
call command invokes one Batch program from another 9. The
call command is like having a bunch of
goto statements in your code, jumping around to different labels.
:displayTime happens to be the label for my function. This behavior in Batch reminds me of my time working with assembly, or if I really wanted very poorly designed Java code. The following more complex function drives this point home. You can follow the execution flow through the
goto commands, jumping between labels.
Obviously we know that
goto commands are difficult to follow in our code. This behavior in Batch makes me wonder how Bash and PowerShell handle functions and script traversal.
I felt like I learned a lot about command line scripting the past few days, but obviously I have a long way to go. I just scratched the surface of Batch files, and still seek a greater understanding of Bash and PowerShell. I hope to research them further and compare their features in the future.
The code is on GitHub with even more samples than the ones I covered in this post!