The theory and concept section of this document my be incomprehensible - if so, or you just don't care about it, just skip over it.
Many of the concepts employed in the following discussion are from "C": you need to understand something of pointers and structures, and such terms as "dereferencing a pointer". Some concepts best managed with assembly thinking are also used: the identity between numbers of arbitrary precision and strings is fundamental to this work. Certain concepts from mathematics are also used: sets and subsets, ordering, n-dimensional coordinate systems, and the like. These concepts are necessary to understanding how these stacks work, but are not necessary for understanding how to use them.
Pointers are indexes to data items. Normally they are thought of as numerical addresses, but in batch language, they must be handled as strings. Dereferencing a pointer is the act of obtaining the data referenced (pointed to) by the pointer.
Structures are data elements containing more than one data item. Structures are known by their pointers; the internal elements are accessed as (named) offsets from the structure's pointer - in effect, structures are two (or more) dimensional constructs: the pointer to the structure has a position in one ordered scale and the various elements are indexed in another direction.
Consider the binary number 110100000110100000110100000 - if viewed as a decimal number, it's 109265312 ... but viewed as a string, it's "111", and 110100000110100000 is "11", 110100000110100000110100000110100000110100000110100000 is "1111111", and so forth - 010000010100000101000001 is "AAA". When I use string concatenation to construct strings of various numbers of "1"s, what's happening numerically is that I'm adding a new number that is the same as the original number multiplied by some multiple of 256 to the existing number. Therefore, strings and numbers can serve similar purposes - are, in fact, indistinguishable here. Unfortunately, we can't make the code work reliably if we use numerals for the indexes: "%11.data% " is seen as "%1" followed by "1.data" followed by "% " so the expression resolves to "1.data" instead of whatever the contents of 11.data are. Non-numerical characters work, and what we are doing is just a form of bang counter anyway, so we'll use "!" instead of "1" as the symbol.
Sets are collections of related items - subsets are parts of a set that meet some additional criterion: odd numbers form a subset of all positive numbers - itself a subset of all numbers.
For our purposes here, "ordering" is understood in terms of "previous", "this", and "next" and also in terms of a series: "1", "11", "111", etc. When looking at the pointers used here, generation is by means of a series, and retrieval is through "this" and "previous" relationships (we know "this", and "this" has a pointer to "previous").
"N-dimensional coordinate systems" refers to what amounts to grids (line, plane, solid, hypersolid) with ordered indexes for each line in each dimension - locations are described by the indexes of all the lines that intersect at the location given in an agreed upon order. Such a coordinate system is a sparse matrix if most of the lines are missing when actually implemented.
A stack is a block of memory accessed through a pointer that changes in one way while adding items to the stack and the opposite way when retrieving them. The classic PC implementation involves assigning a pointer to the block (the stack segment) and another to the items within the block. The item pointer (called the stack pointer) is initially at its maximum value and is decremented as items are pushed onto the stack (much as plates are pushed onto a plate dispenser at a cafeteria), and incremented when the item is removed (much as plates are popped off the plate stack). Obviously, this is a "last in, first out" buffer.
In batch language it is somewhat inconvenient to decrement and increment the stack pointer - "nearly impossible" would be more accurate - so a different approach must be used.
We can model a stack as a subset of a large, but sparsely filled, matrix in n dimensions. We can, for example use anything we like for the segment and just those numbers that have values of 2^n -1, that is binary numbers (leading zeros suppressed) 1, 11, 111, 1111, 11111, etc. for the pointer. Obviously, we waste a lot of possible entries in the matrix doing this, but it means that incrementation can be done by simple string concatenation: set foo=%f00%1. That leaves only decrementation to be dealt with.
Since we work our way through the stack one item at a time (particularly with respect to POP, we need only to obtain one pointer based on something known about the current pointer - that something can be the address of the stored next value. That is, we make each stack pointer point to both the data pushed on the stack and also to the value of the next pointer needed. If we invert the conventional increment/decrement order, we can remember the current pointer value, then increment the pointer and use the new value as the address of the next stack item - this way we can always know the value we will need later need at a time when it's convenient to save it.
The key concept is that by making the item pushed onto the stack, not a data item, but a structure containing both data and a pointer to the previous structure, we can implement the elements in a sparse three dimensional matrix as linked lists (the stack segment is one dimension, data is another, and the pointer is the third. This requires a three element address: the segment (which stack to use), the stack pointer (which structure on that stack to use), and the data/pointer selector (whether to read the data or the pointer). These can be implemented as dotted names: segment.pointer.selector. In addition, the entire designation can be considered as a string. The later is quite fortunate since strings are the only things we have for identifying environment variables and are the only kind of data we can store in the environment ... and the environment is the only piece of memory we can access directly.
All of this leads to implementing stacks as environment strings associated with other strings. Each stack item is a pair of environment variables having all but the last part of their names the same, and the last part predictable: .data and .pointer. The first part is the always the same for the same stack, and the middle part is the all ones number discussed above.
Since we can make lists doubly linked (data, pointer to previous, pointer to next)or unlinked (data only, no pointers in the structure) as well as just linked, we can also implement other forms of buffers. Two independent pointers (the second constrained to be equal to or less than the first) allow building a FIFO (first in, first out) as just a collection of data items and the two pointers. Both pointers start low and increment as items are added or removed (used entries should be deleted). This has the limitation that the length of the string used as the pointer, and therefore the number of items ever put in the buffer, is strictly limited. In that model, every entry gets a unique index that is never reused. The buffer size can be limited while removing the limitation on total number of items ever stored by making the FIFO buffer into a circular buffer - when the read pointer catches up with the write pointer, reading stops; when either pointer reaches the last allowed, that pointer is reset to the initial value and the process continues. The keyboard buffer in your PC is a circular buffer. I have no plans at this time to address those constructions here - they should be rather obvious variations on the stack theme.
This level of abstraction allows far more flexibility that would normally be needed, but a general solution to the problem of buffers in batch language allows us to efficiently implement whatever subset of the functionality we need.
The first code to be discussed is a full blown stack manager that implements concepts of stack segment, stack pointer, and base pointer (a pointer into the stack at a point where some special subset of variables can be addressed as offsets from the base pointer), and implements the push, pop, and clear functions. Instead of a segment pointer, we will use stack names supplied by the calling program.
This stack manager is to be called by other batch programs using the syntax
STACK function stack_name variable_name
where "function" is "push", "pop", "clear", or "base", "stack_name" is the program specific string (the program's own name as obtained from %0, perhaps) that names the stack to use, and "variable_name" is the name of an environment variable used to pass the value to be pushed or to return the value popped. The later variable is also used to pass the value returned by the "base" function. In this demo, the name of the program is assumed to be STACK.BAT.
@echo off
set %2.s_error=Clear
set s_jump=
set s_mode=goto %1
if not %s_max%!==! goto cont0
set s_max=!!!!!!!!!!!!!!!!!
:cont0 //line 7
echo set s_sp=%%%2.sp%%> }{.bat
echo if %%s_sp%%!==! set s_sp=!>> }{.bat
echo set s_next=%%s_sp%%!>> }{.bat
echo set s_data=%%%3%%>> }{.bat
echo set s_this=%%%2.sp.%s_sp%.prev%%>> }{.bat
call }{ //line 13
%s_mode%
:push //line 15
if %s_next%==%s_max% goto over
set %2.sp.%s_next%.data=%s_data%
set %2.sp.%s_next%.prev=%s_sp%
set %2.sp=%s_next%
goto end
:pop //line 21
if %s_sp%==! goto under
echo set %3=%%%2.sp.%s_sp%.data%%> }{.bat
echo set %2.sp=%%%2.sp.%s_sp%.prev%%>> }{.bat
call }{ //line 25
set %2.sp.%s_sp%.prev=
set %2.sp.%s_sp%.data=
%s_jump%
goto end
:over //line 30
if %4!==y! echo BATCH STACK ERROR: stack overflow
set %2.s_error=Over
goto end
:under //line 34
%s_jump%
if %4!==y! echo BATCH STACK ERROR: stack underflow
set %2.s_error=Under
goto end
:clear //line 39
if %s_sp%==! goto cont1
set s_jump=goto clear
set s_mode=goto pop
goto cont0
:cont1 //line 44
set %2.sp=
goto end
:base
set %3=%2.s_sp
:end
del }{.bat
The base pointer function simply returns the current value of the pointer for the named stack.
In the real world, it is seldom necessary to use multiple stacks, so the stack segment can be ignored. Nor is it generally necessary to have the stack manager as a separate program. These two considerations allow shortening the code somewhat ... to eight lines. It is so unlikely that the entire package will be needed, that I see little point in a detailed explanation of the code - everything important is in the simple version - the longer version is mostly multi-stack management code and object wrapper. The simple version:
set %sp%!.ptr=%sp%
set %sp%!.data=%foo%
set sp=%sp%!
for PUSH (where the foo environment variable contains the string we wish to save), and
echo set foo=%%%sp%.data%%> }{.bat
echo set sp=%%%sp%.ptr%%>> }{.bat
call }{
set %sp%!.data=
set %sp%!.ptr=
for the POP part (the foo environment variable gets the string that is popped of the stack).
The PUSH part of the code manages the SP (stack pointer) environment variable on the assumption that no errors will occur and that eventually every PUSH will have a mating POP. Note that in Windows (all flavors), each DOS box begins with SP cleared, as does the only DOS session under real DOS - after that, every program that uses the same stack must issue equal numbers of pushes and pops in order to keep the variable properly synchronized. Note also that each DOS box has a unique environment, and therefore a unique stack.
set %sp%!.ptr=%sp%
(in the PUSH code) generates the next stack pointer by concatenating the existing pointer string with a "!" string literal and appends the pointer designator to produce the name of the variable that is then used to store the pointer part of the names of the variables used for the previous entry, thereby creating a link in the current stack entry that points to the previous item, that is, another link is added to the linked list of variables that comprise the stack.
set %sp%!.data=%foo%
builds the same stack pointer as the previous line did, but this time the data designator is used instead of the pointer designator. The composite variable name (the same first part as above, but with a different second part) is then used to save the contents of the foo environment variable in the stack.
set sp=%sp%!
then sets the stack pointer to point to the current entry. This is just a matter of reconcatentating the existing string with "!" and saving the result back into the same stack pointer variable (SP). When this command finishes, the stack is in the same condition as before the first as executed, except that there is one more link in the list and one more item on the stack - one more item has been PUSHed onto the stack.
POP is a little harder because we have to, in effect, dereference a pointer one time more than the command processor allows on one pass, that is, we have to refer to the contents of a variable by means of the name of the variable containing part of the name of the variable containing the string we want - a single pass through the command interpreter will deference a pointer, but we need to dereference a pointer to a pointer.
echo set foo=%%%sp%.data%%> }{.bat
echo set sp=%%%sp%.ptr%%>> }{.bat
create a batch file containing the lines (if sp contains "!!!!")
set foo=%!!!!.data%
set sp=%!!!!.ptr%
When CALLed by the next line
call }{
those lines execute and put the data from the most recent extant stack entry into the foo variable and set the stack pointer variable to point to the previous entry in the list. The last two lines
set %sp%!.data=
set %sp%!.ptr=
are clean-up code: they delete the list entry that was just read.
I mentioned base pointer (above) - in this context it could be used to access the stack entries, beginning with the element pointed to by the base pointer by incrementing the pointer the same way as the stack pointer, in the same order they were placed on the stack, or if BP points to the last entry of immediate interest, in reverse order through the linked pointers. Normally one would not delete entries when they are just read, only when they are POPped. One could also store SP values that point to items of interest so that they can be retrieved randomly.
The most obvious use for a stack is to store the current drive and directory before CALLing another batch program - if your program needs to run from a specific directory and it is not known that the secondary program does not change the directory, it is necessary to store that information before starting the secondary program. If the secondary program does the same thing with yet another program, the information would need to be saved again. The only way to do this is by saving the information in variables with unique names, and since we have the means to construct and recover unique variable names through the stack mechanism, the thing to do is to PUSH the drive and directory information (or even the complete commands) onto the stack before calling the child program and to POP the information back off when it returns - what happens to the stack in the interval is unimportant as long as the PUSH and POP operations cancel each other (same number of each). Code for obtaining the drive letter and current directory is at exmp1.htm#pushpop - this version has been modified to work under real DOS and Win3.x with short directory names, and under Win9x, and NT4 with long names containing spaces. If you want OS specific code, versions are included for real DOS, Win9x, and NT4.
:: Get drive letter and directory from directory listings.
dir | find "Volume in" > }1{.bat
echo set drive=%%3:> volume.bat
dir | find "Directory" > }2{.bat
echo set dir=%%2 > directory.bat
:: this loop written to DIRECTORY.BAT deals with spaces - if there are
:: none, it immediately exits the subprogram.
echo :loop>> directory.bat
echo if %%4!==! goto l_end>> directory.bat
echo shift>> directory.bat
echo set dir=dir %%3>> directory.bat
echo goto loop>> directory.bat
echo :l_end >> directory.bat
:: CALL the two batch programs that will actually set the DIR and DRIVE
:: environment variables to the strings extracted from the DIR listing.
call }1{
call }2{
:: Delete the transient files.
for %%a in (}1{ }2{ volume directory) do del %%a.bat
:: PUSH drive letter.
set %sp%!.ptr=%sp%
set %sp%!.data=%drive%
set sp=%sp%!
:: PUSH directory name.
set %sp%!.ptr=%sp%
set %sp%!.data=%dir%
set sp=%sp%!
:: Invoke the program that may or may not change the directory.
call foo
:: POP the directory (it was pushed last, so it pops first). Note that it
:: goes back into the same variable (DIR) that it came out of.
echo set dir=%%%sp%.data%%> }{.bat
echo set sp=%%%sp%.ptr%%>> }{.bat
call }{
:: POP the drive letter
set %sp%!.data=
set %sp%!.ptr=
echo set drive=%%%sp%.data%%> }{.bat
echo set sp=%%%sp%.ptr%%>> }{.bat
call }{
:: It is necessary to delete the transient file before changing away
:: from its directory.
del }{.bat
:: Delete the variables used to store the stack elements no longer needed.
set %sp%!.data=
set %sp%!.ptr=
:: Restore the original drive.
:: %drive% contains the complete command (the drive letter + a colon)
:: needed to change the drive - expanding the variable (as if a macro)
:: causes the command to execute.
%drive%
:: Change back to the original directory - this datum was not stored
:: as a complete command in order that we can deal with spaces here: if
:: the directory name contains at least one space, we must quote it.
set quote=
echo %dir% | find " " > nul
if errorlevel 1 set quote="
cd %quote%%dir%%quote%
:: That line resolves to either the directory name (d:\foobar) or the
:: directory name quoted ("d:\foo bar"), depending on whether or not
:: the name contains at least one space.
:: More clean-up code - delete unneeded variables.
set drive=
set dir=
set quote=
The real DOS version has no need to deal with spaces, so all the code dealing with spaces and quotes can be deleted:
:: Get drive letter and directory from directory listings.
dir | find "Volume in" > }1{.bat
echo set drive=%%3:> volume.bat
dir | find "Directory" > }2{.bat
echo set dir=%%2 > directory.bat
:: CALL the two batch programs that will actually set the DIR and DRIVE
:: environment variables to the strings extracted from the DIR listing.
call }1{
call }2{
:: Delete the transient files.
for %%a in (}1{ }2{ volume directory) do del %%a.bat
:: PUSH drive letter.
set %sp%!.ptr=%sp%
set %sp%!.data=%drive%
set sp=%sp%!
:: PUSH directory name.
set %sp%!.ptr=%sp%
set %sp%!.data=%dir%
set sp=%sp%!
:: Invoke the program that may or may not change the directory.
call foo
:: POP the directory (it was pushed last, so it pops first). Note that it
:: goes back into the same variable (DIR) that it came out of.
echo set dir=%%%sp%.data%%> }{.bat
echo set sp=%%%sp%.ptr%%>> }{.bat
call }{
:: POP the drive letter
set %sp%!.data=
set %sp%!.ptr=
echo set drive=%%%sp%.data%%> }{.bat
echo set sp=%%%sp%.ptr%%>> }{.bat
call }{
:: It is necessary to delete the transient file before changing away
:: from its directory.
del }{.bat
:: Delete the variables used to store the stack elements no longer needed.
set %sp%!.data=
set %sp%!.ptr=
:: Restore the original drive.
:: %drive% contains the complete command (the drive letter + a colon)
:: needed to change the drive - expanding the variable (as if a macro)
:: causes the command to execute.
%drive%
:: Change back to the original directory - this datum was not stored
:: as a complete command.
cd %dir%
:: More clean-up code - delete unneeded variables.
set drive=
set dir=
set quote=
The NT4 code for getting drive letters and directories can be very different and rather more efficient:
@echo off
for /d %%a in (.) do set drive=%%~da && set dir=%%~pa
:: Note that the above line sets DRIVE to the letter + a colon. The dir
:: part does not contain the drive letter, though a change of %~da to
:: %~fa would put the letter on it.
:: PUSH drive letter.
set %sp%!.ptr=%sp%
set %sp%!.data=%drive%
set sp=%sp%!
:: PUSH directory name.
set %sp%!.ptr=%sp%
:: Note that the following line puts the complete command to restore the
:: directory into the environment variable and quotes the path regardless
:: of whether it has spaces or not.
set %sp%!.data=cd "%dir%"
set sp=%sp%!
:: Invoke the program that may or may not change the directory.
call foo
:: POP the directory (it was pushed last, so it pops first). Note that it
:: goes back into the same variable (DIR) that it came out of.
echo set dir=%%%sp%.data%%> }{.bat
echo set sp=%%%sp%.ptr%%>> }{.bat
call }{
:: POP the drive letter
set %sp%!.data=
set %sp%!.ptr=
echo set drive=%%%sp%.data%%> }{.bat
echo set sp=%%%sp%.ptr%%>> }{.bat
call }{
:: It is necessary to delete the transient file before changing away
:: from its directory.
del }{.bat
:: Delete the variables used to store the stack elements no longer needed.
set %sp%!.data=
set %sp%!.ptr=
:: Restore the original drive.
:: %drive% contains the complete command (the drive letter + a colon)
:: needed to change the drive - expanding the variable (as if a macro)
:: causes the command to execute.
%drive%
%dir%
:: More clean-up code - delete unneeded variables.
set drive=
set dir=
set quote=
Win9x requires almost all of the code in the original multi-OS example except that all directories can be quoted:
set quote=
echo %dir% | find " " > nul
if errorlevel 1 set quote="
cd %quote%%dir%%quote%
becomes
cd "%dir%"
if it's thought worth while to save those two lines. The Win9x version also works in NT4.