Amiga-OS Shell Documentation

Abstract: This document describes the command line syntax and features of 
the Amiga-OS Shell. Specifically, the shell described hereafter is the V45
shell of BB2. Not all issues apply to the V40 release of the shell.

______________________________________________________________________________

The purpose of the Amiga-OS shell is to parse its input stream for commands,
then locate these commands either on disk or in memory, and execute these
commands. An input stream is typically the stream of a console handler, as
for example the CON: window or a ViNCEd window. Other input streams are
command files, or "batch files" as they are called sometimes. The most 
popular batch file is of course the "Startup-Sequence" in the S: assign.

The input stream is therefore read line by line, that is, up to a terminating
line feed character (or the end of file, where the shell generates this
line feed implicitly). Should the input be read from a console window, this
line feed is of course generated by the console handler as soon as the user
presses the return key.

The input line is then scanned for the command itself and for the input/output
and error redirection of the command defining the streams the command will
print to, will read its input from and may write its diagnostic output to.
Furthermore, the shell also handles expansion of back-tick sequences, aliases
and variables. A special feature of the shell is the "&" character and the
"+" sign at the end of a line, further the comment introducer ";". All
this is described in more detail below.

Even though the shell parses the command line, it does not prepare argument
lists for the command to be executed; the command line is scanned for
shell specific control features like input/output redirection or back-ticks,
and the remaining input is forwarded as is to the command. It is the matter 
of the command itself to parse this input string a second time. This has the
unfortune side effect that the shell has to second-guess about the expected
syntax of the appropriate command; for the time being, the shell assumes 
the following syntax, which - surprise, surprise - coincides with the syntax
of the ROM function ReadArgs() resp. ReadItem() used by most commands anyhow:

This pair of commands knows only two special characters - or maybe four,
depending on how you count: Blanks = the space character and the TAB which 
are considered equivalent, the double quote " and the asterisk * as the BCPL 
style escape character. A command argument is to be understood as a sequence
of non-blank characters up to the line feed, unless the blanks and the 
remaining argument are enclosed in double quotes. Interestingly, this goes 
only for quotes following a blank space. Quotes within an argument, meaning
not following a blank, are not counted as argument separators. Hence,

list abc 
list ab"cd"ef
list abcd"

present all three the attempt to run "list" with one single argument in
which the quote stands for itself.
Within quoted arguments, and only there(!) the escape character * gets
its special meaning. Outside of quotes, the star stands for itself.

Four *-escaped sequences exist: ** is the star itself, *" the double
quote, *N a line feed character and *E an "escape" character, of ASCII
code 27 = hex 0x1B. Therefore,

echo "**"
echo **

prints one star, and two stars respectively. Unlike other claims, the 
sequence "*E[" is not directly related to the "csi" character of code 
155 = 0x9b; as far as argument parsing is concerned, this is regarded as 
"escape [". It is merely the console driver which interprets the latter as
a substitute for "csi", neither ReadItem() nor ReadArgs() do.

Even though these are all syntax rules as far as the ROM argument parsing 
routines are concerned, the shell itself parses its input line as well, and
performs various operations on this line before it is provided as input to
the command to be executed.

The first step is back-tick substitution; then the first argument according
to the above rules is parsed off, ignoring leading blanks. Alias substitution
follows next. As soon as the alias expansion fails to expand further, it
checks whether the command name is a shell variable to be substituted, and
then tries to locate the command to be executed. This may be either an
explicit command found in the current directory, the command search path
or the resident list, or an implicit command for script files, non-executable
files or directories. The last step parses the command arguments for shell
variables to be substituted, and for input/output redirection. All these
steps are described in more detail in the following sections.


1) Variable substitution: Local or global variables are set by means of the 
"set" or "setenv" commands. Both are build into the ROM, or the Shell-Segment,
as "resident commands". Should the shell find a variable specification on the
command line, this specification is replaced by the contents of the variable 
before the line is feed into the command as input, or used as the command to
be loaded directly.

A variable specification is either a $ sign, followed by alpha-numeric
characters, or a "${" sequence followed by arbitrary characters up to the line
end and a pairing "}" brace to end the variable name. The braces itself do not
count as part of the variable name itself and are therefore discarded. Should
the variable be undefined, no substitution takes place and the $ followed by
the supposed to be variable name stands for itself. An alpha-numeric character
is here any digit, any lower or upper case character or any national character
within the code range of 161..255, i.e. hex 0xa1..0xff. Hence,

echo $a
echo ${a}

print the same text if there should be a variable named "a", otherwise both
print the argument literally. Even though the above side effect can be used to
test whether a variable has been set or not, the shell offers a more
orthogonal way to check the existence of a variable, namely by $? followed by
the variable to be tested for existence. This expression returns the string 0 
if the variable is not defined, or 1 otherwise. Hence,

echo $?a
echo $?{a}

generate the same output, 0 for a undefined and 1 for a defined. Note that
the ? is part of the $, and not part of the variable name.

The second related sequence is $?? which returns 1 only if the variable is
defined as a global variable, i.e. exists as a file within the ENV: device.

The usage of { } allows rather weird variable names, even names containing
spaces, escape characters, question marks, quotes, ">" or "<". Due to a side
effect of the internal working, ":" does NOT work as part of any variable
name; unfortunately this cannot be fixed since variables are stored by a
device handler that reacts in a special way on ":". Similar restrictions arise
for the slash / that implies a special meaning for global variables. 
(Make a guess how!)

The $ itself can be escaped by an asterisk, hence *$ stands for a single $.
Unfortunately, THIS meaning of escaping is independent of the * concerning the
argument parsing mentioned above, and is therefore valid within and outside 
of quotes, but only in front of a $ sign (and other shell-only tokens like the
back-tick ` and the bracket [, see below). For the same reason, the parsing of
variable names, though not variable expansion, works the same way within and
outside of quotes, except that stars * that are not in front of $ have a
different meaning. Therefore, confusingly,

echo ${*}
echo "${**}"

address the same variable, namely *, whereas both

echo *$
echo "*$"

print the same string $, regardless of the quotes.

The escaping can be escaped with another star as well. Hence,

echo **$a

will print a star, followed by the contents of a. This kind of escaping the
$ sign is independent of the quotation as well and works only in front of $
and related shell-only tokens. Therefore,

echo "**$a"

does right the same. More precisely, $ stands for itself with an odd number of
stars in front, and is regarded as variable expansion command with an even
number of stars. Since variable substitution happens before argument analysis,

echo "***$a"

will be parsed as "star, star, escaped dollar, a" on expanding variables, and
the resulting string is analyzed as argument of "echo" as "**$a" with the $
meaning of a literal "$". Now, the star escapes the second star within the
quoted argument: echo prints therefore *$a onto the console. Since * is not an
escape character for itself within arguments,

echo ***$a

will print one star more instead, but the interpretation of $ does not change.
Clearly, all this is messy, and can be understood only in the historical
context of the Amiga shell where $ was added after argument parsing syntax has
been set in stone. The very same rules for escaping apply to ` and [ as well,
for exactly the same reason.

The main idea of the above rules is that $ should not influence argument
parsing "too much", should work outside and inside of quotes, but should be
escapable by the same BCPL syntax. Looking back, a backslash-driven 
and more consistent escaping mechanism would have been more appropriate, of
course.


A second rule formulates how variable expansion interacts with quoting:
Should the $ sequence be within double quotes, and should the variable
contents be quoted as well, the outermost pair of quotes of the variable
contents gets removed. Otherwise, the result could have been a quote within
a quote, undesirably terminating the argument. The following example 
demonstrates this:

set a "*"hi*""		;defines a as "hi", note the escaping of quotes
			;due to a first parsing step before executing the
			;set command.
echo "a is $a"		;results in "a is hi"

Leaving the quotes around $a would have resulted in

echo "a is "hi""

This would be interpreted as two arguments rather than one, the first one
being "a is " and the second one being hi"". 

If this is undesirable, set the local variable "keepdoublequotes" to "on"
by the following command

set keepdoublequotes on

Thereafter, the above command

echo "a is $a"		;results in "a is *"hi*"" and prints a is "hi"
			

will get the quotes escaped properly such that the "echo" command really
prints the text including the quotes of the variable contents.

The very same idea of "double quoting" is applied to variable expansion
outside of quotes, except  that the shell now either leaves the possibly
quoted variable contents alone, or adds quoting and escaping to guarantee
that the command reads the quotes as part of its argument. Therefore, the
command

echo $a			;results in "hi" with keepdoublequotes off
			;results in "*"hi*"" with keepdoublequotes on

prints either hi or "hi", depending on whether "keepdoublequotes" is "off" or
"on". In most cases, you might find "off" more attractive as it is backwards
compatible, even though it drops one level of quoting. However, if 
"keepdoublequotes" is set to "on", the shell tries to preserve as much of the
original meaning of the variable, possibly adding escape characters.


2) Alias substitution: Before the shell is trying to locate a command, it
tries to expand this as an alias. Aliases are command renames with additional
argument relocation, to be defined by the shell-resident "alias" command. On
invocation, the alias is then replaced by the contents of the alias, "shifting
remaining arguments to the right" after inserting a blank. Hence,

alias foo echo a
foo b

will result in the command "echo a b" and the output "a b". With the aid of
the additional alias token [] one can relocate the remaining arguments to
other places, e.g. in front of additional arguments of the alias contents.
No blank is inserted in this case:

alias foo echo [] a
foo b

will move the "b" in place of the [] and therefore print "b a", the opposite
of the above. No blank spaces are inserted in front of the [], this should be
done manually if required. 

alias foo echo []a
foo b

will therefore really print "ba". Unfortunately, the brackets do not try to
analyze the argument syntax for traditional reasons and work therefore only
the first time within the alias contents, i.e.

alias foo echo [] a []

will not work as expected. The second [] is regarded as literal bracket.
Future releases might introduce further refinements of [] that allow finer
control of the arguments.

Similar to the $ sign, the [] can be escaped by a *, applying the very same
messy rules concerning the counting of stars. However, one should now be even
aware that aliases must enter the shell database somehow: This is of course
done by the "alias" command which, by itself, parses its arguments *again*.
Hence, should it be necessary to enclose the "alias body" of the alias command
in double quotes, then all stars MUST BE DOUBLED.

alias bracket "echo **[]"
bracket

will just print [] on the console. The ** is escaped into a * by the argument
parsing of the alias command, and *[] escapes into [] upon expansion of the
alias.

Once again, the same quotation rules as for variable expansion apply: By 
setting "keepdoublequotes" to "on", you may define whether the shell shall
prevent quotes or shall drop one level of quoting. 

Alias expansion happens recursively until no alias are found to expanded any
more: If the expansion of an alias happens to be an alias again, then this
alias is expanded, and so on. Since this would imply the possibility of an 
infinite loop of an alias that expands into itself, or into an alias that 
expands into an alias it was expanded from, no alias will be used more than
once within a single command line. Hence, the following alias definition is
legal:

alias hi ho
alias ho echo "ho"

and defines the "hi" command to output "ho", whereas the following alias 
definition is asking for trouble:

alias gnu gnu is not unix

This will first expand "gnu" into "gnu is not unix", and will stop then since
the "gnu" alias has been "used up" already. Since the gnu command cannot be
resolved further - unless of course you provide this command as a binary - an
error message will result.

Expansion of aliases can be prevented completely by enclosing the command in
quotes. "hi", therefore, will not try to use the above hi-alias, but will try
to find a file named "hi". Otherwise, alias definition take precedence over
ordinary files, i.e. the alias expansion will be applied first when 
applicable, and only if this fails the shell tries to locate an appropriate 
command.


3) Back-ticks: As the very first step of command line interpretation, the shell
will expand all back-tick sequences on this line. The commands enclosed in
back-ticks will be executed, and the resulting output, after a mild
preprocessing step described below, is inserted instead of the back-tick
sequence. The reader should be aware that the resulting command line might be
very long; even though the shell itself has no problem to process long lines,
some commands may.

The back-tick itself is represented by *`, and hence escaped in the very same
way as the $ sign for variable expansion and the [ bracket of the alias
argument placement operator; this goes also for the messy multiple-star
sequences, hence **` is a literal star followed by a back-tick sequence, and
***` a literal star followed by a literal back-tick, and so on - see the syntax
description of $ for all the side effects one should expect.

The command output of the "backtick'd" command is processed before it replaces
the back-tick sequence, though. The shell removes a final terminating line
feed, should it be present, and replaces all other line feeds by blank spaces.
Finally, the shell handles double quotes in the very same way as for $ and []:
If "keepdoublequotes" is "off", quotes resulting from back-tick sequences
within quotes are removed, and no additional quotes are paired around back-tick
expansions resulting in quoted strings. If "keepdoublequotes" is "on",
however, the shell tries to preserve the meaning of the string as close as 
possible and adds escaping stars to preserve quotes within the command output.

The body of the backtick'd sequence is considered to be a separate command
line and is therefore parsed separately. Especially, quotes outside a back-tick
sequence can never match any quote within, and do not influence the meaning of
the back-tick sequence as far as escape characters are concerned. Hence, in the
following two command lines

echo `echo *.`
echo "`echo *.`"

the star stands for itself, even for the second line. It is not considered to
be "within" the surrounding double quotes and is therefore not treated as an
escape character. However, the reader should be aware of the * as an escape
sequence for the back-tick itself, as mentioned in the first paragraph! The two
commands

echo `echo *`
echo "`echo *`"

will NOT print a single star since the star in front of the second back-tick
escapes this back-tick and hence disables the expansion of the back-tick
sequence at all. One correct way to print a single star by this admittedly
complex method would be 

echo `echo **`

whereas 

echo `echo ***`

prints `echo **` with one star less, namely the star that prevented the last
back-tick to be part of a back-tick sequence. This is again the "count the
stars" rule we have been observing for $ and [].

Needless to say, back-tick expansion works also within the prompt, and for the
command itself. 

`echo type` s:Startup-Sequence

is a rather complex and weird way to display the startup sequence on the
console window, and

prompt "*`date*` %N.%S> "

prints the date in front of the typical CLI number/directory prompt indicator.
The command to be executed within the prompt should be better fast and small,
of course! The reason for the asterisks in front of the back-ticks is left as
a little exercise for the alert reader.


4) Input/Output/Error redirection: Each command to be executed inherits three
I/O streams by the shell. An input stream user input is read from, an output
channel the command prints output to, and a diagnostic channel error codes
should appear on. Unfortunately, the consistent use of the latter has never
been very popular and is not even supported to full extend by various Os 
functions; the PrintFault() dos.library call, for instance, does NOT print
to the diagnostic output. Nevertheless, all three streams can be redirected
by appropriate shell constructions:

An unquoted > followed by a file name defines the output stream, a < plus file
name the input stream and a *> the diagnostic output channel. Should these
sequences be required as part of any argument for the command, they should be
enclosed in double quotes. Should the shell find more than one similar re-
direction on the command line, all additional redirections are regarded as
literal strings. Hence,

echo >t:out >NIL:

prints the string ">NIL:" into the file "t:out". Further, the shell applies
special rules to the commands "alias", "run" and "pipe" leaving the parsing of
the redirection tokens to the command itself. This has the effect that, for
example, the following alias

alias foo echo foo >t:out

defines the "foo" command to print the string "foo" into the "t:out" file,
rather than printing the (empty) result of the alias command into that file.
Similar considerations apply to "run" and "alias" as well, of course.

Furthermore, should the variable "oldredirect" set to "on", then the shell
will only try to interpret redirection tokens following immediately the 
command name; otherwise, the position of the redirection tokens do not matter.

echo "foo" >t:out

will either print "foo" to the file "t:out", or will print "foo >t:out" if
"oldredirect" is set to "on". This is a backwards compatibility feature for
pre V37 shell scripts.

Needless to say, the file name argument of >, < and *> must be quoted if it
contains spaces, and can be as well a variable specification by means of $,
or a back-tick sequence. Therefore,

set t t:out
echo foo >$t

and

echo foo >`echo t:out`

write both "foo" into the file "t:out". Besides these three commands, the
shell also understands the following additional  redirection tokens: 

>> appends the output to a file, or creates it should it be not yet existing.

<> redirects both input and output into the same file. However, since this
implies that the input stream must be "cloned", the "file" must be either
NIL: or any kind of interactive device, e.g. CON: or RAW:. An ordinary file
cannot be written to and read from at the same time.

*>> works much the same way as >> does, but just for the diagnostic output:
It either creates the file, or appends to an existing stream. 

*>< redirects the diagnostic output into the same file as > does. 
Unfortunately, this has to be the default anyhow for compatibility reasons
beyond logic, and is explained below in more detail. *<> is NOT equivalent
to *>< and currently a reserved token.

Last but not least, the shell also understands a very special input
redirection using the << token. This token takes a string as argument and
scans the shell input - typically a batch file containing the << redirection -
up to the specified string and pastes the result into the stdin of the
command. The following example may enlighten the reader a bit:

ask "Enter a number:" numeric to num
echo $num

asks the user for a number - to be entered on the input stream of course - 
and then prints this number. The following silly script will define this
number to be 42:

ask "Enter a number:" numeric to num <<end
42
end
echo $num

The lines between "<<end" and the "end" token form the input of the "ask"
command. This may seem rather un-useful here, but may become important if
a command expects rather lengthy user input on the command line and the
command shall be run as part of a shell script. Due to the way how << works,
though, this construction fails for commands launched by the "run" command;
instead, usage of the "&" token described below is recommended because it
knows how to deal with this kind of input redirection.


Finally, some words concerning the diagnostic output redirection:
Unfortunately, this is not widely used, if it is used at all. Instead, 
commands either print error messages into their ordinary output, or into a
file opened as "*", i.e. the console. The first case is of course out of
control of the shell, but the second case is kept care of: Should the
diagnostic output be some kind of interactive stream, say a console handler,
then *> sets the terminal console task of the command as well, and hence
redirects the "*" output into the  console specified by *>. 

Unfortunately, it is quite typical to detach commands by means of "<>NIL:"
from the console. Without further care, this would NOT redirect stderr
anymore and would therefore break existing shell scripts. Hence, some
messy compatibility rules for *> redirection have been established:

Without *>	The diagnostic error stream goes into the output
		stream, the command console goes into the current
		console. A "run" command would clear the current console
		anyhow, and would hence dispatch the file from the terminal.

With *>		Diagnostic output will go into the specified file,
		the command console is the current console unless
		the specified file is interactive or NIL:. In the
		first case, the program console is set to the 
		controlling console of the stream, in the second
		no terminal console is provided.

With *><	Diagnostic output will go into the output stream,
		the command console will be set to the console
		controlling the output stream should this be inter-
		active, or will be cleared in case the output goes
		to NIL:

Similar rules apply if the command is run in background by means of the "&"
token, except that there is no default command console unless one is provided
by means of *> or *><. This makes it a bit easier to detach commands from the
shell by using "&" and "<>NIL:". 


5) The "&" token: Should the shell find a & sign on the command line that is
surrounded by blank spaces and neither quoted, then this sign is removed from
the command line and the command gets detached from the shell very similar to
what "run" would do. 

By default, these commands get a new separate console, and a new console owner
should the console driver support this concept (ViNCEd does, the ROM CON: 
handler does not) and hence make use of the "job control feature" of the
console.

These commands will keep the console window locked open unless you explicitly
redirect the command input and output to NIL: - this is simply because
commands launched in this way are handled very similar to commands launched
by "run".

There is, however, one important difference between "run" and "&": Unlike
"run", the "&" sign does not detach the command to be run from the standard
input. Hence, "ask &" will still ask for input on the console, "run ask" will
not. Furthermore, "run" doesn't mix with << input redirection, "&" does. All
this is because the "&" sign is handled by the shell directly and is much
better integrated into the internal wiring than "run" could.


6) The "+" token: Should a command line end on a single + sign, the shell
waits for additional input and concatenates this input with the previous
command input, including the line feed, but excluding the + sign itself.

However, this second line is not parsed as a shell command line, but is
rather placed on the command arguments of the command of the previous line.
Since the ROM ReadArgs() and ReadItem() functions only parse up to the first
line feed character, this additional line is typically lost and is only of
interest for special commands like "run" that read their arguments in a
non-standard way.


7) The ";" token: If found unquoted on the command line, this token introduces
a comment. All characters following the semicolon, and the semicolon itself
are removed from the command arguments.


8) Loading of commands: Besides aliases that have been discussed above, the
shell knows also the following types of commands:

The PIPE command, which is used as soon as one of the pipe tokens is found on
the command line, resident (or so-called "build-in", though this nomenclature
is sort of incorrect) commands, implicit commands, commands on the command
search path, and commands within the current directory.

Should the shell find that the user defined either the _pchar or the _mchar
local variables, then the command line is scanned for the contents of these
variables. For historical reasons, both variables should contain strings of at
most two characters in size.

If these characters are found, then the shell assumes an implicit "PIPE"
command that is prepended to the full command line before further parsing.

Resident commands are found on the dos.library "resident command" list and
are program segments similar to those generated by the LoadSeg() command.
Unlike the latter, they do not get a "fresh" copy of their segments each
time but are assumed not to touch their code and data space; hence, not all
commands should be made resident as they are not "pure". Pure commands are
usually in ROM space, or can be placed onto the list of residents by the
"resident" command, which, by itself, is resident. The "resident" command
checks the "p" bit of the command file whether a certain command can be
made resident or not. Since the "p" bit can be set by hand, this is not
much of a safety feature, though. There original "arp" ("Amiga Replacement
Project") commands where the resident feature goes back to had an additional
check concerning the purity of commands.

If a command is neither resident nor an alias, the shell tries to find it on
the command path list, which is controlled by the "path" command. If an
executable binary load file is found, this is assumed to be the command;
should the "h" = "hold" bit of this executable be set as well, the command is
made resident as it is loaded. The same care as for the "p" bit should 
apply to the "h" bit as well.

If the file in question is not executable, but the "s" bit is set, then the
first two characters of the file are investigated. If they are "/*", hence a
Rexx-comment introducer, a rexx script is assumed and an implicit "RX" command
is prepended to the argument line. If they are "#!" or ";!", the first line
of the script is assumed to be the file name of a script interpreter. This
name must be enclosed in double quotes should it contain blanks. The script
name itself gets then inserted BETWEEN the command interpreter file name, and
its options on this line.

Therefore, the following command interpreter specification on the first line
of the file "foo"

#! type HEX

will run the following text of foo thru the interpreter "type" with the option
"HEX", as if 

type foo HEX

has been input on the command line. Note that the HEX option is placed behind 
the "script" file name, and not in front of it.

If neither a "/*", a "#!" nor a ";!" is found, the script is assumed to be a
shell script and run thru the "Execute" command. Curiously, "Execute" expands
only the arguments and generates a temporary script file, but the actual
command interpretation of this file is left to the shell again by making use
of a not-so-well documented side effect of the shell segment.

If the found file is neither a script, nor executable, the shell checks for
the variable "VIEWER". Should it be found, then the shell further checks
whether the file is any known data type by using the datatypes library. If so,
then the contents of this variable is understood as a command that should be
run to display this file. For this, the contents of the VIEWER variable is
placed in front of the command line before further interpretation. Typically,
VIEWER should be set to the MultiView command, of course:

Setenv SAVE VIEWER MultiView


Finally, if the "file" is in fact a directory, the implicit command "CD" is
assumed and inserted in front of the command line.


The shell can be restricted to search on the current directory only by
enclosing the command within double quotes; this will remove the resident
command list and the command path from the search targets the shell will scan
for commands.


9) Miscellaneous: The shell scans segment except INTERNAL commands for the
"magic cookie" 

$STACK:

Should it be found, then a decimal string terminated by a line feed (!) is
expected. This string is then regarded as a request to enlarge the stack to
at least the specified amount of bytes.

Therefore, the following string near the startup code of your program will
guarantee at least 10240 bytes of stack:

char stack[]="$STACK:10240\n";

Note that the "\n" is necessary and that "STACK" is in capitals.