Amiga-OS Shell Documentation Abstract: This document describes the command line syntax and features of the Amiga-OS Shell. Specifically, the shell described hereafter is the V45 shell of BB2. Not all issues apply to the V40 release of the shell. ______________________________________________________________________________ The purpose of the Amiga-OS shell is to parse its input stream for commands, then locate these commands either on disk or in memory, and execute these commands. An input stream is typically the stream of a console handler, as for example the CON: window or a ViNCEd window. Other input streams are command files, or "batch files" as they are called sometimes. The most popular batch file is of course the "Startup-Sequence" in the S: assign. The input stream is therefore read line by line, that is, up to a terminating line feed character (or the end of file, where the shell generates this line feed implicitly). Should the input be read from a console window, this line feed is of course generated by the console handler as soon as the user presses the return key. The input line is then scanned for the command itself and for the input/output and error redirection of the command defining the streams the command will print to, will read its input from and may write its diagnostic output to. Furthermore, the shell also handles expansion of back-tick sequences, aliases and variables. A special feature of the shell is the "&" character and the "+" sign at the end of a line, further the comment introducer ";". All this is described in more detail below. Even though the shell parses the command line, it does not prepare argument lists for the command to be executed; the command line is scanned for shell specific control features like input/output redirection or back-ticks, and the remaining input is forwarded as is to the command. It is the matter of the command itself to parse this input string a second time. This has the unfortune side effect that the shell has to second-guess about the expected syntax of the appropriate command; for the time being, the shell assumes the following syntax, which - surprise, surprise - coincides with the syntax of the ROM function ReadArgs() resp. ReadItem() used by most commands anyhow: This pair of commands knows only two special characters - or maybe four, depending on how you count: Blanks = the space character and the TAB which are considered equivalent, the double quote " and the asterisk * as the BCPL style escape character. A command argument is to be understood as a sequence of non-blank characters up to the line feed, unless the blanks and the remaining argument are enclosed in double quotes. Interestingly, this goes only for quotes following a blank space. Quotes within an argument, meaning not following a blank, are not counted as argument separators. Hence, list abc list ab"cd"ef list abcd" present all three the attempt to run "list" with one single argument in which the quote stands for itself. Within quoted arguments, and only there(!) the escape character * gets its special meaning. Outside of quotes, the star stands for itself. Four *-escaped sequences exist: ** is the star itself, *" the double quote, *N a line feed character and *E an "escape" character, of ASCII code 27 = hex 0x1B. Therefore, echo "**" echo ** prints one star, and two stars respectively. Unlike other claims, the sequence "*E[" is not directly related to the "csi" character of code 155 = 0x9b; as far as argument parsing is concerned, this is regarded as "escape [". It is merely the console driver which interprets the latter as a substitute for "csi", neither ReadItem() nor ReadArgs() do. Even though these are all syntax rules as far as the ROM argument parsing routines are concerned, the shell itself parses its input line as well, and performs various operations on this line before it is provided as input to the command to be executed. The first step is back-tick substitution; then the first argument according to the above rules is parsed off, ignoring leading blanks. Alias substitution follows next. As soon as the alias expansion fails to expand further, it checks whether the command name is a shell variable to be substituted, and then tries to locate the command to be executed. This may be either an explicit command found in the current directory, the command search path or the resident list, or an implicit command for script files, non-executable files or directories. The last step parses the command arguments for shell variables to be substituted, and for input/output redirection. All these steps are described in more detail in the following sections. 1) Variable substitution: Local or global variables are set by means of the "set" or "setenv" commands. Both are build into the ROM, or the Shell-Segment, as "resident commands". Should the shell find a variable specification on the command line, this specification is replaced by the contents of the variable before the line is feed into the command as input, or used as the command to be loaded directly. A variable specification is either a $ sign, followed by alpha-numeric characters, or a "${" sequence followed by arbitrary characters up to the line end and a pairing "}" brace to end the variable name. The braces itself do not count as part of the variable name itself and are therefore discarded. Should the variable be undefined, no substitution takes place and the $ followed by the supposed to be variable name stands for itself. An alpha-numeric character is here any digit, any lower or upper case character or any national character within the code range of 161..255, i.e. hex 0xa1..0xff. Hence, echo $a echo ${a} print the same text if there should be a variable named "a", otherwise both print the argument literally. Even though the above side effect can be used to test whether a variable has been set or not, the shell offers a more orthogonal way to check the existence of a variable, namely by $? followed by the variable to be tested for existence. This expression returns the string 0 if the variable is not defined, or 1 otherwise. Hence, echo $?a echo $?{a} generate the same output, 0 for a undefined and 1 for a defined. Note that the ? is part of the $, and not part of the variable name. The second related sequence is $?? which returns 1 only if the variable is defined as a global variable, i.e. exists as a file within the ENV: device. The usage of { } allows rather weird variable names, even names containing spaces, escape characters, question marks, quotes, ">" or "<". Due to a side effect of the internal working, ":" does NOT work as part of any variable name; unfortunately this cannot be fixed since variables are stored by a device handler that reacts in a special way on ":". Similar restrictions arise for the slash / that implies a special meaning for global variables. (Make a guess how!) The $ itself can be escaped by an asterisk, hence *$ stands for a single $. Unfortunately, THIS meaning of escaping is independent of the * concerning the argument parsing mentioned above, and is therefore valid within and outside of quotes, but only in front of a $ sign (and other shell-only tokens like the back-tick ` and the bracket [, see below). For the same reason, the parsing of variable names, though not variable expansion, works the same way within and outside of quotes, except that stars * that are not in front of $ have a different meaning. Therefore, confusingly, echo ${*} echo "${**}" address the same variable, namely *, whereas both echo *$ echo "*$" print the same string $, regardless of the quotes. The escaping can be escaped with another star as well. Hence, echo **$a will print a star, followed by the contents of a. This kind of escaping the $ sign is independent of the quotation as well and works only in front of $ and related shell-only tokens. Therefore, echo "**$a" does right the same. More precisely, $ stands for itself with an odd number of stars in front, and is regarded as variable expansion command with an even number of stars. Since variable substitution happens before argument analysis, echo "***$a" will be parsed as "star, star, escaped dollar, a" on expanding variables, and the resulting string is analyzed as argument of "echo" as "**$a" with the $ meaning of a literal "$". Now, the star escapes the second star within the quoted argument: echo prints therefore *$a onto the console. Since * is not an escape character for itself within arguments, echo ***$a will print one star more instead, but the interpretation of $ does not change. Clearly, all this is messy, and can be understood only in the historical context of the Amiga shell where $ was added after argument parsing syntax has been set in stone. The very same rules for escaping apply to ` and [ as well, for exactly the same reason. The main idea of the above rules is that $ should not influence argument parsing "too much", should work outside and inside of quotes, but should be escapable by the same BCPL syntax. Looking back, a backslash-driven and more consistent escaping mechanism would have been more appropriate, of course. A second rule formulates how variable expansion interacts with quoting: Should the $ sequence be within double quotes, and should the variable contents be quoted as well, the outermost pair of quotes of the variable contents gets removed. Otherwise, the result could have been a quote within a quote, undesirably terminating the argument. The following example demonstrates this: set a "*"hi*"" ;defines a as "hi", note the escaping of quotes ;due to a first parsing step before executing the ;set command. echo "a is $a" ;results in "a is hi" Leaving the quotes around $a would have resulted in echo "a is "hi"" This would be interpreted as two arguments rather than one, the first one being "a is " and the second one being hi"". If this is undesirable, set the local variable "keepdoublequotes" to "on" by the following command set keepdoublequotes on Thereafter, the above command echo "a is $a" ;results in "a is *"hi*"" and prints a is "hi" will get the quotes escaped properly such that the "echo" command really prints the text including the quotes of the variable contents. The very same idea of "double quoting" is applied to variable expansion outside of quotes, except that the shell now either leaves the possibly quoted variable contents alone, or adds quoting and escaping to guarantee that the command reads the quotes as part of its argument. Therefore, the command echo $a ;results in "hi" with keepdoublequotes off ;results in "*"hi*"" with keepdoublequotes on prints either hi or "hi", depending on whether "keepdoublequotes" is "off" or "on". In most cases, you might find "off" more attractive as it is backwards compatible, even though it drops one level of quoting. However, if "keepdoublequotes" is set to "on", the shell tries to preserve as much of the original meaning of the variable, possibly adding escape characters. 2) Alias substitution: Before the shell is trying to locate a command, it tries to expand this as an alias. Aliases are command renames with additional argument relocation, to be defined by the shell-resident "alias" command. On invocation, the alias is then replaced by the contents of the alias, "shifting remaining arguments to the right" after inserting a blank. Hence, alias foo echo a foo b will result in the command "echo a b" and the output "a b". With the aid of the additional alias token [] one can relocate the remaining arguments to other places, e.g. in front of additional arguments of the alias contents. No blank is inserted in this case: alias foo echo [] a foo b will move the "b" in place of the [] and therefore print "b a", the opposite of the above. No blank spaces are inserted in front of the [], this should be done manually if required. alias foo echo []a foo b will therefore really print "ba". Unfortunately, the brackets do not try to analyze the argument syntax for traditional reasons and work therefore only the first time within the alias contents, i.e. alias foo echo [] a [] will not work as expected. The second [] is regarded as literal bracket. Future releases might introduce further refinements of [] that allow finer control of the arguments. Similar to the $ sign, the [] can be escaped by a *, applying the very same messy rules concerning the counting of stars. However, one should now be even aware that aliases must enter the shell database somehow: This is of course done by the "alias" command which, by itself, parses its arguments *again*. Hence, should it be necessary to enclose the "alias body" of the alias command in double quotes, then all stars MUST BE DOUBLED. alias bracket "echo **[]" bracket will just print [] on the console. The ** is escaped into a * by the argument parsing of the alias command, and *[] escapes into [] upon expansion of the alias. Once again, the same quotation rules as for variable expansion apply: By setting "keepdoublequotes" to "on", you may define whether the shell shall prevent quotes or shall drop one level of quoting. Alias expansion happens recursively until no alias are found to expanded any more: If the expansion of an alias happens to be an alias again, then this alias is expanded, and so on. Since this would imply the possibility of an infinite loop of an alias that expands into itself, or into an alias that expands into an alias it was expanded from, no alias will be used more than once within a single command line. Hence, the following alias definition is legal: alias hi ho alias ho echo "ho" and defines the "hi" command to output "ho", whereas the following alias definition is asking for trouble: alias gnu gnu is not unix This will first expand "gnu" into "gnu is not unix", and will stop then since the "gnu" alias has been "used up" already. Since the gnu command cannot be resolved further - unless of course you provide this command as a binary - an error message will result. Expansion of aliases can be prevented completely by enclosing the command in quotes. "hi", therefore, will not try to use the above hi-alias, but will try to find a file named "hi". Otherwise, alias definition take precedence over ordinary files, i.e. the alias expansion will be applied first when applicable, and only if this fails the shell tries to locate an appropriate command. 3) Back-ticks: As the very first step of command line interpretation, the shell will expand all back-tick sequences on this line. The commands enclosed in back-ticks will be executed, and the resulting output, after a mild preprocessing step described below, is inserted instead of the back-tick sequence. The reader should be aware that the resulting command line might be very long; even though the shell itself has no problem to process long lines, some commands may. The back-tick itself is represented by *`, and hence escaped in the very same way as the $ sign for variable expansion and the [ bracket of the alias argument placement operator; this goes also for the messy multiple-star sequences, hence **` is a literal star followed by a back-tick sequence, and ***` a literal star followed by a literal back-tick, and so on - see the syntax description of $ for all the side effects one should expect. The command output of the "backtick'd" command is processed before it replaces the back-tick sequence, though. The shell removes a final terminating line feed, should it be present, and replaces all other line feeds by blank spaces. Finally, the shell handles double quotes in the very same way as for $ and []: If "keepdoublequotes" is "off", quotes resulting from back-tick sequences within quotes are removed, and no additional quotes are paired around back-tick expansions resulting in quoted strings. If "keepdoublequotes" is "on", however, the shell tries to preserve the meaning of the string as close as possible and adds escaping stars to preserve quotes within the command output. The body of the backtick'd sequence is considered to be a separate command line and is therefore parsed separately. Especially, quotes outside a back-tick sequence can never match any quote within, and do not influence the meaning of the back-tick sequence as far as escape characters are concerned. Hence, in the following two command lines echo `echo *.` echo "`echo *.`" the star stands for itself, even for the second line. It is not considered to be "within" the surrounding double quotes and is therefore not treated as an escape character. However, the reader should be aware of the * as an escape sequence for the back-tick itself, as mentioned in the first paragraph! The two commands echo `echo *` echo "`echo *`" will NOT print a single star since the star in front of the second back-tick escapes this back-tick and hence disables the expansion of the back-tick sequence at all. One correct way to print a single star by this admittedly complex method would be echo `echo **` whereas echo `echo ***` prints `echo **` with one star less, namely the star that prevented the last back-tick to be part of a back-tick sequence. This is again the "count the stars" rule we have been observing for $ and []. Needless to say, back-tick expansion works also within the prompt, and for the command itself. `echo type` s:Startup-Sequence is a rather complex and weird way to display the startup sequence on the console window, and prompt "*`date*` %N.%S> " prints the date in front of the typical CLI number/directory prompt indicator. The command to be executed within the prompt should be better fast and small, of course! The reason for the asterisks in front of the back-ticks is left as a little exercise for the alert reader. 4) Input/Output/Error redirection: Each command to be executed inherits three I/O streams by the shell. An input stream user input is read from, an output channel the command prints output to, and a diagnostic channel error codes should appear on. Unfortunately, the consistent use of the latter has never been very popular and is not even supported to full extend by various Os functions; the PrintFault() dos.library call, for instance, does NOT print to the diagnostic output. Nevertheless, all three streams can be redirected by appropriate shell constructions: An unquoted > followed by a file name defines the output stream, a < plus file name the input stream and a *> the diagnostic output channel. Should these sequences be required as part of any argument for the command, they should be enclosed in double quotes. Should the shell find more than one similar re- direction on the command line, all additional redirections are regarded as literal strings. Hence, echo >t:out >NIL: prints the string ">NIL:" into the file "t:out". Further, the shell applies special rules to the commands "alias", "run" and "pipe" leaving the parsing of the redirection tokens to the command itself. This has the effect that, for example, the following alias alias foo echo foo >t:out defines the "foo" command to print the string "foo" into the "t:out" file, rather than printing the (empty) result of the alias command into that file. Similar considerations apply to "run" and "alias" as well, of course. Furthermore, should the variable "oldredirect" set to "on", then the shell will only try to interpret redirection tokens following immediately the command name; otherwise, the position of the redirection tokens do not matter. echo "foo" >t:out will either print "foo" to the file "t:out", or will print "foo >t:out" if "oldredirect" is set to "on". This is a backwards compatibility feature for pre V37 shell scripts. Needless to say, the file name argument of >, < and *> must be quoted if it contains spaces, and can be as well a variable specification by means of $, or a back-tick sequence. Therefore, set t t:out echo foo >$t and echo foo >`echo t:out` write both "foo" into the file "t:out". Besides these three commands, the shell also understands the following additional redirection tokens: >> appends the output to a file, or creates it should it be not yet existing. <> redirects both input and output into the same file. However, since this implies that the input stream must be "cloned", the "file" must be either NIL: or any kind of interactive device, e.g. CON: or RAW:. An ordinary file cannot be written to and read from at the same time. *>> works much the same way as >> does, but just for the diagnostic output: It either creates the file, or appends to an existing stream. *>< redirects the diagnostic output into the same file as > does. Unfortunately, this has to be the default anyhow for compatibility reasons beyond logic, and is explained below in more detail. *<> is NOT equivalent to *>< and currently a reserved token. Last but not least, the shell also understands a very special input redirection using the << token. This token takes a string as argument and scans the shell input - typically a batch file containing the << redirection - up to the specified string and pastes the result into the stdin of the command. The following example may enlighten the reader a bit: ask "Enter a number:" numeric to num echo $num asks the user for a number - to be entered on the input stream of course - and then prints this number. The following silly script will define this number to be 42: ask "Enter a number:" numeric to num < sets the terminal console task of the command as well, and hence redirects the "*" output into the console specified by *>. Unfortunately, it is quite typical to detach commands by means of "<>NIL:" from the console. Without further care, this would NOT redirect stderr anymore and would therefore break existing shell scripts. Hence, some messy compatibility rules for *> redirection have been established: Without *> The diagnostic error stream goes into the output stream, the command console goes into the current console. A "run" command would clear the current console anyhow, and would hence dispatch the file from the terminal. With *> Diagnostic output will go into the specified file, the command console is the current console unless the specified file is interactive or NIL:. In the first case, the program console is set to the controlling console of the stream, in the second no terminal console is provided. With *>< Diagnostic output will go into the output stream, the command console will be set to the console controlling the output stream should this be inter- active, or will be cleared in case the output goes to NIL: Similar rules apply if the command is run in background by means of the "&" token, except that there is no default command console unless one is provided by means of *> or *><. This makes it a bit easier to detach commands from the shell by using "&" and "<>NIL:". 5) The "&" token: Should the shell find a & sign on the command line that is surrounded by blank spaces and neither quoted, then this sign is removed from the command line and the command gets detached from the shell very similar to what "run" would do. By default, these commands get a new separate console, and a new console owner should the console driver support this concept (ViNCEd does, the ROM CON: handler does not) and hence make use of the "job control feature" of the console. These commands will keep the console window locked open unless you explicitly redirect the command input and output to NIL: - this is simply because commands launched in this way are handled very similar to commands launched by "run". There is, however, one important difference between "run" and "&": Unlike "run", the "&" sign does not detach the command to be run from the standard input. Hence, "ask &" will still ask for input on the console, "run ask" will not. Furthermore, "run" doesn't mix with << input redirection, "&" does. All this is because the "&" sign is handled by the shell directly and is much better integrated into the internal wiring than "run" could. 6) The "+" token: Should a command line end on a single + sign, the shell waits for additional input and concatenates this input with the previous command input, including the line feed, but excluding the + sign itself. However, this second line is not parsed as a shell command line, but is rather placed on the command arguments of the command of the previous line. Since the ROM ReadArgs() and ReadItem() functions only parse up to the first line feed character, this additional line is typically lost and is only of interest for special commands like "run" that read their arguments in a non-standard way. 7) The ";" token: If found unquoted on the command line, this token introduces a comment. All characters following the semicolon, and the semicolon itself are removed from the command arguments. 8) Loading of commands: Besides aliases that have been discussed above, the shell knows also the following types of commands: The PIPE command, which is used as soon as one of the pipe tokens is found on the command line, resident (or so-called "build-in", though this nomenclature is sort of incorrect) commands, implicit commands, commands on the command search path, and commands within the current directory. Should the shell find that the user defined either the _pchar or the _mchar local variables, then the command line is scanned for the contents of these variables. For historical reasons, both variables should contain strings of at most two characters in size. If these characters are found, then the shell assumes an implicit "PIPE" command that is prepended to the full command line before further parsing. Resident commands are found on the dos.library "resident command" list and are program segments similar to those generated by the LoadSeg() command. Unlike the latter, they do not get a "fresh" copy of their segments each time but are assumed not to touch their code and data space; hence, not all commands should be made resident as they are not "pure". Pure commands are usually in ROM space, or can be placed onto the list of residents by the "resident" command, which, by itself, is resident. The "resident" command checks the "p" bit of the command file whether a certain command can be made resident or not. Since the "p" bit can be set by hand, this is not much of a safety feature, though. There original "arp" ("Amiga Replacement Project") commands where the resident feature goes back to had an additional check concerning the purity of commands. If a command is neither resident nor an alias, the shell tries to find it on the command path list, which is controlled by the "path" command. If an executable binary load file is found, this is assumed to be the command; should the "h" = "hold" bit of this executable be set as well, the command is made resident as it is loaded. The same care as for the "p" bit should apply to the "h" bit as well. If the file in question is not executable, but the "s" bit is set, then the first two characters of the file are investigated. If they are "/*", hence a Rexx-comment introducer, a rexx script is assumed and an implicit "RX" command is prepended to the argument line. If they are "#!" or ";!", the first line of the script is assumed to be the file name of a script interpreter. This name must be enclosed in double quotes should it contain blanks. The script name itself gets then inserted BETWEEN the command interpreter file name, and its options on this line. Therefore, the following command interpreter specification on the first line of the file "foo" #! type HEX will run the following text of foo thru the interpreter "type" with the option "HEX", as if type foo HEX has been input on the command line. Note that the HEX option is placed behind the "script" file name, and not in front of it. If neither a "/*", a "#!" nor a ";!" is found, the script is assumed to be a shell script and run thru the "Execute" command. Curiously, "Execute" expands only the arguments and generates a temporary script file, but the actual command interpretation of this file is left to the shell again by making use of a not-so-well documented side effect of the shell segment. If the found file is neither a script, nor executable, the shell checks for the variable "VIEWER". Should it be found, then the shell further checks whether the file is any known data type by using the datatypes library. If so, then the contents of this variable is understood as a command that should be run to display this file. For this, the contents of the VIEWER variable is placed in front of the command line before further interpretation. Typically, VIEWER should be set to the MultiView command, of course: Setenv SAVE VIEWER MultiView Finally, if the "file" is in fact a directory, the implicit command "CD" is assumed and inserted in front of the command line. The shell can be restricted to search on the current directory only by enclosing the command within double quotes; this will remove the resident command list and the command path from the search targets the shell will scan for commands. 9) Miscellaneous: The shell scans segment except INTERNAL commands for the "magic cookie" $STACK: Should it be found, then a decimal string terminated by a line feed (!) is expected. This string is then regarded as a request to enlarge the stack to at least the specified amount of bytes. Therefore, the following string near the startup code of your program will guarantee at least 10240 bytes of stack: char stack[]="$STACK:10240\n"; Note that the "\n" is necessary and that "STACK" is in capitals.