String

From MZXWiki
Revision as of 13:00, 13 June 2008 by Wervyn (talk | contribs) (touchup)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

The term "string" in MegaZeux is now most commonly used to refer to string variables, introduced in a limited capacity in version 2.62. MZX also refers to any value enclosed in quotes (") in Robotic code as a string, with interpretation dependent on context. However, this meaning is so trivial and general that it would be more productively explored in specific articles on Robotic functionality.

Basic String Syntax

String variables (henceforth referred to as "strings" for brevity) work in much the same way as counters, in terms of where and how they can be used in Robotic. Any counter name prefixed with a dollar sign ($) is interpreted as a string. Like counters, strings can be assigned to with set, interpolated with ampersands (&), compared with an if branch (less-than and greater-than are quirky, however), and read and written to and from files. Strings can also be concatenated using the inc command. The value assigned to a string depends on context:

set "$string" to 42            sets a string to the text "42", from a literal value
set "$string" to "42"          sets a string to the text "42", from a literal string value
set "$string" to "counter"     sets a string to the text "counter", a literal string value
set "$string" to "&counter&"   sets a string to the value of the counter "counter", via interpolation
set "$string" to "$string2"    sets a string to the value of another string, "$string2".
set "$string" to "fread"       sets a string to a value read from a file, terminated by an asterisk (*)
set "$string" to "fwrite"      does NOT set, but WRITES a string to a file, terminating it with an asterisk (*)
set "$string" to "board_name"  sets a string to the name of the current board

As you can see, there are three basic cases and some special considerations. First, a string set to a numeric literal becomes a string representation of that number. Second, a string set to a STRING literal becomes (in most cases), that literal, NOT a counter named by that literal. In order to set a string to the value of a numeric counter, that counter must be interpolated into the string literal. Finally, a string can be set to another string (i.e. when the second value also begins with $). However, in addition to these general cases, there are also some special keywords interpreted as built-in string values instead of literals. Currently, these include "fread" and "fwrite", and both of these followed by a number value, for file access; "robot_name", "board_name", and "mod_name", which return the name of the currently executing robot, the current board, and the currently playing module music file; and "INPUT", which is itself interpreted as a string, and is the value returned from an "input string "Prompt"" command. In order to actually set a string to these literal values (or a literal string beginning with $ for that matter), it must be specially constructed using other commands.

Much of this same syntax and behavior can be applied with the "inc" command to append to a string. Notable exceptions to this are that numeric literals do not work (the string is not altered), and must be enclosed in quotes; and the special keyword counters used with set do not apply, they must be set to other strings first to be concatenated as values instead of literals. Also, attempting to increment a regular counter with a string (or set a regular counter to a string, for that matter) will numerically evaluate the string. That is, if the string itself is a number, like "42" or "-14", the counter will be modified by that value. If the string can't be evaluated, it will default to 0.

The "dec" command also has a meaning when applied to strings, but only a numeric one: it removes a number of characters from the end of the string, either defined by a counter or given as a literal. Decreasing a string by another string will only work if the second string represents a number. Other mathematical commands like "multiply" or "half" have no defined effects for strings, and will thus numerically evaluate the string in the first parameter, normally resulting in 0.

Two strings can be compared with an if statement in the same way two other values can. The most useful comparisons here are probably "=" and "!=" (equality and inequality), but strings CAN be compared as being less-than or greater-than one another. This is slightly quirky, though; if two strings are of different lengths, the shorter string is ALWAYS less than the longer one, regardless of any alphabetical comparison. Only when the strings are the same length will an alphabetical comparison take place. (That is, comparing the ASCII values of each respective character in each string, such that "test" is less than "text" but greater than "TEXT", because "s" is less than "x" but "t" is greater than "T".)

if "$string1" = "$string2" then "EQUAL"
if "$string1" != "$string2" then "UNEQUAL"   basic syntax for string comparisons

As with setting, strings can be compared to string or number literals as well. In this case, the variable string must be the first value, like so:

if "$string1" < "test" then "STRING"  performs a less-than comparison based on the value of $string1 and "test"
if "test" > "$string1" then "NUMBER"  performs a greater-than comparison based on the value of the counter "test"
                                      and the numeric value of $string1
if "$string1" = 42 then "MATCH"       compares the equality of $string1 with the string representation of 42
if 42 = "$string1" then "MATCHAGAIN"  compares the equality of the number 42 with the numeric representation of $string1

The last two comparisons actually accomplish the same thing, but in notably different ways.

Finally, strings can be interpolated using ampersands, just like any other counter. Attempting to do this with expression syntax, however, will numerically evaluate the string instead of include its string value. So for instance:

* "~F&$string&"     displays the value of $string
* "~F('$string')"   displays the numeric evaluation of $string

An interesting note about this, is that in places where the string is interpolated into a value that understands ~ and @ color codes, those codes can be included in the string as well. This really shouldn't be surprising at all, but it is worth pointing out. Also, while things like "robot_name" are special keyword hacks for the set command only, "INPUT" has been in MZX since before any other strings were, and can be interpolated just fine.

Advanced String Syntax

Strings also support a variety of special options that can be used to slice and index them. These take the form of suffixes, added to the name of the string. A plus (+) is used to specify an offset into the string, while a hash (#) specifies a clip length. These characters are added to the end of the string name and followed by a number as a parameter, which can of course be interpolated from a counter (e.g. "$string+5#&clip&" is a substring starting at offset 5 for length "clip). In addition, a specific character in a string can be indexed using a period (.). "$string.40" returns the byte value of the 40th character (indexed from 0) in the string. This is notably different from "$string+40#1", which returns the string value of that character. And finally, all of these modifiers can be assigned to, not simply read from. This will work even if they represent substrings or indexes outside the current positive range of the string (negative indexes do not make sense and will not work); doing so will extend the length of the string and, as of version 2.82, populate any new characters with spaces. Below are some examples of these modifiers in action:

set "$string" to "$string+5"                removes the first five characters from $string
set "$string  to "$string#5"                removes all but the first five characters from $string
set "$string" to "$string+5#5"              clips out all but the substring from character 5 to character 9
set "$string" to "$string.5"                sets the string to the numeric value of its 5th character
set "$string+5" to "blah"                   replaces the end of the string (after 5 characters) with "blah"
set "$string#5" to "lots of text"           replaces the string with the first 5 characters of "lots of text"
                                            (probably a bug)
set "$string+0#5" to "lots of text"         same as previous (almost certainly a bug)
set "$string+5#5" to "lots of text"         replaces a substring in the string with the first 5 characters of
                                            "lots of text" (intended behavior)
set "$string+&$string.length&" to "append"  appends "append" to the end of the string

The last example demonstrates on final feature of strings: the special value "$string.length" is, as one might expect, the length of the string. This value is supposed to be read-only, but currently assigning a value to it seems to interpret it as "$str.0", and assigns a new value to the first character of the string.