Next Previous Contents

## 4.Data Types and Literal Constants

The current implementation of the S-Lang language permits up to 65535 distinct data types, including predefined data types such as integer and floating point, as well as specialized application-specific data types. It is also possible to create new data types in the language using the `typedef` mechanism.

Literal constants are objects such as the integer 3 or the string `"hello"`. The actual data type given to a literal constant depends upon the syntax of the constant. The following sections describe the syntax of literals of specific data types.

## 4.1Predefined Data Types

The current version of S-Lang defines integer, floating point, complex, and string types. It also defines special purpose data types such as `Null_Type`, `DataType_Type`, and `Ref_Type`. These types are discussed below.

### Integers

The S-Lang language supports both signed and unsigned characters, short integer, long integer, and long long integer types. On most 32 bit systems, there is no difference between an integer and a long integer; however, they may differ on 16 and 64 bit systems. Generally speaking, on a 16 bit system, plain integers are 16 bit quantities with a range of -32767 to 32767. On a 32 bit system, plain integers range from -2147483648 to 2147483647.

An plain integer literal can be specified in one of several ways:

• As a decimal (base 10) integer consisting of the characters 0 through 9, e.g., 127. An integer specified this way cannot begin with a leading 0. That is, 0127 is not the same as 127.
• Using hexadecimal (base 16) notation consisting of the characters 0 to 9 and `A` through `F`. The hexadecimal number must be preceded by the characters `0x`. For example, `0x7F` specifies an integer using hexadecimal notation and has the same value as decimal 127.
• In Octal notation using characters 0 through 7. The Octal number must begin with a leading 0. For example, 0177 and 127 represent the same integer.
• In Binary notation using characters 0 and 1 with the `0b` prefix. For example, 21 may be expressed in binary using `0b10101`.

Short, long, long long, and unsigned types may be specified by using the proper suffixes: `L` indicates that the integer is a long integer, `LL` indicates a long long integer, `h` indicates that the integer is a short integer, and `U` indicates that it is unsigned. For example, `1UL` specifies an unsigned long integer.

Finally, a character literal may be specified using a notation containing a character enclosed in single quotes as `'a'`. The value of the character specified this way will lie in the range 0 to 256 and will be determined by the ASCII value of the character in quotes. For example,

``` i = '0'; ```
assigns to `i` the character 48 since the `'0'` character has an ASCII value of 48.

A ``wide'' character (unicode) may be specified using the form '\x{y...y}' where `y...y` are hexadecimal digits. For example,

``` '\x{12F}' % Latin Small Letter I With Ogonek; '\x{1D7BC}' % Mathematical Sans-Serif Bold Italic Small Sigma ```

Any integer may be preceded by a minus sign to indicate that it is a negative integer.

### Floating Point Numbers

Single and double precision floating point literals must contain either a decimal point or an exponent (or both). Here are examples of specifying the same double precision point number:

``` 12. 12.0 12e0 1.2e1 120e-1 .12e2 0.12e2 ```
Note that 12 is not a floating point number since it contains neither a decimal point nor an exponent. In fact, 12 is an integer.

One may append the `f` character to the end of the number to indicate that the number is a single precision literal. The following are all single precision values:

``` 12.f 12.0f 12e0f 1.2e1f 120e-1f .12e2f 0.12e2f ```

### Complex Numbers

The language implements complex numbers as a pair of double precision floating point numbers. The first number in the pair forms the real part, while the second number forms the imaginary part. That is, a complex number may be regarded as the sum of a real number and an imaginary number.

Strictly speaking, the current implementation of the S-Lang does not support generic complex literals. However, it does support imaginary literals permitting a more generic complex number with a non-zero real part to be constructed from the imaginary literal via addition of a real number.

An imaginary literal is specified in the same way as a floating point literal except that `i` or `j` is appended. For example,

``` 12i 12.0i 12e0j ```
all represent the same imaginary number.

A more generic complex number may be constructed from an imaginary literal via addition, e.g.,

``` 3.0 + 4.0i ```
produces a complex number whose real part is `3.0` and whose imaginary part is `4.0`.

The intrinsic functions `Real` and `Imag` may be used to retrieve the real and imaginary parts of a complex number, respectively.

### Strings

A string literal must be enclosed in double quotes as in:

``` "This is a string". ```
As described below, the string literal may contain a suffix that specifies how the string is to be interpreted, e.g., a string literal such as
``` "\$HOME/.jedrc"\$ ```
with the '\$' suffix will be subject to variable name expansion.

Although there is no imposed limit on the length of a string, single-line string literals must be less than 256 characters in length. It is possible to construct strings longer than this by string concatenation, e.g.,

``` "This is the first part of a long string" + " and this is the second part" ```

S-Lang version 2.2 introduced support for multi-line string literals. There are basic variants supported. The first makes use of the backslash at the end of a line to indicate that the string is continued onto the next line:

``` "This is a \ multi-line string. \ Note the presence of the \ backslash character at the end \ of each of the lines." ```
The second form of multiline string is delimited by the backquote character (`) and does not require backslashes:
``` `This form does not require backslash characters. In fact, here the backslash character \ has no special meaning (unless given the ``Q' suffix` ```
Note that if a backquote is to appear in such a string, then it must be doubled, as illustrated in the above example.

Any character except a newline (ASCII 10) or the null character (ASCII 0) may appear explicitly in a string literal. However, these characters may embedded implicitly using the mechanism described below.

The backslash character is a special character and is used to include other special characters (such as a newline character) in the string. The special characters recognized are:

``` \" -- double quote \' -- single quote \\ -- backslash \a -- bell character (ASCII 7) \t -- tab character (ASCII 9) \n -- newline character (ASCII 10) \e -- escape character (ASCII 27) \xhh -- byte expressed in HEXADECIMAL notation \ooo -- byte expressed in OCTAL notation \dnnn -- byte expressed in DECIMAL \u{h..h} -- the Unicode character U+h..h \x{h..h} -- the Unicode character U+h..h [modal] ```
In the above table, `h` represents one of the HEXADECIMAL characters from the set [0-9A-Fa-f]. It is important to understand the distinction between the `\x{h..h}` and `\u{h..h}` forms. When used in a string, the `\u` form always expands to the corresponding UTF-8 sequence regardless of the UTF-8 mode. In contrast, when in non-UTF-8 mode, the `\x` form expands to a byte when given two hex characters, or to the corresponding UTF-8 sequence when used with three or more hex characters.

For example, to include the double quote character as part of the string, it must be preceded by a backslash character, e.g.,

``` "This is a \"quote\"." ```
Similarly, the next example illustrates how a newline character may be included:
``` "This is the first line\nand this is the second." ```
``` `This is a "quote".` `This is the first line and this is the second.` ```

### Suffixes

A string literal may be contain a suffix that specifies how the string is to be interpreted. The suffix may consist of one or more of the following characters:

R

Backslash substitution will not be performed on the string. This is the default when using back-quoted strings.

Q

Backslash substitution will be performed on the string. This is the default when using strings using the double-quote character.

B

If this suffix is present, the string will be interpreted as a binary string (BString_Type).

\$

Variable name substitution will be performed on the string.

Not all combinations of the above controls characters are supported, nor make sense. For example, a string with the suffix `QR` will cause a parse-error because `Q` and `R` have opposing meanings.

### The Q and R suffixes

These suffixes turn on and off backslash expansion. Unless the `R` suffix is present, all double-quoted string literals will have backslash substitution performed. By default, backslash expansion is turned off for backquoted strings.

Sometimes it is desirable to turn off backslash expansion for double-quoted strings. For example, pathnames on an MSDOS or Windows system use the backslash character as a path separator. The `R` prefix turns off backslash expansion, and as a result the following statements are equivalent:

``` file = "C:\\windows\\apps\\slrn.rc"; file = "C:\\windows\\apps\\slrn.rc"Q; file = "C:\windows\apps\slrn.rc"R; file = `C:\windows\apps\slrn.rc`; % slang-2.2 and above ```
The only exception is that a backslash character is not permitted as the last character of a string with the `R` suffix. That is,
``` string = "This is illegal\"R; ```
is not permitted. Without this exception, a string such as
``` string = "Some characters: \"R, S, T\""; ```
would not be parsed properly.

### The \$ suffix

If the string contains the `\$` suffix, then variable name expansion will be performed upon names prefixed by a `\$` character occurring within the string, e.g.,

``` "The value of X is \$X and the value of Y is \$Y"\$. ```
with variable name substitution to be performed on the names `X` and `Y`. Such strings may be used as a convenient alternative to the `sprintf` function.

Name expansion is carried out according to the following rules: If the string literal occurs in a function, and the name corresponds to a variable local to the function, then the string representation of the value of that variable will be substituted. Otherwise, if the name corresponds to a variable that is local to the compilation unit (i.e., is declared as static or private), then its value's string representation will be used. Otherwise, if the name corresponds to a variable that exists as a global (public) then its value's string representation will be substituted. If the above searches fail and the name exists in the environment, then the value of the corresponding environment variable will be used. Otherwise, the variable will expand to the empty string.

Consider the following example:

``` private variable bar = "two"; putenv ("MYHOME=/home/baz"); define funct (foo) { variable bar = 1; message ("file: \$MYHOME/foo: garage=\$MYGARAGE,bar=\$bar"\$); } ```
When executed, this will produce the message:
``` file: /home/baz/foo: garage=,bar=1 ```
assuming that `MYGARAGE` is not defined anywhere.

A name may be enclosed in braces. For example,

``` "\${MYHOME}/foo: bar=\${bar}"\$ ```
This is useful in cases when the name is followed immediately by other characters that may be interpreted as part of the name, e.g.,
``` variable HELLO="Hello "; message ("\${HELLO}World"\$); ```
will produce the message "Hello World".

### Null_Type

Objects of type `Null_Type` can have only one value: `NULL`. About the only thing that you can do with this data type is to assign it to variables and test for equality with other objects. Nevertheless, `Null_Type` is an important and extremely useful data type. Its main use stems from the fact that since it can be compared for equality with any other data type, it is ideal to represent the value of an object which does not yet have a value, or has an illegal value.

As a trivial example of its use, consider

``` define add_numbers (a, b) { if (a == NULL) a = 0; if (b == NULL) b = 0; return a + b; } variable c = add_numbers (1, 2); variable d = add_numbers (1, NULL); variable e = add_numbers (1,); variable f = add_numbers (,); ```
It should be clear that after these statements have been executed, `c` will have a value of 3. It should also be clear that `d` will have a value of 1 because `NULL` has been passed as the second parameter. One feature of the language is that if a parameter has been omitted from a function call, the variable associated with that parameter will be set to `NULL`. Hence, `e` and `f` will be set to 1 and 0, respectively.

The `Null_Type` data type also plays an important role in the context of structures.

### Ref_Type

Objects of `Ref_Type` are created using the unary reference operator `&`. Such objects may be dereferenced using the dereference operator `@`. For example,

``` sin_ref = &sin; y = (@sin_ref) (1.0); ```
creates a reference to the `sin` function and assigns it to `sin_ref`. The second statement uses the dereference operator to call the function that `sin_ref` references.

The `Ref_Type` is useful for passing functions as arguments to other functions, or for returning information from a function via its parameter list. The dereference operator may also used to create an instance of a structure. For these reasons, further discussion of this important type can be found in the section on Referencing Variables.

### Array_Type, Assoc_Type, List_Type, and Struct_Type

Variables of type ``` Array_Type```, ``` Assoc_Type```, ``` List_Type```, and ``` Struct_Type``` are known as container objects. They are more complicated than the simple data types discussed so far and each obeys a special syntax. For these reasons they are discussed in a separate chapters.

### DataType_Type Type

S-Lang defines a type called `DataType_Type`. Objects of this type have values that are type names. For example, an integer is an object of type `Integer_Type`. The literals of `DataType_Type` include:

``` Char_Type (signed character) UChar_Type (unsigned character) Short_Type (short integer) UShort_Type (unsigned short integer) Integer_Type (plain integer) UInteger_Type (plain unsigned integer) Long_Type (long integer) ULong_Type (unsigned long integer) LLong_Type (long long integer) ULLong_Type (unsigned long long integer) Float_Type (single precision real) Double_Type (double precision real) Complex_Type (complex numbers) String_Type (strings, C strings) BString_Type (binary strings) Struct_Type (structures) Ref_Type (references) Null_Type (NULL) Array_Type (arrays) Assoc_Type (associative arrays/hashes) List_Type (lists) DataType_Type (data types) ```
as well as the names of any other types that an application defines.

The built-in function `typeof` returns the data type of its argument, i.e., a `DataType_Type`. For instance `typeof(7)` returns `Integer_Type` and `typeof(Integer_Type)` returns `DataType_Type`. One can use this function as in the following example:

``` if (Integer_Type == typeof (x)) message ("x is an integer"); ```
The literals of `DataType_Type` have other uses as well. One of the most common uses of these literals is to create arrays, e.g.,
``` x = Complex_Type ; ```
creates an array of 100 complex numbers and assigns it to `x`.

### Boolean Type

Strictly speaking, S-Lang has no separate boolean type; rather it represents boolean values as `Char_Type` objects. In particular, boolean FALSE is equivalent to `Char_Type` 0, and TRUE as any non-zero `Char_Type` value. Since the exact value of TRUE is unspecified, it is unnecessary and even pointless to define TRUE and FALSE literals in S-Lang.

## 4.2Typecasting: Converting from one Type to Another

Occasionally, it is necessary to convert from one data type to another. For example, if you need to print an object as a string, it may be necessary to convert it to a `String_Type`. The `typecast` function may be used to perform such conversions. For example, consider

``` variable x = 10, y; y = typecast (x, Double_Type); ```
After execution of these statements, `x` will have the integer value 10 and `y` will have the double precision floating point value `10.0`. If the object to be converted is an array, the `typecast` function will act upon all elements of the array. For example,
``` x = [1:10]; % Array of integers y = typecast (x, Double_Type); ```
will create an array of 10 double precision values and assign it to `y`. One should also realize that it is not always possible to perform a typecast. For example, any attempt to convert an `Integer_Type` to a `Null_Type` will result in a run-time error. Typecasting works only when datatypes are similar.

Often the interpreter will perform implicit type conversions as necessary to complete calculations. For example, when multiplying an `Integer_Type` with a `Double_Type`, it will convert the `Integer_Type` to a `Double_Type` for the purpose of the calculation. Thus, the example involving the conversion of an array of integers to an array of doubles could have been performed by multiplication by `1.0`, i.e.,

``` x = [1:10]; % Array of integers y = 1.0 * x; ```

The `string` intrinsic function should be used whenever a string representation is needed. Using the `typecast` function for this purpose will usually fail unless the object to be converted is similar to a string--- most are not. Moreover, when typecasting an array to `String_Type`, the `typecast` function acts on each element of the array to produce another array, whereas the `string` function will produce a string.

One use of `string` function is to print the value of an object. This use is illustrated in the following simple example:

``` define print_object (x) { message (string (x)); } ```
Here, the `message` function has been used because it writes a string to the display. If the `string` function was not used and the `message` function was passed an integer, a type-mismatch error would have resulted.

Next Previous Contents