THE DESIGN OF THE MATHEMATICA PROGRAMMING LANGUAGE

A Single paradigm provides surprising diversity

Say you have a formula you want to simplify, for example a Integral t sin² t dt. You can "look it up" in the table of integrals. You notice that your formula is of the form shown in Example 1(a), with x replaced by t. You can therefore replace that part of your formula by the right side of the formula in the integral table, making sure to replace each occurrence of x by t. The result then becomes as shown in Example 1(b). You will not find a formula that matches this simple expression in your handbook, so this is the final answer.

There are many transformation rules that we apply automatically, without looking them up in a handbook. We might want to transform logarithms of products into sums of logarithms, for example. The rules we use in our head are shown in Example 2(a).

Example 2: (a) Rules for logarithms, expressed in traditional mathematical notation; (b) rules for logarithms, as represented in Mathematica; (c) Mathematica uses rules to simplify a logarithmic expression.

  (a)

  log (x y) = log x + log y
  log (xⁿ) = n log x

  (b)

  In [1] := log [x_ y_] := log [x] + log[y]
  In [2] := log [x_^n_] := n log [x]

  (c)

  In [3] := log[a b c^2/d]
  Out [3] = log[a] + log[b] + 2 log[c] - log[d]

With these two rules we have no trouble transforming log(abc²/d) into loga+ logb+ 2 logc-logd by repeatedly applying the rules and by noting that 1/d= d^-1. We can program these rules directly into Mathematica. The result is shown in Example 2(b).

Note that we have to explicitly denote the parts of the pattern that are variables, by appending an underscore to the variable name; in this example, x, y, and n are the pattern variables. We can now try to simplify the example given above; see Example 2(c).

In general, it takes some experience to make the task work as smoothly as this illustration. Humans are much cleverer at matching patterns than computers. For example, we have no trouble transforming log 1/3 into -log 3. However, the Mathematica rules presented in Example 2(b) will not accomplish this, because the rational number 1/3 is not stored as 3^-1; therefore, the rule will not match. A different rule is needed for rational numbers. But we can easily add such a rule: log[r_Rational]:= log[Numerator[r]]-log[Denominator[r]]. (Mathematica users should also note that we chose to use the symbol log instead of the built-in logarithm function Log for this example.)

This simple example shows that designing rule sets for performing advanced computations is not trivial. Some familiarity with the many different pattern objects that Mathematica supports and some experimenting are required. Fortunately, this is easy to do in an interpreted system. Also, you can learn by looking at rule sets that others have written, such as the samples that come with the Mathematica system.

Mathematica Syntax

Mathematica has a lot of predefined operators, which are represented by a single character, for example @. While convenient for the expert, a program that uses these operators freely may be hard to understand by the novice reader. In this respect, Mathematica can resemble APL. Nevertheless, the basic structure of the language is simple, and allows all programs to be written entirely without the use of operators. For example, instead of writing a + b, you can write Plus[a, b]. Likewise, instead of a b c, you can write Times[a, b, c].

Expressions are built out of atoms (similar to Lisp atoms) and composite expressions. Atoms can be symbols (names of variables), numbers, or character strings. All composite expressions have the form: b[e₁, e₂, ..., e_n], where b and the e[i] are themselves expressions.

You can read or interpret a Mathematica expression in any of three ways:

As a function call, for example, Sin[x]

As a procedural command, such as Expand [(a + b)^ 10]

As a collection of objects, as in List [alpha, beta, gamma], (normally written as {alpha, beta, gamma})

The Mathematica system comes with hundreds of standard, built-in mathematical functions. These built-ins are compiled functions, implemented in C. Many are quite powerful -- for example, the NDSolve function, which finds numeric solutions for differential equations where no closed-form solution exists. Syntactically, there is no difference between a built-in function and one defined by the user.

The use of square brackets for function calls is somewhat nonstandard. Mathematica reserves ordinary parentheses for grouping of expressions, as in a(b+c). Thus, there is never confusion between a nested expression and a function call such as a[b + c]. This syntax makes it possible to leave out the multiplication operator (*), as is the normal custom with mathematical formulas. The third kind of brackets, curly braces, are used to denote lists of things.

Even constructs commonly identified as statements are simply expressions. For example, the "statement" a = 5, in internal form is the expression Set[a, 5]. When evaluated, this expression will assign the value 5 to the variable a. The semicolon used to separate statements is really an operator, just like + or *. The sequence stmt₁; stmt₂; stmt₃ is therefore read as CompoundExpression[stmt₁, stmt₂, stmt₃]. The effect of this compound expression is, of course, to evaluate each of the three expressions stmt₁, stmt₂, and stmt₃ in sequence.

Procedural Programming

How can we write procedural programs in a language that offers "only" pattern matching? The key idea is that a pattern in its simplest form looks just like a function or procedure declaration (or formal parameter list).

For example, a pattern such as SinSum [x0_, x1_, dx_] can serve as the declaration of a function with three arguments -- x0, x1, and dx -- inside the body of the function definition. Such a function would be used in a call such as 2*SinSum[0, 1, 0.1]. If you recall how the evaluator works, you can see that the expression SinSum[0, 1, 0.1] will match the pattern SinSum[x0_, x1_, dx_]. Mathematica will then replace this expression by the right side of the rule for SinSum, after having replaced x0, x1, and dx by their values 0, 1, and 0.1. So, the right side of the rule for SinSum [x0_, x1_, dx_] serves as the body of the function; see Example 3(a).

Example 3: (a) Summing a sine series in a procedural manner; (b) summing a sine series, as expressed in C.

  (a)

  SinSum[x0_, x1_, dx_] :=
      Module[ {s, x},
          s = 0.0;
          x = x0;
          While[ x <= x1,
              s = s + Sin[x];
              x = x + dx
          ];
          s
      ]

  (b)

  double SinSum (double x0, double x1, double dx)
  {
      double s, x;

      s = 0.0;
      x = x0;
      while (x <= x1) {
          s = s + sin (x),
          x = x + dx;
      }
      return s;
  }

Superficially, there is not much difference between this Mathematica program and the equivalent C program, shown in Example 3(b). (An expert C programmer may perhaps write this a bit differently, but my version remains a little more readable for the wider audience.)

Note that there is no need to declare the type of the formal parameters in Mathematica. As in other languages that use polymorphic data types, the same block of code can sum the sine values of real numbers, complex numbers, or some user-defined types.

Also note that the construct Module [varlist, body] introduces local variables (s and x, in our example), just like the declaration in the C program.

You can also see that there is no need for a return statement. The last value evaluated (in this case, the variable s) is returned as the value of the function.

One final difference is that in Mathematica, unlike C and Pascal, variables are not restricted to 32-bit integers, but can be arbitrary precision floating-point (hundreds of digits in length), complex numbers, or user-defined data types. The number of digits displayed is a configuration value that can be changed at any time.

Lisp-style Data Manipulation

The Mathematica program in Example 3(a) has a distinctly procedural "look," despite the fact that all constructs used in it are simply expressions. The Mathematica language, however, supports other kinds of programming. Here's how to define a basic Lisp data system with in Mathematica.

As you may recall, data in Lisp is constructed of elements known as cells, which are created with (cons x y). The two parts can be retrieved with (car 1) and (cdr 1), respectively. These functions satisfy the two equations: (car (cons x y)) = x, and (cdr (cons x y)) = y. These two equations can be entered directly into Mathematica as rules.

Every Lisp object that is not such a cons cell is an atom -- that is, a number or a symbol. A nested cell structure of the form: (cons e₁ (cons e₂ ... (cons e_n nil) ... )) is called a list and is written as (e₁ e₂ ... e_n). Nil is an atom denoting the empty list. The Lisp predicate nullQ returns true for the empty list, false for nonempty lists. All these definitions can be concisely expressed in Mathematica, as shown in Example 4(a).

Example 4: (a) Mathematica rules for Lisp-like lists; (b) building higher-level, Lisp-like forms; (c) applying the Lisp-like functions on a list.

  (a)

  car[cons[e_, l_]] := e
  cdr[cons[e_, l_]] := l
  nullQ[nil] = True
  nullQ[_cons] = False
  list[] = nil
  list[e_, r___] := cons[e, list[r]]

  (b)

  In[1] := append[l_, e_] :=
           If[ nullQ[l], list [e], cons[ car [1], append[cdr[l], e] ] ]

  In[2] := reverse[l_, e_] :=
           If[ nullQ[l], nil, append[ reverse[cdr[l]], car[l] ] ]

  (c)

  In[3] := alist = list [e1, e2, e3, e4, e5]
  Out[3] := (e1 e2 e3 e4 e5)

  In[4] := reverse[alist]
  Out[4] - (e5 e4 e3 e2 e1)

Example 4(b) shows how these basic forms can be used to build higher-level Lisp functions, such as the familiar append and reverse. append appends a new element at the end of a list, and reverse, of course, reverses the order of elements in a list. Example 4(c)shows how these functions would be used, by defining a sample list and then applying reverse to it. In addition to the given definitions, Mathematica was told to output lists in true Lisp notation. (Those definitions are not shown.)

Functional Programming

Moving further away from procedural programming, and going beyond manipulation of Lisp-style data, we arrive at the realm of truly functional programming. Functional programming is also= known as applicative programming, in that pure functions are applied to data, using a minimum of state variables, counters, and side effects.

For example, a more elegant way of summing the sines of the numbers x₀, x₀ +dx,... is to first generate a list of these numbers, then map the sine function over the list, and finally fold the result using the addition function. The transcript of an interactive Mathematica session that accomplishes these steps is shown in Example 5(a).

Example 5: (a) A nonprocedural way to sum sines; (b) nesting nonprocedural functions; (c) a functional rule for summing sines; (d) using the built-in summation function.

  (a)

  In[1] := Range[ 0, 1, 0.1 ]

  Out[1] = {0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.}

  In [2]:= Map[ Sin, % ]
  Out [2]={0, 0.0998334, 0.198669, 0.29552, 0.389418,

    0.479426, 0.564642, 0.644218, 0.717356, 0.783327,

    0.841471}

  In[3] := Fold[ Plus, 0, % ]
  Out[3] = 5.01388

  (b)

  In[4] := Fold[ Plus, 0, Map[ Sin, Range[ 0, 1, 0.1 ] ] ]
  Out[4] = 5.01388

  (c)

  SinSum[x0_, x1_, dx_] :=
      Fold[ Plus, 0, Map[ Sin, Range[x0, x1, dx] ] ]

  (d)

  SinSum[x0_, x1_, dx_] := Sum[ Sin[x], {x, x0, x1, dx} ]

In the transcript, input lines entered by the user are labeled by Mathematica according to the form In[n]:=. The lines of output that correspond to this input have the form Out[n]=. The percent sign (%) is a shorthand symbol that represents the output of the previous line.

Having followed these steps, we can nest the functions inside each other to achieve the one-line solution in Example 5(b). This leads to the functional definition of SinSum, shown in Example 5(c), which is a vast improvement over the procedural program presented in Example 3(a).

As an aside, note that the summation function is one of the built-in functions supplied with Mathematica. We can therefore rewrite our definition by using the built-in summation function; see Example 5(d).

Pure Functions

One of the remarkable features of most functional programming languages is the notion of a "nameless," pure function, such as the Lisp expression (lambda(x) (*xx)). This example uses the lambda form to define an anonymous function that returns the square of its argument.

A lambda function can be applied to an argument, for example:((lambda (x) (*xx))5). Here, evaluation of this function will produce a result of 25. The lambda capability makes it unnecessary to give functions a name; functions can be defined on-the-fly, as needed. They can be passed as arguments to other functions or even returned as values from functions. In this sense they are first-class objects in the language.

In Mathematica, our example would be written as Function[x, x^2] and applied to the argument 5 like this: Function [x, x^2][5]. Together with Map, the function that applies functions to the elements of a list, we can write a short definition to square all elements of a list; see Example 6.

Example 6: Squaring elements of a list using the Map function

  In[1] := SquareList [l_] := Map[ Function[x, x^2], l ]

  In[2] := Squarelist[ {1, 5.5, a, Sqrt[2]} ]

  Out[2] = {1, 30.25, a2, 2}

Operations on Structured Data

APL's strength lies in using operators to work on data structured as arrays, matrices, and so on. Most of the APL operators can be made available in Mathematica, but not in the form of single characters. Rather, these operators are simply implemented as functions.

Here's an example of defining an APL operator, one not built into Mathematica. The reshape operator takes an arbitrary structure and rearranges it such that it has certain dimensions.

Consider the data shown in the first statement of Example 7(a), which is a structure nested three levels deep. The dimensions are shown by the second statement of Example 7(a). The reshape "operator" rearranges the data into a matrix with the dimensions 4x3, as shown in Example 7(b) .

Example 7: (a) A three-level data structure that will be reshaped; (b) the data structure after reshaping; (c) the definition for the Reshape function; (d) the steps in reshaping, as shown by the debugging trace facility.

  (a)

  In[1] := data = {{{a1, a2}, {b1, b2}, {c1, c2}},
                   {{d1, d2}, {e1, e2}, {f1, f2}}};

  In[2] := Dimensions[ data ]

  Out [2] = {2, 3, 2}

  (b)

  In[3] := Reshape[ data, {4, 3} ]

  Out[3] = {{a1, a2, b1}, {b2, c1, c2}, {d1, d2, e1},

    {e2, f1, f2}}

  (c)

  Reshape[list_, dims_] := reshape[ Flatten[list], dims ]

  reshape[list_, {n_}] := list /; Length[list] == n

  reshape[list_, {head__, n_}] := reshape[ Partition[list, n], {head} ]

  (d)

  In[4] := Trace[Reshape[ data, {4, 3} ], reshape]

  Out[4] = {reshape[Flatten[{{{a1, a2}, {b1, b2}, {c1, c2}},

        {{d1, d2}, {e1, e2}, {f1, f2}}}], {4, 3}],

  reshape[{a1, a2, b1, b2, c1, c2, d1, d2, e1, e2, f1,

    f2}, {4, 3}], reshape[Partition[{a1, a2, b1, b2, c1,

     c2, d1, d2, e1, e2, f1, f2}, 3], {4}],

  reshape[{{a1, a2, b1}, {b2, c1, c2}, {d1, d2, e1},

    {e2, f1, f2}}, {4}],

  {{a1, a2, b1}, {b2, c1, c2}, {d1, d2, e1},

    {e2, f1, f2}}}

The function Reshape[data, dims] can easily be written recursively on the list of dimensions. The last element of this list is the length of the innermost parts of the resulting structure. The built-in function Partition[list, size] groups elements of list into sublists of length size. With its recursion on a list, the resulting definitions have a distinct, Lisp-like flavor; see Example 7(c).

First, the old structure that the input might have is destroyed with Flatten, making it into a linear list. The work is then done in the auxiliary function reshape. (Note the lowercase r.) The boundary case (when the list of dimensions has just a single element) merely tests whether the length of the structure is correct. Nothing else needs to be done.

A trace of the reshape program is shown in Example 7(d), showing the steps followed as the rules for reshape are applied. The tracing facility in Mathematica can provide a list of the intermediate steps in an evaluation sequence. In this case, the arguments to reshape are first simplified (the Flatten and Partition commands are carried out), and then the rules for reshape itself take effect.

Modularization

To organize large programming projects, modern programming languages use facilities such as modularization, encapsulation, and information hiding. In Mathematica, modular program components are known as "packages." Like the facility found in Modula and Object Pascal, a package consists of two parts: an interface definition and an implementation part. The interface section describes the aspects that a client needs to know. The implementation section provides that functionality, but keeps the details hidden from users of the packages (the clients of the interface).

Even with such a small example as the function Reshape[] (from the preceding section), it makes sense to write a package. As written in package form, the reshape function is shown in Example 8. The part between the initial BeginPackage[] and Begin ["'Private'"] is the interface. It declares all functions exported from this package (in this case, only one) and documents them. The part between Begin and End is the implementation. The auxiliary function reshape is not exported. It will not be available to users of this pa kage. The command Protect[ Reshape ] prevents any modification of the exported function Reshape by the user of our package.

Example 8: The reshape function in package form

  BeginPackage ["Reshape'"]

  Reshape::usage = "Reshape[list, dims] rearranges list as a nested
          structure with dimension dims."

  Begin["'Private'"]

  reshape[list_, {n_}] := list /; Length[list] == n
  reshape[list_, {head__, n_}] := reshape[ Partition[list, n], {head} ]
  Reshape[list_, dims_] := reshape[ Flatten[list], dims ]

  End[]

  Protect[ Reshape ]

  EndPackage[]

You load a package from disk into a Mathematica session by typing: << Reshape'. The notation used is machine independent; Mathematica translates the context name Reshape' into a valid filename on any computer. The commands in that package are read in, just as if they had been typed in response to input prompts.

Conclusion

The rule-based paradigm in Mathematica's programming language provides surprising flexibility in supporting a range of programming styles and problem-solving strategies. It is especially suited for interactive experimentation, and allows code to be written in a way that matches closely the natural formulation of the problem to be solved.

References

Gray, Theodore and Jerry Glynn. Exploring Mathematics with Mathematica. Reading, Mass.: Addison-Wesley, 1991.

Gradshteyn, Izrail Solomonovich and Iosif Moiseevich Ryzhik. Table of Integrals, Series and Products. San Diego, Calif.: Academic Press, 1965.

Maeder, Roman E. Programming in Mathematica, Second Edition. Reading, Mass.: Addison-Wesley, 1991.

Skiena, Steven S. Implementing Discrete Mathematics: Combinatorics and Graph Theory with Mathematica. Reading, Mass.: Addison-Wesley, 1990.

Wagon, Stan. Mathematica in Action. New York, N.Y.: W.H. Freeman, 1990.

Wolfram, Stephen. Mathematica: A System for Doing Mathematics by Computer, Second Edition. Reading, Mass.: Addison-Wesley, 1991.