The StatiC Compiler & Language

Dr. Dobb's Journal March, 2005

A dual-mode system for sequential and FSM development

By Pete Gray

Pete is a programmer who specializes in embedded systems development. He can be contacted at petegray@ieee.org.

Take one traditional sequential programming language—C, for instance—and remove the parts you don't like. Design and code a compiler based on this language, and target a popular microcontroller (making sure that the design is flexible enough to support multiple targets). Now, add Finite State Machine programming constructs as an integral part of the language. What you end up with is "StatiC," a dual-mode compiler that provides an easy migration path from the traditional sequential programming model to the inherently multitasking Finite State Machine model. Effectively, you get two languages for the price of one, along with a friendly syntax and small learning curve. Additionally, the use of command-line switches facilitates retargeting through external parsing software.

The StatiC compiler supports both traditional sequential and Finite State Machine (FSM) language methodologies, the feature being controlled via a command-line switch. Dual-methodology support lets you code using an identical syntax (but different language constructs) in either the "generic" sequential mode or the inherently multitasking FSM mode. The sequential language is based on the familiar language constructs of C, Basic, and Pascal, but with a unified and simplified syntax. The FSM language is the same as the sequential language, except that it has additional FSM extensions. In this article, I don't specifically address Finite State Machines, as this topic has been covered in DDJ in the past. Rather, I deal with the implementation of the concept as it applies to the StatiC language.

The compiler has been designed from scratch, specifically for the embedded domain, and includes the features required to support both the sequential and FSM modes of operation. In addition, the languages themselves have been enhanced to remove "clutter" (such as ambiguous operators and symbols) found in other languages, as well as incorporating some features more suited for embedded software development.

The StatiC compiler can be hosted under Windows or Linux, and currently targets the Motorola DSP56F80x microcontrollers. These controllers, with their dual Hardvard architecture, JTAG flash capabilities, and a wide variety of interface modules, are particularly well suited for the embedded/robotics domain. A demo version of the compiler is available at http://petegray.newmicros.com/static/ and from DDJ (see "Resource Center," page 5).

StatiC compiler operation is controlled via parameters and switches, which are invoked like this:

static sourcename [switches]

By convention, StatiC FSM-mode programs have a filetype of .fsm and non-FSM programs have a filetype of .nsm.

Table 1(a) lists the switches that control compiler operation. For example, to produce CodeWarrior-style assembly language for myprog.nsm, enter:

static myprog.nsm -a568cw

Output is placed in the file clist.asm. Table 1(b) lists additional switches for compiler development and debugging.

If the assembler output is unspecified (that is, you don't use the -a switch), the compiler generates descriptive, nonoptimized assembler, making it possible for an external program to parse the output and produce assembler for a completely different target. The default compiler mode is sequential. In addition, if the compiler is invoked without specifying a source, it begins an interactive session.

FSM-Specific Language Topics

FSM mode allows the use of Finite State Machine constructs, which are inherently multitasking. An application typically consists of multiple machines that—at any point in time—exist in a particular state. A good analogy would be that a machine is like a thread running in a process, or a machine is like a program running in a multitasking operating system.

The implementation of the FSM methodology requires that you list the allowable states of the machines in an application, defines the conditions whereby a machine state transition occurs, and declares the name and initial state of each machine. Each state and each machine has a unique name (they are, after all, identifiers).

State transitions are used to determine and execute a machine transition from one state to another, and achieve this through the assignment of the reserved word nextstate, optionally executing additional code during the transition.

Due to the nature of state machines, a transition may not include loops or calls (that is, anything that may cause a transition to "hang"). This apparent limitation really isn't limiting, rather, it proactively encourages you to produce code that is more appropriate to the state machine programming paradigm. The compiler automatically generates a high-speed, minimal overhead, context-switching mechanism based on an application's machine chain. This context switch examines the state transition conditions of each machine in a round-robin fashion, performing a machine state transition only when the transition conditions have been satisfied.

The demo version of the StatiC compiler has limited FSM capabilities—two machines and 12 states—which is enough to compile the FSM mode example program.

Program Structure: Sequential Mode

The structure of a typical sequential program looks like this:

Comments
Directives
Global Variable and Constant Declarations
Procedure Block(s)
Program Block

All items, except the Program Block, are optional. Comments can appear anywhere. Most Directives can appear anywhere. Global Variables and Constants must appear prior to being referenced (that is, referenced from a Procedure or the Program). Procedure Blocks must appear before the Program Block.

The Procedure and Program Block Structures follow. Optional elements are shown between square braces ([ and ]). Procedure Block Structure, see Example 1(a), define the name, parameters, and code of a callable routine. The Program Block Structure defines the code of the main program; see Example 1(b).

Listing One, a complete sequential mode StatiC program, performs simple terminal I/O and lets users turn LEDs on/off. The target system is New Micros's PlugaPod (http://www.newmicros.com/), which is based on Motorola's DSP56F803 digital-signal processor. This program displays instructions, then turns the LEDs on/off, depending upon what users type at the keyboard.

From within the program block, you see the statement:

word ichar 1

which declares a one-word variable, ichar, which is local to the program block. Next, the statement

^SCI0BR = 260 // baud 9600

loads the Serial Communications Interface (SCI) baud rate register with the value 260, which sets the baud rate of the chip's SCI module to 9600. The statement works this way because I defined SCI0BR, near the top of the program, to be $0F00, the address (in Hex) of the baud-rate register for the PlugaPod, and I use the "^" operator. This could be thought of as meaning "load the contents of $0F00 with 260." In StatiC, the same statement could have been written like this:

^$0F00 = 260

which would have achieved the same thing. However, it's good practice to substitute definitions for register addresses because the registers do not always have the same address within the same family of chips. Using definitions means that if you port your code to another chip, which doesn't have the same register address as the original, you'll only need to change the program in one place—in the #define directive. Besides, SCI0BR is a little more meaningful than $0F00 to someone reading or maintaining the code.

The program then sets the various general-purpose input/output (GPIO) line-control registers, which are tied to the LEDs on the PlugaPod. Next, the statement:

call sci0output (@msg1) // display message

calls the SCI output routine, passing the address of msg1 as the parameter. The constant msg1 is a null-terminated string, and sci0output is coded to process the string passed to it, displaying the characters (via the SCI) to the terminal.

The program then enters a never-ending loop, reading the keystrokes and adjusting the LEDs accordingly. The statement:

call sci0input (@ichar)

calls the SCI input routine, passing the address of the local variable ichar as the parameter. The sci0input routine is coded to wait for keyboard input and return what was typed in the parameter passed to it.

Next, the character returned from the input routine is tested, and the LEDs are adjusted. The statement:

if ( ichar = '1' ) ^PADR = ^PADR | $0004 endif

performs a logical OR operation on the contents of the GPIO data register (PADR = Port A Data Register), if users type a "1" at the keyboard. ORing the data register with $0004 sets bit 2 high, which turns the green LED on.

Finally, sci0input waits until the SCI status register (SCI0SR) indicates that a character has been entered, then puts the character into wherever rchar is pointing at:

ostat = ^SCI0SR
while ( ostat & $3000 ) <> $3000
ostat = ^SCI0SR
endwhile
^rchar = ^SCI0DR

Recall that I passed @ichar to the routine, and the formal declaration of the routine was:

procedure sci0input (rchar)

so the statement:

^rchar = ^SCI0DR

actually stores the contents of the SCI data register (SCI0DR) into ichar.

Program Structure: FSM Mode

The structure of a typical FSM program looks like this:

Comments
Directives
Global Variable and Constant Declarations
Procedure Block(s)
State List
Transition Blocks(s)
Machine Definitions
Program Block

Many of the components of an FSM program structure are present in the sequential program structure. The extensions required for FSM mode are the State List, Transitions, and Machine Definitions, which must appear in order.

The State List simply lists the allowable application machine states:

statelist statename1 statename2 ...

The Transition Block Structure defines the conditions required for a state change and the actions to perform when those conditions are met.

transition statename
begin
condition expression
causes
statements
endcondition
end

Finally, Machine Definitions lists the machines in the application, and defines their stack space and initial state. Each machine has its own stack space, and the compiler automatically initializes—and keeps track of—the stack pointer for each machine.

machine machinename stacksize initial state

Listing Two, a complete FSM mode StatiC program, performs simple terminal I/O. Characters entered on the keyboard are received by the microcontroller and echoed on a PC running a terminal emulator. Again, the target system is NewMicros's PlugaPod, although this example also runs on the 805 chip (NewMicro's IsoPod), and—with modification to the SCI register addresses—the 807 (ServoPod).

First, notice that this application consists of two machines—inputmachine and outputmachine. The main part of the program, the Program Block:

^SCI0BR = 260 // baud rate 9600
^SCI0CR = 12 // 8N1
call sci0output (@msg1) // display
// welcome message
appstate = APPSTATEINPUT// the initial
//app state

sets up the Serial Communications Interface (SCI), displays a message, and sets the global variable appstate to be APPSTATEINPUT. This application is designed in such a way that the two machines are cooperative, and the setting of appstate determines their transition to/from one state to another. Machines don't have to behave this way, but it's useful, for demonstration purposes.

Once the program block has been executed, all machines are activated. That is to say, they're put into their "initial state" as determined by the machine definition statements:

machine inputmachine 10 waitappinput
machine outputmachine 10 waitappoutput

inputmachine is put into waitappinput state, and outputmachine is put into waitappoutput state. Once in these states, they remain in these states until the state transition conditions have been satisfied. So, inputmachine is initially in waitappinput state, which is described in the transition block, thus:

transition waitappinput
begin
condition appstate = APPSTATEINPUT
causes
nextstate = waitinput
endcondition
end

However, appstate was defined as APPSTATEINPUT in the main program block, so the inputmachine's transition condition is satisfied. This causes inputmachine to change states to waitinput.

Also, you'll notice that outputmachine's initial state is waitappoutput, which is described in the transition block, thus:

transition waitappoutput
begin
condition appstate = APPSTATEOUTPUT
causes
nextstate = doappoutput
endcondition
end

Unlike inputmachine, outputmachine's transition condition has not been satisfied, so no state change takes place, and outputmachine remains in the waitappoutput state.

At this point in time, outputmachine is waiting for its transition condition to be satisfied, and inputmachine has changed state to waitinput. So, looking at the waitinput transition block:

transition waitinput
begin
condition ( ^SCI0SR & $3000 ) = $3000
causes
appchar = ^SCI0DR
appstate = APPSTATEOUTPUT
nextstate = waitappinput
endcondition
end

inputmachine remains in this waitinput state until a keyboard key is pressed at the keyboard. The outputmachine is still waiting for its transition conditions to be satisfied.

When a key is pressed, inputmachine's transition conditions are satisfied, a character is read from the SCI data buffer into the global variable appchar, the appstate is set to APPSTATEOUTPUT, and inputmachine performs a state change back to waitappinput.

At this point, outputmachine's state transition conditions have been satisfied (because appstate was set to APPSTATEOUTPUT by inputmachine), so outputmachine experiences a state change from waitappoutput to doappoutput. Looking at the doappoutput transition block:

transition doappoutput
begin
condition ( ^SCI0SR & $C000 ) = $C000
causes
^SCI0DR = appchar
appstate = APPSTATEINPUT
nextstate = waitappoutput
endcondition
end

The outputmachine waits until the SCI is ready to send, then it loads the SCI data register with the global variables; appchar, sets the appstate to APPSTATEINPUT, and performs a state change back to waitappoutput. While this is all happening, inputmachine does nothing, because its state transition conditions have not been satisfied.

At this point in time, both machines are back in their initial states, and the whole cycle starts again.

Why FSM?

At this point, you may be wondering why anyone would want to code like this. The answer is because it's inherently multitasking. For example, say that you've coded the previous example and want to have the application monitor the PH level of the water in a fishtank (via the ADC), then set a GPIO line high (triggering an alarm) if the reading goes above a certain point. All you have to do is add another machine. Want to send PWM signals to activate a servo that opens a feeding tray? Add another machine.

There's no difficult "where do I put this new code so that the existing code still works?"—the machines run independently from each other (unless, of course, you deliberately design them to be cooperative). You could even run multiple machines on the same chip, which perform functions for more than one application; for instance, monitor a fishtank and monitor a home-security system.

You simply create machines, as required, to perform the tasks you desire. Each machine runs and changes state when its transition conditions are satisfied. All of the machines you define are running at the same time—the same as a multitasking operating system—and performing whatever function you've designed them to do. This is the true power of FSM programming.

Conclusion

My goal with StatiC is to create a dual-methodology language, which is easy to learn and use, yet advanced enough to perform multitasking in embedded environments. It had to be something that made rapid application development a reality, and not just an overused marketing phrase. But most of all, it had to be a language that I—as an experienced software developer—would want to use, as a matter of preference, over any other languages available in the domain. The StatiC language and compiler meet, and in some ways exceed, that goal. I'm surprised with what can be achieved using a relatively simple language—which just goes to show that sometimes the best solution to complex problems is a simple solution.

Acknowledgments

New Micros (http://www.newmicros.com/) produces inexpensive DSP56F80x microcontroller boards (IsoPod, ServoPod, MiniPod, and PlugaPod), as well as the JTAG cables. I'd like to thank Randy M. Dumse and Jack Crenshaw for their support and guidance. All compiler development was performed on a homemade P4 WXP box and an IBM 300PL running Linux RH9. I'm currently developing support for the Atmel AVR series of microcontrollers, as well as additional language features.

DDJ

Listing One

// port A definitions for GPIO (LEDs)
#define PAPUR   $0FB0
#define PADR    $0FB1
#define PADDR   $0FB2
#define PAPER   $0FB3
#define PAIAR   $0FB4
#define PAIENR  $0FB5
#define PAIPOLR $0FB6
#define PAIPR   $0FB7
#define PAIESR  $0FB8

// SCI0 definitions for terminal (RS232) interface
#define SCI0BR  $0F00
#define SCI0CR  $0F01
#define SCI0SR  $0F02
#define SCI0DR  $0F03

// constants - the welcome message
const msg1 "LEDs on/off 1/2=Green 3/4=Yellow 5/6=Red."
const msge 13,10,0

// output a null-terminated string to SCI0
procedure sci0output (optr)
begin
  word ostat 1

  while ^optr
    ostat = ^SCI0SR
    while ( ostat & $C000 ) <> $C000
      ostat = ^SCI0SR
    endwhile
    ^SCI0DR = ^optr
    optr = optr + 1
  endwhile
end

// read a character from SCI0
procedure sci0input (rchar)
begin
  word ostat 1

  ostat = ^SCI0SR
  while ( ostat & $3000 ) <> $3000
    ostat = ^SCI0SR
  endwhile
  ^rchar = ^SCI0DR
end

// the main program
program
begin
  word ichar 1

  ^SCI0BR = 260              // baud 9600
  ^SCI0CR = 12               // 8N1
  ^PAIAR = 0                 // enable LEDs
  ^PAIENR = 0
  ^PAIPOLR = 0
  ^PAIESR = 0
  ^PAPER = $00F8
  ^PADDR = $0007
  ^PAPUR = $00FF
  call sci0output (@msg1)    // display message
  ^PADR = 0                  // LEDs off
  while 1                    // loop forever
    call sci0input (@ichar)
    if ( ichar = '1' ) ^PADR = ^PADR | $0004 endif
    if ( ichar = '2' ) ^PADR = ^PADR & $00FB endif
    if ( ichar = '3' ) ^PADR = ^PADR | $0002 endif
    if ( ichar = '4' ) ^PADR = ^PADR & $00FD endif
    if ( ichar = '5' ) ^PADR = ^PADR | $0001 endif
    if ( ichar = '6' ) ^PADR = ^PADR & $00FE endif
  endwhile

end

Back to article

Listing Two

// definitions for SCI (RS232)
#define SCI0BR  $0F00
#define SCI0CR  $0F01
#define SCI0SR  $0F02
#define SCI0DR  $0F03

// global variables
word appstate 1
word appchar 1

// application control definitions
#define APPSTATEINPUT 1
#define APPSTATEOUTPUT 2

// constants
const msg1 "StatiC FSM SCI Demo Ready."
const msg2 13,10,0

// the application states
statelist waitappinput waitinput waitappoutput doappoutput

// the transitions
transition waitappinput
begin
  condition appstate = APPSTATEINPUT
  causes
    nextstate = waitinput
  endcondition
end

transition waitinput
begin
  condition ( ^SCI0SR & $3000 ) = $3000
  causes
    appchar = ^SCI0DR
    appstate = APPSTATEOUTPUT
    nextstate = waitappinput
  endcondition
end

transition waitappoutput
begin
  condition appstate = APPSTATEOUTPUT
  causes
    nextstate = doappoutput
  endcondition
end

transition doappoutput
begin
  condition ( ^SCI0SR & $C000 ) = $C000
  causes
    ^SCI0DR = appchar
    appstate = APPSTATEINPUT
    nextstate = waitappoutput
  endcondition
end

// define the machines
machine inputmachine 10 waitappinput
machine outputmachine 10 waitappoutput

// a procedure used at start-up, to display welcome message
procedure sci0output (optr)
begin
  word ostat 1

  while ^optr
    ostat = ^SCI0SR
    while ( ostat & $C000 ) <> $C000
      ostat = ^SCI0SR
    endwhile
    ^SCI0DR = ^optr
    optr = optr + 1
  endwhile
end

// the main program
program
begin
  ^SCI0BR = 260                // baud rate 9600
  ^SCI0CR = 12                 // 8N1
  call sci0output (@msg1)      // display welcome message
  appstate = APPSTATEINPUT     // the initial app state
end

// at this point, all of the defined machines are 'running'

Back to article