March 1993/SSX -- Stack Swap eXecutive

Real-Time & Embedded Systems

SSX — Stack Swap eXecutive

Tom Green and Dennis Cronin

Tom Green is a UNIX Software engineer who specializes in UNIX driver development. He also writes MS-DOS, Windows and embedded 80x86 applications. He may be reached via electronic mail at tomg@cd.com.

Dennis Cronin almost completed an EE degree but got lured into the sordid world of fast computers, easy money, and loose connections. Specialties: UNIX driver development and embedded systems. He may be reached via electronic mail at denny@cd.com.

Introduction
A project we worked on recently required us to port existing disk-controller software from an 80186-based board to a new 80386-based product. The total port required that we port not only the actual controller software, but also our real-time executive and a debugger as well. Our existing real-time executive was written in C and assembly language. We simply set out to directly port it to the 386. Oh yeah, we also committed to a pretty aggressive (translation: insane) schedule.
The existing executive used a software-interrupt entry point with parameters passed in registers — pretty standard stuff. Task switches were implemented in assembly language, swapping in and out registers for the new task and the running task. We used a similar interrupt interface in the protected-mode version with 80386 task gates used to swap in and out registers for tasks. After we ported the disk controller software to protected mode, we tied the whole thing together and tried to run the new software. The new software worked fine. There was one problem though — it was slower than the old version running on the 80186!
We were quite surprised (the words dismayed and panicked come to mind as well) by the results of our first test. We set out to find the problem. We discovered that our disk-controller software was doing a rather large number of task switches per second and making a pile of calls into the real-time executive.
Referring to our trusty Intel black book, we then noted that calling an 80386 task gate takes 309 clock cycles — not good. But after looking over the executive, we decided that calling 80386 task gates was not our only speed problem. The whole process of calling our executive took a lot of time. First an assembly-language routine was called from the disk-controller C-code software. The assembly-language routine would get all of the parameters passed C-style and then stuff them in registers. Next a software interrupt was executed, followed by getting the registers back onto the stack so we could call into the C portion of the executive. We also had to preserve the state of the registers so that, on return from the executive, the registers would be restored for the calling software. It should be noted that this basic approach resulted from our early experiences with a particular commercial real-time executive and its C support libraries.
After studying things, we realized we could streamline the whole process dramatically. Why not simply take advantage of the fact that a C compiler saves and restores as much context as necessary between C function calls? The compiler knows when making a call to another C routine which registers will be saved across the call. This meant that we could ditch swapping tasks with an 80386 task gate and skip the 309 cycles used to do a task switch. And switching tasks was now a simple C-language call into the linked executive — a small set of subroutines linked with the main application, instead of a remote set of services accessible only through software interrupts.
Since the basic mechanism of the task switch is now a rather trivial stack frame switch, we called the design the Stack Swap eXecutive, or SSX (Listing 1, Listing 2, and Listing 3) .

Pros and Cons of SSX
There are many things to recommend SSX. The native C-language interface and frisky stack-swap task switch make it very fast and efficient. And it is very small. The small size can save precious RAM or EPROM space that a larger executive might take up. SSX is also very portable. This article uses the executive in the MS-DOS environment but it is flexible enough to use in embedded applications or on different processor families. And since SSX is a very minimal executive, it is also easy to understand, port, modify, and extend.
There are, of course, a few limitations to using SSX. SSX is written in C and works best with an application written in C or C++, although it's not too hard to call into SSX from assembly after saving a few registers. SSX must be linked with your application. If you have a number of separate programs which much share CPU time and resources, they must all be linked together. Another disadvantage with SSX is its lack of features compared with many commercially-available executives.

Features of SSX
SSX is a real-time, preemptive, multitasking executive. Tasks are allowed to run until:

The task readies another task of equal or higher priority

A time slice (a clock tick) passes and there are other ready tasks of equal priority

An interrupt readies a task of equal or higher priority

The task explicitly gives up the CPU by calling the ssx_delay routine
Synchronization of tasks is accomplished via shared data structures of type wait_q. A task that needs to wait for an event calls ssx_wait with a wait_q as an argument. If the event has already occurred, as indicated by a message flag at the queue, the task does not get suspended but remains ready.
When another task (or interrupt) wishes to ready a sleeping task, it issues an ssx_alert to the queue. The highest priority task waiting is readied. If no task is waiting, the message flag is set.
Out of these two basic primitives you can build just about anything you could possibly need!
This model of synchronization has the advantage that it is quite efficient to implement. There is no hashing of addresses onto sleep queues or any of that kind of messiness. Moreover, it seems reasonable to assume in today's object-oriented society that if tasks are cozy enough to be synchronizing with each other, they should be well enough acquainted to share the wait_q data object.
At the risk of making our lean mean executive seem feature-laden, we also provided a primitive for waiting at a wait_q with an alarm timer set. If the event does not occur within the specified period of time, the alarm goes off and the task becomes ready again with a return value indicating the reason for the resumption.
And to round out our real-time toolkit, we have the ssx_task_delay call which simply excuses the task from execution for a specified number of clock ticks.
A crucial aspect of any real-time executive is how it is accessed from an interrupt to signal events to task-level code. SSX provides two calls which are used to frame the interrupt handler code. ssx_lock is called right after the interrupt handler saves the necessary CPU register. ssx_lock disables task switching so that the executive doesn't try to switch out the interrupt handler before it has completed its processing. ssx_unlock is called at the end of the interrupt handler right before it restores the CPU registers. This allows task switching to take place again.
It should be noted that for extra efficiency you can skip both these calls if the interrupt handler meets the following criteria:

It only makes one ssx_alert call.

All hardware processing is done before the call to ssx_alert (e.g. the interrupt controller has been reset and any other steps necessary to clear the interrupt at the requestor have been taken).
The idea is that at that point, the interrupt thread has just become a continuation of the task that was running. The actual interrupt return can now happen at any time.
The ssx_lock and ssx_unlock calls can also be used from the task level to temporarily disable rescheduling. If a task is going to do something time-consuming enough that it doesn't want to risk masking interrupts, but can't afford to be switched out while performing the specific activities, it can call ssx_lock to lock control of the CPU. Interrupts can still be handled, but any changes they make to the task state will not be examined until the ssx_unlock call is invoked.
For a complete list of function calls see Table 1.

Porting SSX
SSX currently works with Borland C and C++ compilers in the small model. SSX will need porting to a different memory model or an environment other that MS-DOS. To port SSX from the MS-DOS environment to a new one you must port four functions. They are:

ssx_task_create

stack_swap

disable_ints

enable_ints
A task is created by a call to ssx_task_create before or after ssx_run is called. You must have at least one task created before calling ssx_run. When a task is created a stack space is allocated. This stack is then set up so that when the task's stack pointer is swapped in, a simple return from stack_swap will start the task off. Figure 1 shows what the newly-created task's stack looks like after being created for the MS-DOS executive.
This sets the task up to be run for the first time. The code
/* stack_swap - switch from stack of current task
 * to stack of new task
 */

LOCAL void
stack_swap(unsigned int **old_stack_ptr, unsigned int **new_stack_ptr)
{
   asm or di,di  /* fool compiler into saving */
   asm or si,si /* di and si registers */

   /* save stack ptr of old task */
   *old_stack_ptr = (unsigned int *)_SP;
   /* load stack ptr register with stack ptr */
   /* of new task */
   _SP=(unsigned int)*new_stack_ptr;
}
shows the C listing of stack_swap. On entry, the code executes two lines of in-line assembly language. These instructions make sure the compiler saves and restores the two register variables, 80X86 registers di and si. The last two instructions use the pseudo-variable _SP. With Borland C compilers _SP allows you to directly access the 80X86 sp (stack pointer) register. This part of the code stores the stack pointer of the old task and puts the stack pointer of the new task in the sp register. This is all the code has to do to switch tasks.
The code
stack_swap proc near

; prologue for a C function
push  bp
mov bp,sp
push  si
push  di

; this fools compiler into saving
; si and di because
; they are register variables
or   di,di
or   si,si

; save stack pointer for old task
mov bx,word ptr [bp+4]
mov word ptr [bx],sp

; load stack ptr register with stack
; ptr of new task
mov bx,word ptr [bp+6]
mov sp,word ptr [bx]

; epilogue for a C function
pop di
pop si
pop bp
ret
stack_swap endp
shows an 80X86 assembly-language listing of the C function stack_swap. In the epilogue of the function stack_swap, registers di, si, and bp are popped off the stack. We have placed these values on the task's stack during ssx_task_create. The last instruction is ret which will pop the return address off the stack and execute the function run_new_task. run_new_task enables interrupts and runs the new task by calling a function pointer. The code
/* run_new_task - starts up a new
 * task making sure interrupts are
 * enabled
 */

LOCAL void
run_new_task(void)
{
   ints_on();
   (t_current-task_ptr)();
}
shows the function run_new_task.
disable_ints is a routine that gets the current state of interrupts and then disables interrupts and returns the previous state of interrupts. In the MS-DOS version of SSX this function is coded as in-line assembly language code. The pseudocode
disable_ints(void)
{
   save current state of interrupts (enabled or disabled);
   disable interrupts;
   returned saved state of interrupts (positive integer if they were enabled,
      0 is disabled)
}
indicates what is necessary to port disable_ints to other environments.
enable_ints enables interrupts. In the MS-DOS version of this function we have used a Turbo C macro that places an 80X86 sti instruction in the code.

SSX Demo Code
The file demo.c (Listing 4) demonstrates many of the calls in SSX. This MS-DOS demo program sets up a timer interrupt and then creates several tasks. The timer interrupt handler uses vector 8 on the MS-DOS PC. This vector is called 18.2 times a second. Each time the interrupt routine is called ssx_clock_tick is called. This is one area of the code that is not portable. On an MS-DOS AT PC you could also use the interrupt that is called 1,024 times a second if you need more resolution than this timer supplies.
After the timer handler is setup the demo program sets up several tasks. Here is a description of the tasks that are made.
Print queue tasks — These tasks increment a counter (one for each task) and print the value of the counter. These tasks use a semaphore that allows only one task at a time to call cprintf since it is not reentrant. The code
/* initialize semaphore to having
 * waiting message */
#define SET_SEMAPHORE(wqptr) (wqptr)-mesg_flg=1; (wqptr)-task_ptr=NULL

/* initialize wait_q to NULL task_ptr and no message waiting */
#define INIT_WAIT_Q(wqptr) (wqptr)-mesg_flg=0; (wqptr)-task_ptr=NULL
contains macros to setup semaphores and wait queues. Many Standard C functions are not reentrant, so keep that in mind when using C library functions. Five of these tasks are created.
Time-slice tasks — These tasks get a full time slice to increment a counter (one for each task). Since print-queue tasks have to get permission to print each time through, their counters will increment slower than these time-slice tasks. Five of these tasks are created.

Print-time-slice task — This task prints the counter values for each time-slice task once every three seconds. This task must also use our print semaphore each time it needs to print.

Keypress task — This task checks for a keypress every two seconds and calls ssx_stop if it finds one.

System-time task — This prints the system time in seconds every second. This task must also use our print semaphore each time it needs to print.
This code has been tested with Turbo C 2.0, Turbo C++ 1.0, and Borland C 3.1. To make the demo program type:
tcc demo.c ssx.c
or
bcc demo.c ssx.c
Since this code uses in-line assembly language you will also need tasm if you are using Turbo C 2.0 or Turbo C++ 1.0.
We hope you will find this executive portable and easy to use. For many applications this will have more than enough features. If you find it is missing something you need, no big deal. Change it. It's easy. It's all in C.