Threads:
The Modula-3 Systems Journal. Issue 3. Fall 1997 |
|
Copyright © 1997 Critical Mass, Inc. All Rights Reserved. |
We are pleased to bring you the third issue of Threads, The Modula-3 Systems Journal. By publishing Threads we hope to establish a forum for discussion about Modula-3 and about what various industrial and academic organizations are doing with Modula-3. The articles are intended to be accessible to both currently active and potential Modula-3 users. We hope to invite those who now use other programming languages give Modula-3 a try, too.
Production and distribution of Threads is sponsored by Critical Mass, Inc.
We welcome your ideas and contributions in shaping the future of Threads. We
imagine that Threads will change with your input over the next few issues.
Please drop us a note at
threads@cmass.com
.
This issue of Threads is also available in PDF format at:
https://www.cmass.com/threads/3/threads3.pdf
.
Feature Article
Critical Mass JVM: Modula-3 Befriends Java by Farshad Nayeri and Blair MacIntyre
JVM, Critical Mass's implementation of the Java Virtual Machine, is written entirely in Modula-3. In this article, Farshad Nayeri and Blair MacIntyre describe how they used JVM to build an industrial, Java-extensible serial port controller in Modula-3.Advanced Topics
Low-Level Systems Programming with Modula-3 by Marc E. Fiuczynski, Wilson C. Hsieh, Emin Gün Sirer, Przemyslaw Pardyak, and Brian N. Bershad
The SPIN group at the University of Washington has been using Modula-3 since 1995 as both the kernel and extension language for the SPIN operating system. In this article, they describe the set of additions they have made to extend Modula-3's support for low-level systems programming.Continuing Thread
An Object-Oriented Implementation of Unification Closure by Allan Heydon
In this third article in a series about Juno-2, Allan Heydon describes the design of three object types used in the Juno-2 constraint solver to illustrate the use of objects, subclassing, method overriding, and partial revelation in Modula-3.Modula-3 in Academia
Aachen University of Technology, Germany by Peter Klein
Aachen University of Technology, in Aachen, Germany has been using Modula-3 successfully for a number of academic and research projects since 1993. Peter Klein, a member of the CS Department III, reports on some of their experiences.
JVM, Critical Mass's implementation of the Java Virtual Machine, is written entirely in Modula-3. In this article, Farshad Nayeri and Blair MacIntyre describe how they used JVM to build a Java-extensible serial port controller in Modula-3.The success of Java has popularized many of the concepts that Modula-3 has been promoting for years, particularly the combination of garbage collection, exception handling, objects, and threads. Given the similarities between the two languages, and the strength of Modula-3 as a systems programming language, it is possible to exploit this common run-time infrastructure and build a Java virtual machine (JavaVM) implementation in Modula-3. Such an implementation will be better-structured than the equivalent C-based implementation since much of the hardest work, such as multi-threaded garbage collection, has already been done.
This is exactly how Critical Mass JVM was conceived. Adding to the impressive list of systems built in Modula-3, such as operating systems, network object systems, windowing systems, compilers, internet distribution and commerce systems, we can now point to Critical Mass's implementation of the JavaVM as another example of the power of Modula-3 for building systems that work.
What's unique about JVM is that it's written entirely in Modula-3, and as such it carries many of the benefits of Modula-3's mature, well-designed infrastructure. In JVM, a single run-time system supports both Java and Modula-3, providing an unparalleled level of integration between the two languages.
Like most other Modula-3 packages, JVM is implemented as a "component" (or in old-fashioned terms, as a reusable library.) Therefore, you can easily integrate JVM within new or existing applications written in any language, including Modula-3, C, or C++. In addition, JVM is designed so that it can be easily reconfigured and extended. The internal interfaces to JVM are available to the integrator, allowing his application to be integrated more tightly with JVM. JVM makes for a great migration path to Java, as well as a nice testbed for extending, enhancing, and researching the JavaVM. If you like both Java and Modula-3, then JVM is a great way of mixing them together!
Integration of the runtime systems is exactly how Critical Mass JVM was built. In the process of building JVM, portions of the Critical Mass Modula-3 runtime were redesigned to add support for some useful extensions, such as mapping runtime errors into exceptions, unicode characters, and dynamic loading. These features were used in turn to implement the JVM. This article won't go into the Modula-3 runtime extensions; they are likely to be the subject of a future Threads article.
The resulting system is quite useful for building large, intricate systems that require tight integration with Java. Indeed, Critical Mass JVM uses an identical layout for the fields of its objects as Critical Mass Modula-3, allowing you to assign values to fields of Java objects from your Modula-3 code. That is, given a Java object, it is possible to write a Modula-3 object definition that allows you to access the fields of the Java object directly from Modula-3. Methods, on the other hand, are not mapped between languages automatically, because the object models of Java and Modula-3 are sufficiently different that such a mapping either would have to unify both object models, or it would have to support a subset of each language. Neither one of these solutions seem useful. However, you can bind Modula-3 procedures as native Java methods via calls to the JVM runtime, allowing Modula-3 code to be called directly from Java. Conversely, Java methods can be called from Modula-3 by name.
SerialPort
interface, which is distributed as part of the Reactor
system:
INTERFACE SerialPort; IMPORT OSError, Pathname, Terminal;
SerialPort.T
represents a serial communications device. It
is a subtype of File.T
that can be read or written. Various
configuration details of the underlying device, such as its baud rate
and parity, can be read or written.
TYPE T <: Public; Public = Terminal.T OBJECT METHODS get_config (): Config RAISES {OSError.E}; set_config (READONLY config: Config) RAISES {OSError.E}; END;
Given an open serial port s
, s.get_config()
returns the current configuration of the underlying device.
s.set_config(cfg)
configures the underlying device to
correspond to cfg
.
TYPE Config = RECORD baud_rate : BaudRate; data_bits : DataBits; stop_bits : StopBits; parity : Parity; DTR_mode : DTR; RTS_mode : RTS; END; TYPE BaudRate = {BR75, BR110, BR300, BR600, BR1200, BR2400, BR4800, BR9600, BR14400, BR19200, BR38400, BR56000, BR128000, BR256000}; DataBits = {DB8, DB7, DB6, DB5}; StopBits = {SB1, SB15, SB2}; Parity = {None, Odd, Even, Mark, Space}; DTR = {Disabled, Enabled, Handshake}; RTS = {Disabled, Enabled, Handshake, Toggle}; CONST InitialConfig = Config { BaudRate.BR9600, DataBits.DB8, StopBits.SB1, Parity.None, DTR.Disabled, RTS.Disabled }; PROCEDURE Open (p: Pathname.T): T RAISES {OSError.E};This procedure opens the serial device named
p
and returns aSerialPort.T
that can read and write the device. The initial configuration of the result is set toInitialConfig
.
END SerialPort.
Note that this Modula-3 serial port interface itself has nothing to do
with Java; it provides portable serial port access to Modula-3 programs
on Windows and Unix. A SerialPort.T
is a subtype of
File.T
; this is a nice property because you can mix and
match serial ports with other Modula-3 libraries. For example, you can
create a standard reader or writer for a serial port.
SerialPort
interface in
Java. Despite the similarities, note the number of differences between the
structure of these two "interfaces":
package cmass; // SerialPort is a JVM native file descriptor that // maps to a serial port of the system where JVM runs public class SerialPort { /* File Descriptor - handle to the open file */ private java.lang.io.FileDescriptor fd; /* Opens the specified serial port. */ private native void open(String name) throws java.io.IOException; /* Closes the serial port. */ private native void close() throws java.io.IOException; /* * Creates a SerialPort file with the specified system dependent file name. * Throws FileNotFoundException if the file is not found. */ public SerialPort(String name) throws java.io.FileNotFoundException { try { this.fd = new FileDescriptor(); this.open(name); } catch (IOException e) { throw new java.io.FileNotFoundException(name); } } /* Constants for configuring the serial port. */ /* The baud rate. */ public static final short BR75 = 0; public static final short BR110 = 1; public static final short BR300 = 2; public static final short BR600 = 3; public static final short BR1200 = 4; public static final short BR2400 = 5; public static final short BR4800 = 6; public static final short BR9600 = 7; public static final short BR14400 = 8; public static final short BR19200 = 9; public static final short BR38400 = 10; public static final short BR56000 = 11; public static final short BR128000 = 12; public static final short BR256000 = 13; /* the number of data bits */ public static final short DB8 = 0; public static final short DB7 = 1; public static final short DB6 = 2; public static final short DB5 = 3; /* the number of stop bits */ public static final short SB1 = 0; public static final short SB15 = 1; public static final short SB2 = 2; /* the parity */ public static final short PARITY_NONE = 0; public static final short PARITY_EVEN = 1; public static final short PARTIY_ODD = 2; public static final short PARITY_MARK = 3; public static final short PARITY_SPACE = 4; /* DTR line control */ public static final short DTR_DISABLED = 0; public static final short DTR_ENABLED = 1; public static final short DTR_HANDSHAKE = 2; /* RTS line control */ public static final short RTS_DISABLED = 0; public static final short RTS_ENABELD = 1; public static final short RTS_HANDSHAKE = 2; public static final short RTS_TOGGLE = 3; /* * Sets the configuration of the serial port. The * array parameter contains the following elements: * [baud_rate, data_bits, stop_bits, parity, * DTR_mode, RTS_mode, timeout readInterval, * timeout readMultiplier, timeout readConstant]. */ public native void setConfig(short[] cfg); /* * Gets the current configuration of the serial port. * The array parameter will contain the following * elements: [baud_rate, data_bits, stop_bits, * parity, DTR_mode, RTS_mode, timeout readInterval, * timeout readMultiplier, timeout readConstant]. */ public native void getConfig(short[] cfg); /* * Returns the opaque file descriptor object * associated with this stream. */ public final FileDescriptor getFD() throws IOException { if (this.fd == null) throw new IOException(); return this.fd; } /* Cleans up if the user forgets to close it. */ protected synchronized void finalize() throws IOException { if (this.fd != null) this.close(); } } // class SerialPort
This class simply exposes the serial port features to Java. You can compile
this file with a standard Java compiler (such as Sun's javac
.) Of course, the
resulting class file expects the Java runtime to include native implementations
for the methods defined in this class. If you were to run this with an ordinary
JavaVM without the serial port extensions, it would not work. So, the next step
is to extend the JVM to include support for objects of this class.
Note the private "fd
" field of this object, which is of type FileDescriptor
.
Ideally, we would like to do the Java equivalent of subtyping File.T
, namely
having our SerialPort
class extend java.lang.io.FileDescriptor
, but this is not
possible because the Java Language Specification declares FileDescriptor
to be
final
(meaning that this class cannot be extended.) Even if JVM allowed
FileDescriptor
to be extended, javac
, the Java source-to-bytecode compiler
would not allow the code for cmass.SerialPort
to compile. So, we end up with
the more cumbersome (and less efficient) approach of including a FileDescriptor
field in our object. (There is precedence for this approach in the Java
libraries that implement TCP/IP sockets, which should also be subtypes of
FileDescriptor
.) The contrast of the two serial port "interfaces" also
illustrates some useful constructs missing from Java, e.g., an enumeration
type.
JVM_SerialPort
module in
Modula-3:
INTERFACE JVM_SerialPort; IMPORT java, JVM_File; CONST ClassName = "cmass.SerialPort"; TYPE T = java.object BRANDED ClassName OBJECT fd: JVM_File.FileDescriptor := NIL; END; END JVM_SerialPort.
JVM_SerialPort.T
is a mirror image of
cmass.SerialPort
. Note how its fd
field's type
is the mirror image of java.lang.io.FileDescriptor
,
JVM_File.FileDescriptor
.
Finally, we implement the JVM_SerialPort
module.
MODULE JVM_SerialPort; IMPORT SerialPort, JVM, JVM_Interp, java, JVM_Error, AtomList, OSError, JVM_String, JVM_ArrayClass;
First, create an array of method descriptions. For each method, declare the
class name ("cmass.SerialPort
"), the method name, its Java type, and the name
of the Modula-3 procedure implementing it. The somewhat cryptic Java types are
a legacy of Sun's JavaVM implementation, and are documented in the Java Virtual
Machine Specification. (We leave as an exercise to the reader to decrypt them!
Hint: see the method signatures of the cmass.SerialPort
class.)
CONST NativeMethods = ARRAY OF JVM.MDesc { JVM.MDesc { ClassName, "open", "(Ljava.lang.String;)V", Open}, JVM.MDesc { ClassName, "setConfig", "([S)V", SetConfig}, JVM.MDesc { ClassName, "getConfig", "([S)V", GetConfig}, JVM.MDesc { ClassName, "close", "()V", Close} };
The next procedure, Open
implements the open call, which is
declared to take a pathname as its argument. To access its parameters,
Open
pops them off the JVM stack. Here, there is only one
argument to be popped off, plus the implicit self
argument.
We take advantage of Modula-3's automatic narrow by declaring
self
to be of type T
, thereby allowing us to
refer to self.fd
.
PROCEDURE Open (VAR env: JVM.Env; m: JVM.Method) RAISES {java.failure} = VAR path : java.string := JVM_Interp.PopObj (env); self : T := JVM_Interp.PopObj (env); name := JVM_String.ToText (path); BEGIN TRY self.fd.file := SerialPort.Open (name); EXCEPT OSError.E (err) => IOError (env, err) END; END Open;
(For simplicity, Open
doesn't check the validity of its arguments. We skip over
the implementation of Close
, since it is similar to Open
.)
You may wonder what happens if you want to access fd
concurrently from Modula-3
and Java. The answer is that, just as if you were building two components in
pure Java or Modula-3, the two sides will have to synchronize with each other. In
particular, you can use the usual per-object locks available in Java. To access
the object from Modula-3, you will have to lock the object via the call
JVM_Thread.Lock
.
Next, consider the native implementation of the setConfig
method, which takes a configuration array. Note how we can mix and match
Java data structures with Modula-3 ones. (We skip
getConfig
's implementation, since it is similar to
setConfig
.)
CONST ConfigArraySize = 9; PROCEDURE SetConfig (VAR env: JVM.Env; <*UNUSED*> m: JVM.Method) RAISES {java.failure} = VAR x : java.short_array := JVM_Interp.PopObj (env); self: T := JVM_Interp.PopObj (env); cfg : SerialPort.Config; BEGIN IF NUMBER(x.elts^) # ConfigArraySize THEN ArgError(env, "illegal array size"); END; cfg.baud_rate := VAL(x.elts[0], SerialPort.BaudRate); cfg.data_bits := VAL(x.elts[1], SerialPort.DataBits); cfg.stop_bits := VAL(x.elts[2], SerialPort.StopBits); cfg.parity := VAL(x.elts[3], SerialPort.Parity); cfg.DTR_mode := VAL(x.elts[4], SerialPort.DTR); cfg.RTS_mode := VAL(x.elts[5], SerialPort.RTS); cfg.timeouts.readInterval := x.elts[6]; cfg.timeouts.readMultiplier := x.elts[7]; cfg.timeouts.readConstant := x.elts[8]; TRY TYPECASE self.fd.file OF | SerialPort.T (port) => port.set_config(cfg); ELSE ArgError(env, "non-serial port file"); END; EXCEPT OSError.E (err) => IOError (env, err) END; END SetConfig;
The procedure IOError
raises a java.failure
exception from Modula-3, passing enough information so that meaningful
backtrace information can be produced in Java (including Modula-3 line
numbers in Java's backtraces!) The procedure ArgError
is
similar to IOError
, except that it raises an exception of
a particular type, namely, a standard
IllegalArgument
exception.
Finally, the main body of the module registers the native methods using
JVM.RegisterNativeMethods
, which conveniently takes an array, and the
class name "cmass.SerialPort
" is registered to map to JVM_SerialPort.T
.
BEGIN JVM.RegisterNativeMethods (NativeMethods); JVM.RegisterNativeType (ClassName, TYPECODE (T)); END JVM_SerialPort.
As you can see, it is quite simple to extend JVM with Modula-3 code. The nice part is that in your extension, you can take full advantage of Modula-3 features, such as exceptions, garbage collection and threads. These features are unavailable inside other JavaVM implementations; there you drop to ground zero when you start to write native methods, foregoing all access to these features.
Note that, as with Modula-3's <*EXTERNAL*>
pragma, it is the programmer's
responsibility to ensure that the mapping from Java to Modula-3 is correct. The
programmer is still aided by the strong typing of both languages. This
responsibility is a small price to pay for the tight degree of integration
provided by JVM.
SLI
class, which is not described here):
import java.io.*; import cmass.*; class app { public static void main (String argv[]) { try { SLI sli = new SLI("/dev/cua1"); PrintStream ps = sli.ps; ps.print("\fHello\r World!"); ps.flush(); Thread.sleep(1000000); } catch (InterruptedException e) { System.err.println("sleep failed"); } catch (java.io.FileNotFoundException n) { System.err.println("serial port not found"); } catch (java.io.IOException e) { System.err.println("I/O error"); } } }
And here is the result:
https://www.cmass.com/jvm
.
The SPIN group at the University of Washington has been using Modula-3 since 1995 as both the kernel and extension language for their extensible operating system. In this article, they describe the set of additions they have made to extend Modula-3's support for low-level systems programming.
Since 1995 we have used Modula-3 to develop operating system services for a kernel called SPIN at the University of Washington [1]. SPIN is an extensible kernel that allows untrusted applications to extend system services by dynamically linking in extensions [4]. The SPIN kernel provides in-kernel threads, virtual memory, and device management as its core set of services, and higher-level operating system services are implemented by extensions. To date we have implemented a variety of extensions in Modula-3, including user-level threads, a TCP/IP protocol stack, transaction services for databases, a log-structured file system (LFS), NFS and HTTP servers, a Unix emulation library that is binary compatible with Digital UNIX and FreeBSD, and many other services. The kernel itself is also implemented in Modula-3.
We have found Modula-3's type safety, as well as its data hiding properties, threads, generic interfaces and object support, to be effective in developing a high-performance, modular system [5]. While building SPIN, we have learned that low-level system programming requires explicit language constructs to interact with the outside world. Some examples of interactions with entities outside the domain of a Modula-3 program are interfacing with other languages, manipulation of the underlying hardware, and interpreting data from devices. Modula-3 greatly facilitated the rapid construction of SPIN and its extensions, but there were a few additions and modifications we needed to make to the language and run-time to support low-level programming.
In this article we describe our changes to support low-level systems programming with Modula-3 as follows:
The rest of this article addresses each of the above points.
To pass reference types from a foreign language (typically C, although occasionally assembly language) to Modula-3 we follow one of two strategies on the Modula-3 side. We either declare the arguments as untraced references to records, or we declare them as call-by-reference arguments using Modula-3's VAR parameter-passing mode (for which the compiler generates code that automatically dereferences the parameter).
Most Modula-3 programs define data structures in the traced heap, which simplifies interface design and eliminates storage leaks and dangling pointers. However, passing a traced reference outside the domain of a Modula-3 program (such as a foreign language or a device) requires that we register the referent with the collector. Otherwise, the collector may relocate or prematurely deallocate it. For instance, a network buffer that is allocated in the traced heap cannot be passed to a network device, since the pointer to the buffer that is kept in the device is invisible to the collector and hence will not be updated if the buffer is relocated. For this reason, we introduced the notion of a strong reference, which registers an object allocated from the traced heap with the collector as temporarily uncollectible and immovable. In this way, an object can be "strong ref'd" before it is passed out and "un-strong ref'd" when and if it is safe to do so.
Finally, we added the CVAR
parameter-passing mode to allow
Modula-3 procedures to call C procedures more efficiently. In C,
call-by-reference is implemented by passing a pointer to the appropriate
parameter. Passing a value of NIL
as the address of
an argument is often used to denote a special case by C programs. It is not
possible to preserve such semantics using VAR
, the Modula-3 mechanism
for call-by-reference, because an argument passed as a VAR
parameter
must be a designator (and has its address taken automatically).
UNSAFE INTERFACE Ccalls (* The C declaration for the CallByReference procedures is void CBR(int *out); *) <* EXTERNAL "CBR" *> PROCEDURE CallByReference1(CVAR out: INTEGER); (* CallByReference1 can be called with NIL as an argument: CallByReference1(NIL); *) <* EXTERNAL "CBR" *> PROCEDURE CallByReference2(out: UNTRACED REF INTEGER); END Ccalls. |
Figure 1: Declaration using the
CVAR call-by-reference mode.
|
The code fragment in Figure 1
illustrates how CVAR
can be used. CVAR
is restricted to
EXTERNAL
declarations. The CVAR
parameter out
is
similar to a VAR
parameter; it differs in that NIL
can be
explicitly passed as the value for out
. If NIL
is passed to
CallByReference1
, the C formal parameter out
is bound to
NIL
. If an INTEGER
designator is passed to
CallByReference1
, CVAR
is equivalent to VAR
, and
out
is bound to the address of that designator.
Without CVAR
, we would have to declare an external function as
CallByReference2
. However, using a REF
INTEGER
does
not capture the semantics of the parameter-passing mode for out
,
and would require the
caller to use the unsafe ADDRESS
operator. For these reasons, we decided to add the
CVAR
parameter mode.
EXTERNAL
pragma, which allows an interface to describe a function written in a
foreign language such as C. However, this pragma is legal in both safe and
unsafe interfaces, implying that a safe module could import a safe
interface that provided direct access to a C function or data structure of
arbitrary type. Since the type of the function or data structure may in
fact be specified by the C implementation, Modula-3 cannot enforce type
safety of safe modules that use EXTERNAL
.
Consequently, we modified the compiler so that the EXTERNAL
pragma
can only be used within unsafe interfaces. The programmer is thus forced
to assert safety by wrapping a safe interface
around the unsafe one containing uses of EXTERNAL
.
Calling Modula-3 procedures from foreign languages is more complex, as the DEC SRC implementation does not directly support this direction. Rather, the conventional approach is for the programmer to initialize procedure variables that are visible to the foreign language. We use this technique, but also modified the front-end of the compiler to generate C header files for Modula-3 interface files. This way, procedures exported via a Modula-3 interface can be called directly from C using "module dot method" syntax.
ALIGNED
FOR
construct (analogous to BITS FOR
) to Modula-3. The type ALIGNED
n FOR T
, where n
is an INTEGER
, specifies a subtype
of T
that has an alignment of n
bits, subject to the
following constraints:
n
must be a multiple of the type's inherent alignment.
T
to n
bits must not require any padding.
The first restriction preserves natural alignment. The second
restriction is a design decision that requires the programmer to explicitly
include extra padding when necessary. Together, these restrictions mean that
ALIGNED FOR
can be used only to boost the alignment of a type.
(* The corresponding C declaration: struct UdpHeaderC { int sport: 16; int dport: 16; int len: 16; int check: 16 }; *) TYPE UdpHeaderM3 = RECORD sport: BITS 16 FOR [0..16_ffff]; dport: BITS 16 FOR [0..16_ffff]; len : BITS 16 FOR [0..16_ffff]; check: BITS 16 FOR [0..16_ffff]; END; TYPE UdpHeaderM3C = ALIGNED 32 FOR RECORD sport: BITS 16 FOR [0..16_ffff]; dport: BITS 16 FOR [0..16_ffff]; len : BITS 16 FOR [0..16_ffff]; check: BITS 16 FOR [0..16_ffff]; END; |
Figure 2: Declarations of Modula-3
records using different alignment constraints. Using the ALIGNED FOR
constructor enables the programmer to specify the alignment requirements of the
data structure.
|
Figure 2 illustrates the
alignment problems that can arise when interacting with C. One might expect
(as we initially did) that a UdpHeaderM3
could be aligned on any
16-bit boundary. In fact, the DEC SRC M3 compiler boosts the alignment of
every UdpHeaderM3
to the natural word size of the architecture for
efficiency (which is 32-bit for x86 and 64-bit for the Alpha). If the
Modula-3 compiler chooses an alignment for UdpHeaderM3
that is
more strict than the alignment chosen by the C compiler for
UdpHeaderC
, then an alignment exception can occur if C code passes
a reference to a UdpHeaderC
to Modula-3 code. Similarly, if the
Modula-3 compiler chooses an alignment for UdpHeaderM3
that is
less strict than the alignment chosen by the C compiler for
UdpHeaderC
, then an alignment exception can occur if Modula-3 code
passes a REF
UdpHeaderM3
to C code. To avoid these
problems, we use ALIGNED FOR
as shown in the declaration of
the UdpHeaderM3C
type of
Figure 2 above.
In addition to adding ALIGNED FOR
, we have turned off the
automatic alignment boosting that occurs in our M3 compiler. This design
decision may seem counter-intuitive, because the programmer must deal with
alignment when necessary. However, such alignment issues only arise when
dealing with data structures that contain only sub-word-sized types. A
programmer who uses such types is already taking performance issues into
consideration, and dealing with alignment is not much of an additional
burden.
The exception model of Modula-3 requires that each procedure explicitly state what exceptions it can raise, although a note on page 29 [3] does imply the existence of implicit exceptions. In SPIN we require that our Modula-3 implementation reflect all checked run-time errors back to the offending code as implicitly declared exceptions. Such exceptions are raised by every procedure. Unhandled exceptions are subsumed by our model of implicit exceptions, in that the "unhandled exception" exception is another implicit exception, and thus can be caught. If an implicit exception is not caught, the offending thread is stopped at its failure point, and the unhandled exception may be caught by another thread.
IMPORT SpinException; PROCEDURE NilDeref(pointer: REF INTEGER) = VAR value : INTEGER; BEGIN TRY (* dereference a pointer, which could be NIL *) value := Pointer^; EXCEPT | SpinException.Exception(info) => IF info.code = SpinException.ExceptionCode.AttemptToDereferenceNIL THEN (* handle the error *) ... END; END; END NilDeref; |
Figure 3: An example use of catching a checked run-time error as a language exception. |
The code fragment in Figure 3 illustrates how
implicit exceptions are used. In the code, the exception
SpinException.Exception represents all of the implicit exceptions.
In this simple example, the procedure NilDeref dereferences a
NIL
pointer; our implicit exception model allows this error to be
caught. Although this example is overly simplified (since the test for a
NIL
pointer could be done explicitly), it illustrates how implicit
exceptions allow programs to explicitly catch checked run-time errors.
SPIN also reflects other system faults as implicit
exceptions to the offending code. The programmer may catch arithmetic
faults (e.g., divide by zero), virtual memory related faults (e.g.,
protection faults or accessing an invalid address), and unaligned accesses
(for processors such as the Alpha). The ability to catch system faults as
exceptions enables the construction of more robust systems by guarding
against errors. In practice, we have found this capability very useful. For
example, TCP services in our HTTP server continued operating even when
unrelated UDP code on the packet input path erroneously dereferenced
NIL
.
VIEW
.
VIEW
allows a programmer to reinterpret a piece of memory
as a different type. The code in Figure 4
illustrates how we use VIEW
to
interpret packets in SPIN [2]. Inside the body of the WITH
statement, the variable ipHeader
is a typed alias for the
packet's header. Both the ByteFilter
and ViewFilter
procedures have the same function, but the former is more obtuse,
difficult to maintain, and likely to contain errors. VIEW
enables programmers to write code that is simpler to understand and
that executes more efficiently than using explicit byte
manipulations.
IMPORT Ip, Word; CONST SourceAddr = 16_805f02DE; (* IP address of www-spin.cs.washington.edu *) TYPE Packet = ARRAY [0..255] OF Byte; PROCEDURE ByteFilter(READONLY m: Packet) : BOOLEAN = (* type-safe, but readable; inefficient on machines without byte operations *) BEGIN RETURN m.packet[9] = Ip.protocol.UDP AND Word.And (m.packet[0], 16_F) = Ip.version4 AND Word.Or(Word.LeftShift(m.packet[15], 24), Word.Or(Word.LeftShift(m.packet[14], 16), Word.Or(Word.LeftShift(m.packet[13], 8), m.packet[12]))) = SourceAddr; END ByteFilter; PROCEDURE ViewFilter(READONLY m: Packet) : BOOLEAN = (* type-safe, readable, and efficient *) BEGIN WITH ipHeader = VIEW(m, Ip.T) DO (* no copying *) RETURN ipHeader.protocol = Ip.protocol.UDP AND ipHeader.version = Ip.version4 AND ipHeader.sourceAddr = SourceAddr; END; END ViewFilter; |
Figure 4: Implementation of packet
filters using byte operations and VIEW . The filters accept
unfragmented UDP/IP packets with a particular source address. The
WITH operator creates an alias to the ipHeader using a
VIEW expression.
|
Our casting operator is correct for two reasons. First, VIEW
does not create any illegal representations. Second, for any visible type
T, a client can allocate its own instances of T and set
T's fields to any legal value. Therefore, there can be no harm in
allowing a client to "create" new instances of T via casting
instead of allocation. More precisely, VIEW
allows a programmer to
cast a
designator
(an expression that denotes a memory location)
to another type. We develop a notion of equivalence between
designators of different types, such that casting between equivalent
designators is type-safe. We can cast a designator from a type
T1 to another type T2 so long as the
following conditions are met:
We define two types to be representation-equivalent if they satisfy the first two conditions, which can be determined at compile-time. Trivially, any type T is representation-equivalent to itself. For other pairs of types, we must compare the sets of legal values for the types, where legality is defined as follows:
[1] | B.N. Bershad, S. Savage, P. Pardyak, E.G. Sirer, M. Fiuczynski, D.
Becker, S. Eggers, and C. Chambers.
"Extensibility, Safety and Performance in the SPIN Operating System,"
in Proceedings of the Fifteenth ACM Symposium on Operating Systems
Principles, Copper Mountain, CO, December 1995.
https://www.cs.washington.edu/research/projects/spin/www/papers/SOSP95/sosp95.ps |
[2] | M.E. Fiuczynski and B.N. Bershad.
"An Extensible Protocol Architecture for Application-Specific Networking,"
in Proceedings of the 1996 Winter USENIX Conference, San Diego,
CA, January 1996.
https://www.cs.washington.edu/research/projects/spin/www/papers/Usenix96/extprotarch.ps |
[3] | G. Nelson, editor. System Programming in Modula-3, Prentice Hall, 1991. |
[4] | E.G. Sirer, M.E. Fiuczynski, P. Pardyak, and B.N. Bershad.
"Safe Dynamic Linking in an Extensible Operating System,"
in The First Workshop on Compiler Support for Systems Software,
February 1996.
https://www.cs.washington.edu/research/projects/spin/www/papers/WCS/domain.ps |
[5] |
E.G. Sirer, S. Savage, P. Pardyak, and B.N. Bershad.
"Writing an Operating System in Modula-3,"
in The First Workshop on Compiler Support for Systems Software,
February 1996.
https://www.cs.washington.edu/research/projects/spin/www/papers/WCS/m3os.ps |
[6] |
W.C. Hsieh, M.E. Fiuczynski, C. Garrett, S. Savage, D. Becker, and B.N.
Bershad.
"Language Support for Extensible Systems,"
in The First Workshop on Compiler Support for Systems Software,
February 1996. https://www.cs.washington.edu/research/projects/spin/www/papers/WCS/language.ps |
This is the third in a series of articles about Juno-2, a constraint-based drawing editor. The first article described the implementation of the Juno-2 user interface, and the second article described the design and implementation of a reusable software double-buffer object.This article describes the design of three object types used in the Juno-2 constraint solver. It illustrates the use of objects, subclassing, method overriding, and partial revelation in Modula-3.
The programming language used in Juno-2's textual view
[2] is based on Dijkstra's guarded command
language [3], but with provisions for
solving constraints. Its value space is the smallest set closed under
the formation of ordered pairs and containing real numbers, texts, and
the special value NIL
. In other words, the data structures
of Juno-2 programs are similar to those of LISP.
A constraint in Juno-2 is a predicate on program variables. Solving a constraint is equivalent to finding values for the variables that make the predicate true.
The inclusion of ordered pairs in the value space means that the
Juno-2 solver must support constraints on the structure of
values. For example, if v
is a fixed list of elements,
the following command can be used to initialize the local variables
a0
and a1
to the first two elements of
v
:
VAR a0, a1, tail IN v = (a0, (a1, tail)) -> (* code using a0 and a1 goes here *) END
Ordered pairs have the property that if the pairs (a0,
b0)
and (a1, b1)
are asserted to be equal, then the
equalities a0
= a1
and b0
= b1
must also hold. (The converse is also true, but there is no need for
the constraint solver to infer equalities of pairs.) The Juno-2 solver
works by maintaining an equivalence relation on known values and the
unknowns being solved for. When two values are asserted to be equal,
their equivalence classes are merged. The equivalence relation is said
to be unification closed if all equalities of pairs have been
propagated to equalities on their children.
Juno-2 uses unification [4] to solve
constraints on pairs. For example, imagine that the variable
v
in the above constraint has the following fixed value:
v = (0, (1, (2, (3, NIL))))
By one unification, the constraint on v
would engender
the following two equivalences:
a0 = 0 (a1, tail) = (1, (2, (3, NIL)))
The latter of these two equivalences would then engender the following equivalences:
a1 = 1 tail = (2, (3, NIL))
Together, these equivalences solve the constraint.
There is an extra wrinkle in the unification process due to value
types. At run-time, each equivalence class may have a known type. If
two equivalence classes being merged have different known types, then
the constraint solver can immediately report failure. If only one
class has a known type, that type can be propagated to the other
class as a side-effect of unifying the two values. Another failure case
that must be detected is a cycle in the closure, such as would be
caused by the constraint x = (x, 1)
.
Juno-2's constraint solver uses three abstract classes to
propagate equalities and to perform unification: Equiv.Elt
,
Egraph.Node
, and JunoSolve.Var
. An Equiv.Elt
represents an element of an equivalence class; the constraint solver
uses the Equiv interface to maintain the equivalence relation
on Juno-2 values. The solver uses instances of Egraph.Node
to
represent the functional expressions appearing in constraints.
Finally, the solver uses instances of JunoSolve.Var
to
represent knowns (i.e, constants) and unknowns in constraints. As we
will see, the three classes are related by subtyping, although the
subtyping in one case is revealed in an implementation module rather
than an interface.
The design of these classes illustrates the use of subclassing, partial revelation, and method overriding in Modula-3. The rest of this article describes the three classes and their interfaces. There is also a short section on the implementation.
CAR(x) = 2 * y
. In this
graph, the names x0
and x1
have been introduced
for the first (CAR) and second (CDR) elements of the pair x
, respectively.
Figure 1. The Egraph for the constraint CAR(x) = 2 * y .
In this graph, the dashed lines denote nodes in the same equivalence class.
|
The basis for an Egraph is an equivalence relation on its nodes. Juno-2 uses the following Equiv interface to maintain the equivalence relation:
INTERFACE Equiv; EXCEPTION Forbidden; TYPE Elt <: Public; Public = OBJECT METHODS init(): Elt; find(): Elt; union(y: Elt): Elt RAISES {Forbidden}; END;
For those readers unfamiliar with Modula-3, these declarations
introduce the types Elt
and Public
. The syntax
Elt <: Public
declares the type Elt
to be a
partially
opaque subtype of the type Public
.
Public
is declared to have no data fields and three methods.
Here is the rest of the interface.
AnEquiv.Elt
is an element of an equivalence relation. The statementNEW(Equiv.Elt).init()
produces a new element in an equivalence class by itself.The call
x.find()
returns the distinguished representative, or root, ofx
's equivalence class.The call
x.union(y)
combines the equivalence classes represented byx
andy
, and returns the representative of the new class. After the callx.union(y)
,x.find() = y.find()
. It is a checked run-time error for eitherx
ory
not to be the root of its equivalence class. The result ofx.union(y)
is guaranteed to be eitherx
ory
.The default
union
method never raises theForbidden
exception, but it may be useful for subtypes overriding that method to raiseForbidden
in the event that the operation is deemed illegal by the subtype.
END Equiv.
The expressions in an Egraph are function applications. For
simplicity, Juno-2 uses a more concrete representation of function
applications than the abstract one depicted in Figure 1:
the function application f(a1, ..., an)
is represented by a linked list of length n+1 in
which the function name f is the first element of the list, and
the arguments are the remaining elements. For example, Figure 2 shows the representation of the pair
expression (x, 2+y)
.
Figure 2. The Egraph representation of the expression (x, 2+y) .
|
The Egraph interface reveals that an Egraph node is
actually a subtype of an Equiv.Elt
:
INTERFACE Egraph; IMPORT Equiv; TYPE Node <: Public; Public = Equiv.Elt OBJECT head, tail: Node := NIL; METHODS init(): Node; END;An
Egraph.Node
is a node of an oriented, directed graph in which every node has out-degree at most 2 and on which there is an equivalence relation.The statement
NEW(Egraph.Node, head := a, tail := b).init()
evaluates to a new node in its own equivalence class with children initialized to the valuesa
andb
. Thehead
andtail
fields should not be written after initialization.
END Egraph.
The third class of interest is used to represent constants and
variables in constraints. These are both represented by the partially
opaque Var
type in the JunoSolve interface. In this
interface, RTVal.T
is the type of a Juno-2 run-time value.
INTERFACE JunoSolve; IMPORT RTVal; TYPE Var <: Public; Public = Private OBJECT known: BOOLEAN; val: RTVal.T; END; Private <: ROOT;If
x
is of typeVar
, thenx.known
indicates if the variable denoted byx
is known (i.e., fixed) or unknown. Ifx.known
, thenx.val
is a legal Juno value. IfNOT x.known
butx.val # NIL
, thenx.val
is a hint for the initial value ofx
.
PROCEDURE New(known := FALSE; val: RTVal.T := NIL): Var;
Return a new, validVar
with the given field values.
The JunoSolve interface also includes procedures for
creating new constraints and a procedure P
for solving a
constraint system, but these procedures are irrelevant to the current
discussion.
Notice in the above declarations that both the prefix and the
suffix of JunoSolve.Var
are opaque. The prefix is opaque
because the type Private
is declared to be an opaque subtype
of the root object type ROOT
. The suffix is opaque because
Var
is declared to be an opaque subtype of the
type Public
. The JunoSolve implementation reveals that
a JunoSolve.Var
is actually a subtype of an
Egraph.Node
.
MODULE JunoSolve; IMPORT Equiv, Egraph, RTVal, ...; REVEAL Private = Egraph.Node BRANDED "JunoSolve.Private" OBJECT pair: EC; OVERRIDES union := Union; END; Var = Public BRANDED "JunoSolve.Var" OBJECT type: Type; (* other fields private to the implementation declared here... *) END;
TYPE EC = Private;
The typeEC
represents an equivalence class with an extra field to record if the class contains a pair expression. We declare the nameEC
simply to serve as a more meaningful synonym for the typePrivate
in the implementation.
TYPE Type = { Any, Pair, Num, Text, Null };
There is a flat partial order on types, withType.Any
as bottom.
The head
and tail
fields of a JunoSolve.Var
are always NIL
, but by virtue of being an Equiv.Elt
, a
JunoSolve.Var
is also part of the equivalence relation.
When a node is the representative of its equivalence class and
denotes an ordered pair, its pair
field is non-NIL
and points to an Egraph list of length 3 in which the first element of
the list is the node denoting the Pair function and the next
two elements are the pair's CAR and CDR.
Notice that JunoSolve.Private
is revealed to override the
Equiv.Elt.union
method with the local procedure
JunoSolve.Union
. Its implementation is sketched below.
Space limits do not permit a complete description of the unification closure implementation. Here, we summarize three of its aspects.
JunoSolve.Union
procedure first
calls the Egraph.Node.union
method. It then performs extra
work required by the Juno-2 solver. For example, it compares the
type
fields of the two nodes being merged. If they have
incompatible types, it raises the exception Equiv.Forbidden
.
If one is of type JunoSolve.Type.Any
and the other is not,
the known type is propagated to the representative of the new
equivalence class.
union
method. The implementation
of the JunoSolve.Union
procedure has the side-effect of adding
elements to the queue whenever two pairs in different equivalence classes
are unified. The time complexity of our implementation is O(n),
where n is the size of the Egraph.
We therefore chose to use the QuickFind implementation, in which each equivalence class is represented by a tree of depth at most 1. Each element has a pointer to the representative element of its equivalence class. The find operation thus takes constant time. The union operation works by making all of the elements of the smaller class point to the root of the larger class. The implementation is straightforward.
[1] | Allan Heydon and Greg Nelson. The Juno-2 Constraint-Based Drawing Editor, SRC Research Report 131a, December, 1994. |
[2] | Greg Nelson and Allan Heydon. Juno-2 Language Definition, SRC Technical Note 1997-009, June, 1997. |
[3] | Edsger W. Dijkstra. A Discipline of Programming, Prentice-Hall, Inc., 1976. |
[4] | J. A. Robinson. "A Machine-Oriented Logic Based on the Resolution Principle," in Journal of the ACM, Vol. 12, No. 1, pgs 23-41, January, 1965. |
[5] | Techniques for Program Verification, Greg Nelson, Technical Report CSL-81-10, Xerox Palo Alto Research Center, June, 1981. |
Aachen University of Technology, in Aachen, Germany has been using Modula-3 successfully for a number of academic and research projects since 1993. Peter Klein, a Staff Member of Computer Science, reports on some of their experiences.The Department of Computer Science III is one of the thirteen departments comprising the computer science group at the Aachen University of Technology. Started by Professor M. Nagl, the main research topics of the department are methods, languages, and tools for software engineering.
Like many other computer science departments, we favored Modula-2 in the eighties as the main programming language for teaching and project implementation. In the early nineties, our dissatisfaction with Modula-2 grew because of its missing support for modern programming concepts like objects, genericity, garbage collection, and exception handling. On the practical side, it was also becoming increasingly difficult to find a Modula-2 environment that was suitable for our implementations (approaching 500k lines of code at that time.)
It took us some time to reach agreement on a replacement for Modula-2. Although Modula-3 seemed a natural choice, our experience with Modula-2 suggested using a more well-known programming language. As such, different subprojects started out in C, C++, and Modula-3, which allowed us to recognize that the compactness and clarity of Modula-3 allowed us to produce more robust and reusable code. This factor proved to be an essential point in our academic setting: with a high fluctuation of rather inexperienced programmers, a major part of the implementation of our systems are done by student workers or in the course of diploma theses.
Some of the project courses resulted in usable software, including m2tom3 and readthis.
m2tom3 is freely available at ftp://ftp-i3.informatik.rwth-aachen.de/pub/Modula-3-Contrib/m2tom3/
For more information on readthis visit https://www-i3.informatik.rwth-aachen.de/research/readthis/.
readthis can be freely download from ftp://ftp-i3.informatik.rwth-aachen.de/pub/Modula-3-Contrib/readthis.
We have also been engaging in a number of longer-term research projects, which are described below.
GRAS is a graph-oriented database system with a multi-client/multi-server architecture. From the first prototype in 1985, gradually improving versions have been used in different software engineering projects. Currently, a version automatically derived from a Modula-2 implementation using m2tom3 is shipped in binary form with the PROGRES release. A new implementation called GRAS-3 written directly in Modula-3 will be released around the end of 1997.
For more information about GRAS, visit: https://www-i3.informatik.rwth-aachen.de/research/gras/.
For more information about PROGRES, visit: https://www-i3.informatik.rwth-aachen.de/research/progres/.
TXL-3 is freely available at: ftp://ftp-i3.informatik.rwth-aachen.de/pub/Modula-3-Contrib/txl-3/.
The core of the adt prototype is an editor for a software architecture design language. Existing Modula-3 source code can be analyzed to visualize its architectural structure, and code skeletons can be generated from an architecture description.
A major goal of adt is to assist the programmer by providing him with an architectural view of his system without requiring the use of special tools. So, we made it quite easy to adapt adt not only to use it with any editor or compiler, but also to integrate it with other tools for revision control, browsing, e-mailing/conferencing, problem reporting, or any other tools needed in everyday work. In order to keep the participating documents consistent while allowing different tools to modify them, adt supports incremental updates between an architecture description and source code, so that changes in the architecture can be propagated into existing Modula-3 sources and the architectural plan can be mapped to source modifications.
An early demo of adt was built for a project course in 1994. A complete new
implementation making use of the GRAS-3 database and TXL-3 is currently under
construction and will, apart from the architecture language mentioned above,
also feature extensions for concurrent and distributed systems and interaction
diagrams comparable to the collaboration diagrams in
Unified Modeling Language.
A release is planned for early next year.
| Home
| Products
| Services
| Publications
| How to Contact Us
| About Critical Mass | | |
Please feel free to send us your comments Copyright © 1997 Critical Mass. |