A Programming Language Perspective to Compositional Software Architectures

Amr Sabry
Department of Computer and Information Science
University of Oregon
sabry ... cs uoregon edu

1. Programming Language Research

Programming language research spans many areas, several of which are immediately relevant to the field of compositional software architectures:

Type Systems are key components of compositional software. Not only do they provide a language in which to specify the interfaces of individual components, but they also provide an algorithm for checking that an implementation matches an interface, and even better that an application assembled from a collection of components is globally consistent. Recent research in type systems uses a richer notion of type that includes information about the dynamic behavior of the components. Such type systems can be used to guarantee properties of software architectures like freedom of deadlock, locality of dynamically allocated memory, and system-wide bounds on quality of service (QoS).
Compiler Optimizations provide a powerful technology for producing efficient applications from a collection of reusable components. The tension between re-usability and efficiency is clear. One one hand, in order to maximize the re-usability of components, one is tempted to provide the most generic interface possible. On the other hand, the performance of an application assembled from completely generic components is likely to be poor. The solution is to use optimizations techniques, like partial evaluation, to specialize the generic components at link time.
Structuring Programs has always been an active area of research. A simple, effective, and established technique has been to recognize common programming patterns in a specific domain and abstract them using a higher-order combinator or meta-construct. More recently, there has been significant work in ways of structuring programs that not only combine the ability to assemble applications from reusable blocks, but also separate the code that implements the logical aspects of the system from the code that implements the QoS requirements [4].

The remainder of this paper expands on each of the above ideas in some detail.

2. Types

There is a wealth of proposals that use types to verify and hence guarantee properties of systems assembled from reusable components. We briefly describe two such ideas:

Detecting Deadlocks. Typically, reusable components make implicit assumptions about their behavioral interaction with other components. Such implicit assumptions cannot be formally checked at link time, and may cause errors such as deadlocks. To avoid such errors, the types can be enriched to include information that can be used to guarantee the absence of deadlocks [2, 3]. In our approach, we have extended the types with size information that gives bounds on the sizes of the data structures. These sizes can be used to guarantee the absence of deadlocks by verifying that every recursive procedure makes progress, i.e., increases the size of the data structure it is processing by one each time around the loop.
Locality of Memory References. Many applications require that memory references from one component do not attempt to access the local memory of another component. For example, operating systems segmentation checks guarantee this property for security reasons. Mobile code on the Internet would have to obey similar restrictions. Also, concurrent systems are easier to implement if property program fragments do not interfere via shared locations. Again such properties cannot usually be checked when assembling applications from reusable components, but we have developed an extended type system that maintains information about memory regions, and uses this information to prevent programs from exporting local pointers, and from reading or writing pointers allocated from another region [5]. To give some ideas of the accuracy achieved by this mechanism it is worth noting that it is quite feasible to have one component manipulate locations belonging to another quite component, as long as no attempt is made to dereference these other locations. Under these conditions, the host could build and traverse a graph containing the foreign locations, perhaps duplicate or discard the locations, or build them into data structures, eventually to be returned, presumably, to the owning component for it to dereference at will. Through all this the type system is able to track that the components do not interfere with each other.

3. Compiler Optimizations

Partial evaluation is a source-to-source program transformation technique for specializing programs with respect to parts of their input. It can be used effectively in the domain of compositional software architectures to specialize generic reusable components to their particular use cases. There has been several such experiments for optimizing protocol stacks [1, 7], operating system kernels [8], etc. The basic idea of partial evaluation is rather simple. Consider the following example from the XDR protocol [7]:

bool_t xdr_int (XDR *xdrs, int *ip)
{
  if (sizeof(int) == sizeof(long))
    return xdr_long(xdrs, (long *)ip);
  else 
    return xdr_short(xdrs, (short *)ip);
}

The program encodes an integer for transmission on the network. The encoding naturally depends on the machine integer size. Since the type of machine is typically known early in the process, such code is amenable to partial evaluation. If the size of an integer is short, the residual optimized code would be:

bool_t xdr_int (XDR *xdrs, int *ip)
{
  return xdr_short(xdrs, (short *)ip);
}

4. Structuring Programs

We present two ideas about patterns that appear most relevant.

Monads. Consider a generic program component for traversing a tree. Larger contexts that use this component might need to perform several computational effects while traversing the tree. For example, we might want to raise exceptions in some situations, or collect some information in a global variable, or backtrack and explore a different branch, or remember milestones and restart the traversal at an earlier milestone, etc. The variations are endless and writing one component that can be effectively reused is a challenging task.
Fortunately, monadic style is a programming pattern, originating from category theory [6], that uses a higher-order combinator to compose computations. Thus instead of writing:
f(); g()
one would write:
f() -> g()
where the arrow -> is an operator that glues the two computations together using special code. In the simplest cases, the glue code does not perform anything interesting and is essentially like the ;. However in more interesting cases, the glue code might catch any exception raised during the first computation and perform a number of actions. Thus in our example, all we need is to parameterize the tree traversal component by the monad, substituting different monads in different contexts of reuse.
Meta-Programming. An even more general technique is to provide meta or reflective facilities. For example, consider a library of reusable filters to be used in the context of image processing. Each filter takes an image, performs some elementary operation, and produces a resulting image. By appropriately composing these filters, one can express sophisticated image processing operations. The problem is that such a program will have unacceptably high memory use, producing a number of intermediate images at each stage. Although automated techniques might help, a general solution to this problem requires some programmer intervention. The challenge is to express the additional code in a disciplined way without tangling the original code.
One elegant solution [4] is to provide an additional meta-programming language in which the programmer can express algorithms to manipulate the original program. If written at an appropriate level of abstraction, these algorithms are concise and clear. For example, in order to express that two filters should be fused into one to avoid the generation of an intermediate image, the programmer might write:
if ((isFilter p1) && (isFilter p2)) { p = makeFilter (p1, p2); }
The above code is compiled and then executed to manipulate the original program. The net effect is to produce an efficient image processing operation with clean composable generic filters.

5. Conclusion

We have covered three ideas from programming language research that are useful in the context of compositional software architectures. At the lowest level, type systems might be able to verify some properties but are generally too weak to verify all properties of interest. Program analysis techniques can help in this latter case by automatically rewriting the code to customize the reusable components to their current use. Finally when automatic techniques fail, programmer intervention is supported in a disciplined way using monads and meta-programming. The examples we have chosen are necessarily small but the ideas scale to the large scale integration of components.

References

[1] Edoardo Biagioni, Robert Harper, Peter Lee, and Brian G. Milnes. Signatures for a Network Protocol Stack: A Systems Application for Standard ML. In the ACM Conference on Lisp and Functional Programming, 55-64. ACM Press, New York, 1994.

[2] John Hughes, Lars Pareto, and Amr Sabry. Proving the Correctness of Reactive Systems Using Sized Types. In the ACM SIGPLAN-SIGACT Symposium on the Principles of Programming Languages. ACM Press, New York, 1996.

[3] Paola Inverardi, Alexander L. Wolf, and Daniel Yankelevich. Checking Assumptions in Component Dynamics at the Architectural Level. In the Proceedings of the International Conference on Coordination Models and Languages, 46-73. Lecture Notes in Computer Science 1282, Springer-Verlag, Berlin, 1997.

[4] Gregor Kiczales, John Lamping, Anurag Mendhekar, Chris Maeda, Cristina Lopes, Jean-Marc Loingtier, and John Irwin. Aspect-Oriented Programming. Xerox Palo Alto Research Center Technical Report, February 97, SPL97-008 P9710042.

[5] John Launchbury and Amr Sabry. Monadic State: Axiomatization and Type Safety. In the ACM SIGPLAN International Conference on Functional Programming, 227-238. ACM Press, New York, 1997.

[6] Eugenio Moggi. Computational Lambda-Calculus and Monads. In the IEEE Symposium on Logic in Computer Science, 14-23. IEEE Computer Society Press, Los Alamitos, 1989.

[7] Gilles Muller, Eugen-Nicolae Volanschi, and Renaud Marlet. Scaling up Partial Evaluation for Optimizing the Sun Commercial RPC Protocol. In the ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation, ACM Press, New York, 1997.

[8] C. Pu et al.. Optimistic Incremental Specialization: Streamlining a Commercial Operating System. In the ACM Symposium on Operating Systems Principles. ACM Press, New York, 1995.