Implementing Statically Typed Object-Oriented Programming Languages
Résumé
Object-oriented programming languages represent an original implementation issue due to the mechanism known as late binding, aka message sending. The underlying principle is that the address of the actually called procedure is not statically determined, at compile-time, but depends on the dynamic type of a distinguished parameter known as the receiver. In statically typed languages, the point is that the receiver's dynamic type may be a subtype of its static type. A similar issue arises with attributes, because their position in the object layout may depends on the object's dynamic type. Furthermore, subtyping introduces another original feature, i.e. subtype checks. All three mechanisms need specific implementations, data structures and algorithms. In statically typed languages, late binding is generally implemented with tables, called virtual function tables in C++ jargon. These tables reduce method calls to pointers to functions, through a small fixed number of extra indirections. It follows that object-oriented programming yields some overhead, as compared to usual procedural languages. The different techniques and their resulting overhead depend on several parameters. Firstly, inheritance and subtyping may be single or multiple and a mixing is even possible, as in JAVA, which presents single inheritance for classes and multiple subtyping for interfaces. Multiple inheritance is a well known complication. Secondly, the production of executable programs may involve various schemes, from global compilation frameworks, where the whole program is known at compile time, to separate compilation and dynamic loading, where each program unit---usually a class in an object-oriented context---is compiled and loaded independently of any usage. Global compilation is well known to facilitate optimization. In this paper, we review the various implementation schemes available in the context of static typing and in the three cases of single inheritance, multiple inheritance, and single inheritance but with multiple subtyping, e.g. JAVA. The survey focuses on separate compilation and dynamic loading, as it is the most commonly used framework and the most demanding. However, many works have been recently undertaken in the global compilation framework, mostly for dynamically typed languages but also applied to the EIFFEL language in the SMARTEIFFEL compiler. Hence, we examine global techniques and how they can improve implementation efficiency. Finally, a mixed framework is considered, where separate compilation is followed by a global step, similar to linking, which uses global techniques, as well for implementation, with coloring, as for optimization, with type analysis. An application to dynamic loading is sketched.