UNIVERSITY OF MANCHESTER INSTITUTE OF SCIENCE AND TECHNOLOGY


                          DEPARTMENT OF COMPUTATION


Object Oriented Specification, Design and Implementation


Lecture 1: Types and Type Constructors


Introduction

"[Programming]...  is easier to learn than piano

playing but more difficult than tic-tac-toe"

Schneiderman


The lecture has been written such that it assumes only  a  knowledge
of a block-structured high-level programming language[1]. The  style
of presentation seeks to identify fundamental and  general  concepts
and principles, and to  explain  these  by  simple  examples.  Where
necessary, footnotes, appendices  and  references  are  included  in
order that terms used, and concepts and principles  introduced,  are
clarified and may subsequently be further studied.


This lecture is concerned, first, with a notion fundamental  to  the
development of descriptions of  computations,  i.e.  the  notion  of
type. Secondly, in the context of program language design,  we  will
examine the notion of  abstraction,  in  particular,  how  a  module
construct provides a basis for  developing  realisations  of  types.
Finally, the notion of abstract data types will be examined and used
in later lectures as a basis for furthering our understanding of the
notions of class and class instance  (or object.).


Background


The notion of object orientation is now widely used as a  basis  for
developing a variety of different kinds  of  description,  including
specifications, designs and also implementations of software systems
written in programming languages. The notion of  object  orientation
can be placed in a variety of different  contexts[2],  however,  one
particular context enables us to see how object orientation seeks to
address limitations of  existing  programming  languages,  i.e.  the
context of programming language design.


Programming languages reflect or embody a paradigm, i.e.  a  "model"
against which they can be compared and contrasted.  Fundamental  and
general paradigms include the imperative  paradigm,  the  functional
paradigm,  the  declarative  paradigm  and   also   the   concurrent
paradigm[3].


A particular programming language will embody notions drawn from  at
least  one  paradigm,  for  example,  it  may   be   essentially   a
"functional" programming language yet  also  embody  the  notion  of
assignment drawn from the imperative paradigm.


Object-oriented programming languages may be extensions or revisions
of existing languages, or they  may  have  been  expressly  designed
"from scratch" to embody  the  object-oriented  paradigm,  i.e.  the
notions of class, class instance  (object),  inclusion  polymorphism
(inheritance), parametric polymorphism (type parameterisation),  ad-
hoc or intersection polymorphism,  etc.


In addition to object oriented programming languages, a  variety  of
different means of developing software designs,  have become  widely
used. These design techniques seek to embody notions drawn from  the
object  oriented  paradigm.   There   are   also   object   oriented
specification notation techniques which enable  software  components
to be given a description about  which  certain  properties  can  be
proved using mathematics and logic. Finally, object orientation  has
been employed as a means of organising the description, storage  and
manipulation of  persistent  data,  e.g.  object  oriented  database
systems.


We will examine, next, the notion of type and how types are provided
by programming languages and may be  constructed  using  programming
languages.


Types and Programming Languages


When we say that v is a value of type T we imply that v  ?  T,  i.e.
that a type T is a set of values. We also imply constraints upon the
operations which may be applied to  values  of  the  type,  i.e.  we
insist that all the values of the  type  exhibit  uniform  behaviour
under operations associated with the type. Thus, {0..9}  is  a  type
because its values exhibit uniform behaviour under the operations +,
-, DIV and MOD, but {true, maybe, 42, fred} is not a  type  in  this
context.


Typically, programming languages  separate  types  into  two  kinds:
primitive  types,  i.e.  types  which  cannot  be  decomposed,   and
composite or  structured  types.  Thus,  a  typical  general-purpose
programming  language  will  provide  type  constructors  for  well-
understood mathematical  constructs  including  cartesian  products,
disjoint  unions,  mappings,  powersets  and  recursive  types,  for
example, the programming language Pascal  supports  record,  variant
record,  array,  set  and  pointer  types,  together  with  'special
purpose' types, e.g. a file type.


We will examine, next, the  notion  of  abstraction  because  it  is
inherently a  basis  for  the  notion  of  one  particular  kind  of
abstraction which is of concern to programming  language  designers,
i.e. data abstraction.


Abstraction and Data Abstraction


Programming languages  provide  control  abstractions,  e.g.  skips,
assignments,  procedure  calls,  sequential   commands,   collateral
commands,  conditional  commands,  iterative  commands   and   block
commands, and data abstractions., e.g. modules, objects and classes.


A module  construct is a feature of many imperative languages  which
seek  to  provide  support  for  the   systematic   development   or
"engineering"   of   large-scale   modular   software   systems   as
compositions of named, separately compilable components. A module in
such  languages  encapsulates  its  components  (constants,   types,
variables,  procedures  and  functions)   enabling   a   (syntactic)
decomposition  of  a  procedure  hierarchy  into  sets  of   related
components with a well-defined (module) interface.


Two limitations of a module  construct  in  such  languages  can  be
identified:-


?     The inability of such a construct to provide a means of ensuring that
the decomposition      satisfies certain formal criteria, for example, the
operations defined in the interface are

      sufficient to define a type


?     The need to support a concrete realisation of a type such that
instances of the type are created       in  a  suitably  initialised
"state" and with "protection" for this encapsulated "state"


Types and Abstract Types


The notion of type as a set of values is suitable for most purposes,
however, problems can arise when a "new" type is to  be  defined  in
terms of existing types since we must choose  a  representation  for
values of the new type. In some cases, the representation  type  may
have values that do not correspond to  any  values  of  the  desired
type, or the  representation  type  may  have  several  values  that
correspond to the same value of  the  desired  type.  Consider,  for
example, the definition of the type  speed (written  in  a  language
with a Pascal-like syntax) shown below:-


TYPE speed = RECORD

               distance: integer;

               time    : integer

             END;


This definition might usefully embody some constraint, for  example,
one which excludes those values of distance and time  which  have  a
common factor. We might represent such a constraint as:-


{speed(m, n) | m, n ? integer AND n > 0 AND  m,  n  have  no  common
factor}


In the example above a representation type  for  speed  was  defined
directly in terms of a type, i.e. a set of values.  Consider,  next,
how a type can be described indirectly  by  a  group  of  operations
rather than directly as set of values, i.e. as an abstract type. The
definition of a "new" type as an abstract type  enables  undesirable
properties of a  representation  type  to  be  excluded.  Given  the
realisation of the abstract type speed shown below, a "user" of this
type can generate values  of  the  type  by  evaluating  expressions
involving the constants zero and one and  the  functions  cons_speed
and add_speed, and two values of the type speed can be compared by a
call to the function speed_eq. The implementation  of  the  function
cons_speed is used to constrain the value  of  its  second  argument
(time) to be non-zero:-


MODULE speed_def;

  INTERFACE

    TYPE speed = HIDDEN;

    FUNCTION zero: speed;

    FUNCTION one : speed;

    FUNCTION cons_speed(m: integer; n: integer): speed;

    FUNCTION add_speed (r: speed; s: speed    ): speed;

    FUNCTION speed_eq  (r: speed; s: speed    ): Boolean;

  IMPLEMENTATION

    TYPE speed = RECORD

                   distance: integer;

                   time    : integer

                 END;

    FUNCTION zero: speed;

    BEGIN

      .

      .

    END;


    FUNCTION one: speed;

    BEGIN

      .

      .

    END;

    .

    .

    FUNCTION cons_speed(m: integer; n: integer): speed;

        VAR  r: speed;

    BEGIN

      IF n <> 0 THEN

        WITH r^ DO

        BEGIN distance:=m; time:=n END

      ELSE ...{not a speed}

    END;

    .

    .

END.


It is important to note that the  module  construct  used  above  to
realise the type speed provides a means of specifying types and  not
type constructors[4]. If we wish, for example,  to  support  a  type
whose elements may be of different types a separate module  must  be
provided for each. Where a whole class of similar objects is  to  be
created a "generic module" is needed, i.e. a module construct  which
permits an object class to be defined such that it captures a  whole
class of behaviourally related types.


In the example below, a language whose syntax "resembles" the syntax
of a modular Pascal-like language is  used  to  describe  the  class
stack[5]:-


CLASS_MODULE stack(item_type            : TYPE;

                   CONST max_cardinality: natural

                  );

  INTERFACE

    PROCEDURE empty;

    FUNCTION  is_empty(s: stack): Boolean;

    PROCEDURE push(item: item_type);

    FUNCTION  top: item_type;

    PROCEDURE pop;

  IMPLEMENTATION

    .

    .

END.


Given  the  above  definition  of  a  stack  an  implicit  subtyping
relation[6] enables  a  stack  "user"  to  assign  the  value  of  a
"subtype" where an instance of the  "supertype"  is  expected.  i.e.
s1:=s2 is "type safe" in the example below, but s2:=s1 is not.


PROGRAM stack_user;

    .

    USES stack;

    TYPE natural = 0..maxint;

    VAR  s1      : stack[integer];

         s2      : stack[natural];

         n       : natural;


    PROCEDURE local_typed_procedure;

       VAR s: stack[boolean];

    BEGIN

      s:=s.push(true, s.empty_stack);

      .

      .

    END;


BEGIN

  s1:=s1.push(3);

  n:=3;

  s2:=s2.push(n);

  .

  .

  s1:=s2

END.


Although the concepts of abstract type and object class  are  subtly
different[7] both enable the creation of  values  of  a  type  whose
representation is hidden and whose state  can  be  changed  only  by
operations with exclusive access.


Abstract Types and Abstract Data Types


Thus far, a type has been characterised in terms of a set of  values
and also as an abstract type realised by a  module  whose  interface
component defines the signature of the operations over the type  and
whose  implementation  component  "hides"  the   definition   of   a
representation type for the abstract type.


Where a type is defined solely in terms of operations over the  type
and not in terms of a representation  type  then  it  is  termed  an
abstract data type. Such a  type  can  be  defined  using  a  module
construct  whose  implementation  component  is  "replaced"   by   a
specification component, for example, the type stack shown below:-


MODULE stack_adt;

  INTERFACE

    USES item_type;

    TYPE stack;

    FUNCTION  empty: stack;

    FUNCTION  is_empty(s: stack                 ): Boolean;

    FUNCTION  push    (s: stack; item: item_type): stack;

    FUNCTION  top     (s: stack                 ): item_type;

    FUNCTION  pop     (s: stack                 ): stack;

  SPECIFICATION

    VAR i: item_type;

        s: stack;

    EQUATIONS

      is_empty(empty_stack) = true;

      is_empty(push(s, i))  = false;

      pop(push(s, i))       = s;

      top(push(s, i))       = i;

END.


This style of description, termed an  algebraic  specification,  has
the  advantage  that  it  is  possible  to  generate  an  executable
representation "automatically".


Conclusions


This lecture has considered, first, how a  type  can  be  considered
solely in the context of a means of collecting  together  a  set  of
values, secondly how it can be realised as an abstract type using  a
module construct, and thirdly as an abstract data type by specifying
the  meaning  (semantics)  of   the   operations   over   the   type
algebraically.


It has been shown how  a  module  construct   enables  types  to  be
defined but not type constructors and how, if a "class"  of  related
types is  to  be  defined,  support  for  parameterisation  must  be
provided by the module construct.


The next lecture will examine how  a  module  construct  supports  a
variety of different "styles" of description. One  particular  style
of description identified in this lecture, i.e. a  module  which  is
parameterised by a type, is then shown to be the  basis  for  object
oriented programming when this style of description is combined with
the notion of inheritance.


Chris Harrison, January 1996.


-----------------------
[1] A not unreasonable assumption since such languages have been used as a
means of teaching introductory programming for many years and for good
reasons.


[2] The object oriented paradigm has its conceptual basis in  record
structures (called objects) intended  to  be  named  collections  of
values (attributes) and functions (methods). Collections of  objects
form classes and a subclass  relation  defined  on  classes  enables
methods to work "appropriately" on  all  members  belonging  to  the
subclass of a given class. The first object  oriented  language  was
arguably Simula67. Many more recent languages are typed using simple
extensions of  the  type  rules  for  Pascal-like  languages.  These
extensions involve principally a notion of subtyping and  also  more
powerful  type  systems  that   elegantly   incorporate   parametric
polymorphism.


[3]  As  stated  in  lecture  -1,  The  functional  and  declarative
paradigms seek to overcome a  problem  which  is  at  the  heart  of
Software Engineering, i.e.


"As the size  of  a  piece  of  software  increases  the  number  of
potential interactions between  components  increases  exponentially
until  they  cannot  be   understood,   maintained   or   documented
effectively. This effect dominates both the  cost  of  software  and
also limits its applications."


Functional languages seek to overcome the problem  of  "doing  away"
with assignment (the basis of the imperative paradigm where commands
update variables which may be shared) by requiring  programs  to  be
stepwise refined into hierarchies of function definitions which  are
ultimately used to define the expression which represents the result
of the program, e.g. an integer,  a  file  of  records  or  a  value
representing an image to be displayed.  Declarative  languages  also
avoid explicit  updating  of  variables  albeit  using  a  different
technique - see Communications of ACM Vol. 28, No 12, December  1985
"Describing  Prolog  by   its   Interpretation   and   Compilation".
Unfortunately even these "radical" attempts at changing the way  the
majority  of  programs  are  written  have  not  been   particularly
successful in  reducing  the  complexity  of  large  scale  software
systems.


An approach which has been more successful and which  is  applicable
to a variety of different programming language styles is to  provide
direct support for defining sets of constants, types, procedures and
variables with well defined interfaces, i.e. modules.


[4] The use of the reserved word HIDDEN in the  interface  component
of the type speed enables users of the module to  declare  variables
of type speed and  to  define  other  types  which  have  speeds  as
components,  without  being  able  to  access   the   implementation
structure of speeds.


[5] The type stack is parameterised  by  a  TYPE  (and  also  by  an
attribute CONST max_cardinality: natural), and the  "methods",  i.e.
type generators and observer functions, push, is_empty, top and  pop
apply implicitly to an instance of the type (denoted by self).


[6] It is assumed that  the  type  natural  is  a  subrange  of  the
discrete primitive type integer, i.e. that a subtype can be  defined
as a subrange.


[7] Consider, for example,  how  a  module  does  not  "create"  any
variable of the abstract type  it  defines,  instead,  it  binds  an
abstract type definition to an encapsulated  set  of  bindings  such
that the abstract type may  be  used  to  declare  several  distinct
variables. Conversely, an object class supports several objects each
of which defines several distinct methods which accesses a  distinct
object.