Send message to: martin_king@uk.ibm.com, (Martin King), or nell@nist.gov, (Jim Nell) Workshop secretary. Return to: JSW Home Page.

On Modelling in the Context of Information Systems


Date: 31 May 1996
Source: Martin King
Project: JTC1 Workshop on Standards for the use of Models that Define the Data and Processes of Information Systems
Status:Draft Contribution from industry expert for consideration for inclusion at the Workshop, supported by UK national body.

Abstract

The objective of this paper is to propose some principles and guidelines for those interested in standardising any form of modelling facility or language. It examines aspects of various activities commonly described as modelling in the context of information systems. It attempts to clarify the senses in which something can model something else. These provide a basis for the proposed guidelines.

The basic concept of modelling is simple: one thing, the model, represents in some sense something else, the original. While this is simple to state, it appears to be open to very different interpretations. The paper sets out some basic principles of modelling and the relationship between the model and what is modelled. The range of uses of models is discussed. A model is generally built with a particular purpose in mind; it maor may not be suitable for different uses. A distinction is drawn between a "direct" and an "indirect" model. The later involves in some sense a model which primarily relates to one thing, and that in turn is seen as modelling another. While a model in some sense represents something else, a thing which represents something else is not necessarily a model, for example a name. This distinction is clarified and identified as a potential source of complexity and confusion.

Any information system can be considered to serve a purpose within a context. Typical contexts include businesses, government or other non-profit organizations, or may be some mechanical or technical process or other more limited context within an organization. The word enterprise is chosen to refer to any such wider or more limited context. The nature of these contexts or enterprises is explored. Some sources of complexity and general principles relevant to modelling are identified. The need to be able to model any relevant aspects of an enterprise is highlighted.

The essentially linguistic nature information systems is discussed. The presence of many levels of representation and the potential of this for causing confusion is explored. The motivation and evolution of modelling and the concept of meta-models is reviewed.

Against the foregoing, the challenge of modelling in the context of information systems is compared with that of modelling in other contexts. The conclusion is that modelling in the context of information systems is particularly complex. Additional complexity is identified in the inherent nature of large enterprises, information technology, and environmental change. The risk is of confusion caused by complexity. It is argued that it is vital to be clear in any modelling what is being modelled and that the model is constructed in a way that preserves this clarity.

A list of principles or criteria for evaluating the objectives and positioning of any modelling facility or language is proposed. These will include the following:

Various IS projects are reviewed against the identified criteria. These include:

The paper concludes with a proposal for the wider use of the identified criteria for evaluating standardisation activity within the scope of the workshop.

Objective

The objective of this paper is to examine aspects of various activities commonly described as modelling in the context of information systems. It attempts to clarify the senses in which something can model something else and propose some principles and guidelines for those interested in standardizing any form of modelling facility or language

Preliminaries

The basic concept of modelling is simple: one thing, the model, represents in some sense something else, the original. The model can be many different things, for example a solid object, a diagram, a set of language statements, or may be even abstract. The original can be almost as varied in nature as the model.

A model is considered to be different from a replica. That is that a replica may be virtually indistinguishable from the original, whereas the model will normally represent the original only partially. Typically a model will deliberately represent certain features of interest of the original while equally deliberately notrepresenting other features. The choice of what to represent or not represent is the choice of the modeller, guided by the purpose of the model rather than anything inherent in the original.

Architects will often have a model made of a major new building project. Such a model will be much smaller than the original and made of quite different materials. (In this and many other cases, the model may exist before the original, which might therefore be more naturally called the target.) It will typically represent much detail only approximately or not at all. It will enable people to imagine what the building would look like.

A map is considered to be a model of the part of the earth's surface it represents. A piece of descriptive language is considered to be a model of whatever it describes. These uses of the word model, particularly the latter, may not be so common. However, in the context of information systems, such usage is accepted, and is included for this discussion.

Uses of Models

A model will normally be constructed with a particular use in mind. Therefore the method used for modelling will be chosen to be appropriate for that use. This does not prevent the model being usable for other purposes, though it may not be so suitable. For example, an architect will create drawings of a house to guide the builder. He will follow widely observed conventions that facilitate the builder understanding what is required and constructing it appropriately. One such drawing may be very useful in planning and estimating how to purchase and lay carpet for some part of the house. It so happens that the conventions that help the builder are also appropriate for the carpet planner, but this is fortuitous rather than intentional. A physical model of a building to illustrate its visual effect is unlikely to be convenient for carpet planning.

Direct and Indirect Modelling

The simple and widespread case is where the direct and natural interpretation of a model is the intended target or original. The architect's floor plan of a house is unlikely to be interpreted in any other way than as representing the building. Any use of it for planning carpets is unlikely to be seen as changing this.

In contrast to this, a model whose intended or natural interpretation is one thing, and that thing itself models a third in some sense, can be interpreted as modelling the third. This is different from the use of the floor plan of a house to plan carpets. For example, an SQL schema, which defines the set of tables and the relationships between them in a relational database can be interpreted as in some sense modelling the enterprise the database serves. While this may come relatively easily and naturally to those with extensive experience of modelling at various levels, it may not be so easy for those without such experience. Depending on the nature of the modelling formalism or language, there may be a degree of judgement or arbitrariness in deciding what is the natural and direct interpretation of any given model. However, it is probably safe to assume that any departure from the simplest and most direct interpretation is likely to add some difficulty to quick and unambiguous understanding.

It may be useful to draw a parallel with different uses of normal human languages in this respect. If I wished to comment on the application of Marxist principles in practice, I could write explicitly of them and interpret historical characters and events in the light of them. If my research, my analysis, and my writing are good, I should produce a book that the interested reader should be able to follow and learn from. George Orwell, in contrast, wrote the book "Animal Farm" in a totally different allegorical style. Much of what he writes is clearly fiction and admittedly implausible (animals using language for example), but the serious message is there, based on his observations of the historical events and characters. This approach may be extremely effective as a medium for conveying views and changing attitudes of a potentially sceptical audience. It is unlikely to be so effective for communicating precise information about a complex subject to a favourably motivated audience.

Models and Representation

It has already been stated that a model, in an important sense, is considered to represent the original. In contrast, not everything that represents something else seems to be a model. For example, a name is commonly taken to represent the thing having the name and is unlikely to be considered to be a model in any sense. This distinction is important even though there may be cases where it is arguable whether or not the representing thing can be called a model.

The reason for its importance is its potential for confusion. This is particularly where the thing being modelled has things and names of things, and consequently the model may need to represent both the things and their names. Often it will be using the same name in the representation of the thing in the model as the name in the original.

Enterprises

Any information system can be considered to serve a purpose within a context. Typical contexts include businesses, government or other non-profit organizations, or may be some mechanical or technical process or other more limited context within an organization. The word enterprise is chosen to refer to any such wider or more limited context.

In general, an enterprise will consist of many different tangible and abstract entities. There will be many complex relationships between them. In particular some of the entities within an enterprise will represent, or even model, other entities in it. It may be important to recognize a name independently of the entity it identifies. A delivery is an observable happening in a typical enterprise, and a delivery note is an important item of paperwork as evidence of that happening, but the two things are importantly different; whether it is desirable to model either or both of these in a particular context, there is certainly scope for confusion between the two. Another example is the concept of a contract. Typically the word may be interpreted as referring to a piece of paper, but it is a fundamental point of law that the contract is an abstract state of obligations between the parties involved. This exists independently of any piece of paper evidencing it. There are certain cases in English law where a written contact is required; however, the contract is still distinct from the piece of paper.

Depending on the purpose of any model, it is likely to be desirable to be able to model any of these aspects of an enterprise.

Information Systems

Any information system can be seen as essentially linguistic. That is to say, that it is capable of accepting some form of linguistic object as input and providing some form of linguistic output. The input and output may be separated to a greater or lesser extent in content, space, or time. Some information systems may also handle some "physical" input and output; for example, a baggage handling system may measure the force exerted by an item and express it a weight, it may then activate mechanisms to route different items to different physical locations. However, at some point very close to the physical input or output, there will be a linguistic representation corresponding closely with it. Therefore information systems essentially represent aspects of the enterprise they serve in the same way as any other linguistic object does.

Information systems contain many levels of representation. In physical terms, the user sees output on screen or printer as readable characters; in each case the characters are likely to be formed as a matrix of dots, and these in turn will be sequences of electronic signals recognized as a bit pattern. Within the memory of the system, the characters will exist as different patterns of eight bits, each unique pattern representing a letter of the alphabet, a numeral, or some other symbol; these same characters will be stored on magnetic disk or CD-ROM in yet other patterns. The programmer, will normally not be concerned with these different representations; he will be concerned with the name given to represent the area of memory for holding a string of characters with particular meaning. The database administrator will be concerned with the structure of the data in the data dictionary that determines the content and usage of the data. Mercifully, most of these levels of representation are invisible most of the time in using, and to a lesser extent in constructing, an information system. However, some of them, particularly the dictionary levels, can impinge on modelling activity.

Perhaps because of this complexity, and certainly influenced by the generality of database structures, there has been widespread use of models within the development of information systems. In particular, it has been commonplace to construct a model of the data of a system, and this in turn has been done in a way that is itself modelled; this second model is often referred to as a meta-model.

Modelling Challenges

We can now compare the challenge of modelling in the context of information systems with that of modelling in many other contexts. An architect's floor plan of a house is a relatively simple object compared with an information system; the most pedantic analysis would be required to multiple levels of representation within it. The house that it represents does not have parts of itself representing other parts; they are themselves with their particular functions, inherent or assigned. Thus, in this and in many other cases we have a simple picture of one thing modelling another with little or no question of other representations in the context under consideration.

Large enterprises are often inherently highly complex. The pressures of modern economic life and the advances of technology in all fields lead to ever greater specialization of functions within every organization. The field of information technology has a faster rate of advance than many, so information systems are similarly becoming ever more complex. Thus, in the context of information systems, we find complex enterprises with levels of representation within them being modelled in order to build information systems which themselves are complex and have multiple levels of representation. This is a situation ripe for confusion and confusion is rife in practice. A further important source of difficulty is change. In many important cases of things to be modelled, change is endemic. Sources of change include legal, social, economic, and business.

Thus it is vital to be clear in any modelling what is being modelled and that the model is constructed in a way that preserves this clarity.

Evaluation Principles and Criteria

Modelling and the ISO CSMF Work

The ISO CSMF project is concerned with a standard for a Conceptual Schema Modelling Facility. The approved project proposal (ISO/IEC JTC1/SC21/N8060) states "The CSMF permits specification ¢in a conceptual schema! of concepts and terms about one or more subject areas (referred to as Universe of Discourse in ISO TR9007).". The terms "subject area", "UoD", and "enterprise" are in this context considered synonymous. TR9007 devotes considerable content to defining the meaning of the UoD and the relationship between it and the conceptual schema. It consistently holds that the conceptual schema describes (otherwise models) the UoD. However else it may be used, including controlling any aspect of an information system, it does not describe (represent, or model) anything other than the UoD. The current work of the CSMF project is fully in line with this position; that is, it sees the conceptual schema as modelling the enterprise (or UoD).

TR9007 proposes, and the current CSMF work explicitly accepts, two important principles governing how the conceptual schema models the UoD. The 100% principle covers the requirement for the ability to completely describe any relevant aspect of a UoD. The Conceptualization principle is the requirement for the ability to describe what is relevant without having to describe what is not relevant. This can be interpreted to require that the conceptual schema should directly model the UoD and not in an indirect mode such as an SQL schema can only do.

OMG BOMSIG

The OMG Business Object Management Special Interest Group springs form the mainstream of the Object Management Group. It is interested in looking more to the business aspects of applications than the mainstream, whose main work (e.g. CORBA) is aimed at the technical infrastructure of IT systems. A baseline document is the OMG Reference Model for Business Applications, Draft 2, June 22, 1995. This is pleading for application interoperability in addition to the infrastructure interoperability offered by OMG technology. It argues that the problem is closely parallel to the problem faced in the DBMS world of interoperation of different schemas. This was an important motivation for the work leading to TR9007 and an important motivation for the current CSMF project. Section 4.1 of the OMG document states "... a business application model is intended to deal exclusively with business concepts and terminology ...". This seems close to the conceptualization principle of IS TR9007.

The ISO ODP work A review of the parts of the Basic Reference Model of Open Distributed Processing (ISO/IEC 10746) finds various references to modelling or similar concerns. The Scope clause of Part 2 speaks of "... the concepts which are needed to perform the modelling of ODP systems ...". Part 3 refers to "... the specification of an ODP system ..." and "... defines the purpose, scope and policies of an ODP system ..."; while this does not use the term modelling, it seems closely parallel. In Part 3 clause 5.2, (part of the definition of its "Enterprise Language") the RM-ODP does recognize the equivalent of the enterprise as the environment within which an ODP system operates. It appears to consider this can be represented as an object in a community, along with other objects that are parts of the ODP system. On this evidence, as far as the RM-ODP is concerned with modelling, the emphasis is on modelling the ODP system rather than on modelling the enterprise in its own right independently from any ODP system. It also appears that it is not concerned with making the clear distinction between the enterprise and the information system or model and the thing modelled. This is in contrast to TR9007 and the current CSMF work and the position taken by this paper.

The recent proposal for a new work area for an RM-ODP Enterprise Viewpoint Component Standard (ISO/IEC JTC1/SC21 N9773) includes the following words "... Standard for modelling and specifying ODP systems ...". This is consistent with the analysis above of the RM-ODP; the target of the model is the ODP system, not the enterprise the ODP system serves. The following text includes "... to define enterprise viewpoint concepts at a level of detail required to model and specify a Business Object Architecture for enterprise applications." and "Extend the RM-ODP object model to a more concrete level sufficient to specify the design of components which can be physically implemented ...". On the face of it these two statements could be advocating two quite different things. The first quotation, in mentioning "Business Object Architecture", might be interpreted as referring to a model of the business (enterprise). The second, in talking about the design of components, would seem to be moving in a different direction from modelling an enterprise.

The new work area proposal refers to interest from the OMG Business Object Management Special Interest Group. It is relevant therefore to compare the new work area proposal with the direction of BOMSIG. As stated above BOMSIG seems close to the conceptualization principle of TR9007 which is very different from the ODP flavour of modelling an ODP system. Neither the RM-ODP work nor the OMG BOMSIG seems to be totally clear on the thing being modelled, but in the terms analysed in this paper, they appear to be significantly different. This lack of clarity seems to show in the new work area proposal. A key motivator of the OMG BOMSIG is adherence to the concept of "object". The origin of this concept is in programming technology and while it is highly successful in that area, and to some extent in the human-machine interface area, it is arguable that it is inappropriate in enterprise modelling; the recent meeting of the ISO CSMF RG concluded there was no apparent reason to prefer the term object and adopt any of its associated concepts in preference to the established term of entity and its associated concepts. This motivator of the OMG may however explain their association with the RM-ODP work rather than the CSMF project with which some of their motivation might correspond more closely.

Other Related Modelling Work

To be completed.

Conclusions

In the context of information systems modelling plays, and is likely to continue to play, an important role. Because this context is so complex, it is highly desirable that any modelling be quite clear and explicit about what is being modelled. It follows that any design or standards work about modelling facilities should be equally clear and should ensure that the resulting modelling facilities enable this clarity to be maintained in use of the facility.

One particular area of modelling, which is being increasingly recognized as important, is that of modelling an enterprise totally independently of any implementation of any information system. This is fundamental to the ISO work leading up to TR9007 and the current CSMF project. For this work to succeed, the resulting modelling facility must provide an interface to the enterprise expert that communicates the complex factual content in an effective way without unacceptable levels of aptitude or training being required.

However, in this as in other areas, there is increasing risk of re-inventing the wheel. Application of the criteria proposed in this paper is proposed as a useful step towards reducing this risk.


Send message to: martin_king@uk.ibm.com, (Martin King), or nell@nist.gov, (Jim Nell) Workshop secretary.

Return to: JSW Home Page.