is physically maintained. The user can choose between virtual and derived data as a result of considering trade-offs based on: estimated cost of calculation; frequency of update; estimated cost of storage; and frequency of access. For example, suppose a file contains a list of budgets for various projects in a department. The departmental budget can be calculated as a function of the individual project budgets. This information might be defined as derived data since it is expected to be updated infrequently (e.g., once a year), while it is expected to be accessed relatively often. Options will be provided which give the user control with regard to when the calculation of derived data is to be done. These options will be similar to those provided for control of data validity operations. The data validation and derived data concepts are similar in that some operation must be performed on related data. In the case of data validation, the information derived is the condition of data. 3.9 Internal Representation To this point, we have discussed only the high level, logical, aspects of data. Since data, at any given time, must reside on some physical device a representation of the data must be chosen. In some cases it is appropriate to leave this choice to the datacomputer system. For example, the representation of information which is used in the process of transmitting other data, but which itself resides solely at the datacomputer may not be of any concern to the user. However, it is important that the user be capable of controlling the choice of representation. In any application which requires mostly transmission of data rather than interpretation of the data by the datacomputer, the data should be maintained in a form consistent with the system which communicates with the datacomputer. With respect to basic types of data, datalanguage will provide most representations commonly used in systems with which it interacts. For some types (e.g., fixed point) this will be accomplished by providing for parametric (e.g., sign convention, size) description of the representation. In other cases (e.g., floating point) specific representations will be offered (e.g., system 360 short floating point, system 360 long floating point, pdp-10 floating point, etc.). Another aspect of the internal representation problem regards aggregate structures. The method chosen to represent aggregate structures may largely affect the cost of manipulating the data. The user must have control over this representation since only he has any idea of how the data is to be used. Datalanguage will provide a variety of representational options which will allow for efficient implementation of data structures. This includes the availability of auxiliary
structures, automatically maintained by the data computer system. These structures can be used to effect efficient retrieval of subsets of data collections based on the contents of the members (i.e. the common concept of indices), efficient maintenance of orderings on a collection of data, maintenance of redundant information for the purpose of data integrity, and efficient handling of shared data whose behavioral characteristics are dependent on the path of access. It should be noted here that, the datalanguage design effort, will attempt to provide methods whereby the data user can describe the expected use of his data, so that details of internal representation can be left to the datacomputer. 3.10 Data Attributes and Data Classes The type of an item determines the operations which are valid on that item and what they mean. _Data_attributes_ are refinements on the type of data. The data attributes affect the meaning of operations. For example, we would like to provide for the option of defining fixed point items to be scaled. The scale factor, in this case, would be an attribute of fixed point data. It effects the meaning of operations on that data. The attribute concept is useful in that it allows information concerning the manipulation of an item to be associated with the item rather than with the invocation of all operations on that item. The attribute concept can be applied to aggregate as well as basic data. For example, one attribute of a list could define where a new member is to be inserted. Options might be: insert at the beginning of the list; insert at the end of the list; or insert in some order based on the contents of the member. Adding a new member to a list with one of the above attributes could be done by issuing a simple insert request without having to specify where the new member is to be inserted. The _data_class_ concept is actually the inverse of the data attribute concept. A data class is a collection of data types. The data class concept allows for definition of operations, independent of specific type of an item. For example, by defining the data class arithmetic to be composed of fixed point and floating point types of data, the comparison operators (_equal_, _less_than_, etc.) can be defined to operate on arithmetic data, independent of whether it is fixed or floating point. Also the concept of data aggregate can be seen as a class encompassing directories, lists, etc. As there are operations defined on arithmetic data, there are also operations defined on arbitrary aggregates. The inverse relationship between data classes and data attributes is very strong. For example, the concept of list can be seen as a data class, encompassing all types of lists (e.g., lists of integers, lists
of character strings, etc.), independent of the types of their members. The type of a list's members (e.g., integer, character string, etc.) are then seen as attributes. Data attributes and classes are also relative concepts. While the concept of list can be viewed as a data class, it can also be seen as an attribute, relative to the concept of data aggregate. 3.11 Data Description A _data_description_ is a statement of the properties (see discussion of attributes) of a data item. Examples of properties which are recorded in a description are: the name of an item; its size; its data type; its internal representation; privacy information; etc. Datalanguage will contain mechanisms for specifying data descriptions. These descriptions will be processed by the data computer, and used whenever the data item is referenced. The user will be able to physically create data only by first specifying their descriptions. The properties of a description can be divided into groups according to their function. Some have the function of specifying details of representation, which will not be of interest to most users, while others, such as the name are of almost universal interest. All user data is a part of some larger (user or system) data structure. The structures containing data establish a path of access to the data. In the process of following this path the datacomputer system must accrue a complete description of the data item. For example, the description of a data item of a directory may be found associated with that node of the directory. Members of a list or array are described as part of the description of the list or array. We must dispose of two seeming exceptions. First, while aspects of data may (on user request) be left to the system, those aspects are still described, they are described by the system. As discussed above, some data will be, to some degree, self describing (e.g. members of mixed lists). However, it is fully described in some encompassing structure, in that a method of determining the full description is described. It is worth noting here that the sooner a complete description is found in the path of access, the more effective the datacomputer is likely to be in processing requests which manipulate a data item. However, the ability to have data whose complete description does not exist at high levels of the access path provides greater flexibility in the definition of data structures.
3.12 Data Reference Data cannot be manipulated unless it can be referenced. In the same way that data cannot exist without its being described, it cannot exist unless there is a path of access to the data. The method of data reference is to define the path of access to the data. As mentioned above, there is a method of referencing any item relative to the data aggregate which contains it. Nodes of directories and components of structs are referenced via the name associated with the node or component. Members of arrays are referenced via the index associated with the member. Members of lists are referenced via some method of specifying the position of the member or by uniquely identifying the member by content. To reference any arbitrary data item the path of access must be fully defined by either explicit or implicit definition of each link in the chain. In the case of virtual data there is an extra implicit link in the chain, that being the method employed to obtain the data from other data items. It should be noted also that if pointers are provided (see discussion on general relational capabilities) they can also serve as a link in the chain of access to an item. The design of datalanguage will ease the problem (and reduce the cost) of referencing data items by providing methods whereby part of the access path can be implicitly defined. For example, datalanguage will provide a concept of "context". During the course of interacting with the datacomputer, levels of context can be set up so that data can be referenced directly, in context. For example, on initiating a session the user may (in fact will probably be required to) define a directory which will be the context of that session. All items subordinate to this directory can be referenced directly in this context. Another feature will be partial qualification. Each level of struct need not be mentioned in order to reference an item embedded in a deep nest of structs. Only those intermediate levels which are sufficient to uniquely identify the item need be specified. 3.13 Operations In this section we discuss the builtin functions of datalanguage which are of central importance in manipulating data. Functions which operate on items, functions which operate on aggregates, primitive functions and high-level functions are discussed. Of the primitives which operate on items, those of most interest are assignment, comparisons, logicals, arithmetics and conversion functions. Primitive assignment transfers a value from one item to another; these items must be of the same type. When they are of different types,
either conversion must be performed, or some non-primitive form of assignment is involved. The comparison operators accept a pair of items of the same type, and return a boolean object which indicates whether or not a given condition obtains. The type determines how many different conditions can be compared for. A pair of numeric items can be compared to see which is greater, while a pair of uninterpreted items can be compared only for equality. In general, a concept of "greater than" is builtin for a datatype only if it is a very widely applied concept. The comparison operators are used in the construction of inclusion conditions when defining subsets of aggregate data. The result of a comparison operation is a boolean item: one whose value is either TRUE or FALSE. Logical primitives are provided and generalized boolean functions can be constructed from them. With logical and comparison operators, complex conditions for inclusion of objects in sets can be specified. Arithmetic operators will be available for the manipulation of numeric data. Here, we are not interested in generalized computation, but in applications of arithmetic in data selection, space allocation, subscript calculation, iteration control, etc. Conversion is an important part of generalized data translation, and we are interested in providing a substantial builtin conversion facility. In particular, we will want to provide an efficient system routine for each "standard" or widely-used conversion function. Of particular importance are conversions to and from character string data; in character string representation of, for example, numeric items, there are many possible formats corresponding to a single data type. Conversion between character sets and dealing with padding and truncation are viewed as conversion problems. There are two principal classes of primitive operators defined on aggregates: those related to data reference (see previous section) and those which add and delete components. Changing an existing component is accomplished through assignment, and is an operation on the component, not the aggregate. Addition and deletion of components is defined only for aggregates which are not inherently static in composition. Thus one can add a component to a LIST, but not to an ARRAY. To specify deletion it is necessary to specify which component is to be deleted, and from which aggregate (in the case that it is shared). Addition requires specification of new component, aggregate, and sometimes auxiliary information. For example, some aggregate types would permit addition of new components anywhere in the structure; in these a position must be indicated, relative to any
existing components. Often it is desirable to operate on some of the members of a list, or to treat a group of members as a list in its own right. For example, it might be common to transmit to a remote program for analysis, the medical history of patients developing heart disease before the age of 30. These may be just a few of the members of a large list of patients. In this case, the operation to be performed is transmission to the remote system; this operation is performed on several members of the list of patients. The ones to be transmitted are thought of as a _set_; the set is specified as containing all the members of a given list satisfying two conditions: (1) age less than 30, and (2) has heart disease. Sets can be defined explicitly, or implicitly simply with appropriate reference mechanisms. _Definition_ of a set is distinct from _identification_of_membership_, which is distinct from _access_to_membership_. Definition involves specifying the candidates for set membership and specifying a rule by which members of the set can be distinguished from non-members; for example, an inclusion condition such as "under 30 with heart disease". Identification involves effective application of the rule to all candidates for membership. When the membership has been identified, it can be counted, but the data itself has not necessarily been accessed. When a member is accessed, its contents can be operated on. Primitives to accomplish each of these operations on a set will be provided; however, it will ordinarily be optimal for the datacomputer to determine when each step should be performed. To enable users to operate at a level at which the datacomputer can optimize effectively, higher-level operators on sets will be provided. Some of these are logical operators, such as union and intersection. These input and output sets. Also available is an operator which complements a set (since the definition establishes all possible candidates, a set always has a well-defined complement). These higher level operators can be applied to any defined set; the set members need not be identified or accessed. The system will perform such operations without actually accessing members if it can. Some of the other operators on sets are counting membership, partitioning a set into a set of sets, uniting a set of sets into a set. A set can be used to reference another set, providing there is a well- defined way to identify members of the second set given the first set. For example, a set C may contain all the children doing poorly in school. A set F may be defined, where the members of F are the records about families having a child in set C.
Some other useful operations on sets are: adding all the members of a set to an aggregate, deleting all the members of a set (frequently such a massive change can be performed far more efficiently than the same set of changes individually requested), changing all the members of a set in a given way. A set can be made into a list, by actually accessing each member and physically collecting them. Some of the operations on lists are: concatenation of lists into larger lists, division of a list into smaller lists, sorting a list, merging a pair of ordered lists (preserving order). This is not intended to be a full enumeration of high-level operations, but to be suggestive. We are planning to build in high-level functions for operations which are used very commonly, and can be implemented within the system significantly better than they can be implemented by users in the language. For most of the functions mentioned here, considerable knowledge is accumulated on good implementations. In particular, the techniques used for inverted file access provide many set operations to be performed without actual access to the data. 3.14 Control The control features of datalanguage are to the basic operations as data aggregates are to the basic data items. Control features are used to create complex requests out of the basic requests provided by datalanguage. Conditional requests allow the user to alter the normal request flow by specifying that certain requests are to be executed under certain conditions. In general datalanguage will provide the ability to chose at most one of a number of requests to be made based on some set of conditions or the value of some item. In its simplest form the conditional allows for optional execution of a given request. Iterative requests cause a request (called the body) to be executed a fixed or variable number of times or until a given condition is met. Datalanguage will provide iterative requests that will allow for similar manipulations to be performed on all members of some aggregate structure as well as the standard type of iterative request based on counters. By providing a capability of directly expressing manipulations on aggregates which require processing all of the items subordinate to the aggregate, the datacomputer can be more efficient in processing user requests. For example, a user defined conversion process which operates on character strings, can be implemented far more efficiently if the datacomputer is explicitly informed that the process requires sequential
processing of the characters. Datalanguage will also provide for parallel iteration. For example, the user will be able to specify operations which require sequencing through two or more lists in parallel. This would be done if the contents of one file were to be updated based on a file of correction information. Compound requests are collections of requests which act as one. They are primarily provided to allow for the conditional performance of or iteration on more than one statement. Compound requests also provide request reference points which can be used to control the request processing flow. That is, compound requests can be "named". The datalanguage user will be able to specify control information which will conditionally cause a compound request to be exited. By providing naming, the user may cause any number of previously entered compound requests to be exited. We do not intend to provide the traditional _goto_ capability. By not including a goto request, the chances for efficient operation (via optimization) of the datacomputer are increased. We also hope, in this way, to force the datalanguage user to specify his data manipulations in a clear sty1e. Two forms of the compound request will be provided, ordered and unordered. In the unordered case the user is informing the datacomputer that the requests can be performed in any order. This should allow the datacomputer to perform more efficiently and might even allow for parallel processing. During a session with the datacomputer it is likely that a user will find a need for temporary data. That is, data which is used to remember, for a short term, information which is needed for the processing of requests. This short term might be a session or a small part of a session. Datalanguage will provide a temporary data facility. Temporary data will be easy to create, use and dispose of. This will be accomplished by allowing the system to (optionally) make many decisions regarding the data. For example the representation of a temporary integer item will often be of no concern to the user. Some features which are provided for permanent data will be deemed irrelevant with regard to temporary data. Temporary data will be associated with a collection of requests in what will be called a block. A block will be no different than a compound request with the exception that data is defined with the requests which compose it and is automatically created on entrance to the block and destroyed on exiting the block.
3.15 Extensibility The goals of datalanguage are to provide facilities of data structure at two levels. At one level the user may take advantage of high level data capabilities which will do much of his data management work automatically and which allows for the data computer to operate more effectively in some cases since it has been given control of the data. At another level, however, features are provided which allow the user to describe his application in terms of primitive concepts. In this way the datacomputer user may compose a large variety of data constructs and has great flexibility with respect to the manipulations he can perform on his data. Also by interacting with the datacomputer at the primitive level, the user can exercise a good deal of control over the methods employed by the datacomputer which may result in more effective usage of resources for non-standard applications. Datalanguage will provide features which allow the user to create an environment whereby the datacomputer system appears to provide features especially tailored to his application. The control features discussed above allow the user to extend the operations available on data by appropriate composition of the operations. Datalanguage will provide a method of defining a composite request to be a new request (called a _function_). In this way a new operation on specific data can be defined once and then used repeatedly. In order that the user may define general operations, datalanguage will provide functions which can be parameterized. That is, functions will not only be able to operate on specific data but may be defined to work on any data of a specific type. This capability will not be limited to basic data types (e.g. integers) or even specific aggregate types (e.g. array of integers) but will also include the ability to define functions which operate on classes of data. For example, functions can be defined which operate on lists independent of the type of the list members. Also provided, will be the ability to expand and modify existing functions as well as creating new functions. This includes expanding the types of data for which a function is defined or modifying the behavior of a function for certain types of data. As with operations, the data aggregates discussed above allow the user to extend the primitive data types by appropriate composition. For example, a two dimensional array of integers can be created by creating an array of arrays of integers. The situation for data types is analogous to that of operations. Datalanguage will provide the ability to define a composition of data to be a new data type. Also the capability of defining general data structures will be provided by essentially parameterizing the new data definition. This would allow the general concept of two dimensional array to be defined as an array of arrays. Once defined, one could create two dimensional arrays of integers, two dimensional arrays of booleans, etc. As with functions
there is also a need to expand or modify existing data types. One might want to expand the attributes which apply to a given data type, in that he might want to add new attributes, or add new choices for the existing attributes. The control features can be extended also. Special control features might be needed to process a data structure in a special way or to process a user defined data structure. For example, if a tree type data structure has been defined in terms of lists of lists, the user might like to define a control function which causes a specified operation to be performed on each item of a specified tree. As with data types and functions, there is a need to be able to modify and extend existing control features as well as the ability to create new ones. Datalanguage will provide the ability to treat data descriptions and operations in much the same way that data is treated. One can describe and manipulate descriptions and operations in the same way that he can describe and manipulate data. It is impossible to talk about data types without consideration of operations and equally as impossible to talk about operations without an understanding of the data types they operate on. In order for the user to be able to effect the behavior of the datacomputer system, the design of datalanguage will include a definition of the operational cycle of the datacomputer. Precise definitions of all aspects of data (data attributes, data classes, relationship of aggregates to their subordinate items, etc.) in terms of their interaction with datalanguage operations will be made. In this way the datacomputer can offer tools which will give the datacomputer user the ability to be an active participant in the design of the datalanguage which he uses.
4. A Model for Datalanguage Semantics For the purpose of defining and experimenting with language semantics and with language processing techniques, we are developing a model datacomputer. The principal elements of the model are the following: (1) A set of primitive functions (2) An environment in which data objects can be created, manipulated and deleted, using the primitives (3) A structure for the representation of collections of data values, their descriptions, their relationships, and their names. (4) An interpreter which executes the primitives (5) A compiler which inputs requests in a very simple language, performs binding and macro expansion operations, and generates calls to the internal semantic primitives. If our modeling efforts are successful, the model will evolve until it accepts a language like the datalanguage whose properties we have described in sections 2 and 3 of this paper. Then the process of writing the final specification will simply require reconciliation of details not modeled with structure that has been modeled. One rather large detail which we may never handle within the model is syntax; in this case reconciliation will be more involved; however, we firmly believe that the semantic structure should determine the syntax rather than the opposite, so we will be in the proper position to handle the problem. By constructing a model for each of the elements listed above, we are "implementing" the language as we design it, in a very loose sense. In effect, we work in a laboratory, rather than working strictly on paper. Since we aren't concerned with the performance or usability of the datacomputer we are building in the laboratory, we are able to build without becoming involved with some of the most time-consuming concerns of an implementor. However, because we are building and tinkering, rather than simply working on paper, we do get some of the advantages that normally come with the experience of implementing one's ideas. The model datacomputer is a program, developed in ECL, using the EL1 language. Presently we are interested in the process of developing the program, not running it. Our primary requirement is to have, in advance of the existence of datalanguage, a well-defined and flexible notation in which to specify data structures, function definitions and examples. EL1 is convenient for this. Having a program which actually works and acts like a simple datacomputer is really a by-product of specifying semantics in a programming language. It is not necessary for the program to work, but it does provide some nice features. It enhances the "laboratory" effect, by doing such things as automatically compiling
strings of primitives, displaying the state of the environment in complicated examples, automatically discovering inconsistencies (in the form of bugs), and so on. There are two major reasons that EL1 is a convenient notation for specifying datalanguage semantics. One is that the languages have a certain amount in common, in both concepts and in goals in data description. (In part, this is because EL1 itself has been a good source of ideas in attacking the datalanguage problem). Both languages emphasize operations on data, independent of underlying representation. A second reason that EL1 is a convenient way to specify datalanguage, is that EL1 is extensible; in fact, many primitive functions could be embedded directly into EL1 by using the extension facilities. At times, we have chosen to embed less than we could, to expose problems of interest to us. So far, the model has been useful primarily in exposing design issues and relationships between design decisions. Also, because it includes so many of the elements of the full system (compiler, interpreter, environment, etc.), it encourages a fairly complete analysis of any proposal. In presenting the model in this section, we have chosen to emphasize ideas and examples, rather than formal definitions in EL1. This is because the ideas are more permanent and relevant at this point (the formalisms are changing rather frequently) and because we imagine people reading the formal definitions only to get at the ideas. The formal definitions may be interesting in themselves when the language is complete; at this point they are probably of interest only to us. The section is organized into a large number of sub-sections. The first few are concerned with the basic concepts of data objects, descriptions, and relationships between objects. We then discuss primitive semantic functions and present informal definitions and examples in sections 4.7 and 4.8. Section 4.9 is a brief discussion of compilation, interpretation and the execution cycle. Section 4.10 provides a fairly elaborate example of how primitive functions can be combined to do something of interest: a selective retrieval by content. The last two sections wrap up with discussions of high-level functions and some conclusions. 4.1 Objects An _object_ has a name, a description, and a value. It can be related to other objects. The _name_ is a symbol, which can be used to access the object from
language functions. The _description_ is a specification of properties of the object, many of which relate to the meaning or the representation of the value. The _value_ is the information of ultimate interest in the object. The relationships between objects are hierarchical. Each object can be related directly to at most four other objects, designated as its _parent_, its _child_, its _left_sibling_, and its _right_sibling_. This specific concept of relationship is all that has been built in to the model to date. One of our primary objectives in the future is to experiment with more general relationships among objects. 4.2 Descriptions A description has the components _name_, _type_ and _type- dependent_parameters_. It can be related hierarchically to other descriptions, according to a scheme similar to the one described for objects in 4.1. The _name_ has a role in referencing, as in the case of objects. _Type_ is an undefined, intuitive idea for which we expect to develop a precise meaning within datalanguage(see section 3.10 for some of the ideas about this). In terms of the present model, it simply means one of the following: LIST, STRUCT, STRING, BOOL, DESC, DIR, FUNC, 0PD. Each of these refers to a sort of value corresponding to common ideas in programming (with the exception of OPD, which is explained in section 4.7), and on which certain operations are defined. Examples of _type-dependent_parameters are the two items needed to define a STRING: size option and size. A STRING is a sequence of characters; the size of the STRING is the number of characters in it. If a STRING has a fixed size, then size option is FIXED and size is the number of characters it always contains. If a STRING has a varying size, then size option is VARYING, and size is its maximum (clearly, it might also have a minimum in a more refined scheme). When the description of an object has a type of STRING, it is commonly said that the object is a STRING. 4.3 Values The value is the data itself.
An object of type BOOL can have only either the value TRUE or the value FALSE. An object of type STRING has values such as 'ABC', 'JOHN', or 'BOSTON'. Each value has a representation, in bits. Thus a BOOL is represented by a single bit, which will be a 'one' to represent TRUE and a 'zero' to represent FALSE. 4.4 Some examples Here are some examples of structures involving objects, descriptions, and values. In these explanations and drawings, the objective is to convey some ideas about these primitive structures; considerable detail is omitted in the drawings in the interest of clarity. Figure 4-1 shows two objects. X is of type string and has value 'ABC'. Y is of type bool and has value TRUE.
_________________ | | | _____________ | | | X | | | |_____________| | | NAME | ____________ | _____________ | | ________ | | | ____|_|______\| | STRING | | | |_____________| | /| |________| | | DESCRIPTION | | TYPE | | _____________ | |____________| | | | | DESCRIPTION | |_________|___| | | VALUE | | ____________ |___________|_____| | | OBJECT |____________\| "ABC" | /|____________| VALUE _________________ | | | _____________ | | | Y | | | |_____________| | | NAME | ____________ | _____________ | | ________ | | | ____|_|______\| | BOOL | | | |_____________| | /| |________| | | DESCRIPTION | | TYPE | | _____________ | |____________| | | | | DESCRIPTION | |_________|___| | | VALUE | | ____________ |___________|_____| | | OBJECT |____________\| TRUE | /|____________| VALUE Figure 4-1 Two elementary objects Figure 4-2 illustrates an object of type dir (a _directory_) and related objects. The directory has name SMITH. There are two objects entered in this directory, named X and Y.
_________________ | _____________ | | | SMITH | | | |_____________| | | NAME | ____________ | _____________ | | ________ | | | ____|_|______\| | DIR | | | |_____________| | /| |________| | | DESCRIPTION | | TYPE | | _____________ | |____________| | | | | DESCRIPTION | |_________|___| | | CHILD | | |___________|_____| OBJECT | ___________V_____ | _____________ | | | X | | | |_____________| | | NAME | _________________ | _____________ | | _____________ | _____|_|____ | | | | Y | | | | |_____________| | | |_____________| | | | DESCRIPTION | | NAME | | | _____________ | | _____________ | | __|_|____ | | | | ____|_|_____ | | | |_____________| | | |_____________| | | | | | VALUE | | DESCRIPTION | | | | | _____________ | | _____________ | | | | | | ____|_|_____\| | ____|_|__ | | | | |_____________| | /| |_____________| | | | | | | SIBLING | | VALUE | | | | | |_________________| |_________________| | | | | OBJECT OBJECT | | | | _________________ _________________ | | | |_\| "ABC" | | FALSE |/_| | | /|_________________| |_________________|\ | | VALUE VALUE | | _________________ _________________ | | | _____________ | | _____________ | | | | | STRING | | | | BOOL | | | |____\| |_____________| | | |_____________| |/____| /| TYPE | | TYPE |\ |_________________| |_________________| DESCRIPTION DESCRIPTION Figure 4-2: A directory with two members
The idea of a dir is similar to the idea of a file directory in most systems. A directory is a place where one can store named objects, freely adding and deleting them. The entries in the directory are all objects whose parent is that directory. Figure 4-3 shows a more rigidly structured group of objects. Here we have R, a struct, and A and B, a pair of strings. Note that the boxes labeled 'object' in figure 4-3 bear precisely the same relationships to one another as those labeled 'object' in 4-2. However, there are two conditions which hold for 4-3 but do not hold for 4-2: (1) the value of R contains the values of A and B, and (2) the descriptions of R, A and B are all related. Structs have the following properties: (1) name and description of each component in the struct is established when the struct is created, and (2) in a value of the struct, the order of occurrence of component values is fixed.
_________________ _________________ | _____________ | | _____________ | | | R | | | | STRUCT | | | |_____________| | | |_____________| | | NAME | | TYPE | | _____________ | | _____________ | | | ____|_|______\| | | | | |_____________| | /| |__________|__| | | DESCRIPTION | | CHILD | | | _____________ | |____________|____| _____|_|____ | | DESCRIPTION | | | |_____________| | ____________V____ | | VALUE | | _____________ | | | _____________ | | | STRING | | | | | | | | |_____________| | | | |_________|___| | ___\| TYPE | _____________ | | CHILD | | | /| _____________ | | _________ | | |___________|_____| | | | ____|_|______\| | STRING | | | OBJECT | | | |_____________| | /| |_________| | | | | | SIBLING | | TYPE | | ___________V_____ | |_________________| |_____________| | | _____________ | | DESCRIPTION DESCRIPTION A | | | A | | | | | | |_____________| | | _________________ | | | NAME | | | _____________ | | | | _____________ | | | | B | | | | | | ____|_|__| | |_____________| | | | | |_____________| | | NAME | | | | DESCRIPTION | | _____________ | | | | _____________ | | | ____|_|___________________| | __|_|____ | | | |_____________| | | | | |_____________| | | DESCRIPTION | | | | VALUE | | _____________ | | | | _____________ | | | ____|_|____ | | | | ____|_|______\| |_____________| | | | | | |_____________| | /| VALUE | | | | | SIBLING | | _____________ | | | | |_________________| | | | | | | | OBJECT | |_____________| | | | | | SIBLING | | | | |_________________| | | |__________ OBJECT _____________| | _______|__________________________|_______ |____\| _____V_______ _______V_____ | /| | "ABC" | | FALSE | | Figure 4-3 | |_____________| |_____________| | A STRUCT with |__________________________________________| two members
Figure 4-4 shows a list named L. Here a similar structure of objects is implied, but because of the regularity of the structure, not all the boxes labeled 'object' are actually present. _________________ | _____________ | | | L | | | |_____________| | | NAME | ____________ | _____________ | | ________ | | | ____|_|______\| | LIST | | | |_____________| | /| |________| | | DESCRIPTION | | TYPE | | _____________ | | ________ | | | | | | | | | | |_______|_____| | | |______|_| | | VALUE | | | CHILD | | |_________|_______| |________|___| OBJECT | DESCRIPTION | | | _________V_______ ________V___ | | | ________ | | _____________ | | | STRING | | | | "ABC" | | | |________| | | |_____________| | | TYPE | | _____________ | |____________| | | "XY" | | DESCRIPTION | |_____________| | | _____________ | | | "ZLM" | | | |_____________| | | : | | : | | _____________ | | | "BBBF" | | | |_____________| | |_________________| VALUE Figure 4-4 A LIST L has a variable number of components, all satisfying the description subordinate to L's description.
We could imagine an 'object' box for each string in L. Each of these boxes would point to its respective string and to the common description of these strings. Instead, we think in terms of creating such boxes as we need them. 4.5 Definitions of types Following are some more precise definitions of types, in terms of the present model. These serve the purpose of establishing more firmly the semantics of our structure of objects, descriptions and values; however, they should not be thought of as providing a definition for the completed language specification. An object of type STRING has a value which is a sequence of characters (figure 4-1). An object of type BOOL has a value which is a truth value (TRUE or FALSE -- figure 4-1). An object of type DIR has subordinate objects, each having its own description and value. Subordinate objects can be added and deleted at will (figure 4-2). An object of type STRUCT has subordinate objects, each of which has a description which is subordinate to the STRUCT's description, and a value contained in the STRUCT's value. The number, order and description of components is fixed when the STRUCT is created (figure 4-3). An object of type LIST may be thought of as having imaginary subordinate objects, whose existence is simulated by the use of appropriate techniques in processing the LIST. Each of these has the same description, which is subordinate to the description of the LIST. Each has a distinct value, contained in the value of the LIST. In fact, only the LIST object, the LIST and component descriptions, and the values exist (figure 4-4). An object of type DESC has a description as its value. This value is the same sort of entity which serves as the description of other objects. An object of type FUNC has a function call as its value. We will be able to say more about this after functions have been discussed. An object of type OPD has an operation descriptor as its value. (see 4.7 for details).
4.6 Object environment There are three categories of objects in the model datacomputer. These are p/objects, t/objects, and i/objects. P/objects are permanent objects created explicitly with language functions. They correspond to the idea of stored data in the real datacomputer. There are three special objects. These are special only in that they are created as part of initializing the environment, rather than as the result of executing a language function. These are named STAR, BLOCK and TOP/LEVEL. All three are of type DIR. An object is a p/object if it is subordinate to STAR; it is a t/object if it is subordinate to BLOCK. TOP/LEVEL is subordinate to BLOCK. (see figures 4-5 and 4-6). _________________ | | | _____________ | | | STAR | | | |_____________| | | NAME | ____________ | _____________ | | ________ | | | ____|_|______\| | DIR | | | |_____________| | /| |________| | | DESCRIPTION | | TYPE | | _____________ | |____________| | | | | DESCRIPTION | |_________|___| | | CHILD | | |___________|_____| OBJECT | | | | V ALL P/OBJECTS Figure 4-5 STAR and p/objects T/objects are temporary objects, also created explicitly with language functions. However, these correspond to user-defined temporaries, both local to requests and "top-level" (i.e. not local to any request, but existing until deletion or logout.)
_________________ | | | _____________ | | | BLOCK | | | |_____________| | | NAME | ____________ | _____________ | | ________ | | | ____|_|______\| | DIR | | | |_____________| | /| |________| | | DESCRIPTION | | TYPE | | _____________ | |____________| | | | | DESCRIPTION | |_________|___| | | VALUE | | |___________|_____| OBJECT | | | ___________V_____ | | | _____________ | | | TOP/LEVEL | | | |_____________| | | NAME | ____________ | _____________ | | ________ | | | ____|_|______\| | DIR | | | |_____________| | /| |________| | | DESCRIPTION | | TYPE | | _____________ | |____________| | | ____|_|___ DESCRIPTION | |_____________| | | | SIBLING | | | _____________ | |___\ ALL BLOCKS AND | | | | / LOCAL T/OBJECTS | |_________|___| | | CHILD | | |___________|_____| | | V ALL GLOBAL T/OBJECTS Figure 4-6 BLOCK, TOP/LEVEL and t/objects
I/objects are internal, system-defined objects whose creation and deletion is implicit in the execution of some language function. I/objects are hung directly off of function calls (objects of type FUNC), and are always local to the execution of such function calls. They correspond to the notions of (1) literal, and (2) compiler- or interpreter-generated temporary. 4.7 Primitive Language Functions Here we discuss the primitive language functions presently implemented in the model and likely to be of most interest. In this section, the emphasis is on relating functions to one another. Section 4.8 contains more detail and examples. _Assign_ operates on a pair of objects, called the target and the source. The value of the source is copied into the value of the target. Figure 4-7 shows a pair of objects, X and Y, before and after execution of an assignment having X as target and Y as source. Presently, assignment is defined only for objects of type BOOL and objects of type STRING. The objects involved must have identical descriptions.
_________________ _________________ | | | | | _____________ | | _____________ | | | | | | | | | | | X | | | | Y | | | |_____________| | | |_____________| | | NAME | | NAME | | _____________ | | _____________ | | | | | | | | | | |_______|_____| | | |_______|_____| | | VALUE | | | VALUE | | |_________|_______| |_________|_______| OBJECT | OBJECT | | | _________V_______ _________V_______ | | | | | "ABC" | | "DEF" | |_________________| |_________________| VALUE VALUE BEFORE ASSIGNMENT _________________ _________________ | | | | | _____________ | | _____________ | | | | | | | | | | | X | | | | Y | | | |_____________| | | |_____________| | | NAME | | NAME | | _____________ | | _____________ | | | | | | | | | | |_______|_____| | | |_______|_____| | | VALUE | | | VALUE | | |_________|_______| |_________|_______| OBJECT | OBJECT | | | _________V_______ _________V_____ | | | | | "DEF" | | "DEF" | |_________________| |_________________| VALUE VALUE AFTER ASSIGNMENT Figure 4-7 Effect of assignment
A class of primitive functions for manipulating LISTs is defined. These are called _listops_. All listops input a special object called an _operation_descriptor_ or OPD. To accomplish a complete operation on a LIST, a sequence of listops must be executed. There are semantic restrictions on the composition of such sequences, and it is best to think of the entire sequence as one large operation. The state of such an operation is maintained in the OPD. Refer back to figure 4-4. There is one box labeled "object" in this picture; this box represents the list as a whole. To operate on any given member we need an object box to represent that member. Figure 4-8 shows the structure with an additional object box; the new box represents one member at any given moment. Its value is one of the components of the LIST value; its description is subordinate to the LIST description. In 4-8, the name of this object is M. In 4-8 we have enough structure to provide a description and value for M, and this is sufficient to permit the execution of operations on M as an item. However, there is no direct link between object M and object L. The structure is completed by the addition of an OPD, shown in figure 4-9.
_________________ _________________ | | | _____________ | | _____________ | | | | | | | L | | | |_____________| | | |_____________| | | TYPE | | NAME | | _____________ | | _____________ | | | | | | | ____|_|______\| |__________|__| | | |_____________| | /| CHILD | | | DESCRIPTION | |____________|____| | _____________ | DESCRIPTION | | | | | | | |_________|___| | ____________V____ | VALUE | | | _____________ | |___________|_____| | | STRING | |/___ OBJECT | | |_____________| |\ | | | TYPE | | ___________V_____ |_________________| | | | DESCRIPTION | | _____________ | | | | "ABC" | | _________________ | | |_____________| | | | | | _____________ | | _____________ | | | | "XY" | | | | M | | | | |_____________| | | |_____________| | | | _____________ | | NAME | | | | "ZLM" |/|___ | _____________ | | | |_____________|\| | | | ____|_|____| | : | | | |_____________| | | : | | | DESCRIPTION | | _____________ | | | _____________ | | | "BBBF" | | |___|_|____ | | | |_____________| | | |_____________| | |_________________| | VALUE | VALUE |_________________| OBJECT Figure 4-8 LIST and member objects
_________________ _________________ | _____________ | | _____________ | | | L | | | | | | | |_____________| | | |_____________| | | NAME | | TYPE | | _____________ | | _____________ | | | ____|_|______\| | | | | |_____________| | /| |__________|__| | | DESCRIPTION | | CHILD | | | _____________ | |____________|____| | | | |/__ DESCRIPTION | | |_________|___| |\ | ____________V____ | VALUE | | | | _____________ | |___________|_____| | | | STRING | |/___ OBJECT | | | |_____________| |\ | | | | TYPE | | ___________V_____ | |_________________| | | _____________ | | DESCRIPTION | | | "ABC" | | | _________________ | | |_____________| | | | | | | _____________ | | | _____________ | | | | "XY" | | |___|_|____ | | | | |_____________| | | |_____________| | | | _____________ | | LIST | | | | "ZLM" | | | _____________ | | | |_____________| | | | | | | | : | | |_________|___| | | | : | | MEMBER | | | | _____________ | | : | | | | | "BBBF" |/|___ | : | | | | |_____________|\| | |___________|_____| | |_________________| | OPD | | VALUE | ___________V_____ | | | _____________ | | | | | M | | | | | |_____________| | | | | NAME | | | | _____________ | | | | | ____|_|____| | | |_____________| | | | DESCRIPTION | | | _____________ | |___|_|____ | | | |_____________| | Figure 4-9 | VALUE | OPD, LIST and member |_________________| OBJECT
The OPD establishes the object relationship, and contains information about the sequence of primitive listops in progress. When sufficient information is maintained in the OPD, we have in 4-9 a structure which is adequate for the maintenance of the integrity of the LIST and of the global list operation. In addition to LIST and member pointers, the OPD contains information indicating: (1) which suboperations are enabled for the sequence, (2) the current suboperation, (3) the instance number of the current LIST member, (4) an end-of-list indicator. The suboperations are add/member, delete/member, change/member and get/member. All apply to the current member. Only suboperations which have been enabled at the beginning of a sequence may be executed during that sequence; eventually, the advance knowledge of intentions that is implied by this will provide important information for concurrency control and optimization. Presently, an OPD relates a single member object to a single LIST object. This imposes an important restriction on the class of operation sequences which can be expressed. Any LIST transformation requiring simultaneous access to more than one member must be represented as more than one sequence. (And we do not yet solve the problems implied in concurrent execution of such sequences, even when both are controlled by one process.) Any transformation of a LIST can still be achieved by storing intermediate results in temporary objects; however, it is certainly more desirable to incorporate the idea of multiple current members into the semantics of the language, than it is to use such temporaries. An important future extension of the listops will deal with this problem. There are six listops: listop/begin, listop/end, which/member, end/of/list, open/member and close/member. Listop/begin and listop/end perform the obvious functions of beginning and terminating a sequence of listops. Listop/begin inputs LIST and member objects, an OPD, and a specification of suboperations to enable. It initializes the OPD, including establishment of the links to LIST and MEMBER objects. After the OPD-LIST-member relationship has been established, it is only necessary to supply the OPD and auxiliary parameters as input to a listop in the sequence. From the OPD everything else can be derived. Listop/end clears the OPD and frees any resources acquired by listop/begin. Which/member establishes the current member for any suboperations. This is either the first LIST member, the last LIST member, or the next LIST member. This listop merely identifies which member is to be operated on; it does not make the contents of the member accessible.
Open/member and close/member bracket a suboperation. The suboperation is indicated as an argument to open/member. Open/member always establishes a pointer from the member object to the member value; close/member always clears this pointer. In addition, each of these listops may take some action, depending on the suboperation. The details of the action would be dependent on the representation of the LIST in storage, the size of a LIST member, and choices made in implementation. Between execution of the open/member and the close/member, the data is accessible. It can always be read; in the case of the add/member and change/member suboperations, it can also be written into. End/of/list tests a flag in the OPD and returns an object of type BOOL. The value of the object is the same as the value of the flag; it is TRUE if a get/member, change/member or delete/member would be unsuccessful due to a which/member having moved "beyond the end". T his listop is provided so that it is possible to write procedures which terminate conditionally when all members have been processed. Get/struct/member provides the ability to handle STRUCTs. Given a STRUCT object which points to the STRUCT value, it will establish a pointer from a given member object to the member value. (The pointer it establishes is represented by a dashed line in figure 4-10).
_________________ _________________ | _____________ | | _____________ | | | F | | | | STRUCT | | | |_____________| | | |_____________| | | NAME | | TYPE | | _____________ | | _____________ | | | ____|_|______\| | | | | |_____________| | /| |__________|__| | | DESCRIPTION | | CHILD | | | _____________ | |____________|____| | | | | DESCRIPTION | | |___________|_| | ____________V____ _________________ | VALUE | | | _____________ | | _____________ | | ___________|_ | | | STRING | | | | STRING | | | | | | | | |_____________| | | |_____________| | | |_________|_|_| | | TYPE | | TYPE | | CHILD | | | | _____________ | | _____________ | |___________|_|___| ____\| | | | | | | | OBJECT | | | /| |_____________| | | |_____________| | | | | | SIBLING | | SIBLING | | | | |_________________| |_________________| | | | DESCRIPTION DESCRIPTION A | | | ______________________________________ | | | | | ____________ ____________ | | | | | | | "ABC" | | FALSE | | | | |_____|_____| |____________| |____________| | | | | |________A_____________________________| | | | ............: VALUE | ___________V_____ | : _________________ | | _____________ | | : | _____________ | | | | A | | | : | | B | | | | |_____________| | | : | |_____________| | | | NAME | | : | NAME | | | _____________ | | : | _____________ | | | | ____|_|_| : | | ____|_|_______________________| | |_____________| | : | |_____________| | | DESCRIPTION | : | DESCRIPTION | | _____________ | : | _____________ | | | ....|.|....: | | | | | |_____________| | | |_____________| | | VALUE | | VALUE | | _____________ | | _____________ | | | ____|_|______\| | | | | |_____________| | /| |_____________| | | SIBLING | | SIBLING | |_________________| |_________________| Figure 4-10 OBJECT OBJECT Effect of GET/STRUCT/MEMBER