DATA AND THEIR APPLICATIONS (1974)


Chapter 1 of Concise Survey of Computer Methods, Studentlitteratur, Lund, Sweden, and Petrocelli/Charter, New York, 1974.


The present chapter is concerned with certain fundamental ideas related to data and their applications. An attempt will be made to convince the reader that although introduced in their present form only recently, these concepts are strongly related to ideas known from daily life for a long time.


Summary of the Chapter

The starting point is the concept of data, as defined in [Gould, 1971]: DATA: A representation of facts or ideas in a formalized manner capable of being communicated or manipulated by some process. Data science is the science of dealing with data, once they have been established, while the relation of data to what they represent is delegated to other fields and sciences.

The usefulness of data and data processes derives from their application in building and handling models of reality.

Data representations may be chosen freely, and data used in practice differ along several dimensions, being static or dynamic, digital or analog, and using one out of a number of different media.

Numbers and their representation illustrate the concept of data, besides being of central importance in formulating data processes of any kind.

Data conversions are the simplest kind of data processes, but may illustrate wide ranges of data representations, particularly the interplay of static and dynamic representations.

In general data processing, data representing some meaning is processed in accordance with some intent to form new, so far unknown, data. In good data processing these latter may be used directly by humans to guide their actions.

A basic principle of data science is this: The data representation must be chosen with due regard to the transformation to be achieved and the data processing tools available. This stresses the importance of concern for the characteristics of the data processing tools.

Limits on what may be achieved by data processing may arise both from the difficulty of establishing data that represent a field of interest in a relevant manner, and from the difficulty of formulating the data processing needed. Some of the difficulty of understanding these limits is caused by the ease with which certain data processing tasks are performed by humans.