Markus Stocker bio photo

Markus Stocker

Between information technology and environmental science with a flair for economics, the clarinet, and the world of soups and salads.

Email Twitter Google+ LinkedIn Github

Having discussed the RDF (meta-) data model, the RDFS and OWL languages used to describe what RDF data mean, the SSN ontology as a vocabulary for describing sensor observations and metadata about sensing devices in RDF, and the QB vocabulary for describing dataset observations and metadata about datasets in RDF, this post goes beyond data and discusses the representation of information and knowledge acquired from data.

Environmental monitoring systems that build on environmental sensor networks acquire sensor observations and process dataset observations. As argued earlier, the data that result in processing are often structured into multidimensional datasets. Such datasets serve in data interpretation, i.e. in processes that result information.

Beyond data interpretation, systems may want to compose acquired information objects to structured knowledge objects. This is a bit like composing a puzzle given its pieces. Each piece is an information object of a larger knowledge picture and has to go in the right place.

The information one may acquire from data is, of course, diverse. The purpose of the environmental sensor network, e.g. to monitor gas fluxes, and the defined goals, e.g. to assess periods during which an ecosystem is a source or a sink of carbon, largely determine how data is interpreted and what information is acquired. Diverse are also the knowledge types for composed information.

This post focuses on a particular knowledge type, namely situational knowledge. The environmental monitoring system thus acquires knowledge about situations observed by the system.

Just like in data processing, computational models are generally involved also in data interpretation. Model types include data-driven and physically-based, e.g. a data-driven machine learning classification model or an ecological model. Data interpretation, and thus information acquisition, can occur manually or (semi-)automatically.

Situation

Theories have been proposed which formalize what situations are, how information about situations is obtained, how information entities are structured, and what object types form information entities. One such theory is situation theory, discussed at length in Keith Devlin’s book Logic and Information [1], of which you can find a summary online.

In situation theory, situation is a structured part of reality. Take for instance a reed canary grass field and a volume of air the grass exchanges CO2 with. The field and the volume of air form a part of reality. Furthermore, the part is structured by the relations among its objects, e.g. the uptake relation among grass and atmospheric CO2. The field and the volume of air are objects in situations. Other objects include spatial and temporal locations, e.g. a particular point or polygon in space and an instant or period in time. After all, situations are generally located in space-time.

The reed canary grass field and volume of air the grass exchanges CO2 with are always in some situation. The objects are obviously not static. Thus, along the arrow of time, the situations change as the structured part of reality changes and the attributes of objects change (the CO2 concentration in volumes of ambient air is currently increasing).

Situation theory provides for a mathematical object that structures information entities and information about situations. Situations s are said to support infons σ, formally s ⊧ σ, whereby infon σ is the tuple ≪R, a1, …, an, 0/1≫ consisting of a relation R, a set of objects a1, …, an, and a polarity with value 0 or 1, for false and true, respectively. If the polarity is 1 then the objects are said to stand in relation R and the infon is a fact.

Example

The environmental monitoring system that builds on our example environmental sensor network for gas flux monitoring acquires data, processes data, and acquires information about the observed ecosystem from data. The ecosystem consists of a reed canary grass field and volume of air, and is a structured part of reality. We thus acquire information about situations in space-time.

Of interest in this example are situations in which the reed canary grass field is a sink of carbon and situations in which the field is a source of carbon. Relations of interest are thus sink-of-carbon and source-of-carbon. The field rcgf is an object.

At varying time intervals [ti, tj] with start time instant ti and end time instant tj, information acquired from data may determine that the field is a ≪source-of-carbon, rcgf, [ti, tj], 1≫ or a ≪sink-of-carbon, rcgf, [ti, tj], 1≫ in situations.

How information is acquired from data very much depends on the problem and the data. In this example, information for situations during which the field is a sink or a source of carbon can be computed from NEE datasets.

Representation

Modern environmental monitoring systems can obtain such situational knowledge largely automatically. Data is typically acquired automatically via wireless communication channels from sensing devices and software largely automates data processing. In previuos posts, we have discussed the representation of sensor data and datasets using RDF and OWL vocabularies, specifically using the SSN ontology and the QB vocabulary. The representation of sensor observations and dataset observations, and the curation of these data objects, is also automated.

In addition to automating data acquisition and processing, an environmental monitoring system may also automate data interpretation, i.e. automate information acquisition and the composition of information objects to knowledge objects. If derived knowledge is situational and the system builds on RDF and adopts situation theory, then an interesting OWL vocabulary for the representation of knowledge about situations observed by the system is, you may guess it, the Situation Theory Ontology (STO) [2].

With the SSN ontology, the QB vocabulary, and the STO, our environmental monitoring system is equipped with core vocabulary for the representation of acquired sensor data, processed data of datasets, information acquired from datasets, and situational knowledge composed of information. The resulting data, information, and knowledge objects are instances of ontological concepts and can be curated by your favorite RDF database.

Awareness

The environmental monitoring system automates the implementation of some of the fundamental tasks in situation-aware systems. Endsley [3] defined situation awareness as “the perception of elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future.”

For volumes of time and space, the environmental monitoring system uses sensors to perceive the fluxes of gases between the reed canary grass and the atmosphere, and comprehends the meaning of perceived fluxes by processing and interpreting data and composing obtained information to knowledge about situations in which the grass field is a source or a sink of carbon.

With suitable computational models, the system may also implement functionality for situation projection, e.g. forecast gas fluxes in the near future and thus situations for the grass field.

Being “aware of situations,” the system may thus arguably be called a situation-aware environmental monitoring system.

References

[1] Keith Devlin (1991). Logic and Information. Cambridge University Press.

[2] Kokar, Mieczyslaw M. and Matheus, Christopher J. and Baclawski, Kenneth (2009). Ontology-based situation awareness. Information Fusion, 10(1):83-98. doi:10.1016/j.inffus.2007.01.004

[3] Endsley, Mica R. (1995). Toward a theory of situation awareness in dynamic systems. Human Factors 37(1), 32–64. doi:10.1518/001872095779049543

This post is part of a series. Previous posts discussed RDF, RDFS and OWL, the extraction of metadata about sensing devices from various documents, the representation of sensor data using the SSN ontology, and the representation of datasets using the QB vocabulary.