< Texts on topics | cikon.de | download pdf (186K) |
see
also: I-Know'02,
Graz, 11-12 July 2002
|
Webs, Grids and
Knowledge
Spaces
|
Grids |
Webs |
Comments |
|
main drivers |
(big) eScience, eEngineering |
scientific communication (initially, now:) eCommerce, eContent (multimedia) |
there is some overlap and there may be more in the future |
main functions |
high performance computing, sharing of computing resources |
information, communication, transactions |
|
applications |
computationally hard and data intensive problems in science and engineering (e.g. realistic simulations) |
I&C services, education & training, eBusiness, eCommerce (B2B, B2C, B2A, etc.), etc. |
Webs are mainly interfaces to `behind the scene' applications |
data volumes |
XXL (and bigger) |
S - XL |
future Grids may also work on smaller volumes |
resources |
storage (incl. caches), bandwidth, processor time, data files, ... |
digital content and related services |
containers, conveyors & processors vs. content & applications |
users |
special user groups (scientists, engineers) |
general public, businesses, public administrations, etc. |
these are only the main target groups |
standards |
middleware standards need to be agreed |
many standards and recommendations exist |
Grid and Web communities are still fairly separate |
Table 1: A ``comparison'' of GRIDs and WEBs
To take `nothing' for an answer to the question introducing this section may indeed be a bit too little. And it is certainly not necessary. The clue to a possibly correct understanding of the relationship between Webs and Grids lies in the statement ``Webs are mainly interfaces to `behind the scenes' applications'' (cf. Section 3). We noted that these applications can be arbitrarily complex. And we do not usually care who or what is working `behind the scenes'. So it may be Grids (or isolated high performance computers or just an ordinary PC, or whatever). Indeed, Grid applications could render invaluable services even to the general public, via specialised professionals such as medical practitioners. These applications would be accessed via a Web and their output (e.g. visual representations of complex objects or simulations) translated into standard Web formats.
We shall argue that Grids could indeed provide the Web (or Webs ...) with `knowledge', assuming the pragmatic interpretation of that notion put forward in Section 3. >top
While the Web community, led by the World Wide Web Consortium (W3C), and with substantial contributions from researchers in the fields of Artificial Intelligence, agent and database technologies, has been developing and refining the concept of a Semantic Web, Grid proponents - notably in the United Kingdom ([15]) - extended the basic architecture of the ``Computational Grid'' by adding two layers, called ``Information Grid'' and ``Knowledge Grid'' respectively.
Roughly, the first two layers of this model make up the technology explained in Section 4. By contrast, a major role ascribed to processes running within the ``knowledge layer'' of a Grid is to assist in making sense of the huge amounts of data generated by, say, scientific instruments such as particle accelerators, gene sequencers, telescopes, satellites and a gamut of sensors. And they should do so by making use of the computational power and the services of the underlying layers.
As pointed out in Section 4, current Grid development is mainly driven by the needs of ``data intensive'' science. Knowledge Grid processes can therefore be understood best as special applications of the computational Grid, supposed to enhance scientific and other ``problem solving environments''. They make use of a variety of techniques, such as those that can be described broadly as algorithmic content analysis and algorithmic learning. >top
From the foregoing it should be clear that the relationship between Knowledge Grids and Semantic Webs may be characterized as complementarity. Knowledge Grid techniques are indeed among those the Semantic Web calls for in order to render its contents meaningful to software agents (e.g. by creating semantic annotations or by mining data - representing text or other forms of content - for the purpose of establishing and maintaining ontologies). On the other hand the formal framework and (possibly) the organisational underpinning of a Semantic Web would be needed to make full use of Knowledge Grid resources and services.
While many of the basic ideas underlying both the Semantic Web and the Knowledge Grid initiatives are not new, the sheer size, capacity and dynamics of today's global networks (notably the Internet and whatever it will develop into) provide a strong incentive to turn these ideas into large scale reality. This in itself may require a major research effort. In a manner of speaking the evolution of the Internet and the Web brings some of the ``good old fashioned Artificial Intelligence'' research, results and approaches (some of which do require powerful computing resources) down to earth, begging for new approaches to solving new problems. >top
However, as yet neither Semantic Webs nor Knowledge Grids exist as envisioned, let alone do they co-operate. How could they come about? ``Growth'' may be an appropriate metaphor to describe the emergence and evolution of networks (for whatever kind of traffic). But growth does not happen out of the blue. It needs seeds in the first place, then possibly fertilizers and irrigation if nature does not make the necessary provisions herself.
In our case nature's role would be that of the business world, and commercial interest would be the main driving force. However, commercial interest in starting and sustaining pioneering research may be weak if no substantial benefits can be made out in the short or mid term. And indeed there are examples of technologies that would probably never have succeeded the way they did if their development had depended solely on ``commercial interest'' in the first place.
The Internet itself is a case in point. Its initial development depended largely on public funding. And while commercial interest had been strong enough to make ``the Web'' grow exponentially for a number of years this may not be as obvious for the Semantic Web, the Knowledge Grid and their possible ``marriage'' (regardless of what the new family name may be: Semantic Grid or Knowledge Web or whatever). The underlying concepts are after all not so easy to grasp, and their potential benefits (e.g. in terms of creating mass markets) are not so easy to sell given the current perceived slump in online business.
Moreover, a critical mass problem has to be solved for instance for the Semantic Web: adding semantics to content (and services) does not pay off if no tools are available to make good use of it; developing tools, on the other hand, does not pay off if there is little semantically-enriched content to work on.
These may be some of the reasons (apart from the obvious research challenges) why both initiatives, the Semantic Web and the (Knowledge) Grid, have been given firm places on the agenda of the European Commission's IST programme 1998-2002 ([16]): as Semantic Web Technologies under Key Action III (Multimedia Content and Tools, Action Line III.4.1) in the IST work programme 2001 and as Grid Technologies and their applications, a Cross Programm Activity (CPA9) in work programme 2002.
Action Line III.4.1 offered four broad inter-related R&D areas as an orientation for submitting project proposals:
Creating a usable formal framework in terms of formal methods, models, languages and corresponding tools for semantically sound machine-processable resource description (e.g. content characteristics, properties of repositories, capabilities of devices, service features, ...);
fleshing out the formal skeletons by developing and applying techniques for knowledge discovery (in databases and text repositories), ontology learning, multimedia content analysis, content-based indexing, ...;
acting in a semantically rich environment, performing resource and service discovery, complex transactions, semantic search and retrieval, filtering and profiling, supporting collaborative filtering and knowledge sharing, ...;
making it understandable to people through device dependent information visualisation, semantics-based and context-sensitive navigation and browsing, semantics-based dialogue management, ... .
While the second track of Action Line III.4.1 did not stipulate in any way the underlying computational platform, CPA9 (Grid Technologies and their applications) was very specific about it. It invited proposals to apply Grid technology to ``knowledge discovery in (multidimensional and multimedial) large distributed datasets, using cognitive techniques, data mining, machine learning, ontology engineering, information visualisation, intelligent agents, ...''
Neither of these action lines prescribed a particular application domain. The very title of Action Line III.4.1, ``Semantic Web Technologies'', made this quite explicit. And as far as Grids were concerned the term applications referred to the implementation on top of the basic architecture of computational Grids, of solutions pertaining to fairly general classes of problems.
Yet, clearly, technologies must not be developed for the sake of developing technologies. They should respond to real needs and they will be successful (commercially and otherwise) only if they do so. Therefore proposers were advised to make sure the solutions proposed would not benefit a limited constituency only, or solve just one isolated problem. Rather, projects submitted under a generic action line such as ``Semantic Web Technologies'' should, in a final analysis, yield more widely applicable results.
Calls for submitting proposals to these Action Lines were published in July (AL III.4.1) and November (CPA9) 2001, respectively. Both Calls met with considerable interest in relevant R&D communities across Europe and drew altogether nearly one hundred submissions involving several hundred participating organisations. The ``success rate'' (in terms of acceptance for funding) has been close to 25%.
It must be noted that the modules of the IST programme we have described so far are not the only ones designed to deal with the objectives and problems related to knowledge discovery, acquisition, management and use in the context of large scale distributed systems. Other IST Key Actions, in particular IV (Essential Technologies and Infrastructures) and the IST FET domain (Future and Emerging Technologies), but also II (New Methods of Work and Electronic Commerce), invited and are hosting relevant projects.
While Key Action IV and FET are also ``neutral'' as far as applications are concerned, projects under Key Action II, are indeed required to focus on particular application domains which could be broadly described as corporate knowledge management and ``e-business''. >top
As the current IST programme (1998 - 2002) is coming to an end this is the time to make a first assessment of the extent to which retained projects are contributing or are expected to contribute technically, to creating, managing and using the ``knowledge spaces'' that could be spanned by Webs and Grids. Summaries for a selection of these projects (in alphabetical order) are provided in Appendix A to this note.
Talking about space insinuates dimensions. Obviously, we cannot discuss the dimensions of ``knowledge space'' (there may be infinitely many) but only some of the ``problem and solution spaces'' at issue. Here we can at least identify more or less orthogonal subspaces accommodating the various aspects of relevant IST projects. These subspaces correspond roughly to the areas outlined in Section 8.
The provision and usability of a formal framework for dealing with the semantics of distributed digital content is of general concern and the main focus of a number of projects, in particular ON-TO-KNOWLEDGE and WONDERWEB. Not surprisingly (in view of our remarks in Section 3), ontologies take centre stage in these projects whose workplans include ontology language definition and an analysis of the requirements to be met by ``ontology servers''. While both projects contribute (directly and indirectly) to the W3C ``recommendations process'' that organisation actually takes the lead in SWAD-EUROPE, a ``bottom-up'' experimentation and implementation project designed to showcase the viability of Semantic Web model and language recommendations.
There are two large subspaces that can be labeled ``making content semantics explicit'' and ``acting on explicit semantics'', respectively. Usually, explicit semantics means metadata grounded in a firm semantic domain, such as a formal ontology. But it also refers to the ontologies themselves.
``Making semantics (i.e. metadata and ontologies) explicit'' can happen in many ways, depending largely on content types and usage environments. There are, however, two main categories of approaches: either through (automated) content analysis or by interactive capture, at content production time. These categories represent extremes. A middle way would be to provide means of interactive ``knowledge capture'' on the fly, directly from workflow processes for instance, where users would not be required to make the extra effort of entering semantic metadata. This approach appears to be particularly appropriate in the context of ``corporate knowledge management''. But there is in fact an entire spectrum of ``semi-automation''.
The second subspace (``acting on explicit semantics'') can also be partitioned depending on who is acting, software agents or human beings. Software agents are certainly main characters in a Semantic Web scenario. They always act, in a final analysis, on behalf of humans.
They appear in several roles: as service providers, discoverers, mediators or composers. Hence, they also need ``service semantics'' (i.e. ontology-based descriptions of prerequisites and effects) in order to do what they are supposed to do.
Agents and processes handling queries can build on reasoning and inferencing capabilities made possible by ontologies. However, a particularly serious problem in this context is scalability: of ontologies, ontological reasoning and ontology (change) management. This problem ranks high on the agenda of several projects, including MOSES and the aforementioned WONDERWEB and ON-TO-KNOWLEDGE.
Explicit content (and service) semantics may greatly enhance the quality, effectiveness and efficiency of man-system interaction and of system-mediated communication among people. Topics of particular interest include: ``Navigation and browsing'', ``query construction support'', ``dialogue management'', ``personalization, profiling and customization'', ``information visualisation'', ``semantic Web portals'' and ``collaboration support''.
Table presents a classification of the projects listed in Appendix A, based on the above outlined scheme. This classification must be fuzzy as a given project may well address problem areas and propose solutions that belong to different subspaces: ``making content semantics explicit'' for instance, is never an end in itself while ``acting on explicit semantics'' presupposes the existence of explicit semantics. Hence, our classification merely reflects the perceived gist of a project, its main thrust.
|
making |
acting upon |
---|---|---|
automatic tools |
(I) |
( II ) |
interactive tools |
( III ) |
( IV ) |
general framework |
(V) |
Table 2: Semantics projects in problem&solution space
The number of projects allocated to the groups (I)-(IV) respectively seems to indicate the relative urgency of the issues involved. ``Making content semantics explicit'' (groups (I) and (III)) appears to be a dominant objective, and probably rightly so. It comprises both, ontologies (e.g. through ontology learning and emergent semantics in peer-to-peer networks in SWAP, GRACE and MOSES) and ontology-based metadata (e.g. through semantic annotations in ESPERONTO, through image analysis in SCULPTEUR, and through the extraction of domain-specific metadata in SPIRIT and WISPER). The projects in group (I) aim to achieve this objective through more automation than interaction whereas group (III) projects emphasize interaction, including system-mediated human-to-human interaction (cf. above).
Group (II) projects focus on services, their description and discovery, and on other service related operations (cf. above). SWWS for instance, will be about the implementation of a fully fledged Web service modeling framework ( WSMF), put forward in [17]. MONET will offer mathematical solvers and SEWASIE concentrates on semantic search and inferencing. IBROW is a brokering service, configuring ``knowledge components'' (ontologies and generic algorithms) according to stated specifications of user needs.
The main objectives of the three projects in group (IV) are related to what may be called ``interfacing with knowledge'', aspects of which we have already mentioned. In the projects in question these are mainly browsing ( INDICO), context visualisation ( VICODI) and cooperative work ( WIDE) support.
Given the crucial role of ontologies within ``semantic systems'', these constructs appear in one way or other in virtually all projects. Similarly, the agent paradigm that has been quite popular in the field of distributed computing already for some time, is gaining new ground in ``semantically-enriched environments''. Agents also appear almost everywhere: as constructors of ontologies, extractors of metadata, as service composers and as assistants at the user interface.
Several projects (e.g. MOSES, FF-POIROT and SEWASIE) also address multilinguality issues that bear on the creation and use of ontologies. And despite the fact that non-textual content poses much harder ``semantics problems'' than even unstructured text, some projects (e.g. SCULPTEUR, SPACEMANTIX and ESPERONTO) have taken on that challenge. We do note, however, that there still seems to be a fairly wide gap between, say, the Semantic Web communities proper and those who do research on multimedia semantics (e.g. image understanding).
It would be presumptuous to claim absolute novelty (or uniqueness) for any of the approaches taken by the projects discussed in this section. (One may argue that science and technology proceed by piecemeal research and engineering; and that ``radical breakthroughs'' are never so radical when seen against the backdrop of their birthing grounds. The Web itself, as explained in Section 2, corroborates this statement.) However, all of these projects do provide an opportunity for researchers in Europe to explore new territory, to prove or disprove the viability of existing approaches, to establish the need for new ones, and to contribute to making worldwide distributed systems more usable. >top
The evolution of basic digital technologies has been going on for more than half a century, characterized by ever increasing values of parameters such as processor speed, storage and memory capacity, bandwidth and connectivity. Given its current momentum (occasional ups and downs notwithstanding) it is likely to continue for quite some time. It has brought about many novelties relative to the pre-digital era. At least three classes of applications are relevant within the context of this note:
Digital technologies allow us to create, maintain and use content of all types and media in hitherto unreachable dimensions, thanks to tools that are many orders of magnitude more powerful than pen, paper, the printing press or library catalogues.
The digital technologies have enhanced drastically our ability to analyse what is going on in the world (in both nature and society), to peruse vast amounts of data, searching for structure, thus refining our models of the world5.
Digital technologies allow us to build machines that can learn and - to a certain extent - act autonomously in limited environments. (In a way this may be considered a special case of the second class of novelties.)
Developments corresponding to the first and second of these classes have led directly to the more or less recent phenomena discussed in this note, to Webs and Grids. And they will perhaps lead on to Semantic Webs and Knowledge Grids. The third novelty refers to autonomous ``intelligent'' agents and robotics, a field of applications of basic digital technologies that may not yet be as fully visible as other application domains are. All three are about creating and using representations of knowledge (in the sense of footnote 3). They can be subsumed under the heading Knowledge Technologies.
Their impact is steadily growing: they are transforming industrial production processes, the way we create and distribute content for human consumption, the way we do science, the way businesses are managed, the way public administrations work, ... .
We note, however, that in the past attention has always been focused on technologies that would bring about changes in precisely these areas. We remember: Management Information Systems (MIS), Office Automation Systems, Decision Support Systems, Expert Systems, Computer Supported Cooperative Work (CSCW) systems, Corporate Information Systems, Computer Integrated Manufacturing (CIM) systems, ... (not to mention the multitude of isolated or linked business application systems and tools for building such applications).
So the obvious question to ask is: what is going to happen next? Will there be a next Big Thing and if so, what will it be?
The evolution of technology appears to be driven by at least two interacting processes: the emergence of needs and the sophistication of tools. Usually, there is ``positive feedback'' in the sense that the increase in sophistication of tools goes hand in hand with an extension and more detailed specification of needs.
What then are the problems to be solved by Knowledge Technologies? And what are the problems created by these technologies? Some answers to such questions have been outlined in previous sections. But they do require much greater attention.
Will the visions of a Semantic Web where ontologies would be as crucial as plain documents are for the current Web, and of a vast virtual computer called Grid, hold solutions in stock also for big multinational companies? For the multitude of small and medium-sized enterprises? For the general public? For scientists? For engineers? For professionals from all walks of life? The knowledge workers? What exactly are their needs, how do these needs change as technologies change?
Forecasting the future has never been a very gratifying undertaking. Not to predict the future but to create it, is perhaps a more rewarding task. Yet there have been few large scale joint research and development activities that were truly vision-led or taking as (seemingly) straightforward a path as for instance the man-on-the-moon project (not to mention military objectives, of course).
As pointed out in Section 8 publicly funded research programmes have a key role to play here. Ideally, they would provide some guidance and focus, based on a sufficiently broad consensus among relevant R&D communities. The design of the multi-annual research programmes of the European Union reflects this objective.
One of the Priority Research Areas of the European Commission's forthcoming 6th Framework Programme will again be Information Society Technologies (IST) which in turn will offer two main research foci: applied IST for major societal and economic challenges, and generic IST research and technology development. The latter will also cover knowledge and interface technologies.
The Council Decision concerning the specific programmes implementing the Sixth Framework Programme of the European Community for research, technological development and demonstration activities (2002-2006) addresses Knowledge Technologies as follows ([18]):
The objective is to provide automated solutions for creating and organising virtual knowledge spaces (e.g. collective memories) so as to stimulate ... new content and media services and applications. Work will focus on technologies to support the process of modelling and representing, acquiring and retrieving, navigating and visualising, interpreting and sharing knowledge. These functions will be integrated in new semantics-based and context-aware systems including cognitive and agent-based tools. Work will address extensible knowledge resources and ontologies so as to facilitate service interoperability and enable next-generation Semantic-web applications. Research will also address technologies to support the design, creation, management and publishing of multimedia content, across fixed and mobile networks and devices, with the ability to self-adapt to user expectations. The aim is to stimulate the creation of rich interactive content for personalised broadcasting and advanced trusted media and entertainment applications.
This includes the technologies and research directions addressed in this note, but goes clearly beyond. A number of action lines pertaining to several components of the IST programme under FP5 have already set the scene (cf. Appendix B).
The funding instruments foreseen at the time of this writing have been designed with a view to giving research communities a real opportunity to formulate and pursue visions of their own and to build strong bridges across disciplines if necessary or desirable. These instruments have also been designed with a view to supporting the emergence of a true European Research Area (ERA). Applying them to Knowledge Technologies may indeed complement this ERA with a vast European Knowledge Space. >top
KEY ACTION II - NEW METHODS OF WORK AND ELECTRONIC COMMERCE
Corporate knowledge management
Knowledge Management for eCommerce and eWork
Technology Building Blocks for Trust and Security
KEY ACTION III - MULTIMEDIA CONTENT AND TOOLS
Authoring and design systems
Content management and personalisation
Media representation and access: new models and standards
Access to digital collections of cultural and scientific content
Content-processing for domestic and mobile multimedia platforms
Information visualisation
Semantic Web Technologies
KEY ACTION IV - ESSENTIAL TECHNOLOGIES AND INFRASTRUCTURES
Engineering of intelligent services
Methods and tools for intelligence and knowledge sharing
Information management methods
FUTURE AND EMERGING TECHNOLOGIES
Open domain (“FET OPEN”)
Universal information ecosystems
The disappearing computer
Global computing: co-operation of autonomous and mobile entities in dynamic environments
CROSS PROGRAMME ACTIONS
CPA9: GRID Technologies and their applications
1The views expressed in this note are those of the author and do not necessarily engage his employer.
2One may also argue that it has been the World Wide Web that nourished the growth of the Internet.
3Classical examples of collections of metadata are library catalogues consisting, for example, of MARC records describing books and other items belonging to the 'Gutenberg Galaxy'. By contrast, the metadata we have in mind when talking about the Semantic Web pertain to all kinds of digitally representable objects.
4For obvious reasons we do not engage in any discussion of the elusive notion of knowledge beyond this rough description; we do, however, maintain that there is a set of operations applicable to whatever knowledge may be. This set includes: acquisition, elicitation, discovery, representation, communication, inference and access. We also contend that knowledge can be more or less precise, more or less pertinent and hence more or less usable in a given environment and for a given purpose ("ex falso quodlibet" is a well known worst case).
5These data are, by the way, mainly being collected through devices that owe their existence, effectiveness and efficiency to digital technologies.
File translated from TEX
by TTH,
version 3.06.
On 4 Sep 2002, 20:32.