Present position: Post-doc researcher at University of Bergamo
|Thesis title:||Managing Web Data Semantics within the WebDL framework|
|Research area:||Semantic Web|
Traditional Web applications aim at offering to their users content in terms of existing, wellunderstood interaction paradigms based on navigational links. The meaning of this data is known to its users, thanks to their knowledge of language and to their perception of everyday things.
The situation is different, however, when the data is accessed by Web applications. In this setting, Web applications do need to know the semantics of the shared data. Recent applications also tend to provide to the users more successful interactions by enabling navigations that consider the users’ previous actions on the Web data. Such personalized navigations override the predefined solutions. They are performed upon content associations that are built ad-hoc after evaluation of the semantics of the previously retrieved data. Unfortunately, this semantics is usually conveyed as design documents explaining the web data. Such documents are not available along with the data, and even if they were they wouldn’t be understandable by a (software) application.
Traditional Web applications can partially cope with the Web data semantics for two main reasons. The first limitation derives from the limited expressiveness of the design documents regarding Web data. Often, they are not rich enough to cover commonly required aspects of data semantics representation, like generalization hierarchies overlapping classifications, transitivity of relations, and static and multi-valued attributes. The second limitation derives from the available data retrieval mechanisms: they fail to deliver objects when the relations connecting them to other objects are not known a priori. In the current work, we address the limitation for representing explicitly Web data semantics using UML class diagrams, and we implement a framework carrying out data retrieval mechanisms along with a reasoning component that can check for consistency of a UML class diagram knowledge base and can make inferences upon the depicted UML objects deriving new associations between Web data. UML is a widely used design language and it guarantees wide acceptance and applicability of our method. The proposed framework is modeled with WebML, a visual modeling language for Web applications design, and it is applied using a model-driven technique extending the WebML methodology for the design of what we call “knowledge-intensive” hypertexts. It supports the graphical representation of queries upon Web data semantics evaluated against a knowledge base with the support of an inference engine.
The methodology that we have followed is divided in two dimensions. First, we identify the elements that compose the UML class diagram knowledge base. We individuate which are the UML components that we need for representing Web data semantics, and we evaluate the properties of existing UML formalizations for achieving sound and complete reasoning upon UML class diagrams. The reasoning mechanisms are attached to the knowledge base along Pellet, an inference engine implementing its capabilities by means of java interfaces. We next turn to our major objective, that is the modeling and implementation of a component-based Web framework for knowledge extraction upon Web data semantics. The framework modeling is distinguished into two fronts. First, we define new WebML primitives for the extraction and management of Web data semantics. Each primitive extracts content from the underlying knowledge base by means of predefined queries upon UML components. The queries may be parametric receiving values at run-time because of user navigation, and are translated to queries offered by Pellet interface. The primitives for Web data semantics management enable the creation of the knowledge base from UML data models, and are also translated to actions offered by Pellet, communicated to it through its java interface. Finally, we produce the framework hypertexts as WebML models enriched with the above primitives.
The approach produces a high-level, visual model for the framework, capable of depicting interactions with reasoning engines and composition of knowledge-based queries. It hides the implementation details of the interactions with Pellet. The queries that correspond to the framework components are translated to concrete queries in description logic, and when the framework is reused in more complex systems, they eliminate from the developer the burden to know the syntax of the underlying logical representation and the query interface exposed by the reasoning engine. We demonstrate the effectiveness of the framework with two case studies. In the first paradigm, we invoke the framework hypertexts during the development process of UML class diagrams. The invocations explore the structure of the UML model during its design phase, resolve possible inconsistencies in it, and deduct new objects not represented in the original model. The product of such processing is a UML class diagram semantically verified. In the second paradigm, we explore the use of framework components in the design of the Wines portal. We create design patterns based upon the framework components, and we incorporate them within WebML hypertexts for making pages aware of their content semantics. In the hypertexts, we implement knowledge-based functionalities like the loading of data semantics from data repositories to the knowledge base, the generation of logical conclusions upon them and their delivery in the pages context.
The framework was designed for tractability and usability in the process of managing Web data semantics programmatically. On the tractability front, the framework is designed to control the semantics of Web data, both their conceptual representation and their logical counterparts. The use of UML and Pellet establishes the tractability of the represented data semantics. The UML components have been chosen carefully with respect to their expressive power and the computational overhead they impose on the reasoning operations executed upon them by Pellet.
On the usability front, the use of the framework components is based on a visual modelling technique. The components are visually depicted as WebML primitives, and they result to a graphical representation of queries defined upon UML components and evaluated against Pellet.
They hide the implementation details of the queries evaluation. The developer is not aware of the logical formalism supported by Pellet, and the mapping between UML components and Pellet programmatic elements. On the other hand, UML is a widely accepted modeling language of software components with a number of available editing tools and an accumulated experience, and its role in the queries formulation facilitates the widespread acceptance and use of the framework. Further, the representation of Web data semantics as UML class diagrams does not require from the developer to learn a new language minimizing the development time. At last, the framework enables the integration of Web data semantics into the logic of semantic Web applications, and it enables the conceptual model of such integration to be depicted keeping unattached the abstraction level of the Web application design.