dntt - Publications List

Journal articles

2006

Nicolas Travers, Tuyêt-Trâm Dang-Ngoc, Tianxiao Liu (2006) TGV: an Efficient Model for XQuery Evaluation within an Interoperable System Interoperability In Business Information Systems (IBIS) 3: 59-72 December

Abstract: This paper presents a generic model called TGV for efficient evaluation of XQuery in a heterogeneous distributed system. XQuery is a rich and so a complex language that allows users to express a large scale of queries over XML documents. This expressiveness makes difficulties to obtain an exclusive internal representation within a system. To this purpose, models based on Tree Patterns have been proposed: TPQ [1], generalized by the GTP [2]. However, they do not capture well all the expressivity of XQuery, cannot handle mediation problems, and do not support extensible optimisation and sources information.. We present a tree pattern-based model called TGV that * integrates the whole functionalities of XQuery * uses an intuitive representation that provides a global visualization of the request in a mediation context * provides a framework for extensible optimization using a rules definition model * take into account all knowledges useful for the query evaluation (cost model, accuracy, etc.) This work has been implemented in the XLive system [#!dangngoc2005!#] based on a full-XML architecture.

Notes:

2005

Tuyêt-Trâm Dang-Ngoc, Virginie Sans, Dominique Laurent (2005) Classifying XML Materialized View for their Maintenance on Distributed Web Sources Revue des Nouvelles Technologies et de l'Information (RNTI)

Abstract: Ces derniÃ¨res annÃ©es ont mis en Ã©vidence la croissance et la grande diversitÃ© des informations Ã©lectroniques accessibles sur le web. Câ��est ainsi que des systÃ¨mes dâ��intÃ©gration de donnÃ©es tels que des mÃ©diateurs ont Ã©tÃ© conÂ¸cus pour intÃ©grer ces donnÃ©es distribuÃ©es et hÃ©tÃ©rogÃ¨nes dans une vue uniforme. Pour faciliter lâ��intÃ©gration des donnÃ©es Ã travers diffÃ©rents systÃ¨mes, XML a Ã©tÃ© adoptÃ© comme format standard pour Ã©changer des informations. XQuery est un langage dâ��interrogation pour XML qui sâ��est imposÃ© pour les systÃ¨mes basÃ©s sur XML. Ainsi XQuery est employÃ© sur des systÃ¨mes de mÃ©diation pour concevoir des vues dÃ©finies sur plusieurs sources. Pour optimiser lâ��Ã©valuation de requÃªtes, les vues sont matÃ©rialisÃ©es. La difficultÃ© est de maintenir incrÃ©mentalement des vues matÃ©rialisÃ©es lors de la mise Ã jour des sources, car dans le contexte de sources web, trÃ¨s peu dâ��informations sont fournies par les sources. Les mÃ©thodes habituellement proposÃ©es ne peuvent pas Ãªtre appliquÃ©es. Cet article Ã©tudie comment mettre Ã jour des vues matÃ©rialisÃ©es XML sur des sources web, au sein dâ��une architecture de mÃ©diation.

Notes:

2003

Tuyêt-Trâm Dang-Ngoc, Georges Gardarin (2003) Conception et Evaluation de XQuery dans une architecture de médiation «Tout-XML» Revue ISI (Integration de systèmes d'information) : Numéro spécial sur les Bases de Données Semi-structurées

Abstract: XML has emerged as the leading language for representing and exchanging data not only on the Web, but also in general in the enterprise. XQuery is emerging as the standard query language for XML. Thus, tools are required to mediate between XML queries and heterogeneous data sources to integrate data in XML. This paper presents the XMedia mediator, a unique tool for integrating and querying disparate heterogeneous information as unified XML views. It describes the mediator architecture and focuses on the unique distributed query processing technology implemented in this component. Query evaluation is based on an original XML algebra simply extending classical operators to process tuples of tree elements. Further, we present a set of performance evaluation on a relational benchmark, which leads to discuss possible performance enhancements.

Notes:

Conference papers

2007

Tuyêt-Trâm Dang-Ngoc, Nicolas Travers (2007) Tree Graph Views for a Distributed Pervasive Environment In: International Conference on Network-Based Information Systems (NBIS) Regensburg, Germany:

Abstract: The pervasive Internet and the massive deployment of sensor devices have lead to a huge heterogeneous distributed system connecting millions of data sources and customers together [1]. On the one hand, mediation systems [2] using XML as an exchange language have been proposed to federate data accross distributed heterogeneous data sources. On the other hand, work [3,4,5,6] have been done to integrate data from sensors. The challenge is now to integrate data coming from both "classical" data (DBMS, Web sites, XML files) and "dynamic" data (sensors) in the context of an ad-hoc network, and finally, to adapt queries and result to match the client profile. We propose to use the TGV model [7,8] as a mobile agent to query sources across devices (sources and terminal) in the context of a rescue coordination system. This work is integrated in the PADAWAN project.

Notes:

Nicolas Travers, Tuyêt-Trâm Dang-Ngoc, Tianxiao Liu (2007) TGV: a Tree Graph View for Modelling untyped XQuery In: International Conference on Database Systems for Advanced Applications(DASFAA) Bangkok, Thailand:

Abstract: Tree Pattern Queries [1,2] are now well admitted for modeling parts of XML Queries. %Works as GTP [2] use the Tree Pattern Query as a basis to model a part of XQuery specifications. Actual works only focus on a small subpart of XQuery specifications and are not well adapted for evaluation in a distributed heterogeneous environment. In this paper, we propose the TGV (Tree Graph View) model for XQuery processing.The TGV model extends the Tree Pattern representation in order to make it intuitive, has support for full untyped-XQuery queries, and for optimization and evaluation. Several types of Tree Pattern are manipulated to handle all XQuery requirements. Links between Tree Patterns are called hyperlinks in order to apply transformations on results. The TGV, TGV annotations and cost models has been implemented in a mediator system called XLive.

Notes:

Nicolas Travers, Tuyêt-Trâm Dang-Ngoc, Tianxiao Liu (2007) An Efficient Evaluation of XQuery with TGV In: International Conference of WEB Information Systems and Technologies (Web-IST) Barcelona, Spain:

Abstract: This paper presents an efficient evaluation of XQuery in a heterogeneous distributed system. XQuery(W3C, 2005) is a rich and so a complex language. Its syntax allows us to express a large scale of queries over XML documents. We have extended (Chen et al., 2003) proposal to rewrite XQuery expressions in â��canonical XQueryâ�� in order to support the full XQuery specification. The XQuery expressiveness makes difficulties to obtain an exclusive internal representation within a system. Models based on Tree Patterns have been proposed, and we have extended the tree pattern model to a model called TGV that (a) integrates the whole functionalities of XQuery (b) uses an intuitive representation that provides a global visualization of the request in a mediation context and (c) provides a support for optimization and for cost information. Our paper is based on the XLive mediation system. XLive integrates sources in a uniform view. It is a running research vehicle designed at PRiSM Laboratory for assessing the integration system at every stage of the process starting from sources extraction to the user interface and is already used in several projects.

Notes:

Nicolas Travers, Tuyêt-Trâm Dang-Ngoc (2007) An Extensible Rule Transformation Model for XQuery Optimization In: International Conference on Enterprise Information Systems (ICEIS) Madeira, Portugal:

Abstract: Efficient evaluation of XML Query Languages has become a crucial issue for XML exchanges and integration. Tree Pattern (Sihem et al., 2002; Jagadish et al., 2001; Chen et al., 2003) are now well admitted for representing XML Queries and a model -called TGV (Travers, 2006; Travers et al., 2006; Travers et al., 2007c)- has extended the Tree Pattern representation in order to make it more intuitive, respect full XQuery specification and got support to be manipulated, optimized and then evaluated. For optimization, a search strategy is needed. It consists in generating equivalent execution plan using extensible rules and estimate cost of plan to find the better one. We propose the specification of extensible rules that can be used in heterogeneous environment, supporting XML and manipulating Tree Patterns.

Notes:

2005

Tuyêt-Trâm Dang Ngoc, Dominique Laurent, Virginie Sans (2005) On the Maintenance of XML Materialized Views. In: Franco-Japanese Workshop on Information Search Integration and Personnalizzation (ISIP) Lyon, France:

Abstract: Providing services by integrating information available in web resources is one of the main goals of a mediation architecture. In this paper, we consider the standard wrapper-mediator architecture under the following hypothesis: (i) the information exchanged between wrap- pers and the mediator consists in XML documents, (ii) wrappers have limited resources, and (iii) to answer queries even if sources are not available, materialized XML views are stored at the mediator level. In this setting, we focus on the problem of maintaining materialized XML views, when the sources change. In our context, wrappers send the up- dated document without providing any information about the type and the localization of the update in the document. Then, the problems we address are, first, identifying the updates, and, second, updating the view in such a way that accesses to the sources are restricted. Our approach is based on the XAlgebra, which allows to consider XQuery requests on XML documents as relational tables. Moreover, our solution uses iden- tifier annotations for XAlgebra and a diff function.

Notes:

Tuyêt-Trâm Dang-Ngoc, Clément Jamard, Nicolas Travers (2005) XLive: An XML Light Integration Virtual Engine In: Bases de Données Avancées (BDA) Saint-Malo, France:

Abstract: On the Internet, data are distributed on heterogeneous sources. To integrate them in a uniform view, lots of systems based on the famous mediator/wrapper architecture defined by [14] have been designed [3][12]. The data model now admitted for representing data is semi-structured data represented by the XML standard format. Thus now, as well in industry as in research, integration systems using XML-based standards have emerged [5][4]. XLive is such an integration system based on XML standards. It is the sequels of our experiences on mediation design in research project (MIROWEB [11], XML-KM) and in industry XMLMedia. The XLive prototype is designed to be a light mediation system with high modularity and extension capabilities. It is a running research vehicle designed for assessing the integration system at every stage of the process starting from sources extraction to the user interface, including query parsing and modeling, optimization and evaluation, and also benchmarking.

Notes:

2004

Georges Gardarin, Tuyêt-Trâm Dang-Ngoc (2004) Mediating the Semantic Web In: journées d'Extraction et de Gestion des Connaissances (EGC) Clermont-Ferrand, France:

Abstract: Cet article dÃ©veloppe une extension d'une architecture de mÃ©diation pour intÃ©grer le Web sÃ©mantique. Plus prÃ©cisÃ©ment, XLive est un mÃ©diateur tout XML dÃ©veloppÃ© Ã PRiSM. Il permet d'exÃ©cuter des XQuery sur des sources de donnÃ©es hÃ©tÃ©rogÃ¨nes. AprÃ¨s une rapide prÃ©sentation de XLive et du Web sÃ©mantique, une architecture Ã trois niveaux d'ontologies et de schÃ©mas est introduite pour connecter des adaptateurs pour le Web sÃ©mantique. Cette architecture vise Ã intÃ©grer des sources de type Web service d'information conformÃ©ment Ã une ontologie globale de rÃ©fÃ©rence. Elle conduit Ã Ã©tendre XLive avec le support de vues, un outil de conception de vues et de mappings, et des adaptateurs pour les Web services.

Notes:

Georges Gardarin, Tuyêt-Trâm Dang-Ngoc (2004) Mediating the Semantic Web with XML/XQuery In: International Conference on Information & Communication Technologies: from Theory to Applications (ICTTA 2004) Damascu, Syrie:

Abstract:

Notes:

Tuyêt-Trâm Dang-Ngoc, Georges Gardarin, Nicolas Travers (2004) Tree Graph View: On Efficient Evaluation of XQuery in an XML Mediator In: Tree Graph View: On Efficient Evaluation of XQuery in an XML Mediator. In the proceedings of the 20ème conférence Bases de Données Avancées (BDA)

Abstract: XQuery is the emerging standard for querying XML data sources. XLive is a light XML/XQuery mediator developed at University of Versailles whose engine processes an XML algebra derived from the relational one extended to process in dataflow XML trees. The query optimizer translates a subset of XQuery in this algebra. To extend the optimizerâ��s coverage of XQuery and better optimize query plans, we propose a representation of queries as graphs of trees, more precisely as tree pattern graphs interconnected by hyperlinks. Our structure called Tree Graph View (TGV) is an extension of the Generalized Tree Pattern graph proposed in 6 as a concise and practical representation of an XQuery request. It is designed to be a more intuitive model of queries and to allow direct optimization before generating the physical execution plan. TGV lends itself to simple algorithms to generate efficient algebraic execution plans. Moreover, it is effective for view translation and query simplification, and for taking into account source capabilities. We are currently implementing it to support the new XLive optimizer.

Notes:

2003

Tuyêt-Trâm Dang-Ngoc, Georges Gardarin (2003) Evaluating XQuery in a full-XML Mediation architecture In: conférence Bases de Données Avancées (BDA)

Abstract: XML has emerged as the leading language for representing and exchanging data not only on the Web, but also in general in the enterprise. XQuery is emerging as the standard query language for XML. Thus, tools are required to mediate between XML queries and heterogeneous data sources to integrate data in XML. This paper presents the e-XMLMedia mediator, a unique tool for integrating and querying disparate heterogeneous information as unified XML views. It describes the mediator architecture and focuses on the unique distributed query processing technology implemented in this component. Further, we evoke the various applications that are currently being experimented with the e-XMLMedia Mediator.

Notes:

Tuyêt-Trâm Dang-Ngoc, Georges Gardarin (2003) Federating Heterogeneous Data Sources In: International Conference on Information and Knowledge Sharing (IKS) 193-198 Scottsdale, USA:

Abstract: XML has emerged as the leading language for representing and exchanging data either on the Web or in the enterprise for general purposes. Although XQuery is emerging as the standard for XML query languages, tools are still needed to mediate between XML queries and heterogeneous data sources for the integration of data in XML. This paper presents the XLive mediator, a unique tool for integrating and querying disparate heterogeneous information as unified XML views. It describes the mediator architecture and focuses on the unique distributed query processing technology implemented in this component. Query evaluation is based on an original XML algebra simply extending classical operators to process tuples of tree elements. Further, we present a set of performance evaluation on a relational benchmark, which leads to discuss possible performance enhancements.

Notes:

Tuyêt-Trâm Dang-Ngoc, Huaizhong Kou, Georges Gardarin (2003) Mediating the Web through XML concrete Views In: International Conference on DataBase and Applications (DBA) 268-273

Abstract: To cope with the difficulties of Web information search, lots of technologies related to Web search engines have been proposed and have also seen very successful applications. Rather than yet another Web search engines with general purpose, this paper couples text mining and XML view caching techniques within Web mediation architecture and presents a prototype framework for topic-centric Web information search. Given a topic domain, domain-specific information is extracted from the Web documents belonging to the domain, then text-mining technologies are applied to discover the semantics contained in the Web information. Next we integrate the extracted information into a domain-specific common concept model defined using semantic Web languages. Finally an XML-based mediator allows the users to query the integrated Web information using XQuery. Once Web information is represented in the concept model with explicit semantic hierarchy understandable to the programs, user's queries against special fragments of Web documents can be carried out. One important part of our works aims at integrating XML view and cache techniques to manage Web information. Checksum technology is used to monitor the updates of Web page. One prototype is under construction centered on popular French sites of the finance domain.

Notes:

1999

Georges Gardarin, Fei Sha, Tuyêt-Trâm Dang-Ngoc (1999) XML-based Components for Federating Multiple Heterogeneous Data Sources In: International Conference on Conceptual Modeling / the Entity Relationship Approach (ER) 506-519 Paris, France:

Abstract: Several federated database systems have been built in the past using the relational or the object model as federating model. This paper gives an overview of the XMLMedia system, a federated database system mediator using XML as federating model, built in the Esprit Project MIRO-Web. The system is composed of four main components: a wrapper generator using rule-based scripting to produce XML data from various source formats, a mediator querying and integrating relational and XML sources, an XML DBMS extender supporting XML on top of relational DBMSs, and client tools including a Java API and an XML query browser. The results demonstrate the ability of XML with an associated query language (we use XML-QL) to federate various data sources on the Internet or on Intranets.

Notes:

Local workshop

1999

Technical reports

2007

Nicolas Travers, Tuyêt-Trâm Dang-Ngoc (2007) Canonization for Full Untyped-XQuery, Decreasing complexity of XQuery manipulation PRiSM Laboratory

Abstract:

Notes:

2002

Tuyêt-Trâm Dang-Ngoc, Georges Gardarin (2002) The XML Mediator e-XMLMedia SA / PRiSM Lab.

Abstract:

Notes:

2001

Tsin-Shu Yeh, Tuyêt-Trâm Dang-Ngoc (2001) Repository de méta-données (RNTL Specification SP-3) RNTL MUSE project

Abstract:

Notes:

1999

Tuyêt-Trâm Dang-Ngoc, Daniel Artal, Claude Campanaro, Peter Kirkham, Henri Laude, Alban Vuillier (1999) Integration Plan (ESPRIT-25208 Deliverable D3-1-2) OSIS/PRiSM ESPRIT MIROWEB Project

Abstract:

Notes:

1998

Tatiana Chan-Sine-Ying, Tuyêt-Trâm Dang-Ngoc, Danièla Florescu, Claude Campanaro, Peter Kirkham (1998) Message Manager Specification (ESPRIT-25208 Deliverable D5-1-1) OSIS/PRiSM/INRIA ESPRIT MIROWEB Project

Abstract:

Notes:

Tuyêt-Trâm Dang-Ngoc, Tatiana Chan-Sine-Ying, François Chéron, Georges Gardarin, Peter Kirkham, Henri Laude (1998) Browser Interface Specification (ESPRIT-25208 Deliverable D6-2-1) OSIS/PRiSM ESPRIT MIROWEB Project

Abstract:

Notes:

Tatiana Chan-Sine-Ying, Tuyêt-Trâm Dang-Ngoc, Danièla Florescu, Claude Campanaro, Peter Kirkham (1998) Message Manager Specification (ESPRIT-25208 Deliverable D5-1-1) OSIS/PRiSM/INRIA ESPRIT MIROWEB Project

Abstract:

Notes:

PhD theses

2003

Tuyêt-Trâm Dang-Ngoc (2003) Fédération de données semi-structurées avec XML Université de Versailles Saint-Quentin-en-Yvelines 45 avenue des Etats-Unis. 78000 Versailles. France:

Abstract: Contrairement aux donnÃ©es traditionnelles, les donnÃ©es semi-structurÃ©es sont irrÃ©guliÃ¨res : des donnÃ©es peuvent manquer, des concepts similaires peuvent Ãªtre reprÃ©sentÃ©s par diffÃ©rents types de donnÃ©es, et les structures mÃªme peuvent Ãªtre mal connues. Cette absence de schÃ©ma prÃ©dÃ©fini, permettant de tenir compte de toutes les donnÃ©es du monde extÃ©rieur, prÃ©sente l'inconvÃ©nient de complexifier les algorithmes d'intÃ©gration des donnÃ©es de diffÃ©rentes sources. Nous proposons une architecture de mÃ©diation basÃ©e entiÃ¨rement sur XML. L'objectif de cette architecture de mÃ©diation est de fÃ©dÃ©rer des sources de donnÃ©es distribuÃ©es de diffÃ©rents types. Elle s'appuie sur le langage XQuery, un langage fonctionnel conÃ§u pour formuler des requÃªtes sur des documents XML. Le mÃ©diateur analyse les requÃªtes exprimÃ©es en XQuery et rÃ©partit l'exÃ©cution de la requÃªte sur les diffÃ©rentes sources avant de recomposer les rÃ©sultats. L'Ã©valuation des requÃªtes doit se faire en exploitant au maximum les spÃ©cificitÃ©s des donnÃ©es et permettre une optimisation efficace. Nous dÃ©crivons l'algÃ¨bre XAlgebre Ã base d'opÃ©rateurs conÃ§us pour XML. Cette algÃ¨bre a pour but de construire des plans d'exÃ©cution pour l'Ã©valuation de requÃªtes XQuery et traiter des tuples d'arbres XML. Ces plans d'exÃ©cution doivent pouvoir Ãªtre modÃ©lisÃ©s par un modÃ¨le de coÃ»t et celui de coÃ»t minimum sera sÃ©lectionnÃ© pour l'exÃ©cution. Dans cette thÃ¨se, nous dÃ©finissons un modÃ¨le de coÃ»t pour les donnÃ©es semi-structurÃ©es adaptÃ© Ã notre algÃ¨bre. Les sources de donnÃ©es (SGBD, serveurs Web, moteur de recherche) peuvent Ãªtre trÃ¨s hÃ©tÃ©rogÃ¨nes, elles peuvent avoir des capacitÃ©s de traitement de donnÃ©es trÃ¨s diffÃ©rentes, mais aussi avoir des modÃ¨les de coÃ»t plus ou moins dÃ©finis. Pour intÃ©grer ces diffÃ©rentes informations dans l'architecture de mÃ©diation, nous devons dÃ©terminer comment communiquer ces informations entre le mÃ©diateur et les sources, et comment les intÃ©grer. Pour cela, nous utilisons des langages basÃ©s sur XML comme XML-Schema et MathML pour exporter les informations de mÃ©tadonnÃ©es, de formules de coÃ»ts et de capacitÃ© de sources. Ces informations exportÃ©es sont communiquÃ©es par l'intermÃ©diaire d'une interface applicative nommÃ©e XML/DBC. Enfin, des optimisations diverses spÃ©cifiques Ã l'architecture de mÃ©diation doivent Ãªtre considÃ©rÃ©es. Nous introduisons pour cela un cache sÃ©mantique basÃ© sur un prototype de SGBD stockant efficacement des donnÃ©es XML en natif.

Notes: