Abstract: People taking part in argumentative debates through collective annotations face a highly cognitive task when trying to estimate the group's global opinion. In order to reduce this effort, we propose in this paper to model such debates prior to evaluating their social validation. Computing the degree of global confirmation (or refutation) enables the identification of consensual (or controversial) debates. Readers as well as prominent information systems may thus benefit from this information. The accuracy of the social validation measure was tested through an online study conducted with 121 participants. We compared their human perception of consensus in argumentative debates with the results of the three proposed social validation algorithms. Their efficiency in synthesizing opinions was demonstrated by the fact that they achieved an accuracy of up to 84%.
Abstract: This chapter deals with an annotation-based decisional system. The decisional system we present is based on multidimensional databases, which are composed of facts and dimensions. The expertise of decision-makers is modelled, shared and stored through annotations. These annotations allow decision-makers to carry on active analysis and to collaborate with other decision-makers on a common analysis.
Abstract: Nowadays, the Web has become the most queried information source. To solve their information needs, individuals can use different types of tools or services like a search engine for instance. Due to the high amount of information and the diversity of human factors, searching for information requires patience, perseverance and sometimes luck. To help individuals during this task, search assistants feature adaptive techniques aiming at personalizing retrieved information. Moreover thanks to the ânew Webâ (the Web 2.0) personal search assistants are evolving using social techniques (social networks, sharing-based methods). Let us enter in the Social Web, where everyone collaborates with others in providing their experience, their expertise... This chapter introduces search assistants and underlines their evolution toward Social Information Search Assistants.
Abstract: In this work, we evaluate the importance of Passages in blogs
especially when we are dealing with the task of Opinion
Detection. We argue that passages are basic building blocks of
blogs. Therefore, we use Passage-Based Language Modeling
approach as our approach for Opinion Finding in Blogs. Our
decision to use Language Modeling (LM) in this work is totally
based on the performance LM has given in various Opinion
Detection Approaches. In addition to this, we propose a novel
method for bi-dimensional Query Expansion with relevant and
opinionated terms using Wikipedia and Relevance-Feedback
mechanism respectively. We also compare the impacts of two
different query terms weighting (and ranking) approaches on final
results. Besides all this, we also compare the performance of three
Passage-based document ranking functions (Linear, Avg, Max).
For evaluation purposes, we use the data collection of TREC
Blog06 with 50 topics of TREC 2006 over TREC provided best
baseline with opinion finding MAP of 0.3022. Our approach gives
a MAP improvement of almost 9.29% over best TREC provided
baseline (baseline4).
Abstract: We consider Information Retrieval evaluation in the Trec framework with the trec_eval program. It appears that IR systems obtain scores regarding not only the relevance of retrieved documents, but also according to document names in case of ties, i.e., documents retrieved with a same score. We consider this tie-breaking strategy as an uncontrolled parameter influencing measure scores, and argue the case for fairer tie-breaking strategies. A study of 22 Trec editions reveals significant difference between the conventional unfair trec_eval strategy and the fairer strategies that we propose. This experimental result advocates integrating these fairer strategies into trec_eval for conducting fairer experiments.
Abstract: In this work, we propose a Passage-Based Language Modeling
(LM) approach for Opinion Finding in Blogs. Our decision to use
Language Modeling in this work is totally based on the
importance of passages in blogposts and performance LM has
given in various Opinion Detection approaches. In addition to
this, we propose a novel method for bi-dimensional Query
Expansion with relevant and opinionated terms using Wikipedia
and Relevance-Feedback mechanism respectively. Besides all
this, we also compare the performance of three Passage-based
document ranking functions (Linear, Avg, Max). For evaluation
purposes, we use the data collection of TREC Blog06 with 50
topics of TREC 2006 over TREC provided best baseline with
opinion finding MAP of 0.3022. Our approach gives a MAP
improvement of almost 9.29% over best TREC provided baseline
(baseline4).
Abstract: Common search engines process users' queries (i.e., information needs) by retrieving documents from pre-built term-based indexes. For digital libraries, such approaches are limited regarding particular contexts, such as specialized collections (e.g., cultural heritage collections) or specific retrieval criteria (e.g., multidimensional criteria). In this paper, we consider Information Retrieval systems exploiting geographic dimensions: spatial, temporal, and topical dimensions. Our contribution is twofold as we propose a Geographic Information Retrieval system evaluation framework and test the following hypothesis: combining spatial and temporal dimensions along with the topical dimension improves the effectiveness of Information Retrieval systems.
Abstract: The Opinion Detection from blogs has always been
a challenge for researchers. However with the introduction of
Blog track in TREC 2006, a considerable improvement has
been seen in this field at document level. But now it is the time
when researchers are thinking to shift their orientation from
opinion finding at document level to opinion finding at
sentence or passage level. In this paper, we investigate the
challenges the researchers might face with sentence-level
opinion detection and have tried to demonstrate them with few
examples. Our work also includes annotation of a small set of
opinionated sentences by two annotators. These Annotators
annotate the sentences by labels Positive or Negative. The
results of annotation prove that task of opinion detection on
sentence-level is more challenging task than opinion detection
on document level. In addition, we also discuss the importance
of sentence-level opinion detection. Our work can give a new
direction to researchers to think and work on.
Abstract: Opinion Detection is one of the most interesting and challenging work in the field of Information Retrieval. Lot of research work already exists in this area with some distinctive work. A review of the reveals that researchers have been working on different levels of granularity like documents, passages, sentences and words for the task of opinion detection. In this work we revise our previous approach that combines document level heuristics with a semantic similarity based method. We evaluate this semantic similarity approach on a huge data collection using three different setups involving both sentences and passages and then compare the performance of our approach with these different setups. For evaluation purposes, we are using TREC Blog 2006 collection (148 GB) with 50 topics of TREC Blog 2006 over baseline obtained through Terrier Information System Platform. Results show that our approach improves the baseline opinion MAP by 28.89%, 30.13% and 32.26% using setup one, two and three respectively.
Abstract: Information nowadays is a capital for any organization intending to be reactive and aware of its environment. Unfortunately most modern organizations overdose on information as almost every member daily accesses, extracts and stores a growing amount of documents, i.e. vehicles for information. This situation is even deteriorating as individual efforts to organize and search for information yield poorly from the organization standpoint since diffusion mechanisms are limited. We propose a personal and collective information management architecture in order to take advantage of individual efforts, and to manage documents in a collective and sustainable way. This is based on the integration of the document lifecycle activities depicting the way people manage information and documents. The proposed architecture exploits individual efforts through interdependent processes designed on a mutual benefit scheme. These processes rely on the annotation practice, considered as a representative evidence of the way individuals interact with information.
Abstract: Information nowadays is a capital for any organization intending to be reactive and aware of its environment. Unfortunately most modern organizations overdose on information as almost every member daily accesses, extracts and stores a growing amount of documents, i.e. vehicles for information. This situation is even deteriorating as individual efforts to organize and search for information yield poorly from the organization standpoint since diffusion mechanisms are limited. We propose a personal and collective information management architecture in order to take advantage of individual efforts, and to manage documents in a collective and sustainable way. This is based on the integration of the document lifecycle activities depicting the way people manage information and documents. The proposed architecture exploits individual efforts through interdependent processes designed on a mutual benefit scheme. These processes rely on the annotation practice, considered as a representative evidence of the way individuals interact with information.
Abstract: Nowadays we enter the Web 2.0 era where people's participation is a key principle. In this context, collective annotations enable to share and discuss readers' feedback with regard to digital documents. The results of this activity are going to be used in the Information Retrieval context, which already tends to harness similar collective contributions. In this paper, we propose a collective annotation model supporting feedback exchange through discussion threads. Considering this model, we associate annotations with a measure of the sparked consensus degree (social validation), this allows to provide a synthesized view of associated discussions. Finally, we investigate how Information Retrieval systems may benefit from the proposed model, thus taking advantage of human-contributed highly value-added information, namely collective annotations.
Abstract: This paper deals with an annotation-based decisional system. The decisional system we present is based on multidimensional databases, which are composed of facts and dimensions. The expertise of decision-makers is modelled, shared and stored through annotations. These annotations allow decision-makers to make an active reading and to collaborate with other decision-makers about a common analysis project.
Abstract: Nowadays, organizational members manage the huge amount of digital documents that they exploit at work. To do that, they organize documents into individual hierarchies. Actually, these documents are really parts of a company's capital as they reflect past experiences, present competences and impending expertise. Unfortunately, even if corporate documents represent high value-added material, they still mostly remain unknown from the organization as a whole. That is the reason why this paper proposes to build a unified view of corporate documents. Our approach is complementary to current content-based ones because it relies on an original metrics related to documents usage within an organization
Abstract: In this paper, we present the annotation purposes for a personal as well as for a collective use. In the context of digital documents, annotation systems allow not only to annotate passages but also to comment or reply to annotations within discussion threads (similar feature than forum's one). Thus, readers can express their points of view, debate and even reach a consensus about annotated passages in the context of the documents. Moreover, thanks to this reply feature, people can respond to annotations for criticizing them. Therefore, a reader can explore the thread for evaluating the main annotation reliability according to the reactions of previous readers. As this mental work could be painful in the presence of numerous replies and answer levels, we describe, in this paper, an algorithm that computes annotations reliability. Annotation reliability relies on a synthesis of comments given for this annotation. It can be considered as a social validation of the annotation content. By using this reliability calculation, annotation systems can highlight the most reliable annotations, for instance. This visual adaptation can relieve readers from mentally identifying trustworthy annotations. Consequently, this leads to decreasing their cognitive overload.