hosted by
publicationslist.org
    

Norberto Díaz-Díaz

Escuela Politécnica Superior
Universidad Pablo de Olavide
Ctra. Utrera Km 1
41013 Sevilla
SPAIN
ndiaz@upo.es

Journal articles

2011
Jesus S Aguilar-Ruiz, Domingo S Rodriguez-Baena, Norberto Diaz-Diaz, Isabel A Nepomuceno-Chamorro (2011)  CarGene: characterisation of sets of genes based on metabolic pathways analysis.   Int J Data Min Bioinform 5: 5. 558-573  
Abstract: The great amount of biological information provides scientists with an incomparable framework for testing the results of new algorithms. Several tools have been developed for analysing gene-enrichment and most of them are Gene Ontology-based tools. We developed a Kyoto Encyclopedia of Genes and Genomes (Kegg)-based tool that provides a friendly graphical environment for analysing gene-enrichment. The tool integrates two statistical corrections and simultaneously analysing the information about many groups of genes in both visual and textual manner. We tested the usefulness of our approach on a previous analysis (Huttenshower et al.). Furthermore, our tool is freely available (http://www.upo.es/eps/bigs/cargene.html).
Notes:
Norberto Díaz-Díaz, Jesús S Aguilar-Ruiz (2011)  GO-based functional dissimilarity of gene sets.   BMC Bioinformatics 12: 09  
Abstract: The Gene Ontology (GO) provides a controlled vocabulary for describing the functions of genes and can be used to evaluate the functional coherence of gene sets. Many functional coherence measures consider each pair of gene functions in a set and produce an output based on all pairwise distances. A single gene can encode multiple proteins that may differ in function. For each functionality, other proteins that exhibit the same activity may also participate. Therefore, an identification of the most common function for all of the genes involved in a biological process is important in evaluating the functional similarity of groups of genes and a quantification of functional coherence can helps to clarify the role of a group of genes working together.
Notes:

Conference papers

2011
Norberto Diaz-Diaz, Francisco Gomez-Vela, Domingo S Rodriguez-Baena, Jesus Aguilar-Ruiz (2011)  Gene Regulatory Networks Validation Framework Based in KEGG   In: HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, PART II Edited by:, E Corchado, M Kurzynski, M Wozniak. 279-286 HEIDELBERGER PLATZ 3, D-14197 BERLIN, GERMANY: SPRINGER-VERLAG BERLIN  
Abstract: In the last few years, DNA microarray technology has attained a very important role in biological and biomedical research. It enables analyzing the relations among thousands of genes simultaneously, generating huge amounts of data. The gene regulatory networks represent, in a graph data structure, genes or gene products and the functional relationships between them. These models have been fully used in Bioinformatics because they provide an easy way to understand gene expression regulation. Nowadays, a lot of gene regulatory network algorithms have been developed as knowledge extraction techniques. A very important task in all these studies is to assure the network models reliability in order to prove that the methods used are precise. This validation process can be carried out by using the inherent information of the input data or by using external biological knowledge. In this last case, these sources of information provide a great opportunity of verifying the biological soundness of the generated networks. In this work, authors present a gene regulatory network validation framework. The proposed approach consists in identifying the biological knowledge included in the input network. To do that, the biochemical pathways information stored in KEGG database will be used.
Notes: 6th International Conference on Hybrid Artificial Intelligence Systems (HAIS 2011), Wroclaw Univ Technol, Wroclaw, POLAND, MAY 23-25, 2011
2007
Isabel Nepomuceno-Chamorro, Jesus S Aguilar-Ruiz, Norberto Diaz-Diaz, Domingo S Rodriguez-Baena, Jorge Garcia (2007)  A deterministic model to infer gene networks from microarray data   In: INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2007 Edited by:H Yin, P Tino, E Corchado, W Byrne, X Yao. 850-859 HEIDELBERGER PLATZ 3, D-14197 BERLIN, GERMANY: SPRINGER-VERLAG BERLIN  
Abstract: Microarray experiments help researches to construct the structure of gene regulatory networks, i.e., networks representing relationships among different genes. Filter and knowledge extraction processes are necessary in order to handle the huge amount of data produced by microarray technologies. We propose regression trees techniques as a method to identify gene networks. Regression trees are a very useful technique to estimate the numerical values for the target outputs. They are very often more precise than linear regression models because they can adjust different linear regressions to separate areas of the search space. In our approach, we generate a single regression tree for each genes from a set of genes, taking as input the remaining genes, to finally build a graph from all the relationships among output and input genes. In this paper, we will simplify the approach by setting an only seed, the gene ARN1, and building the graph around it. The final model might gives some clues to understand the dynamics, the regulation or the topology of the gene network from one (or several) seeds, since it gathers relevant genes with accurate connections. The performance of our approach is experimentally tested on the yeast Saccharornyces cerevisiae dataset (Rosetta compendium).
Notes: 8th International Conference on Intelligent Data Engineering and Automated Learning, Birmingham, ENGLAND, DEC 16-19, 2007
2006
Jesus S Aguilar-Ruiz, Juan A Nepomuceno, Norberto Diaz-Diaz, Isabel Nepomuceno (2006)  A measure for data set editing by ordered projections   In: ADVANCES IN APPLIED ARTICIAL INTELLIGENCE, PROCEEDINGS Edited by:, M Ali, R Dapoigny. 1339-1348 HEIDELBERGER PLATZ 3, D-14197 BERLIN, GERMANY: SPRINGER-VERLAG BERLIN  
Abstract: In this paper we study a measure, named weakness of an example, which allows us to establish the importance of an example to find representative patterns for the data set editing problem. Our approach consists in reducing the database size without losing information, using algorithm patterns by ordered projections. The idea is to relax the reduction factor with a new parameter, A, removing all examples of the database whose weakness verify a condition over this A. We study how to establish this new parameter. Our experiments have been carried out using all databases from UCI-Repository and they show that is possible a size reduction in complex databases without notoriously increase of the error rate.
Notes: 19th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, Annecy, FRANCE, JUN 27-30, 2006
Norberto Diaz-Diaz, Domingo S Rodriguez-Baena, Isabel Nepomuceno, Jesus S Aguilar-Ruiz (2006)  Neighborhood-based clustering of gene-gene interactions   In: INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2006, PROCEEDINGS Edited by:E Corchado, H Yin, V Botti, C Fyfe. 1111-1120 HEIDELBERGER PLATZ 3, D-14197 BERLIN, GERMANY: SPRINGER-VERLAG BERLIN  
Abstract: In this work, we propose a new greedy clustering algorithm to identify groups of related genes. Clustering algorithms analyze genes in order to group those with similar behavior. Instead, our approach groups pairs of genes that present similar positive and/or negative interactions. Our approach presents some interesting properties. For instance, the user can specify how the range of each gene is going to be segmented (labels). Some of these will mean expressed or inhibited (depending on the gradation). From all the label combinations a function transforms each pair of labels into another one, that identifies the type of interaction. From these pairs of genes and their interactions we build clusters in a greedy, iterative fashion, as two pairs of genes will be similar if they have the same amount of relevant interactions. Initial two-genes clusters grow iteratively based on their neighborhood until the set of clusters does not change. The algorithm allows the researcher to modify all the criteria: discretization mapping function, gene-gene mapping function and filtering function, and provides much flexibility to obtain clusters based on the level of precision needed. The performance of our approach is experimentally tested on the yeast dataset. The final number of clusters is low and genes within show a significant level of cohesion, as it is shown graphically in the experiments.
Notes: 7th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 2006), Univ Burgos, Burgos, SPAIN, SEP 20-23, 2006
2005
R Giraldez, N Diaz-Diaz, I Nepomuceno, J S Aguilar-Ruiz (2005)  An approach to reduce the cost of evaluation in evolutionary learning   In: COMPUTATIONAL INTELLIGENCE AND BIOINSPIRED SYSTEMS, PROCEEDINGS Edited by:J Cabestany, A Prieto, F Sandoval. 804-811 HEIDELBERGER PLATZ 3, D-14197 BERLIN, GERMANY: SPRINGER-VERLAG BERLIN  
Abstract: The supervised learning methods applying evolutionary algorithms to generate knowledge model are extremely costly in time and space. Fundamentally, this high computational cost is fundamentally due to the evaluation process that needs to go through the whole datasets to assess their goodness of the genetic individuals. Often, this process carries out some redundant operations which can be avoided. In this paper, we present an example reduction method to reduce the computational cost of the evolutionary learning algorithms by means of extraction, storage and processing only the useful information in the evaluation process.
Notes: 8th International Work-Conference on Artificial Neural Networks, Barcelona, SPAIN, JUN 08-10, 2005
Norberto Diaz-Diaz, Jesus S Aguilar-Ruiz, Juan A Nepomuceno, Jorge Garcia (2005)  Feature selection based on bootstrapping   In: 2005 ICSC Congress on Computational Intelligence Methods and Applications (CIMA 2005) 217-222 345 E 47TH ST, NEW YORK, NY 10017 USA: IEEE  
Abstract: The results of feature selection methods have a great influence on the success of data mining processes, especially when the data sets have high dimensionality. In order to find the optimal result from feature selection methods, we should check each possible subset of features to obtain the precision on classification, i.e., an exhaustive search through the search space. However, it is an unfeasible task due to its computational complexity. In this paper we propose a novel method of feature selection based on bootstrapping techniques. Our approach shows that it is not necessary to try every subset of features, but only a very small subset of combinations to achieve the same performance as the exhaustive approach. The experiments have been carried out using very high-dimensional datasets (thousands of features) and they show that it is possible to maintain the precision at the same time that the complexity is reduced substantially.
Notes: ICSC Congress on Computational Intelligence Methods and Applications, Istanbul, TURKEY, DEC 15-17, 2005
R Ruiz, J S Aguilar-Ruiz, J C Riquelme, N Diaz-Diaz (2005)  Analysis of feature rankings for classification   In: ADVANCES IN INTELLIGENT DATA ANALYSIS VI, PROCEEDINGS Edited by:A F Famili, J N Kok, J M Pena, A Siebes, A Feelders. 362-372 HEIDELBERGER PLATZ 3, D-14197 BERLIN, GERMANY: SPRINGER-VERLAG BERLIN  
Abstract: Different ways of contrast generated rankings by feature selection algorithms are presented in this paper, showing several possible interpretations, depending on the given approach to each study. We begin from the premise of no existence of only one ideal subset for all cases. The purpose of these kinds of algorithms is to reduce the data set to each first attributes without losing prediction against the original data set. In this paper we propose a method, feature-ranking performance, to compare different feature-ranking methods, based on the Area Under Feature Ranking Classification Performance Curve (AURC). Conclusions and trends taken from this paper propose support for the performance of learning tasks, where some ranking algorithms studied here operate.
Notes: 6th International Symposium on Intelligent Data Analysis, Madrid, SPAIN, SEP 08-10, 2005
Powered by PublicationsList.org.