hosted by
publicationslist.org
    
Y-h. Taguchi

tag@granular.com

Journal articles

2007
 
DOI   
PMID 
Y-h Taguchi, M Michael Gromiha (2007)  Application of amino acid occurrence for discriminating different folding types of globular proteins.   BMC Bioinformatics 8: 10  
Abstract: BACKGROUND: Predicting the three-dimensional structure of a protein from its amino acid sequence is a long-standing goal in computational/molecular biology. The discrimination of different structural classes and folding types are intermediate steps in protein structure prediction. RESULTS: In this work, we have proposed a method based on linear discriminant analysis (LDA) for discriminating 30 different folding types of globular proteins using amino acid occurrence. Our method was tested with a non-redundant set of 1612 proteins and it discriminated them with the accuracy of 38%, which is comparable to or better than other methods in the literature. A web server has been developed for discriminating the folding type of a query protein from its amino acid sequence and it is available at http://granular.com/PROLDA/. CONCLUSION: Amino acid occurrence has been successfully used to discriminate different folding types of globular proteins. The discrimination accuracy obtained with amino acid occurrence is better than that obtained with amino acid composition and/or amino acid properties. In addition, the method is very fast to obtain the results.
Notes:
2005
 
DOI   
PMID 
Y - H Taguchi, Y Oono (2005)  Relational patterns of gene expression via non-metric multidimensional scaling analysis.   Bioinformatics 21: 6. 730-740 Mar  
Abstract: MOTIVATION: Microarray experiments result in large-scale data sets that require extensive mining and refining to extract useful information. We demonstrate the usefulness of (non-metric) multidimensional scaling (MDS) method in analyzing a large number of genes. Applying MDS to the microarray data is certainly not new, but the existing works are all on small numbers (< 100) of points to be analyzed. We have been developing an efficient novel algorithm for non-metric MDS (nMDS) analysis for very large data sets as a maximally unsupervised data mining device. We wish to demonstrate its usefulness in the context of bioinformatics (unraveling relational patterns among genes from time series data in this paper). RESULTS: The Pearson correlation coefficient with its sign flipped is used to measure the dissimilarity of the gene activities in transcriptional response of cell-cycle-synchronized human fibroblasts to serum. These dissimilarity data have been analyzed with our nMDS algorithm to produce an almost circular relational pattern of the genes. The obtained pattern expresses a temporal order in the data in this example; the temporal expression pattern of the genes rotates along this circular arrangement and is related to the cell cycle. For the data we analyze in this paper we observe the following. If an appropriate preparation procedure is applied to the original data set, linear methods such as the principal component analysis (PCA) could achieve reasonable results, but without data preprocessing linear methods such as PCA cannot achieve a useful picture. Furthermore, even with an appropriate data preprocessing, the outcomes of linear procedures are not as clear-cut as those by nMDS without preprocessing.
Notes:
1993
1992
1990
Powered by publicationslist.org.