hosted by
publicationslist.org
    

Albin Sandelin


albin@binf.ku.dk

Journal articles

2011
S F Schmidt, M Jorgensen, Y Chen, R Nielsen, A Sandelin*, S Mandrup* (2011)  Cross species comparison of C/EBPalpha and PPARgamma profiles in mouse and human adipocytes reveals interdependent retention of binding sites   BMC Genomics 12:  
Abstract: ABSTRACT: BACKGROUND: The transcription factors peroxisome proliferator activated receptor gamma (PPARgamma) and CCAAT/enhancer binding protein alpha (C/EBPalpha) are key transcriptional regulators of adipocyte differentiation and function. We and others have previously shown that binding sites of these two transcription factors show a high degree of overlap and are associated with the majority of genes upregulated during differentiation of murine 3T3-L1 adipocytes. RESULTS: Here we have mapped all binding sites of C/EBPalpha and PPARgamma in human SGBS adipocytes and compared these with the genome-wide profiles from mouse adipocytes to systematically investigate what biological features correlate with retention of sites in orthologous regions between mouse and human. Despite a limited interspecies retention of binding sites, several biological features make sites more likely to be retained. First, co-binding of PPARgamma and C/EBPalpha in mouse is the most powerful predictor of retention of the corresponding binding sites in human. Second, vicinity to genes highly upregulated during adipogenesis significantly increases retention. Third, the presence of C/EBPalpha consensus sites correlate with retention of both factors, indicating that C/EBPalpha facilitates recruitment of PPARgamma. Fourth, retention correlates with overall sequence conservation within the binding regions independent of C/EBPalpha and PPARgamma sequence patterns, indicating that other transcription factors work cooperatively with these two key transcription factors. CONCLUSIONS: This study provides a comprehensive and systematic analysis of what biological features impact on retention of binding sites between human and mouse. Specifically, we show that the binding of C/EBPalpha and PPARgamma in adipocytes have evolved in a highly interdependent manner, indicating a significant cooperativity between these two transcription factors.
Notes: Schmidt, Soren F xD;Jorgensen, Mette xD;Chen, Yun xD;Nielsen, Ronni xD;Sandelin, Albin xD;Mandrup, Susanne xD;England xD;BMC genomics xD;BMC Genomics. 2011 Mar 16;12:152.
2010
L Carstensen, A Sandelin, O Winther, N R Hansen (2010)  Multivariate Hawkes process models of the occurrence of regulatory elements   BMC Bioinformatics 11: 1.  
Abstract: ABSTRACT: BACKGROUND: A central question in molecular biology is how transcriptional regulatory elements (TREs) act in combination. Recent high-throughput data provide us with the location of multiple regulatory regions for multiple regulators, and thus with the possibility of analyzing the multivariate distribution of the occurrences of these TREs along the genome. RESULTS: We present a model of TRE occurrences known as the Hawkes process. We illustrate the use of this model by analyzing two different publically available data sets. We are able to model, in detail, how the occurrence of one TRE is affected by the occurrences of others, and we can test a range of natural hypotheses about the dependencies among the TRE occurrences. In contrast to earlier efforts, pre-processing steps such as clustering or binning are not needed, and we thus retain information about the dependencies among the TREs that is otherwise lost. For each of the two data sets we provide two results: first, a qualitative description of the dependencies among the occurrences of the TREs, and second, quantitative results on the favored or avoided distances between the different TREs. CONCLUSIONS: The Hawkes process is a novel way of modeling the joint occurrences of multiple TREs along the genome that is capable of providing new insights into dependencies among elements involved in transcriptional regulation. The method is available as an R package from http://www.math.ku.dk/~richard/ppstat/.
Notes: Journal article xD;BMC bioinformatics xD;BMC Bioinformatics. 2010 Sep 9;11(1):456.
T T Marstrand, R Borup, A Willer, N Borregaard, A Sandelin, B T Porse, K Theilgaard-Monch (2010)  A conceptual framework for the identification of candidate drugs and drug targets in acute promyelocytic leukemia   Leukemia 24: 7. 1265-75  
Abstract: Chromosomal translocations of transcription factors generating fusion proteins with aberrant transcriptional activity are common in acute leukemia. In acute promyelocytic leukemia (APL), the promyelocytic leukemia-retinoic-acid receptor alpha (PML-RARA) fusion protein, which emerges as a consequence of the t(15;17) translocation, acts as a transcriptional repressor that blocks neutrophil differentiation at the promyelocyte (PM) stage. In this study, we used publicly available microarray data sets and identified signatures of genes dysregulated in APL by comparison of gene expression profiles of APL cells and normal PMs representing the same stage of differentiation. We next subjected our identified APL signatures of dysregulated genes to a series of computational analyses leading to (i) the finding that APL cells show stem cell properties with respect to gene expression and transcriptional regulation, and (ii) the identification of candidate drugs and drug targets for therapeutic interventions. Significantly, our study provides a conceptual framework that can be applied to any subtype of AML and cancer in general to uncover novel information from published microarray data sets at low cost. In a broader perspective, our study provides strong evidence that genomic strategies might be used in a clinical setting to prospectively identify candidate drugs that subsequently are validated in vitro to define the most effective drug combination for individual cancer patients on a rational basis.
Notes: Marstrand, T T xD;Borup, R xD;Willer, A xD;Borregaard, N xD;Sandelin, A xD;Porse, B T xD;Theilgaard-Monch, K xD;Research Support, Non-U.S. Gov't xD;England xD;Leukemia : official journal of the Leukemia Society of America, Leukemia Research Fund, U.K xD;Leukemia. 2010 Jul;24(7):1265-75. Epub 2010 May 27.
D Motti, C Le Duigou, E Eugene, N Chemaly, L Wittner, D Lazarevic, H Krmac, T Marstrand, E Valen, R Sanges, E Stupka, A Sandelin, E Cherubini, S Gustincich, R Miles (2010)  Gene expression analysis of the emergence of epileptiform activity after focal injection of kainic acid into mouse hippocampus   Eur J Neurosci 32: 8. 1364-79  
Abstract: We report gene profiling data on genomic processes underlying the progression towards recurrent seizures after injection of kainic acid (KA) into the mouse hippocampus. Focal injection enabled us to separate the effects of proepileptic stimuli initiated by KA injection. Both the injected and contralateral hippocampus participated in the status epilepticus. However, neuronal death induced by KA treatment was restricted to the injected hippocampus, although there was some contralateral axonal degeneration. We profiled gene expression changes in dorsal and ventral regions of both the injected and contralateral hippocampus. Changes were detected in the expression of 1526 transcripts in samples from three time-points: (i) during the KA-induced status epilepticus, (ii) at 2 weeks, before recurrent seizures emerged, and (iii) at 6 months after seizures emerged. Grouping genes with similar spatio-temporal changes revealed an early transcriptional response, strong immune, cell death and growth responses at 2 weeks and an activation of immune and extracellular matrix genes persisting at 6 months. Immunostaining for proteins coded by genes identified from array studies provided evidence for gliogenesis and suggested that the proteoglycan biglycan is synthesized by astrocytes and contributes to a glial scar. Gene changes at 6 months after KA injection were largely restricted to tissue from the injection site. This suggests that either recurrent seizures might depend on maintained processes including immune responses and changes in extracellular matrix proteins near the injection site or alternatively might result from processes, such as growth, distant from the injection site and terminated while seizures are maintained.
Notes: Motti, Dario xD;Le Duigou, Caroline xD;Eugene, Emmanuel xD;Chemaly, Nicole xD;Wittner, Lucia xD;Lazarevic, Dejan xD;Krmac, Helena xD;Marstrand, Troels xD;Valen, Eivind xD;Sanges, Remo xD;Stupka, Elia xD;Sandelin, Albin xD;Cherubini, Enrico xD;Gustincich, Stefano xD;Miles, Richard xD;France xD;The European journal of neuroscience xD;Eur J Neurosci. 2010 Oct;32(8):1364-79. doi: 10.1111/j.1460-9568.2010.07403.x.
J Ryge, O Winther, J Wienecke, A Sandelin, A C Westerdahl, H Hultborn, O Kiehn (2010)  Transcriptional regulation of gene expression clusters in motor neurons following spinal cord injury   BMC Genomics 11:  
Abstract: BACKGROUND: Spinal cord injury leads to neurological dysfunctions affecting the motor, sensory as well as the autonomic systems. Increased excitability of motor neurons has been implicated in injury-induced spasticity, where the reappearance of self-sustained plateau potentials in the absence of modulatory inputs from the brain correlates with the development of spasticity. RESULTS: Here we examine the dynamic transcriptional response of motor neurons to spinal cord injury as it evolves over time to unravel common gene expression patterns and their underlying regulatory mechanisms. For this we use a rat-tail-model with complete spinal cord transection causing injury-induced spasticity, where gene expression profiles are obtained from labeled motor neurons extracted with laser microdissection 0, 2, 7, 21 and 60 days post injury. Consensus clustering identifies 12 gene clusters with distinct time expression profiles. Analysis of these gene clusters identifies early immunological/inflammatory and late developmental responses as well as a regulation of genes relating to neuron excitability that support the development of motor neuron hyper-excitability and the reappearance of plateau potentials in the late phase of the injury response. Transcription factor motif analysis identifies differentially expressed transcription factors involved in the regulation of each gene cluster, shaping the expression of the identified biological processes and their associated genes underlying the changes in motor neuron excitability. CONCLUSIONS: This analysis provides important clues to the underlying mechanisms of transcriptional regulation responsible for the increased excitability observed in motor neurons in the late chronic phase of spinal cord injury suggesting alternative targets for treatment of spinal cord injury. Several transcription factors were identified as potential regulators of gene clusters containing elements related to motor neuron hyper-excitability, the manipulation of which potentially could be used to alter the transcriptional response to prevent the motor neurons from entering a state of hyper-excitability.
Notes: Ryge, Jesper xD;Winther, Ole xD;Wienecke, Jacob xD;Sandelin, Albin xD;Westerdahl, Ann-Charlotte xD;Hultborn, Hans xD;Kiehn, Ole xD;England xD;BMC genomics xD;BMC Genomics. 2010 Jun 9;11:365.
E Portales-Casamar, S Thongjuea, A T Kwon, D Arenillas, X Zhao, E Valen, D Yusuf, B Lenhard*, W W Wasserman*, A Sandelin* (2010)  JASPAR 2010 : the greatly expanded open-access database of transcription factor binding profiles   Nucleic Acids Res 38: Database issue. D105-10  
Abstract: JASPAR (http://jaspar.genereg.net) is the leading open-access database of matrix profiles describing the DNA-binding patterns of transcription factors (TFs) and other proteins interacting with DNA in a sequence-specific manner. Its fourth major release is the largest expansion of the core database to date: the database now holds 457 non-redundant, curated profiles. The new entries include the first batch of profiles derived from ChIP-seq and ChIP-chip whole-genome binding experiments, and 177 yeast TF binding profiles. The introduction of a yeast division brings the convenience of JASPAR to an active research community. As binding models are refined by newer data, the JASPAR database now uses versioning of matrices: in this release, 12% of the older models were updated to improved versions. Classification of TF families has been improved by adopting a new DNA-binding domain nomenclature. A curated catalog of mammalian TFs is provided, extending the use of the JASPAR profiles to additional TFs belonging to the same structural family. The changes in the database set the system ready for more rapid acquisition of new high-throughput data sources. Additionally, three new special collections provide matrix profile data produced by recent alternative high-throughput approaches.
Notes: Portales-Casamar, Elodie xD;Thongjuea, Supat xD;Kwon, Andrew T xD;Arenillas, David xD;Zhao, Xiaobei xD;Valen, Eivind xD;Yusuf, Dimas xD;Lenhard, Boris xD;Wasserman, Wyeth W xD;Sandelin, Albin xD;Research Support, Non-U.S. Gov't xD;England xD;Nucleic acids research xD;Nucleic Acids Res. 2010 Jan;38(Database issue):D105-10. Epub 2009 Nov 11.
A Pansoy, S Ahmed, E Valen, A Sandelin, J Matthews (2010)  3-methylcholanthrene induces differential recruitment of aryl hydrocarbon receptor to human promoters   Toxicol Sci  
Abstract: The aryl hydrocarbon receptor (AHR) is a ligand-activated protein that mediates the toxic actions of polycyclic aromatic and halogenated compounds. Identifying genes directly regulated by AHR is important in understanding the pathways regulated by this receptor. Here we used chromatin immunoprecipitation and promoter focused microarrays (ChIP-chip) to detect AHR bound genomic regions after 3-methylcholanthrene (3MC) treatment of T-47D human breast cancer cells. We identified 241 AHR-3MC bound regions and transcription factor binding site analysis revealed a strong over-representation of the AHR responsive element. Conventional ChIP confirmed recruitment of AHR to 26 regions with target gene responses to 3MC varying from activation to inhibition to having no effect. A comparison of identified AHR-3MC bound regions with AHR-TCDD bound regions in from our previous study (Ahmed, S., Valen, E., Sandelin, A. & Matthews, J 2009 Toxicol Sci, 111, 254-266), revealed that 127 regions were common between the data sets. Time course ChIPs for six of the regions showed that 3MC-induced gene-specific changes in histone H3 acetylation and methylation, and induced differential oscillatory binding of AHR, with a periodicity between 1.5 to 2 h. Re-treatment of cells with 3MC failed to alter the oscillatory binding profiles of AHR or ARNT. Cells became responsive to 3MC but not TCDD after 24 h of exposure to 3MC, highlighting important differences in AHR responsiveness between the two ligands. Our results reveal a number of novel AHR-bound promoter regions and target genes that exhibit differential kinetic binding profiles and regulation by AHR.
Notes: Journal article xD;Toxicological sciences : an official journal of the Society of Toxicology xD;Toxicol Sci. 2010 Apr 8.
2009
S Ahmed, E Valen, A Sandelin, J Matthews (2009)  Dioxin increases the interaction between aryl hydrocarbon receptor and estrogen receptor alpha at human promoters   Toxicol Sci 111: 2. 254-66  
Abstract: Recent studies have shown that activated aryl hydrocarbon receptor (AHR) induced the recruitment of estrogen receptor-alpha (ERalpha) to AHR-regulated genes and that AHR is recruited to ERalpha-regulated genes. However, these findings were limited to a small number of well-characterized AHR- or ERalpha-responsive genes with little knowledge of what was occurring at other genomic regions. In this study, we showed using chromatin immunoprecipitation followed by hybridization to promoter focused microarrays (ChIP-chip) that 2,3,7,8-tetrachlorodibenzo-p-dioxin treatment significantly increased the overlap of genomic regions bound by both AHR and ERalpha. Conventional and sequential ChIPs confirmed the recruitment of AHR and ERalpha to many of the identified regions. Transcription factor binding site analysis revealed an overrepresentation of aryl hydrocarbon receptor response elements in regions bound by both AHR and ERalpha, suggesting that AHR was the important factor determining the recruitment of ERalpha to these regions. RNA interference-mediated knockdown of AHR confirmed its requirement for the recruitment of ERalpha to some, but not all, of the shared regions. Our findings demonstrate not only that dioxin induces the recruitment of ERalpha to AHR target genes but also that AHR is recruited to estrogen-responsive regions in a gene-specific manner, suggesting that AHR utilizes both of these mechanisms to modulate estrogen-dependent signaling.
Notes: Ahmed, Shaimaa xD;Valen, Eivind xD;Sandelin, Albin xD;Matthews, Jason xD;Research Support, Non-U.S. Gov't xD;United States xD;Toxicological sciences : an official journal of the Society of Toxicology xD;Toxicol Sci. 2009 Oct;111(2):254-66. Epub 2009 Jul 2.
H Suzuki, A R Forrest, E van Nimwegen, C O Daub, P J Balwierz, K M Irvine, T Lassmann, T Ravasi, Y Hasegawa, M J de Hoon, S Katayama, K Schroder, P Carninci, Y Tomaru, M Kanamori-Katayama, A Kubosaki, A Akalin, Y Ando, E Arner, M Asada, H Asahara, T Bailey, V B Bajic, D Bauer, A G Beckhouse, N Bertin, J Bjorkegren, F Brombacher, E Bulger, A M Chalk, J Chiba, N Cloonan, A Dawe, J Dostie, P G Engstrom, M Essack, G J Faulkner, J L Fink, D Fredman, K Fujimori, M Furuno, T Gojobori, J Gough, S M Grimmond, M Gustafsson, M Hashimoto, T Hashimoto, M Hatakeyama, S Heinzel, W Hide, O Hofmann, M Hornquist, L Huminiecki, K Ikeo, N Imamoto, S Inoue, Y Inoue, R Ishihara, T Iwayanagi, A Jacobsen, M Kaur, H Kawaji, M C Kerr, R Kimura, S Kimura, Y Kimura, H Kitano, H Koga, T Kojima, S Kondo, T Konno, A Krogh, A Kruger, A Kumar, B Lenhard, A Lennartsson, M Lindow, M Lizio, C Macpherson, N Maeda, C A Maher, M Maqungo, J Mar, N A Matigian, H Matsuda, J S Mattick, S Meier, S Miyamoto, E Miyamoto-Sato, K Nakabayashi, Y Nakachi, M Nakano, S Nygaard, T Okayama, Y Okazaki, H Okuda-Yabukami, V Orlando, J Otomo, M Pachkov, N Petrovsky, C Plessy, J Quackenbush, A Radovanovic, M Rehli, R Saito, A Sandelin, S Schmeier, C Schonbach, A S Schwartz, C A Semple, M Sera, J Severin, K Shirahige, C Simons, G St Laurent, M Suzuki, T Suzuki, M J Sweet, R J Taft, S Takeda, Y Takenaka, K Tan, M S Taylor, R D Teasdale, J Tegner, S Teichmann, E Valen, C Wahlestedt, K Waki, A Waterhouse, C A Wells, O Winther, L Wu, K Yamaguchi, H Yanagawa, J Yasuda, M Zavolan, D A Hume, T Arakawa, S Fukuda, K Imamura, C Kai, A Kaiho, T Kawashima, C Kawazu, Y Kitazume, M Kojima, H Miura, K Murakami, M Murata, N Ninomiya, H Nishiyori, S Noma, C Ogawa, T Sano, C Simon, M Tagami, Y Takahashi, J Kawai, Y Hayashizaki (2009)  The transcriptional network that controls growth arrest and differentiation in a human myeloid leukemia cell line   Nat Genet 41: 5. 553-62  
Abstract: Using deep sequencing (deepCAGE), the FANTOM4 study measured the genome-wide dynamics of transcription-start-site usage in the human monocytic cell line THP-1 throughout a time course of growth arrest and differentiation. Modeling the expression dynamics in terms of predicted cis-regulatory sites, we identified the key transcription regulators, their time-dependent activities and target genes. Systematic siRNA knockdown of 52 transcription factors confirmed the roles of individual factors in the regulatory network. Our results indicate that cellular states are constrained by complex networks involving both positive and negative regulatory interactions among substantial numbers of transcription factors and that no single transcription factor is both necessary and sufficient to drive the differentiation process.
Notes: FANTOM Consortium xD;Riken Omics Science Center xD;Journal Article xD;Research Support, Non-U.S. Gov't xD;United States
E Valen, G Pascarella, A Chalk, N Maeda, M Kojima, C Kawazu, M Murata, H Nishiyori, D Lazarevic, D Motti, T T Marstrand, M H Tang, X Zhao, A Krogh, O Winther, T Arakawa, J Kawai, C Wells, C Daub, M Harbers, Y Hayashizaki, S Gustincich, A Sandelin*, P Carninci* (2009)  Genome-wide detection and analysis of hippocampus core promoters using DeepCAGE   Genome Res 19: 2. 255-65  
Abstract: Finding and characterizing mRNAs, their transcription start sites (TSS), and their associated promoters is a major focus in post-genome biology. Mammalian cells have at least 5-10 magnitudes more TSS than previously believed, and deeper sequencing is necessary to detect all active promoters in a given tissue. Here, we present a new method for high-throughput sequencing of 5' cDNA tags-DeepCAGE: merging the Cap Analysis of Gene Expression method with ultra-high-throughput sequence technology. We apply DeepCAGE to characterize 1.4 million sequenced TSS from mouse hippocampus and reveal a wealth of novel core promoters that are preferentially used in hippocampus: This is the most comprehensive promoter data set for any tissue to date. Using these data, we present evidence indicating a key role for the Arnt2 transcription factor in hippocampus gene regulation. DeepCAGE can also detect promoters used only in a small subset of cells within the complex tissue.
Notes: Journal Article xD;Research Support, Non-U.S. Gov't xD;Validation Studies xD;United States
E Valen, A Sandelin, O Winther, A Krogh (2009)  Discovery of regulatory elements is improved by a discriminatory approach   PLoS Comput Biol 5: 11.  
Abstract: A major goal in post-genome biology is the complete mapping of the gene regulatory networks for every organism. Identification of regulatory elements is a prerequisite for realizing this ambitious goal. A common problem is finding regulatory patterns in promoters of a group of co-expressed genes, but contemporary methods are challenged by the size and diversity of regulatory regions in higher metazoans. Two key issues are the small amount of information contained in a pattern compared to the large promoter regions and the repetitive characteristics of genomic DNA, which both lead to "pattern drowning". We present a new computational method for identifying transcription factor binding sites in promoters using a discriminatory approach with a large negative set encompassing a significant sample of the promoters from the relevant genome. The sequences are described by a probabilistic model and the most discriminatory motifs are identified by maximizing the probability of the sets given the motif model and prior probabilities of motif occurrences in both sets. Due to the large number of promoters in the negative set, an enhanced suffix array is used to improve speed and performance. Using our method, we demonstrate higher accuracy than the best of contemporary methods, high robustness when extending the length of the input sequences and a strong correlation between our objective function and the correct solution. Using a large background set of real promoters instead of a simplified model leads to higher discriminatory power and markedly reduces the need for repeat masking; a common pre-processing step for other pattern finders.
Notes: Valen, Eivind xD;Sandelin, Albin xD;Winther, Ole xD;Krogh, Anders xD;Research Support, Non-U.S. Gov't xD;United States xD;PLoS computational biology xD;PLoS Comput Biol. 2009 Nov;5(11):e1000562. Epub 2009 Nov 13.
2008
H Gao, S Falt, A Sandelin, J A Gustafsson, K Dahlman-Wright (2008)  Genome-Wide Identification of Estrogen Receptor {alpha}-Binding Sites in Mouse Liver   Mol Endocrinol 22: 1. 10-22  
Abstract: We report the genome-wide identification of estrogen receptor alpha (ERalpha)-binding regions in mouse liver using a combination of chromatin immunoprecipitation and tiled microarrays that cover all nonrepetitive sequences in the mouse genome. This analysis identified 5568 ERalpha-binding regions. In agreement with what has previously been reported for human cell lines, many ERalpha-binding regions are located far away from transcription start sites; approximately 40% of ERalpha-binding regions are located within 10 kb of annotated transcription start sites. Almost 50% of ERalpha-binding regions overlap genes. The majority of ERalpha-binding regions lie in regions that are evolutionarily conserved between human and mouse. Motif-finding algorithms identified the estrogen response element, and variants thereof, together with binding sites for activator protein 1, basic-helix-loop-helix proteins, ETS proteins, and Forkhead proteins as the most common motifs present in identified ERalpha-binding regions. To correlate ERalpha binding to the promoter of specific genes, with changes in expression levels of the corresponding mRNAs, expression levels of selected mRNAs were assayed in livers 2, 4, and 6 h after treatment with ERalpha-selective agonist propyl pyrazole triol. Five of these eight selected genes, Shp, Stat3, Pdgds, Pck1, and Pdk4, all responded to propyl pyrazole triol after 4 h treatment. These results extend our previous studies using gene expression profiling to characterize estrogen signaling in mouse liver, by characterizing the first step in this signaling cascade, the binding of ERalpha to DNA in intact chromatin.
Notes: Journal Article xD;United States
M C Frith, E Valen, A Krogh, Y Hayashizaki, P Carninci, A Sandelin* (2008)  A code for transcription initiation in mammalian genomes   Genome Res 18: 1. 1-12  
Abstract: Genome-wide detection of transcription start sites (TSSs) has revealed that RNA Polymerase II transcription initiates at millions of positions in mammalian genomes. Most core promoters do not have a single TSS, but an array of closely located TSSs with different rates of initiation. As a rule, genes have more than one such core promoter; however, defining the boundaries between core promoters is not trivial. These discoveries prompt a re-evaluation of our models for transcription initiation. We describe a new framework for understanding the organization of transcription initiation. We show that initiation events are clustered on the chromosomes at multiple scales-clusters within clusters-indicating multiple regulatory processes. Within the smallest of such clusters, which can be interpreted as core promoters, the local DNA sequence predicts the relative transcription start usage of each nucleotide with a remarkable 91% accuracy, implying the existence of a DNA code that determines TSS selection. Conversely, the total expression strength of such clusters is only partially determined by the local DNA sequence. Thus, the overall control of transcription can be understood as a combination of large- and small-scale effects; the selection of transcription start sites is largely governed by the local DNA sequence, whereas the transcriptional activity of a locus is regulated at a different level; it is affected by distal features or events such as enhancers and chromatin remodeling.
Notes: Journal Article xD;United States
J C Bryne, E Valen, M H Tang, T Marstrand, O Winther, I da Piedade, A Krogh, B Lenhard*, A Sandelin* (2008)  JASPAR, the open access database of transcription factor-binding profiles : new content and tools in the 2008 update   Nucleic Acids Res 36: Database issue. D102-6  
Abstract: JASPAR is a popular open-access database for matrix models describing DNA-binding preferences for transcription factors and other DNA patterns. With its third major release, JASPAR has been expanded and equipped with additional functions aimed at both casual and power users. The heart of the JASPAR database-the JASPAR CORE sub-database-has increased by 12% in size, and three new specialized sub-databases have been added. New functions include clustering of matrix models by similarity, generation of random matrices by sampling from selected sets of existing models and a language-independent Web Service applications programming interface for matrix retrieval. JASPAR is available at http://jaspar.genereg.net.
Notes: Journal Article xD;Research Support, Non-U.S. Gov't xD;England
Yawen Liu, Hui Gao, Troels Torben Marstrand, Anders Ström, Eivind Valen, Albin Sandelin*, Jan-Ake Gustafsson*, Karin Dahlman-Wright (2008)  The genome landscape of ERalpha- and ERbeta-binding DNA regions.   Proceedings of the National Academy of Sciences of the United States of America 105: 7. 2604-2609  
Abstract: In this article, we have applied the ChIP-on-chip approach to pursue a large scale identification of ERalpha- and ERbeta-binding DNA regions in intact chromatin. We show that there is a high degree of overlap between the regions identified as bound by ERalpha and ERbeta, respectively, but there are also regions that are bound by ERalpha only in the presence of ERbeta, as well as regions that are selectively bound by either receptor. Analysis of bound regions shows that regions bound by ERalpha have distinct properties in terms of genome landscape, sequence features, and conservation compared with regions that are bound by ERbeta. ERbeta-bound regions are, as a group, located more closely to transcription start sites. ERalpha- and ERbeta-bound regions differ in sequence properties, with ERalpha-bound regions having an overrepresentation of TA-rich motifs including forkhead binding sites and ERbeta-bound regions having a predominance of classical estrogen response elements (EREs) and GC-rich motifs. Differences in the properties of ER bound regions might explain some of the differences in gene expression programs and physiological effects shown by the respective estrogen receptors.
Notes:
K J Won, A Sandelin, T T Marstrand, A Krogh (2008)  Modeling promoter grammars with evolving hidden Markov models   Bioinformatics 24: 15. 1669-75  
Abstract: MOTIVATION: Describing and modeling biological features of eukaryotic promoters remains an important and challenging problem within computational biology. The promoters of higher eukaryotes in particular display a wide variation in regulatory features, which are difficult to model. Often several factors are involved in the regulation of a set of co-regulated genes. If so, promoters can be modeled with connected regulatory features, where the network of connections is characteristic for a particular mode of regulation. RESULTS: With the goal of automatically deciphering such regulatory structures, we present a method that iteratively evolves an ensemble of regulatory grammars using a hidden Markov Model (HMM) architecture composed of interconnected blocks representing transcription factor binding sites (TFBSs) and background regions of promoter sequences. The ensemble approach reduces the risk of overfitting and generally improves performance. We apply this method to identify TFBSs and to classify promoters preferentially expressed in macrophages, where it outperforms other methods due to the increased predictive power given by the grammar. AVAILABILITY: The software and the datasets are available from http://modem.ucsd.edu/won/eHMM.tar.gz
Notes: Journal Article xD;Research Support, Non-U.S. Gov't xD;England
A Sandelin* (2008)  Prediction of regulatory elements   Methods Mol Biol 453: 233-44  
Abstract: Finding the regulatory mechanisms responsible for gene expression remains one of the most important challenges for biomedical research. A major focus in cellular biology is to find functional transcription factor binding sites (TFBS) responsible for the regulation of a downstream gene. As wet-lab methods are time consuming and expensive, it is not realistic to identify TFBS for all uncharacterized genes in the genome by purely experimental means. Computational methods aimed at predicting potential regulatory regions can increase the efficiency of wet-lab experiments significantly. Here, methods for building quantitative models describing the binding preferences of transcription factors based on literature-derived data are presented, as well as a general protocol for scanning promoters using cross-species comparison as a filter (phylogenetic footprinting).
Notes: Sandelin, Albin xD;United States xD;Methods in molecular biology (Clifton, N.J.) xD;Methods Mol Biol. 2008;453:233-44.
2007
E Birney, J A Stamatoyannopoulos, A Dutta, R Guigo, T R Gingeras, E H Margulies, Z Weng, M Snyder, E T Dermitzakis, R E Thurman, M S Kuehn, C M Taylor, S Neph, C M Koch, S Asthana, A Malhotra, I Adzhubei, J A Greenbaum, R M Andrews, P Flicek, P J Boyle, H Cao, N P Carter, G K Clelland, S Davis, N Day, P Dhami, S C Dillon, M O Dorschner, H Fiegler, P G Giresi, J Goldy, M Hawrylycz, A Haydock, R Humbert, K D James, B E Johnson, E M Johnson, T T Frum, E R Rosenzweig, N Karnani, K Lee, G C Lefebvre, P A Navas, F Neri, S C Parker, P J Sabo, R Sandstrom, A Shafer, D Vetrie, M Weaver, S Wilcox, M Yu, F S Collins, J Dekker, J D Lieb, T D Tullius, G E Crawford, S Sunyaev, W S Noble, I Dunham, F Denoeud, A Reymond, P Kapranov, J Rozowsky, D Zheng, R Castelo, A Frankish, J Harrow, S Ghosh, A Sandelin, I L Hofacker, R Baertsch, D Keefe, S Dike, J Cheng, H A Hirsch, E A Sekinger, J Lagarde, J F Abril, A Shahab, C Flamm, C Fried, J Hackermuller, J Hertel, M Lindemeyer, K Missal, A Tanzer, S Washietl, J Korbel, O Emanuelsson, J S Pedersen, N Holroyd, R Taylor, D Swarbreck, N Matthews, M C Dickson, D J Thomas, M T Weirauch, J Gilbert, J Drenkow, I Bell, X Zhao, K G Srinivasan, W K Sung, H S Ooi, K P Chiu, S Foissac, T Alioto, M Brent, L Pachter, M L Tress, A Valencia, S W Choo, C Y Choo, C Ucla, C Manzano, C Wyss, E Cheung, T G Clark, J B Brown, M Ganesh, S Patel, H Tammana, J Chrast, C N Henrichsen, C Kai, J Kawai, U Nagalakshmi, J Wu, Z Lian, J Lian, P Newburger, X Zhang, P Bickel, J S Mattick, P Carninci, Y Hayashizaki, S Weissman, T Hubbard, R M Myers, J Rogers, P F Stadler, T M Lowe, C L Wei, Y Ruan, K Struhl, M Gerstein, S E Antonarakis, Y Fu, E D Green, U Karaoz, A Siepel, J Taylor, L A Liefer, K A Wetterstrand, P J Good, E A Feingold, M S Guyer, G M Cooper, G Asimenos, C N Dewey, M Hou, S Nikolaev, J I Montoya-Burgos, A Loytynoja, S Whelan, F Pardi, T Massingham, H Huang, N R Zhang, I Holmes, J C Mullikin, A Ureta-Vidal, B Paten, M Seringhaus, D Church, K Rosenbloom, W J Kent, E A Stone, S Batzoglou, N Goldman, R C Hardison, D Haussler, W Miller, A Sidow, N D Trinklein, Z D Zhang, L Barrera, R Stuart, D C King, A Ameur, S Enroth, M C Bieda, J Kim, A A Bhinge, N Jiang, J Liu, F Yao, V B Vega, C W Lee, P Ng, A Yang, Z Moqtaderi, Z Zhu, X Xu, S Squazzo, M J Oberley, D Inman, M A Singer, T A Richmond, K J Munn, A Rada-Iglesias, O Wallerman, J Komorowski, J C Fowler, P Couttet, A W Bruce, O M Dovey, P D Ellis, C F Langford, D A Nix, G Euskirchen, S Hartman, A E Urban, P Kraus, S Van Calcar, N Heintzman, T H Kim, K Wang, C Qu, G Hon, R Luna, C K Glass, M G Rosenfeld, S F Aldred, S J Cooper, A Halees, J M Lin, H P Shulha, M Xu, J N Haidar, Y Yu, V R Iyer, R D Green, C Wadelius, P J Farnham, B Ren, R A Harte, A S Hinrichs, H Trumbower, H Clawson, J Hillman-Jackson, A S Zweig, K Smith, A Thakkapallayil, G Barber, R M Kuhn, D Karolchik, L Armengol, C P Bird, P I de Bakker, A D Kern, N Lopez-Bigas, J D Martin, B E Stranger, A Woodroffe, E Davydov, A Dimas, E Eyras, I B Hallgrimsdottir, J Huppert, M C Zody, G R Abecasis, X Estivill, G G Bouffard, X Guan, N F Hansen, J R Idol, V V Maduro, B Maskeri, J C McDowell, M Park, P J Thomas, A C Young, R W Blakesley, D M Muzny, E Sodergren, D A Wheeler, K C Worley, H Jiang, G M Weinstock, R A Gibbs, T Graves, R Fulton, E R Mardis, R K Wilson, M Clamp, J Cuff, S Gnerre, D B Jaffe, J L Chang, K Lindblad-Toh, E S Lander, M Koriabine, M Nefedov, K Osoegawa, Y Yoshinaga, B Zhu, P J de Jong (2007)  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project   Nature 447: 7146. 799-816  
Abstract: We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
Notes: ENCODE Project Consortium xD;Birney, Ewan xD;Stamatoyannopoulos, John A xD;Dutta, Anindya xD;Guigo, Roderic xD;Gingeras, Thomas R xD;Margulies, Elliott H xD;Weng, Zhiping xD;Snyder, Michael xD;Dermitzakis, Emmanouil T xD;Thurman, Robert E xD;Kuehn, Michael S xD;Taylor, Christopher M xD;Neph, Shane xD;Koch, Christoph M xD;Asthana, Saurabh xD;Malhotra, Ankit xD;Adzhubei, Ivan xD;Greenbaum, Jason A xD;Andrews, Robert M xD;Flicek, Paul xD;Boyle, Patrick J xD;Cao, Hua xD;Carter, Nigel P xD;Clelland, Gayle K xD;Davis, Sean xD;Day, Nathan xD;Dhami, Pawandeep xD;Dillon, Shane C xD;Dorschner, Michael O xD;Fiegler, Heike xD;Giresi, Paul G xD;Goldy, Jeff xD;Hawrylycz, Michael xD;Haydock, Andrew xD;Humbert, Richard xD;James, Keith D xD;Johnson, Brett E xD;Johnson, Ericka M xD;Frum, Tristan T xD;Rosenzweig, Elizabeth R xD;Karnani, Neerja xD;Lee, Kirsten xD;Lefebvre, Gregory C xD;Navas, Patrick A xD;Neri, Fidencio xD;Parker, Stephen C J xD;Sabo, Peter J xD;Sandstrom, Richard xD;Shafer, Anthony xD;Vetrie, David xD;Weaver, Molly xD;Wilcox, Sarah xD;Yu, Man xD;Collins, Francis S xD;Dekker, Job xD;Lieb, Jason D xD;Tullius, Thomas D xD;Crawford, Gregory E xD;Sunyaev, Shamil xD;Noble, William S xD;Dunham, Ian xD;Denoeud, France xD;Reymond, Alexandre xD;Kapranov, Philipp xD;Rozowsky, Joel xD;Zheng, Deyou xD;Castelo, Robert xD;Frankish, Adam xD;Harrow, Jennifer xD;Ghosh, Srinka xD;Sandelin, Albin xD;Hofacker, Ivo L xD;Baertsch, Robert xD;Keefe, Damian xD;Dike, Sujit xD;Cheng, Jill xD;Hirsch, Heather A xD;Sekinger, Edward A xD;Lagarde, Julien xD;Abril, Josep F xD;Shahab, Atif xD;Flamm, Christoph xD;Fried, Claudia xD;Hackermuller, Jorg xD;Hertel, Jana xD;Lindemeyer, Manja xD;Missal, Kristin xD;Tanzer, Andrea xD;Washietl, Stefan xD;Korbel, Jan xD;Emanuelsson, Olof xD;Pedersen, Jakob S xD;Holroyd, Nancy xD;Taylor, Ruth xD;Swarbreck, David xD;Matthews, Nicholas xD;Dickson, Mark C xD;Thomas, Daryl J xD;Weirauch, Matthew T xD;Gilbert, James xD;Drenkow, Jorg xD;Bell, Ian xD;Zhao, XiaoDong xD;Srinivasan, K G xD;Sung, Wing-Kin xD;Ooi, Hong Sain xD;Chiu, Kuo Ping xD;Foissac, Sylvain xD;Alioto, Tyler xD;Brent, Michael xD;Pachter, Lior xD;Tress, Michael L xD;Valencia, Alfonso xD;Choo, Siew Woh xD;Choo, Chiou Yu xD;Ucla, Catherine xD;Manzano, Caroline xD;Wyss, Carine xD;Cheung, Evelyn xD;Clark, Taane G xD;Brown, James B xD;Ganesh, Madhavan xD;Patel, Sandeep xD;Tammana, Hari xD;Chrast, Jacqueline xD;Henrichsen, Charlotte N xD;Kai, Chikatoshi xD;Kawai, Jun xD;Nagalakshmi, Ugrappa xD;Wu, Jiaqian xD;Lian, Zheng xD;Lian, Jin xD;Newburger, Peter xD;Zhang, Xueqing xD;Bickel, Peter xD;Mattick, John S xD;Carninci, Piero xD;Hayashizaki, Yoshihide xD;Weissman, Sherman xD;Hubbard, Tim xD;Myers, Richard M xD;Rogers, Jane xD;Stadler, Peter F xD;Lowe, Todd M xD;Wei, Chia-Lin xD;Ruan, Yijun xD;Struhl, Kevin xD;Gerstein, Mark xD;Antonarakis, Stylianos E xD;Fu, Yutao xD;Green, Eric D xD;Karaoz, Ulas xD;Siepel, Adam xD;Taylor, James xD;Liefer, Laura A xD;Wetterstrand, Kris A xD;Good, Peter J xD;Feingold, Elise A xD;Guyer, Mark S xD;Cooper, Gregory M xD;Asimenos, George xD;Dewey, Colin N xD;Hou, Minmei xD;Nikolaev, Sergey xD;Montoya-Burgos, Juan I xD;Loytynoja, Ari xD;Whelan, Simon xD;Pardi, Fabio xD;Massingham, Tim xD;Huang, Haiyan xD;Zhang, Nancy R xD;Holmes, Ian xD;Mullikin, James C xD;Ureta-Vidal, Abel xD;Paten, Benedict xD;Seringhaus, Michael xD;Church, Deanna xD;Rosenbloom, Kate xD;Kent, W James xD;Stone, Eric A xD;NISC Comparative Sequencing Program xD;Baylor College of Medicine Human Genome Sequencing Center xD;Washington University Genome Sequencing Center xD;Broad Institute xD;Children's Hospital Oakland Research Institute xD;Batzoglou, Serafim xD;Goldman, Nick xD;Hardison, Ross C xD;Haussler, David xD;Miller, Webb xD;Sidow, Arend xD;Trinklein, Nathan D xD;Zhang, Zhengdong D xD;Barrera, Leah xD;Stuart, Rhona xD;King, David C xD;Ameur, Adam xD;Enroth, Stefan xD;Bieda, Mark C xD;Kim, Jonghwan xD;Bhinge, Akshay A xD;Jiang, Nan xD;Liu, Jun xD;Yao, Fei xD;Vega, Vinsensius B xD;Lee, Charlie W H xD;Ng, Patrick xD;Yang, Annie xD;Moqtaderi, Zarmik xD;Zhu, Zhou xD;Xu, Xiaoqin xD;Squazzo, Sharon xD;Oberley, Matthew J xD;Inman, David xD;Singer, Michael A xD;Richmond, Todd A xD;Munn, Kyle J xD;Rada-Iglesias, Alvaro xD;Wallerman, Ola xD;Komorowski, Jan xD;Fowler, Joanna C xD;Couttet, Phillippe xD;Bruce, Alexander W xD;Dovey, Oliver M xD;Ellis, Peter D xD;Langford, Cordelia F xD;Nix, David A xD;Euskirchen, Ghia xD;Hartman, Stephen xD;Urban, Alexander E xD;Kraus, Peter xD;Van Calcar, Sara xD;Heintzman, Nate xD;Kim, Tae Hoon xD;Wang, Kun xD;Qu, Chunxu xD;Hon, Gary xD;Luna, Rosa xD;Glass, Christopher K xD;Rosenfeld, M Geoff xD;Aldred, Shelley Force xD;Cooper, Sara J xD;Halees, Anason xD;Lin, Jane M xD;Shulha, Hennady P xD;Zhang, Xiaoling xD;Xu, Mousheng xD;Haidar, Jaafar N S xD;Yu, Yong xD;Iyer, Vishwanath R xD;Green, Roland D xD;Wadelius, Claes xD;Farnham, Peggy J xD;Ren, Bing xD;Harte, Rachel A xD;Hinrichs, Angie S xD;Trumbower, Heather xD;Clawson, Hiram xD;Hillman-Jackson, Jennifer xD;Zweig, Ann S xD;Smith, Kayla xD;Thakkapallayil, Archana xD;Barber, Galt xD;Kuhn, Robert M xD;Karolchik, Donna xD;Armengol, Lluis xD;Bird, Christine P xD;de Bakker, Paul I W xD;Kern, Andrew D xD;Lopez-Bigas, Nuria xD;Martin, Joel D xD;Stranger, Barbara E xD;Woodroffe, Abigail xD;Davydov, Eugene xD;Dimas, Antigone xD;Eyras, Eduardo xD;Hallgrimsdottir, Ingileif B xD;Huppert, Julian xD;Zody, Michael C xD;Abecasis, Goncalo R xD;Estivill, Xavier xD;Bouffard, Gerard G xD;Guan, Xiaobin xD;Hansen, Nancy F xD;Idol, Jacquelyn R xD;Maduro, Valerie V B xD;Maskeri, Baishali xD;McDowell, Jennifer C xD;Park, Morgan xD;Thomas, Pamela J xD;Young, Alice C xD;Blakesley, Robert W xD;Muzny, Donna M xD;Sodergren, Erica xD;Wheeler, David A xD;Worley, Kim C xD;Jiang, Huaiyang xD;Weinstock, George M xD;Gibbs, Richard A xD;Graves, Tina xD;Fulton, Robert xD;Mardis, Elaine R xD;Wilson, Richard K xD;Clamp, Michele xD;Cuff, James xD;Gnerre, Sante xD;Jaffe, David B xD;Chang, Jean L xD;Lindblad-Toh, Kerstin xD;Lander, Eric S xD;Koriabine, Maxim xD;Nefedov, Mikhail xD;Osoegawa, Kazutoyo xD;Yoshinaga, Yuko xD;Zhu, Baoli xD;de Jong, Pieter J xD;062023/Wellcome Trust/United Kingdom xD;077198/Wellcome Trust/United Kingdom xD;K22 HG003169-01A1/HG/NHGRI NIH HHS/United States xD;P41 HG002371-03S1/HG/NHGRI NIH HHS/United States xD;R01 HG002238-15/HG/NHGRI NIH HHS/United States xD;R01 HG003110-03/HG/NHGRI NIH HHS/United States xD;R01 HG003129-03/HG/NHGRI NIH HHS/United States xD;R01 HG003143-04/HG/NHGRI NIH HHS/United States xD;R01 HG003521-01/HG/NHGRI NIH HHS/United States xD;R01 HG003532-01/HG/NHGRI NIH HHS/United States xD;R01 HG003541-03/HG/NHGRI NIH HHS/United States xD;U01 HG002523-01/HG/NHGRI NIH HHS/United States xD;U01 HG003147-02/HG/NHGRI NIH HHS/United States xD;U01 HG003150-03/HG/NHGRI NIH HHS/United States xD;U01 HG003151-03/HG/NHGRI NIH HHS/United States xD;U01 HG003156-03/HG/NHGRI NIH HHS/United States xD;U01 HG003157-03/HG/NHGRI NIH HHS/United States xD;U01 HG003161-03/HG/NHGRI NIH HHS/United States xD;U01 HG003162-03/HG/NHGRI NIH HHS/United States xD;U01 HG003168-02/HG/NHGRI NIH HHS/United States xD;U54 HG003067-01/HG/NHGRI NIH HHS/United States xD;U54 HG003079-01/HG/NHGRI NIH HHS/United States xD;U54 HG003273-01/HG/NHGRI NIH HHS/United States xD;Wellcome Trust/United Kingdom xD;Research Support, N.I.H., Extramural xD;Research Support, Non-U.S. Gov't xD;Research Support, U.S. Gov't, Non-P.H.S. xD;England xD;Nature xD;Nature. 2007 Jun 14;447(7146):799-816.
A Sandelin, P Carninci, B Lenhard, J Ponjavic, Y Hayashizaki, D A Hume (2007)  Mammalian RNA polymerase II core promoters : insights from genome-wide studies   Nat Rev Genet 8: 6. 424-36  
Abstract: The identification and characterization of mammalian core promoters and transcription start sites is a prerequisite to understanding how RNA polymerase II transcription is controlled. New experimental technologies have enabled genome-wide discovery and characterization of core promoters, revealing that most mammalian genes do not conform to the simple model in which a TATA box directs transcription from a single defined nucleotide position. In fact, most genes have multiple promoters, within which there are multiple start sites, and alternative promoter usage generates diversity and complexity in the mammalian transcriptome and proteome. Promoters can be described by their start site usage distribution, which is coupled to the occurrence of cis-regulatory elements, gene function and evolutionary constraints. A comprehensive survey of mammalian promoters is a major step towards describing and understanding transcriptional control networks.
Notes: Journal Article xD;England
The_ENCODE_Consortium (2007)  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project   Nature 447: 7146. 799-816  
Abstract: We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
Notes: ENCODE Project Consortium xD;Wellcome Trust xD;Journal Article xD;Research Support, N.I.H., Extramural xD;Research Support, Non-U.S. Gov't xD;Research Support, U.S. Gov't, Non-P.H.S. xD;England
2006
M C Frith, J Ponjavic, D Fredman, C Kai, J Kawai, P Carninci, Y Hayashizaki, A Sandelin* (2006)  Evolutionary turnover of mammalian transcription start sites   Genome Res 16: 6. 713-22  
Abstract: Alignments of homologous genomic sequences are widely used to identify functional genetic elements and study their evolution. Most studies tacitly equate homology of functional elements with sequence homology. This assumption is violated by the phenomenon of turnover, in which functionally equivalent elements reside at locations that are nonorthologous at the sequence level. Turnover has been demonstrated previously for transcription-factor-binding sites. Here, we show that transcription start sites of equivalent genes do not always reside at equivalent locations in the human and mouse genomes. We also identify two types of partial turnover, illustrating evolutionary pathways that could lead to complete turnover. These findings suggest that the signals encoding transcription start sites are highly flexible and evolvable, and have cautionary implications for the use of sequence-level conservation to detect gene regulatory elements.
Notes: Frith, Martin C xD;Ponjavic, Jasmina xD;Fredman, David xD;Kai, Chikatoshi xD;Kawai, Jun xD;Carninci, Piero xD;Hayashizaki, Yoshihide xD;Sandelin, Albin xD;Comparative Study xD;Research Support, Non-U.S. Gov't xD;United States xD;Genome research xD;Genome Res. 2006 Jun;16(6):713-22. Epub 2006 May 10.
P Carninci#, A Sandelin#, B Lenhard#, S Katayama, K Shimokawa, J Ponjavic, C A Semple, M S Taylor, P G Engstrom, M C Frith, A R Forrest, W B Alkema, S L Tan, C Plessy, R Kodzius, T Ravasi, T Kasukawa, S Fukuda, M Kanamori-Katayama, Y Kitazume, H Kawaji, C Kai, M Nakamura, H Konno, K Nakano, S Mottagui-Tabar, P Arner, A Chesi, S Gustincich, F Persichetti, H Suzuki, S M Grimmond, C A Wells, V Orlando, C Wahlestedt, E T Liu, M Harbers, J Kawai, V B Bajic, D A Hume, Y Hayashizaki (2006)  Genome-wide analysis of mammalian promoter architecture and evolution   Nat Genet 38: 6. 626-35  
Abstract: Mammalian promoters can be separated into two classes, conserved TATA box-enriched promoters, which initiate at a well-defined site, and more plastic, broad and evolvable CpG-rich promoters. We have sequenced tags corresponding to several hundred thousand transcription start sites (TSSs) in the mouse and human genomes, allowing precise analysis of the sequence architecture and evolution of distinct promoter classes. Different tissues and families of genes differentially use distinct types of promoters. Our tagging methods allow quantitative analysis of promoter usage in different tissues and show that differentially regulated alternative TSSs are a common feature in protein-coding genes and commonly generate alternative N termini. Among the TSSs, we identified new start sites associated with the majority of exons and with 3' UTRs. These data permit genome-scale identification of tissue-specific promoters and analysis of the cis-acting elements associated with them.
Notes: Journal Article xD;Research Support, Non-U.S. Gov't xD;United States
S Gustincich, A Sandelin, C Plessy, S Katayama, R Simone, D Lazarevic, Y Hayashizaki, P Carninci (2006)  The complexity of the mammalian transcriptome   J Physiol 575: Pt 2. 321-32  
Abstract: A comprehensive understanding of protein and regulatory networks is strictly dependent on the complete description of the transcriptome of cells. After the determination of the genome sequence of several mammalian species, gene identification is based on in silico predictions followed by evidence of transcription. Conservative estimates suggest that there are about 20,000 protein-encoding genes in the mammalian genome. In the last few years the combination of full-length cDNA cloning, cap-analysis gene expression (CAGE) tag sequencing and tiling arrays experiments have unveiled unexpected additional complexities in the transcriptome. Here we describe the current view of the mammalian transcriptome focusing on transcripts diversity, the growing non-coding RNA world, the organization of transcriptional units in the genome and promoter structures. In-depth analysis of the brain transcriptome has been challenging due to the cellular complexity of this organ. Here we present a computational analysis of CAGE data from different regions of the central nervous system, suggesting distinctive mechanisms of brain-specific transcription.
Notes: Journal Article xD;Research Support, Non-U.S. Gov't xD;Review xD;England
P J Bailey, J M Klos, E Andersson, M Karlen, M Kallstrom, J Ponjavic, J Muhr, B Lenhard*, A Sandelin*, J Ericson* (2006)  A global genomic transcriptional code associated with CNS-expressed genes   Exp Cell Res  
Abstract: Highly conserved non-coding DNA regions (HCNR) occur frequently in vertebrate genomes, but their functional roles remain unclear. Here, we provide evidence that a large portion of HCNRs are enriched for binding sites for Sox, POU and Homeodomain transcription factors, and such HCNRs can act as cis-regulatory regions active in neural stem cells. Strikingly, these HCNRs are linked to several hundreds of genes expressed in the developing CNS and they may exert locus-wide regulatory effects on multiple genes flanking their genomic location. Moreover, these data imply a unifying transcriptional logic for a large set of CNS-expressed genes in which Sox and POU proteins act as generic promoters of transcription while Homeodomain proteins control the spatial expression of genes through active repression.
Notes: 0014-4827 (Print) xD;Journal article
H Kawaji, M C Frith, S Katayama, A Sandelin, C Kai, J Kawai, P Carninci, Y Hayashizaki (2006)  Dynamic usage of transcription start sites within core promoters   Genome Biol 7: 12.  
Abstract: BACKGROUND: Mammalian promoters do not initiate transcription at single, well defined base pairs, but rather at multiple, alternative start sites spread across a region. We previously characterized the static structures of transcription start site usage within promoters at the base pair level, based on large-scale sequencing of transcript 5' ends. RESULTS: In the present study we begin to explore the internal dynamics of mammalian promoters, and demonstrate that start site selection within many mouse core promoters varies among tissues. We also show that this dynamic usage of start sites is associated with CpG islands, broad and multimodal promoter structures, and imprinting. CONCLUSION: Our results reveal a new level of biologic complexity within promoters--fine-scale regulation of transcription starting events at the base pair level. These events are likely to be related to epigenetic transcriptional regulation.
Notes: Journal Article xD;Research Support, Non-U.S. Gov't xD;England
D Vlieghe, A Sandelin, P J De Bleser, K Vleminckx, W W Wasserman, F van Roy, B Lenhard (2006)  A new generation of JASPAR, the open-access repository for transcription factor binding site profiles   Nucleic Acids Res 34: Database issue. D95-7  
Abstract: JASPAR is the most complete open-access collection of transcription factor binding site (TFBS) matrices. In this new release, JASPAR grows into a meta-database of collections of TFBS models derived by diverse approaches. We present JASPAR CORE--an expanded version of the original, non-redundant collection of annotated, high-quality matrix-based transcription factor binding profiles, JASPAR FAM--a collection of familial TFBS models and JASPAR phyloFACTS--a set of matrices computationally derived from statistically overrepresented, evolutionarily conserved regulatory region motifs from mammalian genomes. JASPAR phyloFACTS serves as a non-redundant extension to JASPAR CORE, enhancing the overall breadth of JASPAR for promoter sequence analysis. The new release of JASPAR is available at http://jaspar.genereg.net.
Notes: 1362-4962 (Electronic) xD;Journal Article
J Ponjavic, B Lenhard, C Kai, J Kawai, P Carninci, Y Hayashizaki, A Sandelin* (2006)  Transcriptional and structural impact of TATA-initiation site spacing in mammalian core promoters   Genome Biol 7: 8.  
Abstract: ABSTRACT: BACKGROUND: The TATA-box, one of the most well-studied core promoter elements, is associated with induced, context-specific expression. The lack of precise transcription start site (TSS) locations linked with expression information has impeded genome-wide characterization of the interaction between TATA and the preinitiation complex. RESULTS: Using a comprehensive set of 5.66x10;6 sequenced 5' cDNA ends from diverse tissues mapped to the mouse genome, we show that the TATA-TSS distance is correlated with the tissue specificity of the downstream transcript. To achieve tissue-specific regulation, the TATA-box position relative to the TSS is constrained to a narrow window (-32 to -29), where position -31 and -30 constitute the optimal positions for achieving high tissue specificity. Slightly larger spacings can be accommodated only when there is no optimally spaced initiation signal; in contrast, the TATA-box-like motifs found downstream of position -28 are generally nonfunctional. The strength of the TATA binding protein-DNA interaction plays a subordinate role to spacing in terms of tissue specificity. Furthermore, promoters with different TATA-TSS spacings have distinct features in terms of consensus sequence around the initiation site and distribution of alternative TSSs. Unexpectedly, promoters which have two dominant, consecutive TSSs are TATA-depleted and have a novel GGG initiation site consensus. CONCLUSION: In this study, we present the most comprehensive characterization of TATA-TSS spacing and functionality to date. The coupling of spacing to tissue specificity on the transcriptome level provides important clues on the function of core promoters and the choice of TSS by the preinitiation complex.
Notes: 1465-6914 (Electronic) xD;Journal article
2005
P Carninci, T Kasukawa, S Katayama, J Gough, M C Frith, N Maeda, R Oyama, T Ravasi, B Lenhard, C Wells, R Kodzius, K Shimokawa, V B Bajic, S E Brenner, S Batalov, A R Forrest, M Zavolan, M J Davis, L G Wilming, V Aidinis, J E Allen, A Ambesi-Impiombato, R Apweiler, R N Aturaliya, T L Bailey, M Bansal, L Baxter, K W Beisel, T Bersano, H Bono, A M Chalk, K P Chiu, V Choudhary, A Christoffels, D R Clutterbuck, M L Crowe, E Dalla, B P Dalrymple, B de Bono, G Della Gatta, D di Bernardo, T Down, P Engstrom, M Fagiolini, G Faulkner, C F Fletcher, T Fukushima, M Furuno, S Futaki, M Gariboldi, P Georgii-Hemming, T R Gingeras, T Gojobori, R E Green, S Gustincich, M Harbers, Y Hayashi, T K Hensch, N Hirokawa, D Hill, L Huminiecki, M Iacono, K Ikeo, A Iwama, T Ishikawa, M Jakt, A Kanapin, M Katoh, Y Kawasawa, J Kelso, H Kitamura, H Kitano, G Kollias, S P Krishnan, A Kruger, S K Kummerfeld, I V Kurochkin, L F Lareau, D Lazarevic, L Lipovich, J Liu, S Liuni, S McWilliam, M Madan Babu, M Madera, L Marchionni, H Matsuda, S Matsuzawa, H Miki, F Mignone, S Miyake, K Morris, S Mottagui-Tabar, N Mulder, N Nakano, H Nakauchi, P Ng, R Nilsson, S Nishiguchi, S Nishikawa, F Nori, O Ohara, Y Okazaki, V Orlando, K C Pang, W J Pavan, G Pavesi, G Pesole, N Petrovsky, S Piazza, J Reed, J F Reid, B Z Ring, M Ringwald, B Rost, Y Ruan, S L Salzberg, A Sandelin, C Schneider, C Schonbach, K Sekiguchi, C A Semple, S Seno, L Sessa, Y Sheng, Y Shibata, H Shimada, K Shimada, D Silva, B Sinclair, S Sperling, E Stupka, K Sugiura, R Sultana, Y Takenaka, K Taki, K Tammoja, S L Tan, S Tang, M S Taylor, J Tegner, S A Teichmann, H R Ueda, E van Nimwegen, R Verardo, C L Wei, K Yagi, H Yamanishi, E Zabarovsky, S Zhu, A Zimmer, W Hide, C Bult, S M Grimmond, R D Teasdale, E T Liu, V Brusic, J Quackenbush, C Wahlestedt, J S Mattick, D A Hume, C Kai, D Sasaki, Y Tomaru, S Fukuda, M Kanamori-Katayama, M Suzuki, J Aoki, T Arakawa, J Iida, K Imamura, M Itoh, T Kato, H Kawaji, N Kawagashira, T Kawashima, M Kojima, S Kondo, H Konno, K Nakano, N Ninomiya, T Nishio, M Okada, C Plessy, K Shibata, T Shiraki, S Suzuki, M Tagami, K Waki, A Watahiki, Y Okamura-Oho, H Suzuki, J Kawai, Y Hayashizaki (2005)  The transcriptional landscape of the mammalian genome   Science 309: 5740. 1559-63  
Abstract: This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development.
Notes: 1095-9203 xD;Journal Article
S Katayama, Y Tomaru, T Kasukawa, K Waki, M Nakanishi, M Nakamura, H Nishida, C C Yap, M Suzuki, J Kawai, H Suzuki, P Carninci, Y Hayashizaki, C Wells, M Frith, T Ravasi, K C Pang, J Hallinan, J Mattick, D A Hume, L Lipovich, S Batalov, P G Engstrom, Y Mizuno, M A Faghihi, A Sandelin, A M Chalk, S Mottagui-Tabar, Z Liang, B Lenhard, C Wahlestedt (2005)  Antisense transcription in the mammalian transcriptome   Science 309: 5740. 1564-6  
Abstract: Antisense transcription (transcription from the opposite strand to a protein-coding or sense strand) has been ascribed roles in gene regulation involving degradation of the corresponding sense transcripts (RNA interference), as well as gene silencing at the chromatin level. Global transcriptome analysis provides evidence that a large proportion of the genome can produce transcripts from both strands, and that antisense transcripts commonly link neighboring "genes" in complex loci into chains of linked transcriptional units. Expression profiling reveals frequent concordant regulation of sense/antisense pairs. We present experimental evidence that perturbation of an antisense RNA can alter the expression of sense messenger RNAs, suggesting that antisense transcription contributes to control of transcriptional outputs in mammals.
Notes: 1095-9203 xD;Journal Article
A Sandelin, W W Wasserman (2005)  Prediction of nuclear hormone receptor response elements   Mol Endocrinol 19: 3. 595-606  
Abstract: The nuclear receptor (NR) class of transcription factors controls critical regulatory events in key developmental processes, homeostasis maintenance, and medically important diseases and conditions. Identification of the members of a regulon controlled by a NR could provide an accelerated understanding of development and disease. New bioinformatics methods for the analysis of regulatory sequences are required to address the complex properties associated with known regulatory elements targeted by the receptors because the standard methods for binding site prediction fail to reflect the diverse target site configurations. We have constructed a flexible Hidden Markov Model framework capable of predicting NHR binding sites. The model allows for variable spacing and orientation of half-sites. In a genome-scale analysis enabled by the model, we show that NRs in Fugu rubripes have a significant cross-regulatory potential. The model is implemented in a web interface, freely available for academic researchers, available at http://mordor.cgb.ki.se/NHR-scan.
Notes: 0888-8809 xD;Journal Article
N Stahlberg, R Merino, L H Hernandez, L Fernandez-Perez, A Sandelin, P Engstrom, P Tollet-Egnell, B Lenhard, A Flores-Morales (2005)  Exploring hepatic hormone actions using a compilation of gene expression profiles   BMC Physiol 5: 1.  
Abstract: BACKGROUND: Microarray analysis is attractive within the field of endocrine research because regulation of gene expression is a key mechanism whereby hormones exert their actions. Knowledge discovery and testing of hypothesis based on information-rich expression profiles promise to accelerate discovery of physiologically relevant hormonal mechanisms of action. However, most studies so-far concentrate on the analysis of actions of single hormones and few examples exist that attempt to use compilation of different hormone-regulated expression profiles to gain insight into how hormone act to regulate tissue physiology. This report illustrates how a meta-analysis of multiple transcript profiles obtained from a single tissue, the liver, can be used to evaluate relevant hypothesis and discover novel mechanisms of hormonal action. We have evaluated the differential effects of Growth Hormone (GH) and estrogen in the regulation of hepatic gender differentiated gene expression as well as the involvement of sterol regulatory element-binding proteins (SREBPs) in the hepatic actions of GH and thyroid hormone. RESULTS: Little similarity exists between liver transcript profiles regulated by 17-alpha-ethinylestradiol and those induced by the continuos infusion of bGH. On the other hand, strong correlations were found between both profiles and the female enriched transcript profile. Therefore, estrogens have feminizing effects in male rat liver which are different from those induced by GH. The similarity between bGH and T3 were limited to a small group of genes, most of which are involved in lipogenesis. An in silico promoter analysis of genes rapidly regulated by thyroid hormone predicted the activation of SREBPs by short-term treatment in vivo. It was further demonstrated that proteolytic processing of SREBP1 in the endoplasmic reticulum might contribute to the rapid actions of T3 on these genes. CONCLUSION: This report illustrates how a meta-analysis of multiple transcript profiles can be used to link knowledge concerning endocrine physiology to hormonally induced changes in gene expression. We conclude that both GH and estrogen are important determinants of gender-related differences in hepatic gene expression. Rapid hepatic thyroid hormone effects affect genes involved in lipogenesis possibly through the induction of SREBP1 proteolytic processing.
Notes: 1472-6793 xD;Journal Article
2004
A Sandelin, W Alkema, P Engstrom, W W Wasserman, B Lenhard (2004)  JASPAR : an open-access database for eukaryotic transcription factor binding profiles   Nucleic Acids Res 32: 1. D91-4  
Abstract: The analysis of regulatory regions in genome sequences is strongly based on the detection of potential transcription factor binding sites. The preferred models for representation of transcription factor binding specificity have been termed position-specific scoring matrices. JASPAR is an open-access database of annotated, high-quality, matrix-based transcription factor binding site profiles for multicellular eukaryotes. The profiles were derived exclusively from sets of nucleotide sequences experimentally demonstrated to bind transcription factors. The database is complemented by a web interface for browsing, searching and subset selection, an online sequence analysis utility and a suite of programming tools for genome-wide and comparative genomic analysis of regulatory regions. JASPAR is available at http://jaspar. cgb.ki.se.
Notes: 1362-4962 xD;Journal Article
A Sandelin, P Bailey, S Bruce, P G Engstrom, J M Klos, W W Wasserman, J Ericson, B Lenhard (2004)  Arrays of ultraconserved non-coding regions span the loci of key developmental genes in vertebrate genomes   BMC Genomics 5: 1.  
Abstract: BACKGROUND: Evolutionarily conserved sequences within or adjoining orthologous genes often serve as critical cis-regulatory regions. Recent studies have identified long, non-coding genomic regions that are perfectly conserved between human and mouse, termed ultra-conserved regions (UCRs). Here, we focus on UCRs that cluster around genes involved in early vertebrate development; genes conserved over 450 million years of vertebrate evolution. RESULTS: Based on a high resolution detection procedure, our UCR set enables novel insights into vertebrate genome organization and regulation of developmentally important genes. We find that the genomic positions of deeply conserved UCRs are strongly associated with the locations of genes encoding key regulators of development, with particularly strong positional correlation to transcription factor-encoding genes. Of particular importance is the observation that most UCRs are clustered into arrays that span hundreds of kilobases around their presumptive target genes. Such a hallmark signature is present around several uncharacterized human genes predicted to encode developmentally important DNA-binding proteins. CONCLUSION: The genomic organization of UCRs, combined with previous findings, suggests that UCRs act as essential long-range modulators of gene expression. The exceptional sequence conservation and clustered structure suggests that UCR-mediated molecular events involve greater complexity than traditional DNA binding by transcription factors. The high-resolution UCR collection presented here provides a wealth of target sequences for future experimental studies to determine the nature of the biochemical mechanisms involved in the preservation of arrays of nearly identical non-coding sequences over the course of vertebrate evolution.
Notes: 1471-2164 xD;Journal Article
A Sandelin, W W Wasserman (2004)  Constrained Binding Site Diversity within Families of Transcription Factors Enhances Pattern Discovery Bioinformatics   J Mol Biol 338: 2. 207-15  
Abstract: Diverse computational and experimental efforts are required to elucidate the control circuitry regulating the transcription of human genes. The fusion of gene-specific promoter analyses with large microarray studies and bioinformatics advances has produced optimism that significant progress can be made in unravelling this complex network. Within bioinformatics, past emphasis for improved pattern discovery has been placed upon "phylogenetic footprinting", the identification of sequences conserved over moderate periods of evolution (e.g. human and mouse comparisons). We introduce a new direction in bioinformatics based on the constraints imposed by the structures of DNA-binding proteins. For most structurally related families of transcription factors, there are clear similarities in the sequences of the sites to which they bind. On the basis of this observation, we construct familial binding profiles for well-characterized transcription factor families. The profiles are shown to classify correctly the structural class of mediating transcription factors for novel motifs in 88% of cases. By incorporating the familial profiles into pattern discovery procedures, we demonstrate that functional binding sites can be found in genomic sequences of dramatically greater length than is possible otherwise. Thus, incorporating familial models can overcome the signal-to-noise challenge that has hindered the transition from microarray data to regulatory control sequences for human genes. Biochemically motivated constraints upon sequence diversity of binding sites will complement the genetically motivated constraints imposed in "phylogenetic footprinting" algorithms.
Notes: 0022-2836 xD;Journal Article
A Sandelin, W W Wasserman, B Lenhard (2004)  ConSite : web-based prediction of regulatory elements using cross-species comparison   Nucleic Acids Res 32: Web Server issue. W249-52  
Abstract: ConSite is a user-friendly, web-based tool for finding cis-regulatory elements in genomic sequences. Predictions are based on the integration of binding site prediction generated with high-quality transcription factor models and cross-species comparison filtering (phylogenetic footprinting). By incorporating evolutionary constraints, selectivity is increased by an order of magnitude as compared to single-sequence analysis. ConSite offers several unique features, including an interactive expert system for retrieving orthologous regulatory sequences. Programming modules and biological databases that form the foundation of the ConSite service are freely available to the research community. ConSite is available at http:/www.phylofoot.org/consite.
Notes: 1362-4962 xD;Journal Article
2003
B Lenhard#, A Sandelin#, L Mendoza, P Engstrom, N Jareborg, W W Wasserman (2003)  Identification of conserved regulatory elements by comparative genome analysis   J Biol 2: 2.  
Abstract: BACKGROUND: For genes that have been successfully delineated within the human genome sequence, most regulatory sequences remain to be elucidated. The annotation and interpretation process requires additional data resources and significant improvements in computational methods for the detection of regulatory regions. One approach of growing popularity is based on the preferential conservation of functional sequences over the course of evolution by selective pressure, termed 'phylogenetic footprinting'. Mutations are more likely to be disruptive if they appear in functional sites, resulting in a measurable difference in evolution rates between functional and non-functional genomic segments. RESULTS: We have devised a flexible suite of methods for the identification and visualization of conserved transcription-factor-binding sites. The system reports those putative transcription-factor-binding sites that are both situated in conserved regions and located as pairs of sites in equivalent positions in alignments between two orthologous sequences. An underlying collection of metazoan transcription-factor-binding profiles was assembled to facilitate the study. This approach results in a significant improvement in the detection of transcription-factor-binding sites because of an increased signal-to-noise ratio, as demonstrated with two sets of promoter sequences. The method is implemented as a graphical web application, ConSite, which is at the disposal of the scientific community at http://www.phylofoot.org/. CONCLUSIONS: Phylogenetic footprinting dramatically improves the predictive selectivity of bioinformatic approaches to the analysis of promoter sequences. ConSite delivers unparalleled performance using a novel database of high-quality binding models for metazoan transcription factors. With a dynamic interface, this bioinformatics tool provides broad access to promoter analysis with phylogenetic footprinting.
Notes: 0 xD;1475-4924 xD;Journal article
A Sandelin, A Hoglund, B Lenhard, W W Wasserman (2003)  Integrated analysis of yeast regulatory sequences for biologically linked clusters of genes   Funct Integr Genomics 3: 3. 125-34  
Abstract: Dramatic progress in deciphering the regulatory controls in Saccharomyces cerevisiae has been enabled by the fusion of high-throughput genomics technologies with advanced sequence analysis algorithms. Sets of genes likely to function together and with similar expression profiles have been identified in diverse studies. By fusing an advanced pattern recognition algorithm for identification of transcription factor binding sites with a new method for the quantitative comparison of binding properties of transcription factors, we provide an integrated means to move from expression data to biological insights. The Yeast Regulatory Sequence Analysis system, YRSA, combines standard functions with a novel pattern characterization procedure in an intuitive interface designed for use by a broad range of scientists. The features of the system include automated retrieval of user-defined promoter sequences, binding site discovery by pattern recognition, graphical displays of the observed pattern and positions of similar sequences in the specified genes, and comparison of the new pattern against a collection of binding patterns for characterized transcription factors. The comprehensive YRSA system was used to study the regulatory mechanisms of yeast regulons. Analysis of the regulatory controls of a battery of genes induced by DNA damaging agents supports a putative mediating role for the cell-cycle checkpoint regulatory element MCB. YRSA is available at http://yrsa.cgb.ki.se. [YRSA: ancient Scandinavian name meaning old she-bear (Latin Ursus arctos = brown bear/grizzly).]
Notes: 22777822 xD;1438-793x xD;Journal Article
2002
Y Okazaki, M Furuno, T Kasukawa, J Adachi, H Bono, S Kondo, I Nikaido, N Osato, R Saito, H Suzuki, I Yamanaka, H Kiyosawa, K Yagi, Y Tomaru, Y Hasegawa, A Nogami, C Schonbach, T Gojobori, R Baldarelli, D P Hill, C Bult, D A Hume, J Quackenbush, L M Schriml, A Kanapin, H Matsuda, S Batalov, K W Beisel, J A Blake, D Bradt, V Brusic, C Chothia, L E Corbani, S Cousins, E Dalla, T A Dragani, C F Fletcher, A Forrest, K S Frazer, T Gaasterland, M Gariboldi, C Gissi, A Godzik, J Gough, S Grimmond, S Gustincich, N Hirokawa, I J Jackson, E D Jarvis, A Kanai, H Kawaji, Y Kawasawa, R M Kedzierski, B L King, A Konagaya, I V Kurochkin, Y Lee, B Lenhard, P A Lyons, D R Maglott, L Maltais, L Marchionni, L McKenzie, H Miki, T Nagashima, K Numata, T Okido, W J Pavan, G Pertea, G Pesole, N Petrovsky, R Pillai, J U Pontius, D Qi, S Ramachandran, T Ravasi, J C Reed, D J Reed, J Reid, B Z Ring, M Ringwald, A Sandelin, C Schneider, C A Semple, M Setou, K Shimada, R Sultana, Y Takenaka, M S Taylor, R D Teasdale, M Tomita, R Verardo, L Wagner, C Wahlestedt, Y Wang, Y Watanabe, C Wells, L G Wilming, A Wynshaw-Boris, M Yanagisawa, I Yang, L Yang, Z Yuan, M Zavolan, Y Zhu, A Zimmer, P Carninci, N Hayatsu, T Hirozane-Kishikawa, H Konno, M Nakamura, N Sakazume, K Sato, T Shiraki, K Waki, J Kawai, K Aizawa, T Arakawa, S Fukuda, A Hara, W Hashizume, K Imotani, Y Ishii, M Itoh, I Kagawa, A Miyazaki, K Sakai, D Sasaki, K Shibata, A Shinagawa, A Yasunishi, M Yoshino, R Waterston, E S Lander, J Rogers, E Birney, Y Hayashizaki (2002)  Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs   Nature 420: 6915. 563-73  
Abstract: Only a small proportion of the mouse genome is transcribed into mature messenger RNA transcripts. There is an international collaborative effort to identify all full-length mRNA transcripts from the mouse, and to ensure that each is represented in a physical collection of clones. Here we report the manual annotation of 60,770 full-length mouse complementary DNA sequences. These are clustered into 33,409 'transcriptional units', contributing 90.1% of a newly established mouse transcriptome database. Of these transcriptional units, 4,258 are new protein-coding and 11,665 are new non-coding messages, indicating that non-coding RNA is a major component of the transcriptome. 41% of all transcriptional units showed evidence of alternative splicing. In protein-coding transcripts, 79% of splice variations altered the protein product. Whole-transcriptome analyses resulted in the identification of 2,431 sense-antisense pairs. The present work, completely supported by physical clones, provides the most comprehensive survey of a mammalian transcriptome so far, and is a valuable resource for functional genomics.
Notes: 22354683 xD;0028-0836 xD;Journal Article

Book chapters

2009

PhD theses

2004
A Sandelin* (2004)  In silico prediction of cis-regulatory elements    
Abstract: As one of the most fundamental processes for all life forms, transcriptional regulation remains an intriguing and challenging subject for biomedical research. Experimental efforts towards understanding the regulation of genes is laborious and expensive, but can be substantially accelerated with the use of computational predictions. The growing number of fully sequenced metazoan genomes in combination with the increasing use of high-throughput methods such as microarrays has increased the necessity of combining computational methods with laboratorial. Computational 訴n-silicoã»methods for the prediction of transcription factor binding sites are mature, yet critical problems remain unsolved. In particular, the rate of falsely predicted sites is unacceptably high with current methods, due to the small and degenerate binding sites targeted by transcription factors. xD;In addition to the false prediction rate, this restriction limits the ability of pattern discovery algorithms to find mediating binding sites in promoters of co-expressed genes. The latter problem constitutes a bottleneck when analyzing regulatory sequences in complex eukaryotes, as regulatory sequences generally are spread over extended genomic regions. xD; xD;This thesis describes the development of algorithms and resources for transcription factor binding site analysis in addressing: xD;site prediction, where a model describing the binding properties of a transcription factor is applied to a sequence to find functional binding sites xD;pattern discovery, where over-represented patterns are sought in sets of promoters. xD; xD;Initially, an open-access database (JASPAR) was created, holding high quality models for transcription factor sites. The database formed part of the foundation for the subsequent project (ConSite), where a set of methods were developed for utilizing cross-species comparison in binding site prediction (å¢hylogenetic footprintingã» to enhance predictive selectivity. In this study, we could show that ~85% of false predictions were removed when only analyzing promoter regions conserved between human and mouse. xD;The current statistical framework for modeling binding properties of transcription factors is inadequate for some regulatory proteins, most notably the medically important nuclear hormone receptors. A Hidden Markov Model framework capable of both predicting and classifying nuclear hormone receptor response elements was developed. In a case study, we showed that nuclear receptor genes have a high potential for cross-or auto regulation using the pufferfish genome as a predictive platform. xD;Pattern discovery in promoters of multi-cellular eukaryotes is limited by the low strength of patterns buried in extended genomic sequence. Methods for improving both sensitivity and evaluation of resulting patterns were developed. We showed that comparison of newly found patterns to databases of experimentally verified profiles is a meaningful complement to other means to evaluate patters. Furthermore, we showed that structural constraints that are shared by families of transcription factors can be integrated as prior expectations in pattern finder algorithms for a significant increase in sensitivity.
Notes:
Powered by PublicationsList.org.