Short Curriculum BIOCRATES Life Sciences AG, Innsbruck, Austria since Dec 2010 Biostatistics Scientist
Centro de Investigación en Matemáticas, A.C., Guanajuato, Mexico May 2010 - Jul 2010 Visiting scientist
Countries of Latin America Nov 2009 - May 2010 Language and cultural stay
Julius-Maximilians-University, Würzburg Nov 2009 PhD in statistical biology Dec 2005 - May/Oct 2009 Research fellow, Department of Bioinformatics and Institute of Molecular Infection Biology Oct 1999 - Apr 2005 Diploma in biology
Laboratoire de Biochimie et Biophysique des Systemes Intégrés, CEA-Grenoble Apr 2003 - Jun 2003 Project on Cadmium toxicity in Dictyostelium discoideum
University of Applied Science, Hamburg Aug 1997 - Sep 1999 Undergraduate studies in biotechnology
Abstract: Two-component systems (TCS) are short signalling pathways generally occurring in prokaryotes. They frequently regulate prokaryotic stimulus responses and thus are also of interest for engineering in biotechnology and synthetic biology. The aim of this study is to better understand and describe rewiring of TCS while investigating different evolutionary scenarios.
Based on large-scale screens of TCS in different organisms, this study gives detailed data, concrete alignments, and structure analysis on three general modification scenarios, where TCS were rewired for new responses and functions: (i) exchanges in the sequence within single TCS domains, (ii) exchange of whole TCS domains; (iii) addition of new components modulating TCS function.
As a result, the replacement of stimulus and promotor cassettes to rewire TCS is well defined exploiting the alignments given here. The diverged TCS examples are non-trivial and the design is challenging. Designed connector proteins may also be useful to modify TCS in selected cases.
Abstract: Background
The Enterobacteriaceae comprise a large number of clinically relevant species with several individual subspecies. Overlapping virulence-associated gene pools and the high overall genome plasticity often interferes with correct enterobacterial strain typing and risk assessment. Array technology offers a fast, reproducible and standardisable means for bacterial typing and thus provides many advantages for bacterial diagnostics, risk assessment and surveillance. The development of highly discriminative broad-range microbial diagnostic microarrays remains a challenge, because of marked genome plasticity of many bacterial pathogens.
Results
We developed a DNA microarray for strain typing and detection of major antimicrobial resistance genes of clinically relevant enterobacteria. For this purpose, we applied a global genome-wide probe selection strategy on 32 available complete enterobacterial genomes combined with a regression model for pathogen classification. The discriminative power of the probe set was further tested in silico on 15 additional complete enterobacterial genome sequences. DNA microarrays based on the selected probes were used to type 92 clinical enterobacterial isolates. Phenotypic tests confirmed the array-based typing results and corroborate that the selected probes allowed correct typing and prediction of major antibiotic resistances of clinically relevant Enterobacteriaceae, including the subspecies level, e.g. the reliable distinction of different E. coli pathotypes.
Conclusions
Our results demonstrate that the global probe selection approach based on longest common factor statistics as well as the design of a DNA microarray with a restricted set of discriminative probes enables robust discrimination of different enterobacterial variants and represents a proof of concept that can be adopted for diagnostics of a wide range of microbial pathogens. Our approach circumvents misclassifications arising from the application of virulence markers, which are highly affected by horizontal gene transfer. Moreover, a broad range of pathogens have been covered by an efficient probe set size enabling the design of high-throughput diagnostics.
Abstract: Hidden Markov models (HMMs) play a major role in applications to unravel biomolecular functionality. Though HMMs are technically mature and widely applied in computational biology, there is a potential of methodical optimisation concerning its modelling of biological data sources with varying sequence lengths.
Single building blocks of these models, the states, are associated with a certain holding time, being the link to the length distribution of represented sequence motifs. An adaptation of regular HMM topologies to bell-shaped sequence lengths is achieved by a serial chain-linking of hidden states, while residing in the class of conventional hidden Markov models. The factor of the repetition of states (r) and the parameter for state-specific duration of stay (p) are determined by fitting the distribution of sequence lengths with the method of moments (MM) and maximum likelihood (ML). Performance evaluations of differently adjusted HMM topologies underline the impact of an optimisation for HMMs based on sequence lengths. Secondary structure prediction on internal transcribed spacer 2 sequences demonstrates exemplarily the general impact of topological optimisations. In summary, we propose a general methodology to improve the modelling behaviour of HMMs by topological optimisation with ML and a fast and easily implementable moment estimator.
Abstract: Over the past years, microarray databases have increased rapidly in size. While they offer a wealth of data, it remains challenging to integrate data arising from different studies. Here we propose an unsupervised approach of a large-scale meta-analysis on Arabidopsis thaliana whole genome expression datasets to gain additional insights into the function and regulation of genes. Applying kernel principal component analysis and hierarchical clustering, we found three major groups of experimental contrasts sharing a common biological trait. Genes associated to two of these clusters are known to play an important role in indole-3-acetic acid (IAA) mediated plant growth and development or pathogen defense. Novel functions could be assigned to genes including a cluster of serine/threonine kinases that carry two uncharacterized domains (DUF26) in their receptor part implicated in host defense. With the approach shown here, hidden interrelations between genes regulated under different conditions can be unraveled.
Abstract: Neisseria meningitidis is a leading cause of infectious childhood mortality worldwide. Most research efforts have hitherto focused on disease isolates belonging to only a few hypervirulent clonal lineages. However, up to 10% of the healthy human population is temporarily colonized by genetically diverse strains mostly with little or no pathogenic potential. Currently, little is known about the biology of carriage strains and their evolutionary relationship with disease isolates. The expression of a polysaccharide capsule is the only trait that has been convincingly linked to the pathogenic potential of N. meningitidis. To gain insight into the evolution of virulence traits in this species, whole-genome sequences of three meningococcal carriage isolates were obtained. Gene content comparisons with the available genome sequences from three disease isolates indicate that there is no core pathogenome in N. meningitidis. A comparison of the chromosome structure suggests that a filamentous prophage has mediated large chromosomal rearrangements and the translocation of some candidate virulence genes. Interspecific comparison of the available Neisseria genome sequences and dot blot hybridizations further indicate that the insertion sequence IS1655 is restricted only to N. meningitidis; its low sequence diversity is an indicator of an evolutionarily recent population bottleneck. A genome-based phylogenetic reconstruction provides evidence that N. meningitidis has emerged as an unencapsulated human commensal from a common ancestor with Neisseria gonorrhoeae and Neisseria lactamica and consecutively acquired the genes responsible for capsule synthesis via horizontal gene transfer.
Abstract: MOTIVATION: Due to the growing number of completely sequenced genomes, functional annotation of proteins becomes a more and more important issue. Here, we describe a method for the prediction of sites within protein domains, which are part of protein-ligand interactions. As recently demonstrated, these sites are not trivial to detect because of a varying degree of conservation of their location and type within a domain family. RESULTS: The developed method for the prediction of protein-ligand interaction sites is based on a newly defined interaction profile hidden Markov model (ipHMM) topology that takes structural and sequence data into account. It is based on a homology search via a posterior decoding algorithm that yields probabilities for interacting sequence positions and inherits the efficiency and the power of the profile hidden Markov model (pHMM) methodology. The algorithm enhances the quality of interaction site predictions and is a suitable tool for large scale studies, which was already demonstrated for pHMMs. AVAILABILITY: The MATLAB-files are available on request from the first author.
Abstract: Bioprocesses like the cell-based production of biologicals, i.e. mainly recombinant proteins and monoclonal antibodies, require optimal culture conditions to obtain a high yield of quality products. The performance of a bioreactor highly depends on the cell characteristics as well as on the composition of the cell culture medium and the process conditions. As the metabolic activity of the cells is very high during fermentation, the external and internal metabolite compositions vary tremendously throughout the process. The quantification of a wide range of metabolic substrates and products is a prerequisite to understand and optimize the underlying cell-based activities. Furthermore, metabolite quantification reveals the composition of biologically derived cell culture supplements, thus serving as a tool to monitor supplement quality or providing the base for the formulation of a chemically defined medium supplement.