tailieunhanh - Báo cáo sinh học: "A combinatorial optimization approach for diverse motif finding applications"

Tuyển tập các báo cáo nghiên cứu về sinh học được đăng trên tạp chí y học Molecular Biology cung cấp cho các bạn kiến thức về ngành sinh học đề tài: A combinatorial optimization approach for diverse motif finding applications. | Algorithms for Molecular Biology BioMed Central Research A combinatorial optimization approach for diverse motif finding applications Elena Zaslavsky and Mona Singh Open Access Address Department of Computer Science Lewis-Sigler Institute for Integrative Genomics Princeton University Princeton NJ 08544 USA Email Elena Zaslavsky - elenaz@ Mona Singh - msingh@ Corresponding authors Published 17 August 2006 Received 02 July 2006 Algorithms for Molecular Biology 2006 1 13 doi 1748-7188-1-13 Accepted 17 August 2006 This article is available from http content 1 1 13 2006 Zaslavsky and Singh licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License http licenses by which permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited. Abstract Background Discovering approximately repeated patterns or motifs in biological sequences is an important and widely-studied problem in computational molecular biology. Most frequently motif finding applications arise when identifying shared regulatory signals within DNA sequences or shared functional and structural elements within protein sequences. Due to the diversity of contexts in which motif finding is applied several variations of the problem are commonly studied. Results We introduce a versatile combinatorial optimization framework for motif finding that couples graph pruning techniques with a novel integer linear programming formulation. Our approach is flexible and robust enough to model several variants of the motif finding problem including those incorporating substitution matrices and phylogenetic distances. Additionally we give an approach for determining statistical significance of uncovered motifs. In testing on numerous DNA and protein datasets we demonstrate that our approach typically identifies statistically .