This document discusses various methods for predicting genes and analyzing unknown DNA sequences, including:
- Using profiles, patterns, and hidden Markov models (HMMs) to find conserved sequences and predict protein function
- Ontologies like Gene Ontology that organize genes and gene products in a structured network to facilitate annotation and analysis
- Computational tools like Genefinder and Glimmer that use signals like coding potential, open reading frames, start/stop codons, and sequence similarity to known genes to predict gene structures in sequences
- Integrating multiple lines of evidence, like HMMs, EST alignments, repeats, and CpG islands, can improve gene prediction over a single method.