From Surf Wiki (app.surf) — the open knowledge base
Generalized suffix tree
In computer science, a generalized suffix tree is a suffix tree for a set of strings. Given the set of strings D=S_1,S_2,\dots,S_d of total length n, it is a Patricia tree containing all n suffixes of the strings. It is mostly used in bioinformatics.
Functionality
It can be built in \Theta(n) time and space, and can be used to find all z occurrences of a string P of length m in O(m + z) time, which is asymptotically optimal (assuming the size of the alphabet is constant).
When constructing such a tree, each string should be padded with a unique out-of-alphabet marker symbol (or string) to ensure no suffix is a substring of another, guaranteeing each suffix is represented by a unique leaf node.
Algorithms for constructing a GST include Ukkonen's algorithm (1995) and McCreight's algorithm (1976).
Example
A suffix tree for the strings ABAB and BABA is shown in a figure above. They are padded with the unique terminator strings $0 and $1. The numbers in the leaf nodes are string number and starting position. Notice how a left to right traversal of the leaf nodes corresponds to the sorted order of the suffixes. The terminators might be strings or unique single symbols. Edges on $ from the root are left out in this example.
Alternatives
An alternative to building a generalized suffix tree is to concatenate the strings, and build a regular suffix tree or suffix array for the resulting string. When hits are evaluated after a search, global positions are mapped into documents and local positions with some algorithm and/or data structure, such as a binary search in the starting/ending positions of the documents.
References
| book-title=Combinatorial Pattern Matching, Lecture Notes in Computer Science, 644. | book-title=Biotechnology Computing, Proceedings of the Twenty-Seventh Hawaii International Conference on. | orig-year = 1997
This article was imported from Wikipedia and is available under the Creative Commons Attribution-ShareAlike 4.0 License. Content has been adapted to SurfDoc format. Original contributors can be found on the article history page.
Ask Mako anything about Generalized suffix tree — get instant answers, deeper analysis, and related topics.
Research with MakoFree with your Surf account
Create a free account to save articles, ask Mako questions, and organize your research.
Sign up freeThis content may have been generated or modified by AI. CloudSurf Software LLC is not responsible for the accuracy, completeness, or reliability of AI-generated content. Always verify important information from primary sources.
Report