Advanced Search

Complex queries can be made for more specific and customised searches, by entering terms and operators according to the simple query language understood by the search application. The language supports the use of wildcards, regular expressions, logical and fuzzy operators, proximity searches and grouping.

Field-specific queries

Searches can be confined to specific fields, i.e. those visible in the individual hub pages. The indexed fields are defined by the TrackHub specification, but some useful examples include species.scientific_name, assembly.accession, hub.shortLabel and hub.longLabel. These are specified in the query string followed by a colon. If the search term in that field is comprised of more than one word, they must be grouped together with brackets, e.g.

species.scientific_name:(Zea mays)

Then use Logical Operators below to add additional search terms as and when you need them.

Wildcards

Sometimes it may be useful to match records based on a query pattern. Wildcard searches can be run on individual terms, using ? to replace a single character, and * to replace zero or more characters:

GRCh3? rna*

Be aware that wildcard queries, especially those with many terms, can use an enormous amount of memory and perform very badly.

Regular Expressions

Regular expression patterns can be embedded in the query string by wrapping them in forward-slashes ("/"):

species.scientific_name:/dan?io (re[ri]o)/

Supported regex syntax (Elasticsearch website).

Logical Operators

By default, all terms are optional, as long as one term matches. A search for foo bar baz will find any document that contains one or more of foo or bar or baz. Alternatively, all the familiar AND, OR and NOT operators (also written &&, || and !) can be used. Using AND instead of the default OR operator in the previous query would force all terms to be required.

Other boolean operators can be used in the query string itself to provide more control. The preferred operators are + (this term must be present) and - (this term must not be present). All other terms are optional. For example, this query:

homo sapiens +rnaseq -srna

States that:

  • rnaseq must be present
  • srna must not be present
  • homo and sapiens are optional — their presence increases the relevance

When mixing operators, tt is important to take their precedence into account: NOT takes precedence over AND, which takes precedence over OR. While the + and - only affect the term to the right of the operator, AND and OR can affect the terms to the left and right.

Grouping

Multiple terms or clauses can be grouped together with parentheses, to form sub-queries:

(rnaseq OR srna) AND homo

Fuzzy Operator

We can search for terms that are similar to, but not exactly like, our search terms, using the fuzzy operator:

hoom~ rnseq~ srmas~

This uses the Damerau-Levenshtein distance to find all terms with a maximum of two changes, where a change is the insertion, deletion or substitution of a single character, or transposition of two adjacent characters.

The default edit distance is 2, but an edit distance of 1 should be sufficient to catch 80% of all human misspellings. It can be specified as:

grhc38~1

Proximity Searches

While a phrase query (eg john smith) expects all of the terms in exactly the same order, a proximity query allows the specified words to be further apart or in a different order. In the same way that fuzzy queries can specify a maximum edit distance for characters in a word, a proximity search allows us to specify a maximum edit distance of words in a phrase:

"sapiens rnaseq"~5

The closer the text in a field is to the original order specified in the query string, the more relevant that document is considered to be. When compared to the above example query, the phrase "quick fox" would be considered more relevant than "quick brown fox".

Status:

API v, UI v0.9.1


Copyright © EMBL-EBI 2025.

This website uses cookies. By continuing to browse this site, you are agreeing to the use of our site cookies. We also collect some limited personal information when you browse the site. You can find the details in our Privacy Policy.