Managing Track Hubs with the Registry - An Overview
A single track data hub is formed by a set of files (i.e. hub.txt, genomes.txt) describing the hub structure and content and one or more directories with genome assembly specific data (bigBed, bigWig etc files) and configuration referred to as track data bases. The configuration of a track data base is described in a file, usually named trackDB.txt, defining the location of all the binary indexed track data for each assembly and directives controlling the display of these data on a genome browser.
The Registry is designed to provide users with the ability to discover track hubs of interest and easily load them into a genome browser. From this point of view, the purpose of the Registry is not to store a complete representation of a track hub, but to introduce a convenient representation with metadata to support search and that can be easily sent to (due to its limited size) and parsed by a genome browser, thus overcoming some of the challenges involved when parsing track data hubs.
A user is generally interested in displaying data for a particular assembly of a particular species, and a browser just needs to do that, without having to deal with all possible assemblies managed by a given hub. Within this perspective, the unit of information modelled by the Registry is given by track database (trackDb) settings used in a Track Hub's trackDb.txt file, which specifies display and configuration options for data pertaining to a certain genome assembly. This is enough to allow the browser to organise the display, as the data referred to in the trackDb settings reside at the original hub URL or somewhere in another remote location. Therefore, this is the only information stored in Registry.
The Registry stores assembly track database settings in a JSON document, with metadata attributes and a simple tree-based structure facilitating parsing with complex track organisations. Some of the document attributes identify the hub the trackDb belongs to, so that a track data hub is implicitly represented by the set of trackDb documents referring to it.
The following section presents an interactive diagram of the trackDb JSON schema against which all trackDbs submitted to Registry are validated (read about the submission process to know more). To access the original schema document, click here.