TALENT (Template bAsed Layout of molEcular NeTworks)
TALENT implements a template-based approach for laying out molecular networks. A molecular network is a hypergraph where a set of biological entities/species (reactants) is being transformed through a reaction into another set of biological entities/species (products). A reaction can also be modified (stimulated or inhibited) by another set of entities/species (modifiers). These biological entities can be of any type such as RNA, DNA, small molecule, protein, but they can also represents more general concepts such as receptor or even phenotype. The underlying representation of the multigraph is typically a bipartite graph with reactions and species being the two sets of nodes. But although we are dealing with standard graphs, automatic layout of molecular networks brings several challenges which makes using the existing numerous graph drawing approaches inappropriate. TALENT approaches the problem in a novel way called template-based layout.
The aim of template-based layout is to find a suitable layout for a target network for which only the network topology is known using a template network with both topology and layout available. In the first step the target and template networks are converted into so called reaction graphs (RG) where reactions correspond to nodes which are connected if the respective reactions share a species. Moreover, in the case of the template graph, we work with deduplicated version which enables efficient handling of the duplication. Next, a mapping between the target and template RGs is identified, resulting in a list of mapped, inserted and deleted reactions. The mapping is then used to transfer the layout from the template to the target. Specifically, the positions of mapped nodes can be used in the target layout, while positions for the inserted nodes need to identified.
Bellow is an example of using TALENT to lay out regulation of glycolysis pahtway from Metabolism Regulation Maps (MRM).
(a) is the resulting layout, (b) is the target network laid out as seen in MRM and (c) is the template.
Note that the layout of target was not used when laying out the target network.
Getting Started
To get TALENT up and running follow the install instructions . The only dependency which is not installed is MINERVA conversion API which is used whenever a conversion between formats needs to happen (which is every time TALENT is run). By default TALENT uses MINERVA's public API on https://minerva-dev.lcsb.uni.lu/minerva/api/.
Prerequisites
- Python 3.6
- libspatialindex (required by Rtree)
- PyQt5 (in case you want to run TALENT GUI)
- virtualenv (in case you want to run TALENT in virtual environment)
Installing
-
Install the libspatialindex library, run the following shell script (Linux):
wget http://download.osgeo.org/libspatialindex/spatialindex-src-1.8.5.tar.gz tar -xzf spatialindex-src-1.8.5.tar.gz cd spatialindex-src-1.8.5/ ./configure; sudo make; sudo make install sudo ldconfig
-
Download TALENT and enter the directory:
git clone https://git-r3lab.uni.lu/david.hoksza/talent cd talent
-
Activate the virtual environment (this step can be skipped if you want to install TALENT's dependencies globally)
python3 -m venv env source env/bin/activate
-
Install TALENT:
pip3 install -r requirements.txt
-
Deactivate the virtual environment if you are using one and you are done with using TALENT.
deactivate
Running a visualization
Arguments for each of the following utilities can be listed by typing just the name of the utility with the -h option.
Generating layout for the MRM's glycolysis regulation pathway:
mkdir out
mkdir cache
python3 talent/utils/transfer.py -tgt data/example/F001-glycolysis-alt-SBGNv02.sbgn.sbml -tmp data/example/F001-glycolysis-alt-SBGNv02_p1_6.sbgn.sbml -o out/ -ddup-tmp true -ddup-tgt true -s settings.json -cache cache
The previous command reads in the target and template networks, deduplicates both target and template, computes semi-global edit distance, stores the result in the cache directory (so it won't be computed next time the same graphs will be mapped) and lays out the target.
After running the command you should see the predicted layout in the SBML, SBGN, CellDesigenr SBML and SVG formats in the out directory together with the .pkl file which contains the target layout with an empty beautification chain. The .pkl file is Python serialization format and it is unlikely that you will be able to load it with other instance of TALENT. In the future, we plan to implement a proper serialization which will be instance and OS independent.
The layout can be "beautified" by application of the beautification operations (read the TALENT publication for further details).
You can apply a chain of beautification operations over the .pkl by running:
pkl_file="F001-glycolysis-alt-SBGNv02.sbgn--from--F001-glycolysis-alt-SBGNv02_p1_6.sbgn-tgt-ddup_True_tmp-ddup_True.pkl"
python3 talent/utils/beautify.py -i out/${pkl_file} -o out/${pkl_file/.pkl/-b.pkl} -d "[{\"type\":\"ALIGN_TO_GRID\", \"params\":{\"spacing\":[40,40]}}]"
To list all beautifications in a chain, run:
python3 talent/utils/beautify.py -l -i out/F001-glycolysis-alt-SBGNv02.sbgn--from--F001-glycolysis-alt-SBGNv02_p1_6.sbgn-tgt-ddup_True_tmp-ddup_True-b.pkl
To list the available beautification operations together with their description and available parameters run the following command:
TBD
Application of beautification operations results in a chain of layouts, each of which can be inspected with the export utility:
python3 talent/utils/export.py -i out/F001-glycolysis-alt-SBGNv02.sbgn--from--F001-glycolysis-alt-SBGNv02_p1_6.sbgn-tgt-ddup_True_tmp-ddup_True-b.pkl -f pdf -o out/out.pdf
The previous command will output the last layout in the chain into the PDF. You can check any other layout in the chain by setting the -s option (see the export help for details).
GUI
TALENT also comes with a graphical user interface which can be used to both lay out a target given provided template, carry out beautification and inspect the beautification chain. However, please note that the GUI is very simplistic and was developed for the purpose of testing the individual beautification operation and inspection of the beautification chains.
The GUI allows to specify target and template networks and output directory. The input file types are guessed; if wrong, the types can be specified manually. After the .pkl file is generated, it can be opened with File|Open and beautification operations can be applied.
The GUI needs to be started from the command line using:
python3 talent_gui.py
The GUI uses the same set of functions as the utilities above and thus all the messages describing the progress of the tasks are shown in the shell from which the GUI was started.
Documentation
TBD
Notes
- Handling modifications in SBML is probably MINERVA-specific. It expects the modifications to be in
layout:listOfAdditionalGraphicalObjects
node. This node containslayout:generalGlyph
which contains bounding box of the object andlayout:listOfReferenceGlyphs
with individuallayout:referenceGlyph
s which has id of a glyph stored inlayout:glyph
. However, the specific value which is then displayed in layout is stored via themulti
SBML package is referenced from thelayout:generalGlyph
by itslayout:reference
property. This links tomulti:listOfSpeciesFeatures\multi:speciesFeature\multi:listOfSpeciesFeatureValues\multi:speciesFeatureValue\@id
. Finallymulti:speciesFeatureValue
has amulti:value
property which is reference tomulti:listOfSpeciesTypes\multi:speciesType\multi:listOfSpeciesFeatureTypes\multi:speciesFeatureType\multi:listOfPossibleSpeciesFeatureValues\multi:possibleSpeciesFeatureValue\@multi:id
. This node hasmulti:name
property which holds the actual value of the modification.
License
This project is licensed under the Apache 2.0 license, quoted below.
Copyright (c) 2019 David Hoksza
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.