browser
Contains the GenomeBrowser class
GenomeBrowser
GenomeBrowser (gff_path:str, genome_path:str=None, seq_id:str=None, init_pos:int=None, init_win:int=10000, bounds:tuple=None, max_interval:int=100000, show_seq:bool=True, search:bool=True, attributes:list=['gene', 'locus_tag', 'product'], feature_name:str='gene', feature_types:list=['CDS', 'repeat_region', 'ncRNA', 'rRNA', 'tRNA'], glyphs:dict=None, height:int=150, width:int=600, label_angle:int=45, label_font_size:str='10pt', feature_height:float=0.15, output_backend:str='webgl', **kwargs)
Initialize a GenomeBrowser object.
Type | Default | Details | |
---|---|---|---|
gff_path | str | path to the gff3 file of the annotations (also accepts gzip files) | |
genome_path | str | None | path to the fasta file of the genome sequence |
seq_id | str | None | id of the sequence to show for genomes with multiple contigs |
init_pos | int | None | initial position to display |
init_win | int | 10000 | initial window size (max=20000) |
bounds | tuple | None | bounds can be specified. This helps preserve memory by not loading the whole genome if not needed. |
max_interval | int | 100000 | maximum size of the field of view in bp |
show_seq | bool | True | shows the sequence when zooming in |
search | bool | True | enables a search bar |
attributes | list | [‘gene’, ‘locus_tag’, ‘product’] | list of attribute names from the GFF attributes column to be extracted |
feature_name | str | gene | attribute to be displayed as the feature name |
feature_types | list | [‘CDS’, ‘repeat_region’, ‘ncRNA’, ‘rRNA’, ‘tRNA’] | list of feature types to display |
glyphs | dict | None | dictionnary defining the type and color of glyphs to display for each feature type |
height | int | 150 | height of the annotation track |
width | int | 600 | width of the inner frame of the browser |
label_angle | int | 45 | angle of the feature names displayed on top of the features |
label_font_size | str | 10pt | font size fo the feature names |
feature_height | float | 0.15 | fraction of the annotation track height occupied by the features |
output_backend | str | webgl | can be “webgl” or “svg”. webgl is more efficient but svg is a vectorial format that can be conveniently modified using other software |
kwargs |
Additional keyword arguments are passed as is to bokeh.plotting.figure
from genomenotebook.data import get_example_data_dir
import os
= get_example_data_dir()
data_path = os.path.join(data_path, "MG1655_U00096.fasta")
genome_path = os.path.join(data_path, "MG1655_U00096.gff3")
gff_path
=GenomeBrowser(genome_path=genome_path, gff_path=gff_path, bounds=(0,50000),width=600)
g g.show()
#Providing GFF file as only input
=GenomeBrowser(gff_path)
g g.show()
#List available attributes
from genomenotebook.utils import available_attributes, available_feature_types
available_attributes(gff_path)
Index(['seq_id', 'source', 'type', 'start', 'end', 'score', 'strand', 'phase',
'attributes', 'Name', 'mobile_element_type', 'Is_circular',
'recombination_class', 'gbkey', 'protein_id', 'exception', 'pseudo',
'gene_synonym', 'orig_transcript_id', 'strain', 'part', 'gene',
'mol_type', 'transl_except', 'ID', 'substrain', 'genome', 'rpt_type',
'Note', 'Dbxref', 'product', 'transl_table', 'orig_protein_id',
'locus_tag', 'Parent', 'gene_biotype', 'left', 'right', 'middle'],
dtype='object')
#Showing different attributes from the GFF file
=GenomeBrowser(gff_path, attributes=["locus_tag","protein_id",'gene','product'],feature_name="protein_id")
g g.show()
GenomeBrowser.add_track
GenomeBrowser.add_track (height:int=200, tools:str='xwheel_zoom, ywheel_zoom, pan, box_zoom, save, reset', **kwargs)
Adds a track to the GenomeBrowser. Ensures that the x_range are shared and figure widths are identical.
Type | Default | Details | |
---|---|---|---|
height | int | 200 | size of the track |
tools | str | xwheel_zoom, ywheel_zoom, pan, box_zoom, save, reset | comma separated list of Bokeh tools that can be used to navigate the plot |
kwargs | |||
Returns | Track |
= get_example_data_dir()
data_path = os.path.join(data_path, "MG1655_U00096.fasta")
genome_path = os.path.join(data_path, "MG1655_U00096.gff3")
gff_path
=pd.DataFrame(dict(x=np.arange(0,50000,100),
data=np.sin(np.arange(0,50000,100))))
y
=GenomeBrowser(genome_path=genome_path, gff_path=gff_path, bounds=(0,5000), search=False, show_seq=False)
g
= g.add_track(height=100)
track =data,pos="x",y="y")
track.scatter(data g.show()
GenomeBrowser.highlight
GenomeBrowser.highlight (data:pandas.core.frame.DataFrame, left:str='left', right:str='right', color:str='color', alpha:str=0.2, hover_data:list=[], highlight_tracks:bool=False, **kwargs)
Type | Default | Details | |
---|---|---|---|
data | DataFrame | pandas DataFrame containing the data | |
left | str | left | name of the column containing the start positions of the regions |
right | str | right | name of the column containing the end positions of the regions |
color | str | color | color of the regions |
alpha | str | 0.2 | transparency |
hover_data | list | [] | list of additional column names to be shown when hovering over the data |
highlight_tracks | bool | False | whether to highlight just the annotation track or also the other tracks |
kwargs |
import pandas as pd
=GenomeBrowser(gff_path=gff_path, genome_path=genome_path, bounds=(0,10000))
g=pd.DataFrame({"start": [5000, 8000], "stop": [6000, 8500], "color": ["red","green"], "y":[23, 45]})
highlight_regions=highlight_regions, left="start", right="stop", hover_data=["y"])
g.highlight(data g.show()
=pd.DataFrame(dict(x=np.arange(0,50000,100),
data=np.sin(np.arange(0,50000,100))))
y
=GenomeBrowser(genome_path=genome_path, gff_path=gff_path, bounds=(0,5000), search=False, show_seq=False)
g= g.add_track(height=100)
track =data,pos="x",y="y")
track.scatter(data
=pd.DataFrame({"start": [2000, 4000], "stop": [3000, 4500], "color": ["red","green"], "y":[23, 45]})
highlight_regions=highlight_regions, left="start", right="stop", hover_data=["y"], highlight_tracks=True)
g.highlight(data
g.show()
GenomeBrowser.save
GenomeBrowser.save (fname:str)
This function saves the initial plot that is generated and not the current view of the browser. To save in svg format you must initialise your GenomeBrowser using output_backend="svg"
Type | Details | |
---|---|---|
fname | str | path to file or a simple name (extensions are automatically added) |
Saving to svg
Plots can only be saved to svg if you initialise your GenomeBrowser using output_backend="svg"
=GenomeBrowser(gff_path=gff_path,
g=(0,5000),
bounds="svg",
output_backend=False)
search= g.add_track(height=100)
track =np.arange(0,5000,100),y=np.sin(np.arange(0,5000,100)))
track.fig.scatter(x
g.show()"test.svg") g.save(
Saving to png
=GenomeBrowser(genome_path=genome_path,
g=gff_path,
gff_path=(0,5000),
bounds=False,
search=200,
height=2000,
width="20pt")
label_font_size"test.png") g.save(