Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

computeMatrix using deepblue doesn't get data for beforeRegionStartLength/afterRegionStartLength #700

Closed
jayoung opened this issue May 4, 2018 · 5 comments
Assignees
Milestone

Comments

@jayoung
Copy link

jayoung commented May 4, 2018

Hi there,

I've recently started using deepTools - I love it. Thanks for all your work on this. I also started playing with using ENCODE data via deepBlue - also a really great feature.

I think I found a bug, and it's probably an easy one to fix. I should say upfront I'm using version 3.0.1 and have not tested 3.0.2. Apologies for that - I'm python-naive and am reliant on my sysadmins to install updates - but I don't see anything related in the changeLog, so thought I'd report it.

It seems like the bigWig file that computeMatrix obtains via deepBlue when I give it an Encode ID contains only the coverage data for the regions themselves but not the flanking regions I specified - it looks like the beforeRegionStartLength/afterRegionStartLength options didn't get applied before requesting the bigWig file - only later when it computes the matrix.

I started with an example like your mouse enhancers plot from the gallery - http://deeptools.readthedocs.io/en/latest/content/example_gallery.html#dnase-accessibility-at-enhancers-in-murine-es-cells. I wasn't sure of the right ENCODE id to use, so I'm probably using a different (but very similar) set of raw data.

first a sanity check - I get output very similar to yours if I manually download the Encode bigWig file

  1. I download a bigwig file for Encode data ENCFF001OIK
    https://www.encodeproject.org/files/ENCFF001OIK/@@download/ENCFF001OIK.bigWig
    and I call it ENCFF001OIK.downloaded.bigWig

  2. I download your example bed file Whyte_TypicalEnhancers_ESC.bed

  3. I run computeMatrix+plotHeatmap and I get a very similar plot to yours:

computeMatrix reference-point -S ENCFF001OIK.downloaded.bigWig -R Whyte_TypicalEnhancers_ESC.bed --referencePoint center -a 2000 -b 2000 -out matrix_Enhancers_DNase_ESC.tab.gz --outFileNameMatrix matrix_Enhancers_DNase_ESC.tab.txt
plotHeatmap -m matrix_Enhancers_DNase_ESC.tab.gz -out matrix_Enhancers_DNase_ESC.heatmap1.png 

matrix_enhancers_dnase_esc heatmap1

next I try using deepBlue to give me the same thing. I keep the temp bigWig file so I can take a look, as well as getting the matrix in a format I can look at:

computeMatrix reference-point -S ENCFF001OIK -R Whyte_TypicalEnhancers_ESC.bed --referencePoint center -a 2000 -b 2000 -out matrix_Enhancers_DNase_ESC_deepBlue.tab.gz --deepBlueTempDir /fh/scratch/delete30/malik_h --deepBlueKeepTemp --outFileNameMatrix matrix_Enhancers_DNase_ESC_deepBlue.tab.txt
plotHeatmap -m matrix_Enhancers_DNase_ESC_deepBlue.tab.gz -out matrix_Enhancers_DNase_ESC_deepBlue.heatmap1.png 

matrix_enhancers_dnase_esc_deepblue heatmap1

The plot shows coverage signal within the enhancer regions themselves, but in the flanking 2000bp it now shows missing data (swathes of black). I examine the bigWig file (or the text format matrix) and I see that they both only contain data within the regions, not in the flanks.

Does this reproduce with v 3.0.2? I'm guessing this will be a simple fix? Hope so!

thanks again,

Janet Young


Dr. Janet Young

Malik lab
http://research.fhcrc.org/malik/en.html

Division of Basic Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., A2-025,
P.O. Box 19024, Seattle, WA 98109-1024, USA.

tel: (206) 667 4512
email: jayoung ...at... fredhutch.org


@dpryan79
Copy link
Collaborator

dpryan79 commented May 4, 2018 via email

@dpryan79 dpryan79 self-assigned this May 4, 2018
@dpryan79 dpryan79 added this to the 3.1.0 milestone May 4, 2018
@jayoung
Copy link
Author

jayoung commented May 9, 2018

Thanks Devon. Yes, I've been downloading files: that works fine. Hope it's easy to fix!

dpryan79 added a commit that referenced this issue Jun 8, 2018
@dpryan79
Copy link
Collaborator

I made an embarrassingly simple mistake in the DeepBlue code that seems to be causing this. I'm testing now if there are any other issues and, if not, I'll merge the fix in and release version 3.1.0.

@dpryan79
Copy link
Collaborator

This is now finally fixed in the develop branch and will work properly in the 3.1.0 release. In addition to the silly typos that caused the main issue, I also had an off-by-one error in the code, so all of the values were shifted by one nucleotide.

@jayoung
Copy link
Author

jayoung commented Jul 17, 2018

Thanks for fixing this - appreciate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants