Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker run issue and pm module missing #1666

Open
vyomesh4bcOnco opened this issue Apr 25, 2024 · 9 comments
Open

docker run issue and pm module missing #1666

vyomesh4bcOnco opened this issue Apr 25, 2024 · 9 comments
Assignees

Comments

@vyomesh4bcOnco
Copy link

Describe the issue

The issue here seems to be that the VEP (Variant Effect Predictor) tool is unable to locate and load the required plugin, specifically dbNSFP4.1. The error message indicates that it failed to compile the plugin because it couldn't find the necessary Perl module (dbNSFP4.pm) in its expected location.

The plugin dbNSFP4.1 likely relies on the dbNSFP4.pm Perl module, which appears to be missing or not installed in the expected directory. The VEP tool expects to find this module in one of the directories listed in the @inc Perl array, which includes directories like /usr/local/lib/x86_64-linux-gnu/perl/5.34.0 and /usr/share/perl/5.34.

To resolve this issue, you'll need to ensure that the dbNSFP4.pm module is correctly installed and accessible by the VEP tool. This may involve installing the missing Perl module or ensuring that it's available in one of the directories listed in the @inc array.

Additional information

time sudo docker run -v $HOME/vep_data:/data ensemblorg/ensembl-vep vep --cache --offline --format vcf --vcf --force_overwrite --input_file test_samples.vcf --output_file test_samples_annot.txt --plugin dbNSFP4.1,dbNSFP4.1a.txt.gz --pick
WARNING: Failed to compile plugin dbNSFP4.1: Can't locate dbNSFP4.pm in @inc (you may need to install the dbNSFP4 module) (@inc contains: /plugins /opt/vep/src/ensembl-vep/modules /opt/vep/src/ensembl-vep /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.34.0 /usr/local/share/perl/5.34.0 /usr/lib/x86_64-linux-gnu/perl5/5.34 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl-base /usr/lib/x86_64-linux-gnu/perl/5.34 /usr/share/perl/5.34 /usr/local/lib/site_perl) at (eval 42) line 2.

#Attempts to resolve:
We have kept dbNSFP4.pm module in the cache folder, its now working. Also, /plugins /opt/vep/src/ensembl-vep/modules /opt/vep/src/ensembl-vep /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.34.0 /usr/local/share/perl/5.34.0 /usr/lib/x86_64-linux-gnu/perl5/5.34 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl-base /usr/lib/x86_64-linux-gnu/perl/5.34 /usr/share/perl/5.34 /usr/local/lib/site_perl) this path are not found when the docker image is considered

@likhitha-surapaneni likhitha-surapaneni self-assigned this Apr 26, 2024
@likhitha-surapaneni
Copy link
Contributor

Hi @vyomesh4bcOnco,

The plugin name does not have versioning in it. The data can be downloaded according to the preferred version by following the docs here
You may need to download the data to a mounted directory in order to use it with docker by following the docs here

Kind regards,
Likhitha

@vyomesh4bcOnco
Copy link
Author

vyomesh4bcOnco commented Apr 26, 2024

Hi @likhitha-surapaneni ,

Tried it without the version too. it does not run and show pm module error..
All the details are kept inside the mounted folder. however the path shown in error are not traceable while in docker

@likhitha-surapaneni
Copy link
Contributor

Hi @vyomesh4bcOnco,

Can you please let us know the error encountered when the plugin name has no version?
Can you please check if the dbNSFP data has been downloaded properly? The usual link to download the file seems to be down, the file can also be downloaded from https://dbnsfp.s3.amazonaws.com/dbNSFP4.4a.zip.

Thanks and regards,
Likhitha

@vyomesh4bcOnco
Copy link
Author

vyomesh4bcOnco commented Apr 27, 2024

Hi @likhitha-surapaneni ,

Can you please let us know the error encountered when the plugin name has no version?
-->
time sudo docker run -v $HOME/vep_data:/data ensemblorg/ensembl-vep vep --cache --offline --format vcf --vcf --force_overwrite --input_file test_samples.vcf --output_file test_samples_annot.txt --plugin dbNSFP,dbNSFP4.4a.txt.gz
[sudo] password for bioinfo-b:
WARNING: Failed to instantiate plugin dbNSFP: ERROR: No headers found before data

Same error for homo Sapien test vcf:
time sudo docker run -v $HOME/vep_data:/data ensemblorg/ensembl-vep vep --cache --offline --format vcf --vcf --force_overwrite --input_file homo_sapiens_GRCh37.vcf --output_file test_samples_annot.txt --plugin dbNSFP,dbNSFP4.4a.txt.gz
WARNING: Failed to instantiate plugin dbNSFP: ERROR: No headers found before data

Can you please check if the dbNSFP data has been downloaded properly?
Yes, no truncation and unexpected error --> tried with both versions 4.1 and 4.4

Attaching the Docker shell command used up until now to set up the annotation pipeline for your reference.
vep_cmd_vyomesh.txt

@likhitha-surapaneni
Copy link
Contributor

likhitha-surapaneni commented Apr 29, 2024

Hi @vyomesh4bcOnco ,

Thank you for providing the details, it seems like the header of dbNSFP file is not as expected. Can you please try these preprocessing steps instead and let us know if it fixes the issue?

@vyomesh4bcOnco
Copy link
Author

vyomesh4bcOnco commented May 1, 2024

Hi @likhitha-surapaneni,

Attempted to follow the preprocessing step,it is running. Thank You.

However, INFO coloumn is too packed if i expand it (test to coloumn it doesnt match the header. Is there any way it can be resolved? (i.e. How can we customize the INFO Field as per our requirment?)

We do not want to include all of these but only few out of it need to be retained:
Allele|Consequence|IMPACT|SYMBOL|Gene|Feature_type|Feature|BIOTYPE|EXON|INTRON|HGVSc|HGVSp|cDNA_position|CDS_position|Protein_position|Amino_acids|Codons|Existing_variation|DISTANCE|STRAND|FLAGS|SYMBOL_SOURCE|HGNC_ID|CADD_phred|clinvar_clnsig

@likhitha-surapaneni
Copy link
Contributor

Hi @vyomesh4bcOnco ,

Thank you for your patience.

You can optionally use fields [list](link) to configure the output format using a comma separated list of fields.
For example, --vcf --fields "Allele,Consequence,Feature_type,Feature"

Kind regards,
Likhitha

@vyomesh4bcOnco
Copy link
Author

Hi @vyomesh4bcOnco ,

Thank you for your patience.

You can optionally use fields [list](link) to configure the output format using a comma separated list of fields. For example, --vcf --fields "Allele,Consequence,Feature_type,Feature"

Kind regards, Likhitha

Hi @likhitha-surapaneni,
Appreciated your help in de-coding VEP process.

I have one more query:
CLIN_SIG and Clinvar_clnSig they have different value for example
A missense variant for ALK chr2 29416572 T>C c.4381A>G p.ile1461Val is highlighted likely_pathogenics&benign under CLIN_SIG & benign under Clinvar_clnSig.

it would be helpful if you could provide some clarification on the above challenge.

Regards,
Vyomesh

@dglemos
Copy link
Contributor

dglemos commented Jun 7, 2024

Hi @vyomesh4bcOnco,
CLIN_SIG is the ClinVar clinical significance annotated by VEP, however Clinvar_clnSig is not provided by VEP. Did you run another variant annotation using a different tool?

The VEP clinical significance likely_pathogenic&benign matches the following submissions:
RCV000119976.11: benign
RCV000590065.1: benign
RCV000573143.2: benign
RCV000608829.21: benign
RCV001250949.1: likely pathogenic -> this submission has no criteria provided

You can check the phenotype data in this table: https://www.ensembl.org/Homo_sapiens/Variation/Phenotype?db=core;r=2:29193206-29194206;v=rs1670283;vdb=variation;vf=89497988

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants