Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing individuals=[] to ts.write_vcf gives malformed haploid output #2446

Closed
jeromekelleher opened this issue Jul 28, 2022 · 0 comments
Closed
Labels
bug Something isn't working
Milestone

Comments

@jeromekelleher
Copy link
Member

ts = msprime.sim_ancestry(3, sequence_length=1e2, random_seed=1234)
ts = msprime.sim_mutations(ts, rate=0.01, random_seed=1234)
print(ts.as_vcf())

gives

##fileformat=VCFv4.2
##source=tskit 0.5.2.dev0
##FILTER=<ID=PASS,Description="All filters passed">
##contig=<ID=1,length=100>
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  tsk_0   tsk_1   tsk_2
1       2       0       C       T       .       PASS    .       GT      1|0     1|0     1|0
1       15      1       G       T       .       PASS    .       GT      1|0     0|0     0|0
1       19      2       A       T       .       PASS    .       GT      1|0     0|0     0|0
1       29      3       C       A       .       PASS    .       GT      1|0     1|0     1|0
1       35      4       T       C       .       PASS    .       GT      0|0     0|0     0|1
1       37      5       G       C       .       PASS    .       GT      0|1     0|1     0|0
1       56      6       C       G       .       PASS    .       GT      0|1     0|1     0|0
1       61      7       T       G       .       PASS    .       GT      0|0     0|0     1|0
1       71      8       C       G       .       PASS    .       GT      0|1     0|1     0|0
1       81      9       G       C       .       PASS    .       GT      1|0     0|0     0|0

But,

ts = msprime.sim_ancestry(3, sequence_length=1e2, random_seed=1234)
ts = msprime.sim_mutations(ts, rate=0.01, random_seed=1234)
print(ts.as_vcf(individuals=[]))

gives

##fileformat=VCFv4.2
##source=tskit 0.5.2.dev0
##FILTER=<ID=PASS,Description="All filters passed">
##contig=<ID=1,length=100>
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  tsk_0   tsk_1   tsk_2tsk_3    tsk_4   tsk_5
1       2       0       C       T       .       PASS    .       GT      1       0       1    01       0
1       15      1       G       T       .       PASS    .       GT      1       0       0    00       0
1       19      2       A       T       .       PASS    .       GT      1       0       0    00       0
1       29      3       C       A       .       PASS    .       GT      1       0       1    01       0
1       35      4       T       C       .       PASS    .       GT      0       0       0    00       1
1       37      5       G       C       .       PASS    .       GT      0       1       0    10       0
1       56      6       C       G       .       PASS    .       GT      0       1       0    10       0
1       61      7       T       G       .       PASS    .       GT      0       0       0    01       0
1       71      8       C       G       .       PASS    .       GT      0       1       0    10       0
1       81      9       G       C       .       PASS    .       GT      1       0       0    00       0

Notes tsk_2 and tsk_3 are joined here without a space.

Ideally we'd probably output a "sites only" VCF here, but we can just throw an error here for now for simplicity.

@jeromekelleher jeromekelleher added the bug Something isn't working label Jul 28, 2022
@jeromekelleher jeromekelleher added this to the Python 0.5.2 milestone Jul 28, 2022
jeromekelleher added a commit to jeromekelleher/tskit that referenced this issue Jul 28, 2022
jeromekelleher added a commit to jeromekelleher/tskit that referenced this issue Jul 28, 2022
@mergify mergify bot closed this as completed in 2d0d33e Jul 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant