Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trig export of multiple graphs assigns wrong prefixes to prefixedNames #679

Closed
engsterhold opened this issue Jan 5, 2017 · 2 comments
Closed
Labels
bug Something isn't working serialization Related to serialization.
Milestone

Comments

@engsterhold
Copy link

It seems, that trig export does not handle prefixes and named graphs very well (or at all). Trying to export a dataset with multiple graphs overwrites already assigned prefixes, produces different prefix assignments, wrong graph prefixes or prefixNames on different runs.

Testcase:

from rdflib import URIRef, Literal,  Graph, Dataset
def trig_export_test():
    graphs = [(URIRef("urn:tg1"),"A"), (URIRef("urn:tg2"), "B")]
    ds = Dataset()
    for i, n in graphs:
        g = Graph(identifier=i)
        a = URIRef("urn:{}#S".format(n))
        b = URIRef("urn:{}#p".format(n))
        c = Literal(chr(0xf23f1))
        d = Literal(chr(0x66))
        e = Literal(chr(0x23f2))
        g.add((a,b,c))
        g.add((a,b,d))
        g.add((a,b,e))
        ds.graph(g)

    for n,k in [ ("json-ld","jsonld"),
                 ("nquads", "nq"), ("trix", "trix"), ("trig", "trig")]:
        ds.serialize("ds.{}".format(k), format=n)

trig_export_test()

Can produce the following

@prefix ns1: <urn:> .
@prefix ns2: <urn:B#> .
@prefix pns1: <urn:x-rdflib:> .
@prefix pns2: <urn:A#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

pns1:tg2 {
    pns2:S pns2:p "f",
            "⏲",
            "󲏱" .
}

pns1:tg1 {
    pns2:S pns2:p "f",
            "⏲",
            "󲏱" .
}

{}

or

@prefix ns1: <urn:x-rdflib:> .
@prefix ns2: <urn:A#> .
@prefix pns1: <urn:> .
@prefix pns2: <urn:B#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

{}

pns1:tg1 {
    pns2:S pns2:p "f",
            "⏲",
            "󲏱" .
}

pns1:tg2 {
    pns2:S pns2:p "f",
            "⏲",
            "󲏱" .
}

both are created with the same function

@jbaublitz
Copy link

I wanted to add that I see this behavior even in the case of parsing a dataset with a single graph in nquads format and serializing as trig with no special characters. This seems to be an issue serializing named graphs as trig even in cases with one named graph in the dataset. I can also verify that the same nquads parsing allows proper nquads serialization so it does not appear to be a data corruption issue in the Graph data structure.

@gromgull gromgull added bug Something isn't working serialization Related to serialization. labels Jan 12, 2017
@gromgull gromgull added this to the rdflib 4.2.2 milestone Jan 12, 2017
gromgull added a commit that referenced this issue Jan 24, 2017
raise exception when trying to rebind a prefix to another ns.
fix broken rebinding when generating prefixes

This fixes #679 - but actually it's more like a work-around. The
underlying problem is confusion about context and graph objects (#167)
@gromgull
Copy link
Member

I've fixed the broken trig output, but it remains ugly.

The underlying issue is #698 - since you do:

g = Graph(identifier=i)
[ ... add data ... ]
ds.graph(g)

you end up with a Graph instance with it's own namespace manager. So each subgraph generates it's own independent space of prefix mappings during trig serialisation.

It's actually much worse than this - these graphs do not share a store. So if you tried to persist the dataset, nothing would work.

It's our fault, passing a graph instance to DataSet.graph should probably raise an exception.

Change the code to :

g = ds.graph(i)
[... add data ...]

and all is good.

gromgull added a commit that referenced this issue Jan 24, 2017
raise exception when trying to rebind a prefix to another ns.
fix broken rebinding when generating prefixes

This fixes #679 - but actually it's more like a work-around. The
underlying problem is confusion about context and graph objects (#167)
gromgull added a commit that referenced this issue Jan 24, 2017
raise exception when trying to rebind a prefix to another ns.
fix broken rebinding when generating prefixes

This fixes #679 - but actually it's more like a work-around. The
underlying problem is confusion about context and graph objects (#167)
gromgull added a commit that referenced this issue Jan 24, 2017
raise exception when trying to rebind a prefix to another ns.
fix broken rebinding when generating prefixes

This fixes #679 - but actually it's more like a work-around. The
underlying problem is confusion about context and graph objects (#167)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working serialization Related to serialization.
Projects
None yet
Development

No branches or pull requests

3 participants