-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Old and new process graph examples (reduce, band math, labeled data) #43
Comments
It's an example and was not meant to reference to any real processes, I could just name them a, b, c and d. I wanted to show how things work in theory, in some parts also with two callback parameters. The official source of truth are always the process specification files, not the examples. There the callback parameter names are different (usually Edit: I'll go through the API and specify less confusing examples. |
Perhaps it would also help to provide a less-trivial example. For instance, this EVI index:
|
Yes, we can do that. But indeed it seems like we should have a shortcut process to access the bands. filter_bands + array_element is quite cumbersome. Would that make things better? Back to your example, the formula for reference: {
"getcol1": {
"process_id": "load_collection",
"arguments": {
"id": "Sentinel-2"
}
},
"filter1": {
"process_id": "filter_bands",
"arguments": {
"data": {
"from_node": "getcol1"
},
"bands": [
"NIR",
"RED",
"BLUE"
]
}
},
"reduce1": {
"process_id": "reduce",
"arguments": {
"data": {
"from_node": "filter1"
},
"dimension": "spectral",
"reducer": {
"callback": {
"nirband": {
"process_id": "array_element",
"arguments": {
"data": {
"from_argument": "data"
},
"index": 0
}
},
"redband": {
"process_id": "array_element",
"arguments": {
"data": {
"from_argument": "data"
},
"index": 1
}
},
"blueband": {
"process_id": "array_element",
"arguments": {
"data": {
"from_argument": "data"
},
"index": 2
}
},
"produc1": {
"process_id": "product",
"arguments": {
"data": [
6,
{
"from_node": "redband"
}
]
}
},
"produc2": {
"process_id": "product",
"arguments": {
"data": [
-7.5,
{
"from_node": "blueband"
}
]
}
},
"sum1": {
"process_id": "sum",
"arguments": {
"data": [
1,
{
"from_node": "nirband"
},
{
"from_node": "produc1"
},
{
"from_node": "produc2"
}
]
}
},
"substr1": {
"process_id": "substract",
"arguments": {
"data": [
{
"from_node": "nirband"
},
{
"from_node": "redband"
}
]
}
},
"divide1": {
"process_id": "divide",
"arguments": {
"data": [
{
"from_node": "substr1"
},
{
"from_node": "sum1"
}
]
}
},
"produc3": {
"process_id": "product",
"arguments": {
"data": [
2.5,
{
"from_node": "divide1"
}
]
},
"result": true
}
}
}
}
},
"export1": {
"process_id": "export",
"arguments": {
"data": {
"from_node": "reduce1"
},
"format": "GTiff"
},
"result": true
}
} Corresponding JS client code (not a very nice implementation yet): var b = new ProcessGraphBuilder();
var collection = b.process("load_collection", {id: "Sentinel-2"});
var filteredBands = b.process("filter_bands", {data: collection, bands: ["NIR", "RED", "BLUE"]});
var evi = b.process("reduce", {
data: filteredBands,
dimension: "spectral",
reducer: (builder, params) => {
var nir = builder.process("array_element", {data: params.data, index: 0});
var red = builder.process("array_element", {data: params.data, index: 1});
var blue = builder.process("array_element", {data: params.data, index: 2});
var result = builder.process("product", {
data: [
2.5,
builder.process("divide", {
data: [
builder.process("substract", {
data: [nir, red]
}),
builder.process("sum", {
data: [
1,
nir,
builder.process("product", {
data: [6, red]
}),
builder.process("product", {
data: [-7.5, blue]
})
]
})
]
})
]
});
return result;
}
});
var result = b.process("export", {data: evi, format: 'GTiff'});
var createdProcessGraph = b.generate(result); |
Thanks, this allows me to try this out in our client and backend implementation! |
I added the JS client code above. Does this "solution" actually makes sense to you, @jdries ? I'm sometimes still a bit confused with my old and the "new" understanding of everything... |
I guess it's basically the only possible solution with the current set of processes, which is better than having multiple options for the same problem. The fact that the data array can contain both primitive and complex objects is probably also not that explicitly described in the docs? |
This example doesn't contain complex objects (it's an array with three primitive values) and currently it is also not meant that the data array contains complex objects. How do you came to the conclusion that it is possible?
I fully agree! What would be a good solution for it? We just added dimension types (one is |
Still, I'd like to say that the JS client code above is just a very naive implementation as a proof-of-concept. The final product will look less verbose so that users will be able to write code like this: function eviReducer(b, data) {
var nir = data.at(0);
var red = data.at(1);
var blue = data.at(2);
return b.product([
2.5,
b.divide([
b.substract([nir, red]),
b.sum([
1,
nir,
b.product([6, red]),
b.product([-7.5, blue])
])
])
]);
}
var builder = new ProcessGraphBuilder();
var collection = builder.load_collection("Sentinel-2");
var filteredBands = builder.filter_bands(collection, ["NIR", "RED", "BLUE"]);
var evi = builder.reduce(filteredBands, "spectral", eviReducer);
var result = builder.export(evi, "GTiff");
var createdProcessGraph = builder.generate(result); That is possible with the 0.4 API and looks much better than my original example, I think. |
Also, I changed the examples in the documentation to a full and up-to-date EVI example, see https://open-eo.github.io/openeo-api/v/0.4.0/processgraphs/#example and also the openAPI spec. |
Moved to a later release, I think we are fine with this for a first version. Otherwise we would potentially need to change quite a lot, which we don't have the time for. So let's make experiences with this and evolve it in the next versions. |
I now support the EVI example in the python client: On the Python side, I was able to make it look very simple. Of course, this relies on the assumption that the backend supports this style of process graph. |
Oh, that looks awesome in the Python client, but I'm concerned that it bases on too many assumptions and doesn't work generally. I'm interested to learn how that works. The process graphs are not meant to be consumed by users so I don't see a general problem here. Though, we have the process graph to model converter from the web editor, which we could release as a single app so that you could simply convert a process graph to a human-readable model (and vice-versa). I think this could help with debugging. Which optimizations? |
@jdries I just had a look at the process graph, it doesn't have a filter_bands to order the bands, so array_element won't work as expected. You would probably work with the wrong bands. It would work without filer_bands if we specify that the bands are by default ordered as specified in the STAC metadata. Then you could simply pass the corresponding array index to array_element. Nevertheless, I'm currently not sure where to define such global definitions. By the way: You don't need to generate |
Thanks! As you know, I also care a lot about developer-friendliness, as an open source project always relies on a healthy and sustainable community of developers. The whole 'my unreadable format is not meant for humans' argument is an OGC classic, and would mean we don't need something like STAC, because we already have Inspire Metadata. |
I'd assume the same, but we have not specified it anywhere and I'm not quite sure where to specify it. Maybe load_collection (+ load_results and ...?) would be the appropriate place?
It's not only the bands, but generally labeled data such as temporal or vertical dimensions, e.g. grouped levels or whatever "nominal" data. But in this version we don't have it yet, so I guess we should aim for that after the first version in May.
Fair enough to expect the processes to be available. The band order is an assumption and maybe more, but I still need to find the actual implementation.
I'd say: me, too. But maybe I have other priorities here (client users) ;-)
Primarily, STAC is not meant to be human-readable either, so I don't buy that argument. STAC is much simpler than INSPIRE in many regards and who wants to mix XML and JSON? The problem with our process graphs is that we are working with JSON here and that always get's quite verbose. If we'd really want to have it human-readable we should remove that constraint and make something like the WCPS language. Having a band math process would make this use case simpler, but not all the of the others. It would mean we need convenience processes for everything. So why not specify an EVI function similar to the NDVI function? Would make it simpler, too. (I'm not actually mandating here to do so.)
Could you give an example? Not quite sure yet what inline declarations are for you. |
We can indeed clarify in load_collection that the band order corresponds to the metadata, seems like a good place. As for the rest of the discussion, it's probably bringing us too far for 0.4.0, so let's get that out of the door first, and hope that other client/backend developers will join in with their experience. |
Thanks. Some Python magic going on with operator overloading and so on, it seems?!
I'll do that:
I don't think that is allowed by the processes anyway, you'd need to resample and merge beforehand and then you can use all bands (in case the names don't collide - should we have a process to rename dimension values? See #50).
Would like to, waiting for #44 to be solved ... |
Is this solved? I don't have a clear to do for now... |
In the new 0.4.0 process graph example, 'dimension_data' is used in some places, but this is not yet clearly documented (afaik).
I'm also a bit confused about the name, could it be that this name was chosen in the context of the initial confusion around data cubes?
The text was updated successfully, but these errors were encountered: