Filtering terms in model: Confusing `/entryType/filteringTerms.json` files #93

mbaudis · 2023-06-19T15:01:18Z

FilteringTerms and filters are still confusing. One of the areas is the existence of filteringTerms.json files (e.g. in biosamples) which are supposedly placeholders for information files about the available filtering terms for the entry type but do not constitute an endpoint. So it seems that they are for internal use only (and anyway most probably would be kept in a database or generated on the fly).

Proposals:

delete filteringTerms.yaml / .json
optionally add a /filtering_terms endpoint to endpoints, for each entity where filters apply, e.g. /biosamples/filtering_terms/, additionally to /filtering_terms, to list all applying to the given scope

The text was updated successfully, but these errors were encountered:

redmitry · 2023-06-19T16:39:56Z

I would rather add

"scope": {
    "type": "array",
    "items": {
        "type": "string"
    }
}

property to the beaconFilteringTermsResults.json.

so we would define entryTypes for filters and do not need many endpoints.
the /filtering_terms endpoint will have:

{
    "id": "LOINC:3141-9",
    "label": "Weight",
    "type": "alphanumeric",
    "scope": ["individual", "biosamples", ...]
}

I think this would simplify both, implementation and usage.

Best,

Dmitry

mbaudis · 2023-06-20T10:33:03Z

@redmitry Yes, IMO also a good option (actually my preferred one). But the form in which to do this is not clear because there are some open questions:

Can there be several scopes (and why would there)?
Does the scope apply to the object in the model (i.e. the filter matches a parameter there) or to the entry type requested? Not the same; the model definition would just be a hint for where this is processed; the entry type use could mean that the filter could be applied somewhere else to filter the entry type indirectly (e.g. variants for a biosample parameter).
Do we want to hint at the precise scope, i.e. the parameter in the entry type's schema in the default model? This seems like a good option but this could be confusing for implementers and also would be problematic for alternative models etc.
Also: Do we need/want a scope in the filter's query part at all? One instance comes to my mind (DUO codes) but this may be a bit constructed...

For me in summary I'd rather see a flexible use where the terms in the beaconFilteringTermsResults have some (optional) information about the scope they apply to, potentially w/ parameter (e.g. biosamples.histologicalDiagnosis) but more for informational purposes. So a solution like above would be one option.

There had also been some discussion at #79 (comment) (w/o final resolution).

costero-e · 2023-06-20T12:26:29Z

I also like the approach that @redmitry has done with filtering terms. I think it would fit better the idea of filters we are applying in beacon and would help the user to quicker find what can he filter and where. Just one observation, wouldn't the scope be defined better as an object than an array?, like this:

"scope": {
    "type": "object",
    "items": {
        "type": "string"
    }
}

On the other hand, on @mbaudis observations, I agree that giving a scope hint can be problematic if we think in alternative models, but as you said we can add it to be optional, not required. But I can think of some examples (in fact we have them in our reference implementation) where an ontology applies to two different scopes, like ethnicities/diseases/sex ontologies in individuals and cohorts.

mbaudis · 2023-06-20T12:45:11Z

@costero-e The

type: array
items: string

...is correct if you reference the entity names. If you reference ontologies for the entities you'd use items: object or better

items:
  $ref: "../common..."

mbaudis · 2023-06-20T12:49:03Z

Good note about the examples (sex, ethnicities...). Another argument for having entities in the filteringTerms - but not for queries since you'd either query cohorts or individuals. I think the use of query aggregation here across entry types (variants for a certain diagnosis, biosamples from male subjects ...) does not interfere for these examples since there is a difference between collection schemas and data records.

redmitry · 2023-06-20T12:50:04Z

Can there be several scopes (and why would there)?

The filters are to be applied to particular entry types, so IMO it's reasonable to enumerate (limit) their scopes.

Does the scope apply to the object in the model (i.e. the filter matches a parameter there) or to the entry type requested?

IMO, the scope defines the entry type it applies, not the parameter. The parameter in particular entry type may differ in different implementations (mongo, sql, omop, etc.).
The reference implementation uses "scope" attribute in the filtering_terms to provide "mongodb" parameter. issue #79
Again, at the moment, there is no "scope" parameter in filtering terms.
Alphanumeric filters may require more complex solutions than just path to the property. For instance, here is java implementation filter:

{
    "id": "LOINC:3141-9",
    "label": "Weight",
    "type": "alphanumeric",
    "query": "{'measures': {'$elemMatch': {'assayCode.id': '$$id', 'measurementValue.value': {$$operator: $$value}}}}}"
}

Note, that "query" is filtered out when exposed by "filtering_terms" (we just use the same format for configuration).

Not the same; the model definition would just be a hint for where this is processed; the entry type use could mean that the filter could be applied somewhere else to filter the entry type indirectly (e.g. variants for a biosample parameter).

Well, the idea that "scope" defined in the query corresponds to "scope"s defined in filtering_terms.

Do we want to hint at the precise scope, i.e. the parameter in the entry type's schema in the default model? This seems like a good option but this could be confusing for implementers and also would be problematic for alternative models etc.

Parameters in the entry should be out of scope, IMO. Beacon implementation may generate beacons dynamically and use other query languages or whatever.

Also: Do we need/want a scope in the filter's query part at all? One instance comes to my mind (DUO codes) but this may be a bit constructed...

This could provide the choice in a case of complex query. Imagine we have say "age" filter which may be applied to two entities (e.g. biosamples and individuals).

costero-e · 2023-06-20T14:52:57Z

@costero-e The
type: array
items: string
...is correct if you reference the entity names. If you reference ontologies for the entities you'd use items: object or better
items:
  $ref: "../common..."

Sorry, I was thinking in the whole "individuals" response as the scope and not the type of scope inside the filtering terms. So yes, it is an array and items are the strings:

"scope": ["individuals", "biosamples"]

jrambla · 2023-06-20T18:26:35Z

We are having several parallel discussions here, which, usually, ends up in some threads not being closed.
I would suggest to split the discussion in three different ones:

Keeping or not the filteringTerms.yaml inside every entry type folder
Making listing the filtering terms vs using filtering terms more homogeneous
Having multiscope in the filterTerm definition

Makes sense?

mbaudis assigned redmitry, jrambla, tb143 and costero-e Jun 19, 2023

This was referenced Jun 21, 2023

Keeping or not the filteringTerms.yaml inside every entry type folder #94

Open

Making listing the filtering terms vs using filtering terms more homogeneous #96

Closed

costero-e closed this as completed Jun 21, 2023

redmitry mentioned this issue Sep 5, 2023

add scopes[] to the FilteringTerm #109

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filtering terms in model: Confusing `/entryType/filteringTerms.json` files #93

Filtering terms in model: Confusing `/entryType/filteringTerms.json` files #93

mbaudis commented Jun 19, 2023

redmitry commented Jun 19, 2023

mbaudis commented Jun 20, 2023

costero-e commented Jun 20, 2023 •

edited

Loading

mbaudis commented Jun 20, 2023

mbaudis commented Jun 20, 2023

redmitry commented Jun 20, 2023

costero-e commented Jun 20, 2023

jrambla commented Jun 20, 2023

Filtering terms in model: Confusing /__entryType__/filteringTerms.json files #93

Filtering terms in model: Confusing /__entryType__/filteringTerms.json files #93

Comments

mbaudis commented Jun 19, 2023

redmitry commented Jun 19, 2023

mbaudis commented Jun 20, 2023

costero-e commented Jun 20, 2023 • edited Loading

mbaudis commented Jun 20, 2023

mbaudis commented Jun 20, 2023

redmitry commented Jun 20, 2023

costero-e commented Jun 20, 2023

jrambla commented Jun 20, 2023

Filtering terms in model: Confusing `/entryType/filteringTerms.json` files #93

Filtering terms in model: Confusing `/entryType/filteringTerms.json` files #93

costero-e commented Jun 20, 2023 •

edited

Loading