Skip to content
This repository has been archived by the owner on Apr 10, 2018. It is now read-only.

Allow developers to specify a default/fallback value for a token when a feature property is undefined #104

Closed
kkaefer opened this issue Jul 16, 2014 · 32 comments

Comments

@kkaefer
Copy link
Contributor

kkaefer commented Jul 16, 2014

For certain tokens, a default string might be useful, e.g. when pulling the icon name from a feature, there may be features with that property not set. In those cases, it'd be cool to specify a default value for a token, for example:

"{maki:generic}-12" would generate "cafe-12" when the feature has the property maki set to "cafe", and "generic-12" when that property isn't set.

@ansis
Copy link
Contributor

ansis commented Jul 16, 2014

Maybe the default values should be in a separate object so that the string parsing doesn't become more complicated.

@yhahn
Copy link
Member

yhahn commented Jul 16, 2014

cc @ajashton trying to remember all the things we've run into with this

@ajashton
Copy link
Member

The cases where I've most wanted fallbacks are not usually to a static string but to an alternate column or multiple alternate columns.

  • Label translations without duplicated data in the vector tiles: "text-field": "{name_en:name}"
  • POI icons with of optional custom rail network icons in a single layer: "icon-image": "{network:maki}-12"
  • Maybe also make string fallbacks possible: "{network:maki:'generic'}-12"

@kkaefer
Copy link
Contributor Author

kkaefer commented Jul 16, 2014

Maybe we can do recursive replacements, like {network:{maki}}?

@kkaefer
Copy link
Contributor Author

kkaefer commented Jul 17, 2014

@ajashton how do you like that suggestion? "{network:maki:'generic'}-12" could be "{network:{maki:generic}}-12"

@ajashton
Copy link
Member

Seems good to me

@ajashton
Copy link
Member

Re-reading this discussion it sounds like we're mostly thinking about fallbacks for situations where the token field is null/unset in the data. But another important situation when working with sprites is being able to fall back when the token field contains an unexpected value.

Using the {maki:'generic'}-12 example, could it be possible to fall back to 'generic' if the maki value is 'cafe' but the current sprite does not contain an icon for cafe? This would also be extremely useful.

@jfirebaugh
Copy link
Contributor

See #362 (comment) for my preferred syntax.

@1ec5
Copy link
Contributor

1ec5 commented Dec 4, 2015

Using the {maki:'generic'}-12 example, could it be possible to fall back to 'generic' if the maki value is 'cafe' but the current sprite does not contain an icon for cafe?

This would be a useful part of #249.

@1ec5
Copy link
Contributor

1ec5 commented Jul 11, 2016

I’ve pushed two proofs of concepts implemented in GL JS, each with a different design:

1ec5-token-default-104 implements exactly the syntax described in #362 (comment). This approach has the more straightforward implementation of the two, and there’s less repetition if you intend to surround a particular token with the same text no matter what the token expands to.

1ec5-token-selection-104 implements a different syntax that alternates on the entire string rather than individual tokens but still allows for an arbitrary number of alternative values. This approach sacrifices compactness for expressiveness. Now you can append a parenthetical gloss to a label but omit the parentheses if the gloss is empty (that is, if any of the constituent tokens is missing):

{
    "type": "selection",
    "cases": [
        [{"ref": "name"}, " (", {"ref": "name_en"}, ")"],
        [{"ref": "name"}, " (", {"ref": "name_fr"}, ")"],
        [{"ref": "name"}]
    ]
}

On this second branch, I’ve also implemented the image fallback described in #104 (comment), so you can specify a series of fallbacks in the event that the icon-image string specifies a nonexistent icon:

{
    "type": "selection",
    "cases": [
        [{"ref": "bespoke-icon"}],
        [{"ref": "maki"}, "-12"],
        ["generic-12"]
    ]
}

Both proposals reserve the ability to add additional information beside ref in the future, for example a transformation option:

{
    "type": "selection",
    "cases": [
        [{"ref": "name", "transform": "uppercase"}, " (", {"ref": "name_en"}, ")"],
        [{"ref": "name", "transform": "uppercase"}]
    ]
}

Finally, I’m totally open to naming suggestions. With selection and cases, I wanted to emphasize the relationship between the various cases rather than what happens to the individual components in each case. I figure that information is already adequately conveyed by ref, which could just as well be called source-property or token.

@kkaefer
Copy link
Contributor Author

kkaefer commented Jul 11, 2016

@1ec5 that looks like a good first start, but I'm wondering if we need the type: selection as the root element. We could start out with any element, e.g. [] for concatenation, then similar to {"ref":...}, introduce other operators, like if/else, sort of like @tmcw's wax approach.

@1ec5
Copy link
Contributor

1ec5 commented Jul 11, 2016

The type property isn’t particularly important in my opinion. It’s a vestige of the syntax @jfirebaugh proposed in #362 (comment), meant to align with the property function syntax. But I agree that the presence of an array already indicates concatenation pretty clearly, and a nested array can reasonably be interpreted to always mean a “choose” construct like the one I’m proposing here.

@jfirebaugh
Copy link
Contributor

Let's keep the object syntax with type property. Arrays are already in use for *-translate, *-offset, line-dasharray, text-font, etc. Using arrays for substitutions as well is likely to lead to ambiguities that would have to be resolved with more complex heuristics, especially if we support substitutions for these properties -- which we should.

@jfirebaugh
Copy link
Contributor

I also prefer the syntax from 1ec5-token-default-104 / #362 (comment). With that syntax, the condition for when to use the fallback is clearer, and the implementation is straightforward. Compactness is lower down on the list of design goals for the style specification.

@1ec5
Copy link
Contributor

1ec5 commented Jul 12, 2016

Unfortunately, the syntax in 1ec5-token-default-104 / #362 (comment) also means we’d be giving up some flexibility: for example, it would be impossible to have a label be {name} ({name_en}) if name_en is defined, but just {name} otherwise.

I think it would also be inadequate as the basis for image fallback beyond what’s described in #104 (comment). The string as a whole is what’s being tested for validity, not the individual token that’s being substituted.

Compactness is lower down on the list of design goals for the style specification.

I consider the 1ec5-token-default-104 design’s only real advantage to be its compactness. My implementation of 1ec5-token-selection-104 can certainly afford some polishing, but it doesn’t seem impractical. Once property functions make their way to mapbox-gl-native, 1ec5-token-selection-104 should be straightforward to implement there too. As for clarity over the fallback criteria, we could express that alongside cases: for example, "match": "has-all-tokens" versus "match": "exists".

@1ec5
Copy link
Contributor

1ec5 commented Jul 12, 2016

Now that we’ve abandoned the mini-language in favor of a structured syntax, software like Mapbox Studio will have to implement a WYSIWYG UI around token defaults if the feature is to be viable. It’s pretty straightforward to implement a text field containing placeholder tokens – one example is the search bar at the top of this page, with its “This repository” token.

With the 1ec5-token-selection-104 syntax, one possible UI would be a series of these tokenized text fields, styled in a way that makes subsequent fields look like alternatives. By contrast, the 1ec5-token-default-104 syntax would require one text field with arbitrarily nested tokens, each of which could contain text in addition to tokens. I’m concerned that such a UI would be confusing to work with.

@lucaswoj
Copy link

lucaswoj commented Sep 20, 2016

Token defaults could be implemented as a use of the proposed conditional primitive and the existing has operator.

["if", ["has", "maki"], "{maki}-15", "generic-15"]
["if", condition, trueValue, falseValue]

The conditional primitive would be useful in other cases. It has has been proposed previously at #402 (comment).

@1ec5
Copy link
Contributor

1ec5 commented Sep 20, 2016

That would be an elegant solution. In #402 (comment), I’ve extended your proposal with an exists operator for missing sprite fallbacks for icon-image.

@kkaefer
Copy link
Contributor Author

kkaefer commented Nov 28, 2016

I've spent some time thinking through potential syntax in the stylesheet and dug out an old prototype from this spring. Ultimately, a syntax like @lucaswoj proposed in #104 (comment) makes most sense and would fit into the existing filter/expression model that we already use for feature filtering.

The main caveat I'm seeing is that expressions like ["==", key, value] currently take two string parameters with vastly different interpretation: key is interpreted as the index name for looking up properties from the associated feature object, while value is a verbatim string. We could keep this model to maintain compatibility, but that would mean that we'll have an awkward syntax where ["==", "foo", "foo"] would return false (unless of course a property foo with value "foo" exists on the feature). However, we can easily extend this syntax to something like ["==", ["string", "foo"], "foo"]. Given that most people will interact with styles via a GUI, this awkward syntax may be okay. Thoughts?

@davidtheclark
Copy link
Contributor

I'm in favor of the 1ec5-token-selection-104 syntax: I think that using a cases array for specifying fallbacks is more clear and more flexible than adding "default" or "fallback" properties to individual token objects, or creating complex arrays blending keywords and string values.

My immediate thought about adding additional properties to token objects (like "transform") is that it would mutate a straightforward UI in Studio into a baroque nightmare ... and if "transform" is the only idea we have in mind for its usage, I'm not sure it's worth the suffering. A special syntax like name_en:uppercase might be preferable because of its simplicity. At least, I can imagine a Studio user reading some instructions and then using that syntax, but have a much harder time imagining a Studio user somehow creating a token abstraction, designating its field as name_en, designating that it's uppercase, and then somehow inserting it into a string of intermixed tokens and non-tokens.

Also, if we don't have token objects, the array fallback syntax could be very straightforward indeed:

{
    "type": "selection",
    "cases": [
        "{name} ({name_en})",
        "{name} ({name_fr})",
        "{name}",
        "unnamed"
    ]
}

We'd parse the string for tokens, and if any of the fields they point to are empty, we move on to the next item.

@kkaefer
Copy link
Contributor Author

kkaefer commented Nov 28, 2016

While we're expanding the syntax to support fallbacks, it'd be great to add a slightly more complex syntax to allow for conversions like meters to feet, and basic mathematical operations like rounding, and number formatting (think thousands formatting, or limiting float precision).

The syntax in 1ec5-token-selection-104 suffers from combinatorial explosion when combining multiple fields. A syntax akin to what @jfirebaugh proposed in #362 (comment) addresses that by allowing dynamic concatenation of strings.

As for integrating this into Studio, we could still build a "fallback" UI, and translate them to the respective JSON syntax. In addition, we could also build a simple expression parser that translates a simple expression grammar

has("name_en") ? "{name} ({name_en})" : (has("name_fr") ? "{name} ({name_fr})" : "{name}")

into the JSON representation of its AST:

["if", ["has", "name_en"],
    "{name} ({name_en})",
    ["if", ["has", "name_fr"],
        "{name} ({name_fr})",
        "{name}"
    ]]

@kkaefer
Copy link
Contributor Author

kkaefer commented Nov 28, 2016

I looked at various possible use cases of expressions and identified these:

  • Mathematical operations

  • Unit conversions

    • e.g. meters => foot, degrees => Fahrenheit
    • requires mathematical operations and rounding
  • Number formatting

    • specify floating precision, thousands formatting, scientific notation
    • locale-awareness?
  • Token defaults

    • format string is based on presence of values in a feature
  • Layout-time values

    • e.g. select highway shield based on text length
    • mathematical operations, string operations, essentially all of the above

Please post here if you think your use case isn't covered by any of these.

@tmcw
Copy link
Contributor

tmcw commented Nov 28, 2016

To frame how we think about this kind of feature for Studio:

Having a simple representation of a value and a complex, powerful representation of the same value usually means, for Studio, that we need to build a nice, robust UI for both. The thing is, base styles are editable in Studio: you can open a bright, dark, emerald, etc style - a complex style that should exploit every advantage we have - and you should be able to edit it. So the idea that 'novice users will write and edit novice-level styles' doesn't work out: novice users are often editing styles written by pro users.

Thus far, we've avoided any kind of JSON editing in the Style UI. There are a few raw text editing areas, but JSON is a level above the complexity of, say, a color string or a text-field value. JSON is very picky about " and , and [, things that us programmers know unconsciously but are very hard for anyone else to pick up. We try to promise that we don't let you screw up, whereas the majority of guessed-inputs to a JSON textfield will be invalid.

This issue has had two years to grow, and it looks like it started as token defaults and is currently building a new grammar and small programming language. Not that that's a bad thing: the problems it now aims to solve are valid. But I do question whether we want to throw this much complexity into a feature in order to move processing from the tile generation to the tile rendering step, and how we plan to deal with the eventuality that people will use the expressiveness of the language to write multi-hundred-line functions in text-fields.

@ajashton
Copy link
Member

  • Token defaults
    • format string is based on presence of values in a feature

Would this include checking presence of icons in the sprite?

@1ec5
Copy link
Contributor

1ec5 commented Nov 28, 2016

A syntax akin to what @jfirebaugh proposed in #362 (comment) addresses that by allowing dynamic concatenation of strings.

Note that #362 (comment) is more or less implemented in 1ec5-token-default-104. I implemented the alternative syntax in 1ec5-token-selection-104 for some use cases that weren't addressed by the token-default syntax. I'd encourage everyone to check out the unit tests on both branches to get a sense of what's possible with each approach.

The "selection" syntax implemented in 1ec5-token-selection-104 may suffer from combinatorial explosion, as described in #104 (comment), but I think it's a good tradeoff compared to 1ec5-token-default-104. In exchange for making it harder to permute a large number of fields and to a small extent hiding what "fails" a case, we make it easier to build an intuitive UI, discourage users from stuffing deeply nested logic or "multi-hundred-line functions" into a single field, and make it possible to vary fallbacks beyond simple replacements. The key insight is that the user rarely wants to only substitute one field for another inside text-field; usually adjustments to surrounding punctuation are needed as well. The token-selection syntax also provides an easy (already implemented) fallback for missing style icons.

A generalized conditional syntax does address the superset of use cases and jives with the trend towards more prosaic syntax, but at the expense of having to build a generalized string template editor in Studio.

@1ec5
Copy link
Contributor

1ec5 commented Nov 30, 2016

Up to this point, we’ve discussed two primary use cases for token fallbacks (formatting and conversion options aside):

  • Image fallbacks. A style designer shouldn’t have to provide an image for every {shield}/{reflen} combination or every {maki}/size combination that the Mapbox Streets source makes possible.
  • Bilingual labels. The Mapbox Streets source currently duplicates the name of every feature into every {name-*} field, even when the name isn’t localized in OpenStreetMap, because there’s no way for the style designer to set text-field to {name} ({name_en}) if {name_en} is defined but {name} otherwise, except to create yet another layer.

Reading #104 (comment) made me reconsider whether the two use cases really need to be served by the same feature. So I’ve filed #597 to support the image fallback use case specifically with a separate syntax reminiscent of font stacks.

Meanwhile, I still think the bilingual label use case is best served by the token-selection syntax described in #104 (comment), not least because the Studio UI for it would be just as straightforward to implement and intuitive to use as the UI envisioned for image fallbacks in #597.

A generalized conditional syntax would also serve this use case well: in the example above, the designer might want to use {name} rather than {name} ({name_en}) if both are set to the same value. However, does anyone have a good idea of what the UI for this syntax would look like? I think an effective UI would need to be complete at a glance: the designer should be able to discern the property’s value without opening a myriad of panels and popups.

One idea that has come up is to present the designer with a syntax-highlighted JSON editor if the value exceeds a certain level of complexity. However, I fear that the designer will never see the “simple mode” in practice, given that the bilingual label use case above would already require some alternation beyond simple token replacement.

@ajashton
Copy link
Member

the designer might want to use {name} rather than {name} ({name_en}) if both are set to the same value.

I've thought about this and have no good suggestions, only more complications. Mainly that it might be undesirable to show name_en even if is just similar to the untranslated name. Eg "Québec (Quebec)" does not seem like a label I would want to add to a map.

@1ec5
Copy link
Contributor

1ec5 commented Nov 30, 2016

"Québec (Quebec)" does not seem like a label I would want to add to a map.

The case and diacritic folding functionality requested in #548 would fit in with the generalized conditional syntaxes that have been proposed so far. For anything beyond case and diacritic folding, we’d probably have to implement a Levenshtein distance function for use inside conditionals, or the source would have to provide more fields to indicate the desired behavior.

@kkaefer
Copy link
Contributor Author

kkaefer commented Dec 2, 2016

While I think I'm warming up to the selection syntax, #548 won't solve the issue of "Québec (Quebec)", since it'll take the first one that is available? It seems like most of the discussion here revolves around creating a UI for a formatting/control flow. A flow-chart like UI could capture this, but is hard to develop and adds a lot of additional complexity to studio. Creating a mini-language that we parse to the JSON AST could also work, and we could show errors.

Something could also work is a combination of the selection syntax with conditionals:

[
    [ condition, value ],
    [ condition, value ],
    [ condition, value ],
    ...
]

So the example we've been using could look like this:

[
    [["all", ["has", "name_en"], ["!=", "name", ["key", "name_en"]] ], "{name} ({name_en})"],
    [["all", ["has", "name_fr"], ["!=", "name", ["key", "name_fr"]] ], "{name} ({name_fr})"],
    ["all", ["has", "name_fr"]], "{name}"],
    [true, "unnamed"]
]

But at this point, we're seeing the same issues I pointed out above: The current UI and syntax only allows you to specify a property name (key), and only a verbatim value for comparison operations.

@1ec5
Copy link
Contributor

1ec5 commented Dec 2, 2016

While I think I'm warming up to the selection syntax, #548 won't solve the issue of "Québec (Quebec)", since it'll take the first one that is available?

That’s correct. #548 is only relevant to this discussion if we implement a generalized conditional syntax that looks uncannily similar to the filter syntax.

But at this point, we're seeing the same issues I pointed out above: The current UI and syntax only allows you to specify a property name (key), and only a verbatim value for comparison operations.

This is going to get into the weeds real fast, but one solution would be to introduce a complementary set of operators, such as equals-property and equals-property-case-insensitive, for use in filters and this “selection syntax with conditionals” (property filters?). Then in Studio, the “Text field” property would have a fallback list UI similar to the one for the “Font” property:

font

Except that there would be a button to the left of each input box, and clicking it would open a flyout containing the usual filter UI:

filter

And that UI would have twice as many operators to choose from, with checkboxes for case and diacritic folding. I think this illustrates that the more use cases we try to address within the scope of this issue, the more bloated any UI would be and the more places a surprising value can lurk unnoticed. (Or the more the UI resembles a JSON editor.)

@lucaswoj lucaswoj changed the title Token defaults Allow developers to specify a default value for a token whose feature property is undefined Dec 22, 2016
@1ec5 1ec5 changed the title Allow developers to specify a default value for a token whose feature property is undefined Allow developers to specify a default/fallback value for a token when a feature property is undefined Dec 22, 2016
@ajashton
Copy link
Member

ajashton commented Jan 3, 2017

This issue has had two years to grow, and it looks like it started as token defaults and is currently building a new grammar and small programming language.

A lot of different use cases and edge cases have come up in this issue, and they would all be nice to solve. But the initial basic idea of falling back from one field to another if the first one is undefined would still unlock a number of important data/cartography needs. Eg being able to greatly expand the number of label languages in vector tile sources without bloating them with mostly-duplicate values in every tile. (Not thinking about multilingual labels - just single names at a time pulling from one of multiple possible fields.)

If coming up with a full expressions syntax won't be doable soon due to complexity or UX concerns, solving just the basic fallback case first would still be extremely helpful.

@lucaswoj
Copy link

lucaswoj commented Feb 1, 2017

This issue was moved to mapbox/mapbox-gl-js#4079

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants