Define valid index, type, field, id, routing values #6736

clintongormley · 2014-07-04T13:23:32Z

Currently we have no specification of allowed values for index names, type names, IDs, field names or routing values.

This issue is an attempt to document and improve the existing specs to prevent inconsistencies.

Index names

Index names are limited by the file system. They may only be lower case, and my not start with an underscore. While we don't prevent index names starting with a ., we reserve those for internal use. Clearly, . and .. cannot be used.

These characters are already illegal: \, /, *, ?, ", <, >, |, ,,. We should also add the null byte.

There are other filenames which are illegal in Windows, but we probably don't need to check for those.

Type names

Type names can contain any character (except null bytes, which currently we don't check) but may not start with an underscore.

IDs

IDs can contain any character (except null bytes, which currently we don't check). IDs should not begin with an underscore.

Currently IDs are not checked for underscores and IDs with underscores may exist. These can clash with eg _mapping and so should be prevented. This is a backwards incompatible change.

Routing & Parent

Routing and parent values should be the same as IDs, ie any chars except for the null byte. The problem is that multiple routing values are passed in the query string as comma-separated values, eg ?routing=foo,bar.

If a single routing value contains a comma, it will be misinterpreted as two routing values. One idea is to pass multiple routing values as eg ?routing=foo&routing=bar,baz. Unfortunately, this is not backwards compatible and isn't supported by a number of client libraries.

The only solution I can think of is to support some escaping of commas, eg foo\,bar. This would mean that \ would need to be escaped as well, ie: foo\bar -> foo\\bar. Support for this escaping would need to be added to Elasticsearch and to the client libraries.

The text was updated successfully, but these errors were encountered:

colings86 · 2014-07-04T15:08:47Z

Should ensure that if any global discisions are made regarding naming that the aggregation names are also included

bleskes · 2014-07-04T19:47:07Z

one more relevant point is that some of our endpoints mask what is a valid get doc by id REST request (according to the above spec). For example: GET index/type/_mapping (which masks a document where id is _mapping). IMHO this is not a problem, but we should mention it for completeness.

clintongormley · 2014-07-07T08:10:29Z

@bleskes i think it is a problem as there is no workaround. I've added this sentence to the original issue: "IDs should not begin with an underscore."

clintongormley · 2014-07-09T10:07:13Z

Adding field names to the specs (see #5972). Field names should not begin with an underscore, contain . or null bytes.

If a fieldname contains . when creating a mapping, we have two choices:

throw an error
convert it to an object eg foo.bar: 5 -> `{ foo: { bar: 5}}``

Throwing an error seems a more transparent way of dealing with this.

danielcweeks · 2014-07-31T19:06:44Z

Having a dot in the field name is actually very useful. Would it be possible to use an escape for referencing a field name instead of path?

clintongormley · 2014-08-01T07:42:10Z

@dcw-netflix an escape? do you mean foo\.bar? Yes we can probably support that.

danielcweeks · 2014-08-01T15:47:14Z

Yes, that would be perfect. The reason is that there are a lot of use cases where property/config files get indexed, which results in many dot separated keys.

clintongormley · 2014-08-04T15:20:39Z

Colons in index names are also invalid. See #7148

uboness · 2014-08-13T10:54:09Z

This is a great start for having a format input validation rules in elasticsearch. I believe we need to centralize all these rules in one place. I also think we should have validation rules for every input in es (not just those listed above)... for example: field names, repository names, snapshot names, etc... basically everything that in one way or another can compromise the consistent state of the cluster.

We currently have a lot of this logic (probably incomplete) scattered in different places, it's definitely time to formalize them (both in docs & code)

dadoonet · 2014-08-13T11:32:34Z

Note that for repository names, we also need to delegate the validation to plugins as there can be other rules with some cloud providers (azure for example). See also #7096

ron-totango · 2014-10-30T06:04:47Z

We have an issue with routing value with comma. Any workaround we should use? Thanks

rpedela · 2014-10-30T23:55:22Z

A common use case for ES (and my use case) is to index a DB table which may have column names that start with an underscore. Renaming the columns is not an option in my use case as well. Currently this requires storing a mapping between DB column names and ES field names which adds complexity.

Is it possible to escape an underscore in a field name? Or more generally is it possible to escape any special character in a field name? A more general escaping solution would be optimal in my opinion because then a field name could have any arbitrary characters just like a quoted SQL identifier.

clintongormley · 2014-12-24T13:12:38Z

Closing in favour of #9059

mcayland · 2015-12-07T15:22:15Z

I've just come across this problem in the past week with ES 2.1 whilst trying to create documents with "."s in the field name. Am I correct in that even the field name escaping parts aren't included in ES 2.1? This is sadly a showstopper for our application as the field names we use are equipment serial codes, and we've recently added a supplier that includes "."s in their serial codes.

clintongormley · 2015-12-15T11:32:41Z

@mcayland using serial numbers for field names is a bad design choice as you will end up with sparse fields, and much more disk usage than you actually need.

roisin-jin · 2016-06-09T15:41:25Z

hi, I'm wondering is there any other wildcard characters allowed in the template names apart from the star symbol? We have several indexes named by the same pattern, i.e: ap-YYYY-MM, bg-YYYY-MM, cm-YYYY-MM, etc. And they all have the same mapping, we just want separate those data into different indexes. Is there anyway to create a single template with index name pattern like '??-*' ?

clintongormley added discuss labels Jul 4, 2014

clintongormley mentioned this issue Jul 4, 2014

Problem with parent id containing commas when talking to an index alias #4202

Closed

This was referenced Jul 4, 2014

Aggregations: Aggregation names can now include dot #6708

Closed

Extend allowed characters in aggregation name #6702

Closed

clintongormley added the breaking label Jul 7, 2014

clintongormley mentioned this issue Jul 8, 2014

disable ability to make a doc named /test/test/_query #1780

Closed

clintongormley self-assigned this Jul 9, 2014

This was referenced Jul 9, 2014

Inconsistent results when mixing fieldname: { fieldname: { and fieldname.fieldname for mapping and indexing #5972

Closed

Suggest REST endpoint fix for POST/PUT #5443

Closed

clintongormley mentioned this issue Jul 31, 2014

Incorrect field values returned when '.' (dot) used in property name #7112

Closed

This was referenced Aug 4, 2014

colon sign (:) not excluded from filenames #7148

Closed

Reject index requests that contain metadata fields in the request body. #3917

Closed

Elasticsearch accepts requests to write indices with bad characters that cannot be written to disk by java #6589

Closed

dakrone mentioned this issue Aug 13, 2014

Resiliency: Forbid index names over 100 characters in length #7252

Merged

clintongormley mentioned this issue Aug 18, 2014

INDEX DELETE with wildcard doesn't delete all matching indexes #7295

Closed

clintongormley mentioned this issue Sep 29, 2014

Failed to write index state due to incorrect syntax in directory name #7800

Closed

clintongormley mentioned this issue Oct 15, 2014

Aggregations order : Escaping the user defined AGG_NAME #8042

Closed

clintongormley mentioned this issue Oct 28, 2014

Unicode index names causing DELETE index to hang #8254

Closed

LosD mentioned this issue Nov 6, 2014

Capitalized names makes illegal index name uberVU/elasticsearch-river-github#12

Open

gibrown mentioned this issue Nov 6, 2014

Change the index setting so it can handle site in a directory alleyinteractive/searchpress#10

Open

sushanthku mentioned this issue Nov 12, 2014

Problem while indexing a document with routing after upgrade #8459

Closed

gmarz mentioned this issue Nov 24, 2014

Client shouldn't silently fail if index name is wrong elastic/elasticsearch-net#1017

Closed

lukasolson mentioned this issue Dec 16, 2014

Creating an index with the name "." causes strange behavior elastic/kibana#2326

Closed

clintongormley mentioned this issue Dec 24, 2014

Input validation #9059

Closed

clintongormley closed this as completed Dec 24, 2014

Mpdreamz mentioned this issue May 25, 2015

Mapping should throw when field names with dots are specified #11337

Closed

MarkusMayer mentioned this issue May 29, 2015

elasticsearch output - failed with response of 400 logstash-plugins/logstash-output-elasticsearch#144

Closed

Artimi mentioned this issue Sep 9, 2016

Elasticsearch report backend ionelmc/pytest-benchmark#58

Merged

rwynn mentioned this issue Nov 18, 2016

ElasticSearch index and type rules rwynn/monstache#8

Closed

adarsh mentioned this issue Dec 13, 2016

Reduce logical complexity for search index jobs greencommons/commons#78

Merged

rwynn mentioned this issue Feb 15, 2018

Does it/Will it support parent joins? rwynn/monstache#43

Closed

jpparis-orange mentioned this issue Feb 26, 2018

the default index name should be lowercase Orange-OpenSource/kibana-xlsx-import#3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define valid index, type, field, id, routing values #6736

Define valid index, type, field, id, routing values #6736

clintongormley commented Jul 4, 2014

colings86 commented Jul 4, 2014

bleskes commented Jul 4, 2014

clintongormley commented Jul 7, 2014

clintongormley commented Jul 9, 2014

danielcweeks commented Jul 31, 2014

clintongormley commented Aug 1, 2014

danielcweeks commented Aug 1, 2014

clintongormley commented Aug 4, 2014

uboness commented Aug 13, 2014

dadoonet commented Aug 13, 2014

ron-totango commented Oct 30, 2014

rpedela commented Oct 30, 2014

clintongormley commented Dec 24, 2014

mcayland commented Dec 7, 2015

clintongormley commented Dec 15, 2015

roisin-jin commented Jun 9, 2016 •

edited

Loading

Define valid index, type, field, id, routing values #6736

Define valid index, type, field, id, routing values #6736

Comments

clintongormley commented Jul 4, 2014

Index names

Type names

IDs

Routing & Parent

colings86 commented Jul 4, 2014

bleskes commented Jul 4, 2014

clintongormley commented Jul 7, 2014

clintongormley commented Jul 9, 2014

danielcweeks commented Jul 31, 2014

clintongormley commented Aug 1, 2014

danielcweeks commented Aug 1, 2014

clintongormley commented Aug 4, 2014

uboness commented Aug 13, 2014

dadoonet commented Aug 13, 2014

ron-totango commented Oct 30, 2014

rpedela commented Oct 30, 2014

clintongormley commented Dec 24, 2014

mcayland commented Dec 7, 2015

clintongormley commented Dec 15, 2015

roisin-jin commented Jun 9, 2016 • edited Loading

roisin-jin commented Jun 9, 2016 •

edited

Loading