Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange terms on field with same name on different types #8614

Closed
rore opened this issue Nov 23, 2014 · 2 comments
Closed

Strange terms on field with same name on different types #8614

rore opened this issue Nov 23, 2014 · 2 comments

Comments

@rore
Copy link

rore commented Nov 23, 2014

Using ES 1.4.0.

Problem description (details to follow): A field with the same name is defined under two separate types in the same index. Once as a string, the other as a long. The values in this field are always numbers. Doing a terms aggregation on this field shows strange values.

Steps to reproduce:

Create a type with a string field:

PUT rotem-test/_mapping/t1
{
  "t1": {
    "_all": {
      "enabled": false
    },
    "properties": {
      "time.total": {
        "type": "string",
        "index": "not_analyzed"
      }
    }
  }
}

Index some objects with numeric values.

POST rotem-test/t1/2
{
  "time.total": "26"
}

POST rotem-test/t1/3
{
  "time.total": 10
}

Create another type with the same field as long

PUT rotem-test/_mapping/t2
{
  "t2": {
    "_all": {
      "enabled": false
    },
    "properties": {
      "time.total": {
        "type": "long"
      }
    }
  }
}

Create objects in the second type

POST rotem-test/t2/2
{
  "time.total": 10
}

POST rotem-test/t2/3
{
  "time.total": "5"
}

POST rotem-test/t2/4
{
  "time.total": 5
}

Do a terms agg on this field in the second type.

POST rotem-test/t2/_search?search_type=count
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "terms": {
      "terms": {
        "field": "time.total"
      }
    }
  }
}

This is what you get:

{
 "took" : 38,
 "timed_out" : false,
 "_shards" : {
  "total" : 4,
  "successful" : 4,
  "failed" : 0
 },
 "hits" : {
  "total" : 6,
  "max_score" : 0.0,
  "hits" : []
 },
 "aggregations" : {
  "terms" : {
   "doc_count_error_upper_bound" : 0,
   "sum_other_doc_count" : 0,
   "buckets" : [{
     "key" : "0 \u0000\u0000\u0000\u0000\u0000\u0000",
     "doc_count" : 6
    }, {
     "key" : "@\b\u0000\u0000\u0000\u0000",
     "doc_count" : 6
    }, {
     "key" : "P\u0002\u0000\u0000",
     "doc_count" : 6
    }, {
     "key" : " \u0001\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0005",
     "doc_count" : 4
    }, {
     "key" : " \u0001\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\n",
     "doc_count" : 2
    }
   ]
  }
 }
}

@clintongormley
Copy link

Hi @rore

You are running into a known problem: fields with the same name in different types must have the same mapping. We are planning on enforcing this in 2.0 with #4081

@rore
Copy link
Author

rore commented Nov 25, 2014

Hi @clintongormley

This is quite a worrying remark. Enforcing the same mapping on fields with the same name under different types is quite breaking the understanding of how the mapping is encapsulated under a type. We are counting on the type separation to allow us to have different mapping for fields with the same name. What you are saying destroys one of the main reasons for having different types under an index. This is going to have serious implications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants