Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingest: foreach processor not accepting dot-notation for nested fields #51037

Closed
fkelbert opened this issue Jan 15, 2020 · 5 comments
Closed
Labels
>bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team

Comments

@fkelbert
Copy link
Contributor

Elasticsearch version (bin/elasticsearch --version): 7.5 Elastic Cloud

The following does not work as expected:

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "foreach": {
          "field": "mail.domains",
          "processor": {
            "gsub": {
              "field": "_ingest._value",
              "pattern": "^[^@]*@(.*)$",
              "replacement": "$1"
            }
          }
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "mail.domains": [
          "foo@bar.com"
        ]
      }
    }
  ]
}

The above results in:

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "field [mail] not present as part of path [mail.domains]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "field [mail] not present as part of path [mail.domains]"
  },
  "status": 400
}

This does work, instead:

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "foreach": {
          "field": "mail.domains",
          "processor": {
            "gsub": {
              "field": "_ingest._value",
              "pattern": "^[^@]*@(.*)$",
              "replacement": "$1"
            }
          }
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "mail": {
          "domains": [
            "foo@bar.com"
          ]
        }
      }
    }
  ]
}

Both should actually work.

@fkelbert fkelbert added >bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP labels Jan 15, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Ingest)

@fkelbert
Copy link
Contributor Author

The same issue applies to the enrich processor, and possibly others.

@gaobinlong
Copy link
Contributor

Maybe none of the processors can accept dot-notation for nested fields, but we can use Dot Expander Processor to resolve the problem.

@fkelbert
Copy link
Contributor Author

Sure, the Dot Expander can resolve the issue.

I would very much welcome an initiative to make this easier for users, though. Elasticsearch seamlessly accepts documents using the dot notation, so it is kind of unexpected that Ingest processors would not accept the same notation.

I suppose there are solid technical reasons for this behaviour. Yet, I am wondering if we can make things easier for users, very much in line with #15951

@rjernst rjernst added the Team:Data Management Meta label for data/management team label May 4, 2020
@dakrone
Copy link
Member

dakrone commented May 17, 2024

This has been open for quite a while, and we haven't made much progress on this due to focus in other areas. For now I'm going to close this as something we aren't planning on implementing. We can re-open it later if needed.

@dakrone dakrone closed this as not planned Won't fix, can't repro, duplicate, stale May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

5 participants