[FEATURE] neural query explain not showing details for nested field #698

yuye-aws · 2024-04-19T08:09:30Z

What is the bug?

Searching for nested field works well. However, we cannot obtain the detailed explanation for search results GET {indexname}/_search?explain=true

How can one reproduce the bug?

First, create an index with nested field embedding, a sample document may look like:

{
    "text": "A Hybrid EP and SQP for Dynamic Economic Dispatch with Nonsmooth Fuel Cost Function Dynamic economic dispatch (DED) is one of the main functions of power generation operation and control. It determines the optimal settings of generator units with predicted load demand over a certain period of time. The objective is to operate an electric power system most economically while the system is operating within its security limits. This paper proposes a new hybrid methodology for solving DED. The proposed method is developed in such a way that a simple evolutionary programming (EP) is applied as a based level search, which can give a good direction to the optimal global region, and a local search sequential quadratic programming (SQP) is used as a fine tuning to determine the optimal solution at the final. Ten units test system with nonsmooth fuel cost function is used to illustrate the effectiveness of the proposed method compared with those obtained from EP and SQP alone.",
    "text_chunk_embedding": [
      {
        "knn": [...]
      },
      {
        "knn": [...]
      }
    ],
    "text_chunk": [
      "[CLS] a hybrid ep and sqp for dynamic economic dispatch with nonsmooth fuel cost function dynamic economic dispatch ( ded ) is one of the main functions of power generation operation and control. it determines the optimal settings of generator units with predicted load demand over a certain period of time. the objective is to operate an electric power system most economically while the system is operating within its security limits. this paper proposes a new hybrid methodology for solving ded. the proposed method is developed in such a way that a simple evolutionary programming ( ep ) is applied as a based level search, which can give a good direction to the optimal global region, and",
      "a local search sequential quadratic programming ( sqp ) is used as a fine tuning to determine the optimal solution at the final. ten units test system with nonsmooth fuel cost function is used to illustrate the effectiveness of the proposed method compared with those obtained from ep and sqp alone. [SEP]"
    ]
}

Then, use the explain query to search the document:

GET {indexname}/_search?explain=true
{
  "size": 1,
  "_source": {
    "excludes": "text_chunk_embedding"
  },
  "query": {
    "nested": {
      "score_mode": "avg",
      "path": "text_chunk_embedding",
      "query": {
        "neural": {
          "text_chunk_embedding.knn": {
            "model_id": "PDx55Y4BxByNDM4P0mdQ",
            "query_text": "Global-Locally Self-Attentive Dialogue State Tracker"
          }
        }
      }
    }
  }
}

Currently, the explanation for search results is

"_explanation": {
  "value": 0.021672908,
  "description": "Score based on 3 child docs in range from 6364 to 6366, using score mode Avg",
  "details": [
    {
      "value": 0.021672908,
      "description": "sum of:",
      "details": [
        {
          "value": 1,
          "description": "No Explanation",
          "details": []
        },
        {
          "value": 0,
          "description": "match on required clause, product of:",
          "details": [
            {
              "value": 0,
              "description": "# clause",
              "details": []
            },
            {
              "value": 1,
              "description": "_nested_path:text_chunk_embedding",
              "details": []
            }
          ]
        }
      ]
    }
  ]
}

What is the expected behavior?

The explain query should at least show score for each nested document like the BM25 query.

What is your host/environment?

Operating system, version.

Do you have any screenshots?

If applicable, add screenshots to help explain your problem.

Do you have any additional context?

Add any other context about the problem.

The text was updated successfully, but these errors were encountered:

yuye-aws · 2024-04-19T08:12:55Z

If I search with BM25 query:

GET {indexname}/_search?explain=true
{
  "size": 1,
  "query": {
    "match": {
      "text_chunk": "Global-Locally Self-Attentive Dialogue State Tracker"
    }
  }
}

The explanation is very detailed like

{
    "value": 18.182425,
    "description": "sum of:",
    "details": [
      {
        "value": 4.982006,
        "description": "weight(text_chunk:self in 20446) [PerFieldSimilarity], result of:",
        "details": [
          {
            "value": 4.982006,
            "description": "score(freq=2.0), computed as boost * idf * tf from:",
            "details": [
              {
                "value": 2.2,
                "description": "boost",
                "details": []
              },
              {
                "value": 3.1877272,
                "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                "details": [
                  {
                    "value": 351,
                    "description": "n, number of documents containing term",
                    "details": []
                  },
                  {
                    "value": 8517,
                    "description": "N, total number of documents with field",
                    "details": []
                  }
                ]
              },
              {
                "value": 0.7103958,
                "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                "details": [
                  {
                    "value": 2,
                    "description": "freq, occurrences of term within document",
                    "details": []
                  },
                  {
                    "value": 1.2,
                    "description": "k1, term saturation parameter",
                    "details": []
                  },
                  {
                    "value": 0.75,
                    "description": "b, length normalization parameter",
                    "details": []
                  },
                  {
                    "value": 104,
                    "description": "dl, length of field (approximate)",
                    "details": []
                  },
                  {
                    "value": 181.63051,
                    "description": "avgdl, average length of field",
                    "details": []
                  }
                ]
              }
            ]
          }
        ]
      },
      {
        "value": 10.799234,
        "description": "weight(text_chunk:attentive in 20446) [PerFieldSimilarity], result of:",
        "details": [
          {
            "value": 10.799234,
            "description": "score(freq=2.0), computed as boost * idf * tf from:",
            "details": [
              {
                "value": 2.2,
                "description": "boost",
                "details": []
              },
              {
                "value": 6.9098706,
                "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                "details": [
                  {
                    "value": 8,
                    "description": "n, number of documents containing term",
                    "details": []
                  },
                  {
                    "value": 8517,
                    "description": "N, total number of documents with field",
                    "details": []
                  }
                ]
              },
              {
                "value": 0.7103958,
                "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                "details": [
                  {
                    "value": 2,
                    "description": "freq, occurrences of term within document",
                    "details": []
                  },
                  {
                    "value": 1.2,
                    "description": "k1, term saturation parameter",
                    "details": []
                  },
                  {
                    "value": 0.75,
                    "description": "b, length normalization parameter",
                    "details": []
                  },
                  {
                    "value": 104,
                    "description": "dl, length of field (approximate)",
                    "details": []
                  },
                  {
                    "value": 181.63051,
                    "description": "avgdl, average length of field",
                    "details": []
                  }
                ]
              }
            ]
          }
        ]
      },
      {
        "value": 2.401184,
        "description": "weight(text_chunk:state in 20446) [PerFieldSimilarity], result of:",
        "details": [
          {
            "value": 2.401184,
            "description": "score(freq=1.0), computed as boost * idf * tf from:",
            "details": [
              {
                "value": 2.2,
                "description": "boost",
                "details": []
              },
              {
                "value": 1.9813391,
                "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                "details": [
                  {
                    "value": 1174,
                    "description": "n, number of documents containing term",
                    "details": []
                  },
                  {
                    "value": 8517,
                    "description": "N, total number of documents with field",
                    "details": []
                  }
                ]
              },
              {
                "value": 0.55086344,
                "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                "details": [
                  {
                    "value": 1,
                    "description": "freq, occurrences of term within document",
                    "details": []
                  },
                  {
                    "value": 1.2,
                    "description": "k1, term saturation parameter",
                    "details": []
                  },
                  {
                    "value": 0.75,
                    "description": "b, length normalization parameter",
                    "details": []
                  },
                  {
                    "value": 104,
                    "description": "dl, length of field (approximate)",
                    "details": []
                  },
                  {
                    "value": 181.63051,
                    "description": "avgdl, average length of field",
                    "details": []
                  }
                ]
              }
            ]
          }
        ]
      }
    ]
}

martin-gaievski · 2024-04-30T23:48:30Z

@yuye-aws neural search will not have detailed response for explain as it uses knn query under the hood, and knn query doesn't support explain. Here is the corresponding GH issue for this matter: opensearch-project/k-NN#875

yuye-aws · 2024-05-20T10:09:19Z

@yuye-aws neural search will not have detailed response for explain as it uses knn query under the hood, and knn query doesn't support explain. Here is the corresponding GH issue for this matter: opensearch-project/k-NN#875

Sorry for taking long to respond. It seems quite likely that after this issue will automatically get resolved after opensearch-project/k-NN#875. Just out of curiosity, do we have an ongoing plan to resolve the k-NN issue?

yuye-aws added bug Something isn't working untriaged labels Apr 19, 2024

martin-gaievski removed the untriaged label Apr 30, 2024

jmazanec15 assigned martin-gaievski Aug 21, 2024

naveentatikonda added Features Introduces a new unit of functionality that satisfies a requirement and removed bug Something isn't working labels Sep 18, 2024

naveentatikonda changed the title ~~[BUG] neural query explain not showing details for nested field~~ [FEATURE] neural query explain not showing details for nested field Sep 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] neural query explain not showing details for nested field #698

[FEATURE] neural query explain not showing details for nested field #698

yuye-aws commented Apr 19, 2024 •

edited

Loading

yuye-aws commented Apr 19, 2024 •

edited

Loading

martin-gaievski commented Apr 30, 2024

yuye-aws commented May 20, 2024

[FEATURE] neural query explain not showing details for nested field #698

[FEATURE] neural query explain not showing details for nested field #698

Comments

yuye-aws commented Apr 19, 2024 • edited Loading

What is the bug?

How can one reproduce the bug?

What is the expected behavior?

What is your host/environment?

Do you have any screenshots?

Do you have any additional context?

yuye-aws commented Apr 19, 2024 • edited Loading

martin-gaievski commented Apr 30, 2024

yuye-aws commented May 20, 2024

yuye-aws commented Apr 19, 2024 •

edited

Loading

yuye-aws commented Apr 19, 2024 •

edited

Loading