Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[text analytics] opinion mining support #12542

Merged
merged 22 commits into from
Jul 31, 2020

Conversation

iscai-msft
Copy link
Contributor

fixes #12476
fixes #12477

Will be updating the samples and readme in a subsequent PR

@@ -408,11 +413,26 @@ def analyze_sentiment( # type: ignore
docs = _validate_batch_input(documents, "language", language)
model_version = kwargs.pop("model_version", None)
show_stats = kwargs.pop("show_stats", False)
show_aspects = kwargs.pop("show_aspects", None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should default to False

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with None to get the service default, @johanste is the policy to default to None in this case, or False to keep it static

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to distinguish between "the application explicitly passed in this value" vs. "the application didn't provide a value, so we'll pick an appropriate default for them" for positional arguments, then we use a sentinel value as the default value. This is often None when None is not a valid value.

Since we are dealing with kwargs here, it would be easier/clearer to check if 'show_aspects' in kwargs: - no need for sentinel values since kwargs inherently let's you determine if the user passed the parameter or not.

Copy link
Contributor Author

@iscai-msft iscai-msft Jul 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johanste I do need to still pop kwargs in this case since the name of the parameter has changed (we have it as show_aspects, the service has it as opinion_mining). I think I'll stick with the sentinel value of None, holler if this is wrong

@iscai-msft iscai-msft added the Client This issue points to a problem in the data-plane of the library. label Jul 15, 2020
@iscai-msft iscai-msft self-assigned this Jul 15, 2020
@iscai-msft iscai-msft closed this Jul 15, 2020
@iscai-msft iscai-msft reopened this Jul 15, 2020
@iscai-msft
Copy link
Contributor Author

/azp run python - textanalytics - ci

@@ -378,6 +378,12 @@ def analyze_sentiment( # type: ignore
:type documents:
list[str] or list[~azure.ai.textanalytics.TextDocumentInput] or
list[dict[str, str]]
:keyword bool show_aspects: Whether to conduct more granular analysis around the aspects of
Copy link
Member

@johanste johanste Jul 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a .. versionadded (or possibly a .. note) directive that indicates in which API version(s) this parameter is available. We should see what the generated docs look like for each since our docs pipelines are customized and I don't think we've used either up until now...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added this code to the PR. While this is not generating great right now (@johanste you mentioned the indentation is off etc), @scbedd is looking into better rendering (thanks Scott!!!!)

…into opinion_mining

* 'master' of https://github.com/Azure/azure-sdk-for-python: (69 commits)
  [formrecognizer] renames from consistency check (Azure#12752)
  [text analytics] add back PII endpoint (Azure#12673)
  [text analytics] generate multiapi client, made v3.1-preview.1 default version (Azure#12720)
  Prevent AttributeError during get_certificate_version (Azure#12747)
  re-apply updates after resetting master (Azure#12771)
  Sync eng/common directory with azure-sdk-tools repository (Azure#12745)
  Enable bandit for servicebus (Azure#12763)
  Skip pypy3 for service bus CI pipeline (Azure#12732)
  Update CODEOWNERS (Azure#12765)
  updated tox file (Azure#11087)
  Updating to match Microsoft recommended template (Azure#11045)
  [text analytics] update version in master (Azure#12749)
  Explicitly stringify message prints in samples per manual doc pass recommendation. (Azure#12709)
  Run only analyse dependency step for aggregate report (Azure#12728)
  VSCodeCredential supports Python 2.7 on Windows (Azure#12731)
  Enable bandit (Azure#12722)
  Move InteractiveCredential to a new module (Azure#12706)
  Add foundations for eng/common restructure (Azure#12725)
  Sync eng/common directory with azure-sdk-tools repository (Azure#12730)
  fix flaky test (Azure#12721)
  ...
@@ -6,6 +6,8 @@
- We are now targeting the service's v3.1-preview.1 API as the default. If you would like to still use version v3.0 of the service,
pass in `v3.0` to the kwarg `api_version` when creating your TextAnalyticsClient
- We have added an API `recognize_pii_entities` which returns entities containing personal information for a batch of documents. Only available for API version v3.1-preview.1 and up.
- We now have added support for opinion mining. To use this feature, you need to make sure you are using the service's
v3.1-preview.1 API. To get this support pass `show_opinion_mining` as True when calling the `analyze_sentiment` endpoint
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you gloating

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no I'm smiling - like the emoji is

@@ -459,6 +465,8 @@ async def analyze_sentiment( # type: ignore
be used for scoring, e.g. "latest", "2019-10-01". If a model-version
is not specified, the API will default to the latest, non-preview version.
:keyword bool show_stats: If set to true, response will contain document level statistics.
.. versionadded:: v3.1-preview.1
The *show_opinion_mining* parameter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious if this is correct placement or if it should be under the keyword for show_opinion_mining?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i put it under the definition for show_opinion_mining, it's splitting up the keyword definition now in a very weird way. Speaking of.. @scbedd do you have any updates for this? no rush if not

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we move show_opinion_mining as the last keyword in docstring? does that help at all?

kristapratico
kristapratico previously approved these changes Jul 30, 2020
@@ -702,19 +706,30 @@ class SentenceSentiment(DictMixin):
and 1 for the sentence for all labels.
:vartype confidence_scores:
~azure.ai.textanalytics.SentimentConfidenceScores
:ivar mined_opinions: The list of opinions mined from this sentence.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be good to add a ask.ms link to service documentation here about Opinion Mining?

@@ -451,6 +451,12 @@ def analyze_sentiment( # type: ignore
:type documents:
list[str] or list[~azure.ai.textanalytics.TextDocumentInput] or
list[dict[str, str]]
:keyword bool show_opinion_mining: Whether to mine the opinions of a sentence and conduct more
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we link here service docs too?

granular analysis around the aspects of a product or service (also known as
aspect-based sentiment analysis). If set to true, the returned
:class:`~azure.ai.textanalytics.SentenceSentiment` objects
will have property `mined_opinions` containing the result of this analysis. Only available for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we should say this here, as we will have to update it once service GA 3.1
We could just use the link to service docs and let them manage the versions of where the functionality is available

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that it is an extra step of maintenance, but I think we should still have this documentation is here, so it's easier for users to see off the bat that this is only available in certain versions.

Will follow your guidance and open an issue to track this for the future, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue here: #12832

except TypeError as error:
if "opinion_mining" in str(error):
raise NotImplementedError(
"'show_opinion_mining' is only available for API version v3.1-preview.1 and up"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

early, but we should create an issue to update this value once service GA`s 3.1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue here: #12832

Comment on lines +592 to +594
self.assertIsNotNone(aspect.confidence_scores.positive)
self.assertEqual(0.0, aspect.confidence_scores.neutral)
self.assertIsNotNone(aspect.confidence_scores.negative)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could add an extra check where all the confidence scores need to add up to 1. That way we know for sure that not all of them are 0

Copy link
Member

@maririos maririos Jul 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For more context, in .NET I was doing the exact checks and I missed a bug in the deserialization, so all scores always ended up in 0.0.
The adds up to check has caught those failures

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's fair. i've created a separate issue to add this, since there are more instances in our test suite where i'd like this added: #12833


@GlobalTextAnalyticsAccountPreparer()
@TextAnalyticsClientPreparer()
def test_aspect_based_sentiment_analysis(self, client):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another test we could add is to turn on opinion mining with a sentence that gives us no results.
For example for the sentence: "today is a hot day"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great suggestion, adding that to this PR. thanks!

maririos
maririos previously approved these changes Jul 30, 2020
Copy link
Member

@maririos maririos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor doc and test comments, otherwise LGTM

@iscai-msft iscai-msft merged commit 2c4182c into Azure:master Jul 31, 2020
@iscai-msft iscai-msft deleted the opinion_mining branch July 31, 2020 14:56
iscai-msft added a commit to iscai-msft/azure-sdk-for-python that referenced this pull request Jul 31, 2020
…into ta_opinion_mining_sample

* 'master' of https://github.com/Azure/azure-sdk-for-python: (47 commits)
  [text analytics] opinion mining support (Azure#12542)
  [key vault] Regenerate keys (Azure#12101)
  [Tables] Azure Data Tables SDK, sync and async code (Azure#12766)
  fix doc, sample, docstring (Azure#12784)
  clean up links for PII samples (Azure#12826)
  [formrecognizer] renames from consistency check (Azure#12752)
  [text analytics] add back PII endpoint (Azure#12673)
  [text analytics] generate multiapi client, made v3.1-preview.1 default version (Azure#12720)
  Prevent AttributeError during get_certificate_version (Azure#12747)
  re-apply updates after resetting master (Azure#12771)
  Sync eng/common directory with azure-sdk-tools repository (Azure#12745)
  Enable bandit for servicebus (Azure#12763)
  Skip pypy3 for service bus CI pipeline (Azure#12732)
  Update CODEOWNERS (Azure#12765)
  updated tox file (Azure#11087)
  Updating to match Microsoft recommended template (Azure#11045)
  [text analytics] update version in master (Azure#12749)
  Explicitly stringify message prints in samples per manual doc pass recommendation. (Azure#12709)
  Run only analyse dependency step for aggregate report (Azure#12728)
  VSCodeCredential supports Python 2.7 on Windows (Azure#12731)
  ...
iscai-msft added a commit to iscai-msft/azure-sdk-for-python that referenced this pull request Jul 31, 2020
…into regenerate_certs

* 'master' of https://github.com/Azure/azure-sdk-for-python:
  [key vault] Regenerate secrets (Azure#12103)
  [text analytics] opinion mining support (Azure#12542)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client This issue points to a problem in the data-plane of the library. Cognitive - Text Analytics
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[text analytics] tests for opinion mining [text analytics] implement opinion mining design
6 participants