Skip to content

Commit

Permalink
update the prompt to include needs-info (#296)
Browse files Browse the repository at this point in the history
- update the prompt to include triaging issues w/ `needs-info` labels
- `[breaking change]` issues should have `breaking-change-request`
labels but don't necessarily need area labels
- add 5 'few-shot' examples in the classification prompt
- update benchmark expectations
- classification performance has generally improved from 56% to 65%

---

- [x] I’ve reviewed the contributor guide and applied the relevant
portions to this PR.

<details>
  <summary>Contribution guidelines:</summary><br>

- See our [contributor
guide](https://github.com/dart-lang/.github/blob/main/CONTRIBUTING.md)
for general expectations for PRs.
- Larger or significant changes should be discussed in an issue before
creating a PR.
- Contributions to our repos should follow the [Dart style
guide](https://dart.dev/guides/language/effective-dart) and use `dart
format`.
- Most changes should add an entry to the changelog and may need to [rev
the pubspec package
version](https://github.com/dart-lang/sdk/blob/main/docs/External-Package-Maintenance.md#making-a-change).
- Changes to packages require [corresponding
tests](https://github.com/dart-lang/.github/blob/main/CONTRIBUTING.md#Testing).

Note that many Dart repos have a weekly cadence for reviewing PRs -
please allow for some latency before initial review feedback.
</details>
  • Loading branch information
devoncarew committed Sep 4, 2024
1 parent 3496f54 commit 4ba6f15
Show file tree
Hide file tree
Showing 6 changed files with 188 additions and 52 deletions.
11 changes: 7 additions & 4 deletions pkgs/sdk_triage_bot/lib/src/common.dart
Original file line number Diff line number Diff line change
Expand Up @@ -37,11 +37,14 @@ String get geminiKey {
return token;
}

/// Don't return more than 5k of text for an issue body.
String trimmedBody(String body) {
const textLimit = 5 * 1024;
/// Maximal length of body used for querying.
const bodyLengthLimit = 10 * 1024;

return body.length > textLimit ? body = body.substring(0, textLimit) : body;
/// The [body], truncated if larger than [bodyLengthLimit].
String trimmedBody(String body) {
return body.length > bodyLengthLimit
? body = body.substring(0, bodyLengthLimit)
: body;
}

class Logger {
Expand Down
9 changes: 5 additions & 4 deletions pkgs/sdk_triage_bot/lib/src/gemini.dart
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ import 'package:google_generative_ai/google_generative_ai.dart';
import 'package:http/http.dart' as http;

class GeminiService {
// gemini-1.5-pro-latest, gemini-1.5-flash-latest, gemini-1.0-pro-latest
// Possible values for models: gemini-1.5-pro-latest, gemini-1.5-flash-latest,
// gemini-1.0-pro-latest, gemini-1.5-flash-exp-0827.
static const String classificationModel = 'models/gemini-1.5-flash-latest';
static const String summarizationModel = 'models/gemini-1.5-flash-latest';

Expand All @@ -33,17 +34,17 @@ class GeminiService {

/// Call the summarize model with the given prompt.
///
/// On failures, this will throw a `GenerativeAIException`.
/// On failures, this will throw a [GenerativeAIException].
Future<String> summarize(String prompt) {
return _query(_summarizeModel, prompt);
}

/// Call the classify model with the given prompt.
///
/// On failures, this will throw a `GenerativeAIException`.
/// On failures, this will throw a [GenerativeAIException].
Future<List<String>> classify(String prompt) async {
final result = await _query(_classifyModel, prompt);
final labels = result.split(',').map((l) => l.trim()).toList();
final labels = result.split(',').map((l) => l.trim()).toList()..sort();
return labels;
}

Expand Down
128 changes: 119 additions & 9 deletions pkgs/sdk_triage_bot/lib/src/prompts.dart
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.

// TODO(devoncarew): Add additional prompt instructions for `area-pkg` issues.

String assignAreaPrompt({
required String title,
required String body,
Expand Down Expand Up @@ -45,13 +47,110 @@ If the issue is clearly a bug report, then also apply the label 'type-bug'.
If the issue is mostly a question, then also apply the label 'type-question'.
Otherwise don't apply a 'type-' label.
If the issue title starts with "[breaking change]" it was likely created using
existing issue template; do not assign an area label. IMPORTANT: only do this if
the issue title starts with "[breaking change]".
If the issue was largely unchanged from our default issue template, then apply
the 'needs-info' label and don't assign an area label. These issues will
generally have a title of "Create an issue" and the body will start with "Thank
you for taking the time to file an issue!".
If the issue title is "Analyzer Feedback from IntelliJ", these are generally not
well qualified. For these issues, apply the 'needs-info' label but don't assign
an area label.
Return the labels as comma separated text.
Issue follows:
Here are a series of few-shot examples:
<EXAMPLE>
INPUT: title: Create an issue
body: Thank you for taking the time to file an issue!
This tracker is for issues related to:
Dart analyzer and linter
Dart core libraries (dart:async, dart:io, etc.)
Dart native and web compilers
Dart VM
OUTPUT: needs-info
</EXAMPLE>
<EXAMPLE>
INPUT: title: Analyzer Feedback from IntelliJ
body: ## Version information
- `IDEA AI-202.7660.26.42.7351085`
- `3.4.4`
- `AI-202.7660.26.42.7351085, JRE 11.0.8+10-b944.6842174x64 JetBrains s.r.o, OS Windows 10(amd64) v10.0 , screens 1600x900`
OUTPUT: needs-info
</EXAMPLE>
<EXAMPLE>
INPUT: title: Support likely() and unlikely() hints for AOT code optimization
body: ```dart
// Tell the compiler which branches are going to be taken most of the time.
if (unlikely(n == 0)) {
// This branch is known to be taken rarely.
} else {
// This branch is expected to be in the hot path.
}
final result = likely(s == null) ? commonPath() : notTakenOften();
```
Please add support for the `likely()` and `unlikely()` optimization hints within branching conditions. The AOT compiler can use these hints to generate faster code in a hot path that contains multiple branches.
$title
OUTPUT: area-vm, type-enhancement, type-performance
</EXAMPLE>
$body
<EXAMPLE>
INPUT: title: Analyzer doesn't notice incorrect return type of generic method
body: dart analyze gives no errors on the follow code:
```dart
void main() {
method(getB());
}
void method(String b) => print(b);
B getB<B extends A>() {
return A() as B;
}
class A {}
```
I would have suspected it to say something along the line of **The argument type 'A' can't be assigned to the parameter type 'String'.**
OUTPUT: area-analyzer, type-enhancement
</EXAMPLE>
<EXAMPLE>
INPUT: title: DDC async function stepping improvements
body: Tracking issue to monitor progress on improving debugger stepping through async function bodies.
The new DDC async semantics expand async function bodies into complex state machines. The normal JS stepping semantics don't map cleanly to steps through Dart code given this lowering. There are a couple potential approaches to fix this:
1) Add more logic to the Dart debugger to perform custom stepping behavior when stepping through async code.
2) Modify the async lowering in such a way that stepping more closely resembles stepping through Dart. For example, rather than returning multiple times, the state machine function might be able to yield. Stepping over a yield might allow the debugger to stay within the function body.
OUTPUT: area-web
</EXAMPLE>
The issue to triage follows:
title: $title
body: $body
${lastComment ?? ''}'''
.trim();
Expand All @@ -60,15 +159,26 @@ ${lastComment ?? ''}'''
String summarizeIssuePrompt({
required String title,
required String body,
required bool needsInfo,
}) {
const needsMoreInfo = '''
Our classification model determined that we'll need more information to triage
this issue. Thank them for their contribution and gently prompt them to provide
more information.
''';

final responseLimit = needsInfo ? '' : ' (1-2 sentences, 24 words or less)';

return '''
You are a software engineer on the Dart team at Google. You are responsible for
triaging incoming issues from users. For each issue, briefly summarize the issue
(1-2 sentences, 24 words or less).
You are a software engineer on the Dart team at Google.
You are responsible for triaging incoming issues from users.
For each issue, briefly summarize the issue $responseLimit.
${needsInfo ? needsMoreInfo : ''}
Issue follows:
The issue to triage follows:
$title
title: $title
$body''';
body: $body''';
}
54 changes: 34 additions & 20 deletions pkgs/sdk_triage_bot/lib/triage.dart
Original file line number Diff line number Diff line change
Expand Up @@ -70,37 +70,45 @@ ${trimmedBody(comment.body ?? '')}
}
}

// ask for the summary
var bodyTrimmed = trimmedBody(issue.body);
String summary;

// ask for the 'area-' classification
List<String> newLabels;
try {
summary = await geminiService.summarize(
summarizeIssuePrompt(title: issue.title, body: bodyTrimmed),
newLabels = await geminiService.classify(
assignAreaPrompt(
title: issue.title,
body: bodyTrimmed,
lastComment: lastComment,
),
);
} on GenerativeAIException catch (e) {
// Failures here can include things like gemini safety issues, ...
stderr.writeln('gemini: $e');
exit(1);
}

logger.log('## gemini summary');
logger.log('');
logger.log(summary);
logger.log('');

// ask for the 'area-' classification
List<String> newLabels;
// ask for the summary
String summary;
try {
newLabels = await geminiService.classify(
assignAreaPrompt(
title: issue.title, body: bodyTrimmed, lastComment: lastComment),
summary = await geminiService.summarize(
summarizeIssuePrompt(
title: issue.title,
body: bodyTrimmed,
needsInfo: newLabels.contains('needs-info'),
),
);
} on GenerativeAIException catch (e) {
// Failures here can include things like gemini safety issues, ...
stderr.writeln('gemini: $e');
exit(1);
}

logger.log('## gemini summary');
logger.log('');
logger.log(summary);
logger.log('');

logger.log('## gemini classification');
logger.log('');
logger.log(newLabels.toString());
Expand All @@ -123,9 +131,9 @@ ${trimmedBody(comment.body ?? '')}
// create github comment
await githubService.createComment(sdkSlug, issueNumber, comment);

final allRepoLabels = (await githubService.getAllLabels(sdkSlug)).toSet();
final labelAdditions = newLabels.toSet().intersection(allRepoLabels).toList()
..sort();
final allRepoLabels = await githubService.getAllLabels(sdkSlug);
final labelAdditions =
filterLegalLabels(newLabels, allRepoLabels: allRepoLabels);
if (labelAdditions.isNotEmpty) {
labelAdditions.add('triage-automation');
}
Expand All @@ -141,7 +149,13 @@ ${trimmedBody(comment.body ?? '')}
logger.log('Triaged ${issue.htmlUrl}');
}

List<String> filterExistingLabels(
List<String> allLabels, List<String> newLabels) {
return newLabels.toSet().intersection(allLabels.toSet()).toList();
List<String> filterLegalLabels(
List<String> labels, {
required List<String> allRepoLabels,
}) {
final validLabels = allRepoLabels.toSet();
return [
for (var label in labels)
if (validLabels.contains(label)) label,
]..sort();
}
29 changes: 19 additions & 10 deletions pkgs/sdk_triage_bot/tool/bench.dart
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ void main(List<String> args) async {
await githubService.fetchIssue(sdkSlug, expectation.issueNumber);
final bodyTrimmed = trimmedBody(issue.body);

print('#${issue.number}');
print('#${issue.number}: ${expectation.expectedLabels.join(', ')}');

try {
final labels = await geminiService.classify(
Expand All @@ -60,12 +60,7 @@ void main(List<String> args) async {
if (expectation.satisfiedBy(labels)) {
predicted++;
} else {
var title = issue.title.length > 100
? '${issue.title.substring(0, 100)}...'
: issue.title;
print(' "$title"');
print(' labeled: ${expectation.expectedLabels.join(', ')}');
print(' prediction: ${labels.join(', ')}');
stderr.writeln(' bot: ${labels.join(', ')}');
}
} on GenerativeAIException catch (e) {
// Failures here can include things like gemini safety issues, ...
Expand Down Expand Up @@ -114,8 +109,22 @@ class ClassificationResults {
}

bool satisfiedBy(List<String> labels) {
final filtered = labels.where((l) => !l.startsWith('type-')).toSet();
final expected = expectedLabels.where((l) => !l.startsWith('type-'));
return expected.every(filtered.contains);
// Handle a `needs-info` label.
if (expectedLabels.contains('needs-info')) {
return labels.contains('needs-info');
}

// Handle a `breaking-change-request` label.
if (expectedLabels.contains('breaking-change-request')) {
return labels.contains('breaking-change-request');
}

for (final label in expectedLabels.where((l) => l.startsWith('area-'))) {
if (!labels.contains(label)) {
return false;
}
}

return true;
}
}
9 changes: 4 additions & 5 deletions pkgs/sdk_triage_bot/tool/bench.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ materially improve the classification performance.
| #56354 | `area-web`, `type-bug` |
| #56353 | `area-dart2wasm` |
| #56350 | `area-analyzer`, `type-enhancement` |
| #56348 | `area-intellij` |
| #56347 | `area-dart-cli`, `type-bug` |
| #56346 | `area-pkg`, `pkg-json`, `type-enhancement` |
| #56345 | `area-analyzer`, `type-enhancement` |
Expand All @@ -44,7 +43,7 @@ materially improve the classification performance.
| #56316 | `area-web` |
| #56315 | `area-web` |
| #56314 | `area-web`, `type-bug` |
| #56308 | `area-vm` |
| #56308 | `area-vm` `breaking-change-request` |
| #56306 | `area-vm`, `type-bug` |
| #56305 | `area-front-end`, `type-bug`, `type-question` |
| #56304 | `area-core-library`, `type-enhancement` |
Expand All @@ -55,13 +54,12 @@ materially improve the classification performance.
| #56283 | `area-dart2wasm` |
| #56256 | `area-front-end`, `type-bug` |
| #56254 | `area-pkg`, `pkg-vm-service`, `type-bug` |
| #56246 | `area-intellij` |
| #56240 | `area-intellij` |
| #56240 | `needs-info` |
| #56229 | `area-infrastructure` |
| #56227 | `area-native-interop` |
| #56220 | `area-infrastructure`, `type-code-health` |
| #56217 | `area-meta` |
| #56216 | `area-intellij` |
| #56216 | `needs-info` |
| #56214 | `area-native-interop` |
| #56208 | `area-google3`, `type-enhancement` |
| #56207 | `area-google3` |
Expand Down Expand Up @@ -108,3 +106,4 @@ We need more information from the user before we can triage these issues.

## Results
2024-08-27: 55.6% using gemini-1.5-flash-latest
2024-08-30: 64.8% using gemini-1.5-flash-latest

0 comments on commit 4ba6f15

Please sign in to comment.