Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft Proposal for Diagnostics Client Library (a.k.a. "Runtime Client Library") #574

Merged
merged 34 commits into from
Nov 13, 2019
Merged
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
c8d1da2
API design doc
sywhang Oct 14, 2019
5aa58a5
more doc
sywhang Oct 15, 2019
b14c3dd
More docs
sywhang Oct 17, 2019
724ca08
some sample code, more detailed description of each API
sywhang Oct 18, 2019
e03a50a
cleanups
sywhang Oct 18, 2019
3ac65ca
Fix markup
sywhang Oct 18, 2019
09edb81
more cleanup
sywhang Oct 18, 2019
6db8055
more cleanup
sywhang Oct 18, 2019
8ffcb28
add DiagnosticsIPCHelper
sywhang Oct 18, 2019
1ab8edb
Add default IPC socket path related helpers
sywhang Oct 18, 2019
4fe23b3
Fix some typos, cases for names
sywhang Oct 21, 2019
8c249ab
Make EventPipeProvider.FilterData an IEnumerable
sywhang Oct 22, 2019
8938d93
Add AddFilterData to EventPipeProvider class
sywhang Oct 22, 2019
00ca2be
Trying to write sample code first
sywhang Oct 23, 2019
98f5509
Some cleanup on unused structs/enums
sywhang Oct 23, 2019
100cec0
Remove EventPipe namespace
sywhang Oct 23, 2019
8c8281b
Adding more sample code, EventPipeSession definition
sywhang Oct 24, 2019
cf5b24d
Some more code in the sample
sywhang Oct 24, 2019
1859bf8
Fix API description
sywhang Oct 24, 2019
b121b5f
Simplify Exceptions, modify some formatting
sywhang Oct 25, 2019
9f671e8
Fix typo and add more description to the intro
sywhang Oct 25, 2019
db05fb7
Remove EventPipeProvider.ToDisplayString() and make Provider IReadOnl…
sywhang Oct 25, 2019
8148b9f
Fix some error in the sample code
sywhang Oct 25, 2019
32b5a64
Address PR comments
sywhang Oct 30, 2019
738a947
State the intended assembly name explicitly
sywhang Oct 30, 2019
2d6a498
Fix some typos
sywhang Nov 1, 2019
9d7bac2
Fix sample code description
sywhang Nov 1, 2019
7eb973a
More fixes
sywhang Nov 5, 2019
9bc8658
Add example about live-parsing events for a period of time
sywhang Nov 5, 2019
57d4f0b
var
sywhang Nov 6, 2019
aa4169d
Fix sample code, add profiler attach sample
sywhang Nov 6, 2019
74d9e12
More fixes
sywhang Nov 6, 2019
1038447
Remove unused sampleprofiler from sample #4
sywhang Nov 7, 2019
c05e35c
More descriptions about each sample, fix few nits
sywhang Nov 7, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
383 changes: 383 additions & 0 deletions documentation/design-docs/diagnostics-client-library.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,383 @@
# Diagnostics Client Library API Design

## Intro
The Diagnostics Client Library (currently named as "Runtime Client Library") - `Microsoft.Diagnostics.NetCore.Client.dll` - is a managed library that can be used to interact with the .NET runtime via the diagnostics IPC protocol as documented in https://github.com/dotnet/diagnostics/blob/master/documentation/design-docs/ipc-protocol.md. It provides managed classes for invoking the diagnostics IPC commands programmatically, and can be extended to write various diagnostics tools. It also comes with various classes that should facilitate interacting with the diagnostics IPC commands.

The name "Diagnostics Client Library" comes from the fact that we call the runtime (CoreCLR) component responsible for accepting and handling the diagnostics IPC commands the "diagnostics server" - https://github.com/dotnet/coreclr/blob/master/src/vm/diagnosticserver.h. Since this library is a managed library on the other side of the IPC protocol responsible for communicating with the runtime's "diagnostics server", calling this the "Diagnostics Client Library" made sense.
sywhang marked this conversation as resolved.
Show resolved Hide resolved
sywhang marked this conversation as resolved.
Show resolved Hide resolved

## Goals

The goal of this library is as following:

* Serve as an implementation of the IPC protocol to communicate with CoreCLR's diagnostics server.
* Provide an easy-to-use API for any library/tools authors to utilize the IPC protocol

## Non-Goals

* Provide tool-specific functionalities that are too high-level (i.e. dumping the GC heap, parsing counter payload, etc.) This will broaden the scope of this library too far and will cause complexity
* Parse event payloads (i.e. - This is also command-specific and can be done by other libraries.

## Sample Code:

Here are some sample code showing the usage of this library.

#### 1. Attaching to a process and dumping out all the runtime GC events in real time to the console
This sample shows an example where we trigger an EventPipe session with the .NET runtime provider with the GC keyword at informational level, and use `EventPipeEventSource` (provided by the TraceEvent library) to parse the events coming in and print the name of each event to the console in real time.

```cs
using Microsoft.Diagnostics.NETCore.Client;
using Microsoft.Diagnostics.Tracing.Parsers;

public void PrintRuntimeGCEvents(int processId)
{
var providers = new List<EventPipeProvider>()
{
new EventPipeProvider("Microsoft-Windows-DotNETRuntime",
EventLevel.Informational, (long)ClrTraceEventParser.Keywords.GC)
};

var client = new DiagnosticsClient(processId);
using (var session = client.StartEventPipeSession(providers, false))
{
var source = new EventPipeEventSource(session.EventStream);

source.Dynamic.All += (TraceEvent obj) => {
Console.WriteLine(obj.EventName);
}
try
{
source.Process();
}
// NOTE: This exception does not currently exist. It is something that needs to be added to TraceEvent.
catch (EventStreamException e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to create a new exception or could we just catch the exception that caused the stream reading error? I don't want to hide the underlying issue by having the EventSource simply say "something went wrong".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By EventSource do you mean EventPipeEventSource?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops! Yeah, I do.

Copy link
Contributor Author

@sywhang sywhang Nov 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we just catch the exception that caused the stream reading error

This part is a little ambiguous. I think of this EventStreamException as an exception that gets thrown by EventPipeEventSource when there is an exception while reading the stream because either the stream was shut down by something. There could be other exception, like incorrect or corrupted payload, which can probably be encapsulated by a different exception class.

I don't want to hide the underlying issue

The sample is meant to capture the intention to add more exception classes to EventPipeEventSource to make it not throw general Exception to prevent exactly this kind of things from happening.

{
Console.WriteLine("Error encountered while processing events");
Console.WriteLine(e.ToString());
}
}
}
```

#### 2. Write a core dump.
This sample shows how to trigger a dump using `DiagnosticsClient`.
```cs
using Microsoft.Diagnostics.NetCore.Client;

public void TriggerCoreDump(int processId)
{
var client = new DiagnosticsClient(processId);
client.WriteDump(DumpType.Normal);
}
```

#### 3. Trigger a core dump when CPU usage goes above a certain threshold
This sample shows an example where we monitor the `cpu-usage` counter published by the .NET runtime and use the `WriteDump` API to write out a dump when the CPU usage grows beyond a certain threshold.
```cs

using Microsoft.Diagnostics.NETCore.Client;

public void TriggerDumpOnCpuUsage(int processId, int threshold)
{
var providers = new List<EventPipeProvider>()
{
new EventPipeProvider(
"System.Runtime",
EventLevel.Informational,
(long)ClrTraceEventParser.Keywords.None,
new Dictionary<string, string>() {
{ "EventCounterIntervalSec", "1" }
}
)
};
var client = new DiagnosticsClient(processId);
using(var session = client.StartEventPipeSession(providers))
{
var source = new EventPipeEventSource(session.EventStream);
source.Dynamic.All += (TraceEvent obj) =>
{
if (obj.EventName.Equals("EventCounters"))
{
// I know this part is ugly. But this is all TraceEvent.
var payloadFields = (IDictionary<string, object>)(obj.GetPayloadValueByName("Payload"));
if (payloadFields["Name"].ToString().Equals("cpu-usage"))
{
double cpuUsage = Double.Parse(payloadFields["Mean"]);
if (cpuUsage > (double)threshold)
{
client.WriteDump(DumpType.Normal, "/tmp/minidump.dmp");
}
}
}
}
try
{
source.Process();
}
catch (EventStreamException) {}

}
}
}
```

#### 4. Trigger a CPU trace for given number of seconds
This sample shows an example where we trigger an EventPipe session for certain period of time, with the default CLR trace keyword as well as the sample profiler, and read from the stream that gets created as a result and write the bytes out to a file. Essentially this is what `dotnet-trace` uses internally to write a trace file.

```cs

using Microsoft.Diagnostics.NETCore.Client;
using System.Diagnostics;
using System.IO;
using System.Threading.Task;

public void TraceProcessForDuration(int processId, int duration, string traceName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make duration a TimeSpan

{
var cpuProviders = new List<EventPipeProvider>()
{
new EventPipeProvider("Microsoft-Windows-DotNETRuntime", EventLevel.Informational, (long)ClrTraceEventParser.Keywords.Default),
new EventPipeProvider("Microsoft-DotNETCore-SampleProfiler", EventLevel.Informational, (long)ClrTraceEventParser.Keywords.None)
};
var client = new DiagnosticsClient(processId);
using (var traceSession = client.StartEventPipeSession(cpuProviders))
{
Task copyTask = Task.Run(async () =>
{
using (FileStream fs = new FileStream(traceName, FileMode.Create, FileAccess.Write))
{
await traceSession.EventStream.CopyToAsync(fs);
}
});

copyTask.Wait(duration * 1000);
traceSession.Stop();
}
}
```

#### 5. Print names of all .NET processes that published a diagnostics server to connect

This sample shows how to use `DiagnosticsClient.GetPublishedProcesses` API to print the names of the .NET processes that published a diagnostics IPC channel.

```cs
using Microsoft.Diagnostics.NETCore.Client;
using System.Linq;

public static void PrintProcessStatus()
{
var processes = DiagnosticsClient.GetPublishedProcesses()
.Select(GetProcessById)
.Where(process => process != null)

foreach (var process in processes)
{
sywhang marked this conversation as resolved.
Show resolved Hide resolved
Console.WriteLine($"{process.ProcessName}");
}
}
```


#### 6. Live-parsing events for a specified period of time.

This sample shows an example where we create two tasks, one that parses the events coming in live with `EventPipeEventSource` and one that reads the console input for a user input signaling the program to end. If the target app exists before the users presses enter, the app exists gracefully. Otherwise, `inputTask` will send the Stop command to the pipe and exit gracefully.

```cs
using Microsoft.Diagnostics.NETCore.Client;
using Microsoft.Diagnostics.Tracing.Parsers;

public static void PrintEventsLive(int processId)
{
var providers = new List<EventPipeProvider>()
{
new EventPipeProvider("Microsoft-Windows-DotNETRuntime",
EventLevel.Informational, (long)ClrTraceEventParser.Keywords.Default)
};
var client = new DiagnosticsClient(processId);
using (var session = client.StartEventPipeSession(providers, false))
{

Task streamTask = Task.Run(() =>
{
var source = new EventPipeEventSource(session.EventStream);
source.Dynamic.All += (TraceEvent obj) =>
{
Console.WriteLine(obj.EventName);
};
try
{
source.Process();
}
// NOTE: This exception does not currently exist. It is something that needs to be added to TraceEvent.
catch (Exception e)
{
Console.WriteLine("Error encountered while processing events");
Console.WriteLine(e.ToString());
}
});

Task inputTask = Task.Run(() =>
{
Console.WriteLine("Press Enter to exit");
while (Console.ReadKey().Key != ConsoleKey.Enter)
{
Thread.Sleep(100);
}
session.Stop();
});

Task.WaitAny(streamTask, sleepTask);
}
}
```

#### 7. Attach a ICorProfiler profiler

This sample shows how to attach an ICorProfiler to a process (profiler attach).
```cs
public static int AttachProfiler(int processId, Guid profilerGuid, string profilerPath)
{
var client = new DiagnosticsClient(processId);
return client.AttachProfiler(TimeSpan.FromSeconds(10), profilerGuid, profilerPath);
}
```

## API Descriptions

At a high level, the DiagnosticsClient class provides static methods that the user may call to invoke diagnostics IPC commands (i.e. start an EventPipe session, request a core dump, etc.) The library also provides several classes that may be helpful for invoking these commands. These commands are described in more detail in the diagnostics IPC protocol documentation available here: https://github.com/dotnet/diagnostics/blob/master/documentation/design-docs/ipc-protocol.md#commands.


### DiagnosticsClient
This is a top-level class that contains methods to send various diagnostics command to the runtime.
```cs
namespace Microsoft.Diagnostics.NETCore.Client
{
public class DiagnosticsClient
{
public DiagnosticsClient(int processId)

/// <summary>
/// Start tracing the application via CollectTracing2 command.
/// </summary>
/// <param name="providers">An IEnumerable containing the list of Providers to turn on.</param>
/// <param name="requestRundown">If true, request rundown events from the runtime</param>
/// <param name="circularBufferMB">The size of the runtime's buffer for collecting events in MB</param>
/// <returns>
/// An EventPipeSession object representing the EventPipe session that just started.
/// </returns>
public EventPipeSession StartEventPipeSession(IEnumerable<EventPipeProvider> providers, bool requestRundown=true, int circularBufferMB=256)

/// <summary>
/// Trigger a core dump generation.
/// </summary>
/// <param name="dumpType">Type of the dump to be generated</param>
/// <param name="dumpPath">Full path to the dump to be generated. By default it is /tmp/coredump.{pid}</param>
/// <param name="logDumpGeneration">When set to true, display the dump generation debug log to the console.</param>
public void WriteDump(DumpType dumpType, string dumpPath=null, bool logDumpGeneration=false)

/// <summary>
/// Attach a profiler.
/// </summary>
/// <param name="attachTimeout">Timeout for attaching the profiler</param>
/// <param name="profilerGuid">Guid for the profiler to be attached</param>
/// <param name="profilerPath">Path to the profiler to be attached</param>
/// <param name="additionalData">Additional data to be passed to the profiler</param>
public void AttachProfiler(TimeSpan attachTimeout, Guid profilerGuid, string profilerPath, byte[] additionalData=null);

/// <summary>
/// Get all the active processes that can be attached to.
/// </summary>
/// <returns>
/// IEnumerable of all the active process IDs.
/// </returns>
public static IEnumerable<int> GetPublishedProcesses();
}
}
```


### Exceptions that can be thrown

```cs
namespace Microsoft.Diagnostics.NETCore.Client
{
// Generic wrapper for exceptions thrown by this library
public class DiagnosticsClientException : Exception {}

// When a certian command is not supported by either the library or the target process' runtime
public class UnsupportedProtocolException : DiagnosticsClientException {}

// When the runtime is no longer availble for attaching.
public class ServerNotAvailableException : DiagnosticsClientException {}

// When the runtime responded with an error
public class ServerErrorException : DiagnosticsClientException {}
}
```

### EventPipeProvider
A class that describes an EventPipe provider.
```cs
namespace Microsoft.Diagnostics.Client
{
public class EventPipeProvider
{
public EventPipeProvider(
string name,
EventLevel eventLevel,
long keywords = 0,
IDictionary<string, string> arguments = null)

public long Keywords { get; }

public EventLevel EventLevel { get; }

public string Name { get; }

public IDictionary<string, string> Arguments { get; }

public override string ToString();

public override bool Equals(object obj);

public override int GetHashCode();

public static bool operator ==(Provider left, Provider right);

public static bool operator !=(Provider left, Provider right);
}
}
```

### EventPipeSession
This is a class to represent an EventPipeSession. It is meant to be immutable and acts as a handle to each session that has been started.

```cs
namespace Microsoft.Diagnostics.Client
{
public class EventPipeSession : IDisposable
{
public Stream EventStream { get; };

///<summary>
/// Stops the given session
///</summary>
public void Stop();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without a constructor or initialize method, how will Stop() know what session id to send?

Copy link
Contributor Author

@sywhang sywhang Nov 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constructor is internal. _sessionId is a private member - the goal here is to provide as simple of an API as possible for the end user. Internal states like sessionId is either internal/private and won't be exposed unless we see a need for it.

}
}
```

### DumpType (enum)
This is an enum for the dump type

```cs
namespace Microsoft.Diagnostics.NETCore.Client
{
public enum DumpType
{
Normal = 1,
WithHeap = 2,
Triage = 3,
Full = 4
}
}
```