Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TGroup library to manage the priority of xapi execution threads. #6020

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

GabrielBuica
Copy link
Contributor

Currently xapi threads are running with the same priority causing slowness when a lot of threads are waiting for locks.

I've created this draft to discuss the classification of groups of different xapi execution threads. The initial plan is to classify the server threads, smapi requests, and external requests while having the default running as normal.

The Group module is the general abstraction for xapi threads, and the Cgroup module is the specialized in the management of groups as control groups. In the end it will probably have separate modules for managing the priority and nice levels.


let ( // ) = Filename.concat

let requests = "/sys/fs/cgroup/cpu/control.slice/xapi.service/request"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this value should be hardcoded, the interface doesn't make it seem like it should just work for the xapi service.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed init to accept a directory and plan to add this in Xapi_globs

@@ -119,4 +134,37 @@ module Cgroup = struct
Group.all
|> List.map dir_of
|> List.iter (fun dir -> Xapi_stdext_unix.Unixext.mkdir_rec dir 0o755)

let write_line_to_file filename s =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of adding it in the unixext. Or a version of it. It's basically pidfile_write but the pid (in this case I want to pass the tid as a parameter) is passed as a parameter. I will also need something similar to write the cpu shares for cgroups.

@GabrielBuica GabrielBuica force-pushed the private/dbuica/CP-50537-tgroups branch 2 times, most recently from 95b8f52 to 5913a55 Compare September 27, 2024 13:33
Creates a new library `Tgroup`, that abstracts and manages groups
of execution threads in xapi.

When xapi is under load, all the threads need to share a single cpu in
dom0 because of  ocaml runtime single-cpu restrictions. This library
is meant to orchestrate the threads in different priority groups.

Signed-off-by: Gabriel Buica <danutgabriel.buica@cloud.com>
`set_cgroup` adds the functionality of adding the current thread in a
cgroup based on its creator.

Signed-off-by: Gabriel Buica <danutgabriel.buica@cloud.com>
Signed-off-by: Gabriel Buica <danutgabriel.buica@cloud.com>
@GabrielBuica GabrielBuica force-pushed the private/dbuica/CP-50537-tgroups branch from 5913a55 to a2fdfdd Compare October 2, 2024 10:41
in
Xapi_stdext_pervasives.Pervasiveext.finally
(fun () ->
let buf = "0\n" in
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably deserves a comment in the code, but writing 0 to the task file will automatically transform in writing the current caller tid to the file. (writing 0 to the processes file will automatically write the caller's pid to it)

Signed-off-by: Gabriel Buica <danutgabriel.buica@cloud.com>
Copy link
Contributor

@lindig lindig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is GADT magic really required?

let buf = "0\n" in
let len = String.length buf in
if Unix.write fd (Bytes.unsafe_of_string buf) 0 len <> len then
debug "writing current tid to %s failed" filename
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a warning or error

(** Type that represents different originators of xapi threads. *)
type t = Internal_Host_SM | EXTERNAL | Internal_Server

val of_string : string -> t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whats strings are expected here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention is to have a new http header, originator, based on which I set the priority of requests. Initially SM would send this header and (the CLI would be added next). So this function would be called on the values of that specific header.

Not having a mapping to Internal_server was intentional, as I didn't want the possibility of an http request to mimic a Server priority. Classifying the server threads would use Originator.Internal_Server directly.

@GabrielBuica
Copy link
Contributor Author

As for why GADT? The classification is not yet set in stone and most likely will have some changes. I wanted to have exhaustive pattern matching without using the wildcard so when a new group is added, or a change is being made, all the necessary changes will be highlighted by dune as non-exhaustive.

Right now, the Creator type has only one field, originator, but the classification will most likely end by also taking into account types of users and endpoints as well. All these field would have a finite set of values that we are interested in.

Initially I was using strings for all of them but it required quite a lot of attention to make the changes everywhere correctly. Afterwards, I tried to define variants, which helped, but this time I had a lot of variants defined everywhere that would depend on each other. Ended up with a similar problem, having a lot of variant types and every time I would tweak something I had to add a new variant tag in a lot of places. Finally, I ended up with the current Group module that uses gadts, I tried to limit their use (and hide it from outside of the library implementation) as much as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants