Skip to content

Latest commit

 

History

History
442 lines (292 loc) · 15.7 KB

gproc_pool.md

File metadata and controls

442 lines (292 loc) · 15.7 KB

Module gproc_pool

Load balancing functions based on Gproc.

Authors: Ulf Wiger (ulf@wiger.net).

Description

This module implements support for load-balancing server pools. It was originally intended mainly as an example of how to use various Gproc resources (e.g. counters and shared properties), but is fully integrated into Gproc, and fully functional.

Concepts

Each pool has a list of 'named' workers (defined using add_worker/2) and a load-balancing strategy. Processes can then 'connect' to the pool (with connect_worker/2), using one of the defined names.

Users then 'pick' one of the currently connected processes in the pool. Which process is picked depends on the load-balancing strategy.

The whole representation of the pool and its connected workers is in gproc. The server gproc_pool is used to serialize pool management updates, but worker selection is performed entirely in the calling process, and can be performed by several processes concurrently.

Load-balancing strategies

  • round_robin is the default. A wrapping gproc counter keeps track of the latest worker picked, and gproc:next() is used to find the next worker.
  • random picks a random worker from the pool.
  • hash requires a value (pick/2), and picks a worker based on the hash of that value.
  • direct takes an integer as an argument, and picks the next worker (modulo the size of the pool). This is mainly for implementations that implement a load-balancing strategy on top of gproc_pool.
  • claim picks the first available worker and 'claims' it while executing a user-provided fun. This means that the number of concurrently executing jobs will not exceed the size of the pool.

Function Index

active_workers/1Return a list of currently connected workers in the pool.
add_worker/2Assign a worker name to the pool, returning the worker's position.
add_worker/3Assign a worker name to a given slot in the pool, returning the slot.
claim/2Equivalent to claim(Pool, F, nowait).
claim/3Picks the first available worker in the pool and applies Fun.
connect_worker/2Connect the current process to Name in Pool.
defined_workers/1Return a list of added workers in the pool.
delete/1Delete an existing pool.
disconnect_worker/2Disconnect the current process from Name in Pool.
force_delete/1Forcibly remove a pool, terminating all active workers.
log/1Update a counter associated with a worker name.
new/1Equivalent to new(Pool, round_robin, []).
new/3Create a new pool.
pick/1Pick a worker from the pool given the pool's load-balancing algorithm.
pick/2Pick a worker from the pool based on Value.
pick_worker/1Pick a worker pid from the pool given the pool's load-balancing algorithm.
pick_worker/2Pick a worker pid from the pool given the pool's load-balancing algorithm.
ptest/4
randomize/1Randomizes the "next" pointer for the pool.
remove_worker/2Remove a previously added worker.
setup_test_pool/4
test_run0/2
whereis_worker/2Look up the pid of a connected worker.
worker_id/2Return the unique gproc name corresponding to a name in the pool.
worker_pool/1Return a list of slots and/or named workers in the pool.

Function Details

active_workers/1


active_workers(Pool::any()) -> [{Name, Pid}]

Return a list of currently connected workers in the pool.

add_worker/2


add_worker(Pool::any(), Name::any()) -> integer()

Assign a worker name to the pool, returning the worker's position.

Before a worker can connect to the pool, its name must be added. If no explicit position is given (see add_worker/3), the most suitable position, depending on load-balancing algorithm, is selected: for round_robin and direct pools, names are packed tightly from the beginning; for hash and random pools, slots are filled as sparsely as possible, in order to maintain an even likelihood of hitting each worker.

An exception is raised if the pool is full (and auto_size is false), or if Name already exists in the pool.

Before a worker can be used, a process must connect to it (see connect_worker/2.

add_worker/3


add_worker(Pool::any(), Name::any(), Slot::integer()) -> integer()

Assign a worker name to a given slot in the pool, returning the slot.

This function allows the pool maintainer to exactly position each worker inside the pool. An exception is raised if the position is already taken, or if Name already exists in the pool. If Slot is larger than the current size of the pool, an exception is raised iff auto_size is false; otherwise the pool is expanded to accomodate the new position.

claim/2

claim(Pool, F) -> any()

Equivalent to claim(Pool, F, nowait).

claim/3


claim(Pool, F::Fun, Wait) -> {true, Res} | false
  • Pool = any()
  • Fun = function()
  • Wait = nowait | {busy_wait, integer()}

Picks the first available worker in the pool and applies Fun.

A claim pool allows the caller to "claim" a worker during a short span (essentially, a lock is set and released as soon as Fun returns). Once a worker is selected, Fun(Name, Pid) is called, where Name is a unique gproc name of the worker, and Pid is its process identifier. The gproc name of the worker serves as a mutex, where its value is 0 (zero) if the worker is free, and 1 (one) if it is busy. The mutex operation is implemented using gproc:update_counter/2.

Wait == nowait means that the call will return false immediately if there is no available worker.

Wait == {busy_wait, Timeout} will keep repeating the claim attempt for Timeout milliseconds. If still no worker is available, it will return false.

connect_worker/2


connect_worker(Pool::any(), Name::any()) -> true

Connect the current process to Name in Pool.

Typically, a server will call this function as it starts, similarly to when it registers itself. In fact, calling connect_worker/2 leads to the process being registered as {n,l,[gproc_pool,N,Name]}, where N is the position of Name in the pool. This means (a) that gproc monitors the worker, and removes the connection automatically if it dies, and (b) that the registered names can be listed in order of their positions in the pool.

This function raises an exception if Name does not exist in Pool (or there is no such pool), or if another worker is already connected to Name.

defined_workers/1


defined_workers(Pool::any()) -> [{Name, Pos, Count}]

Return a list of added workers in the pool.

The added workers are slots in the pool that have been given names, and thus can be connected to. This function doesn't detect whether or not there are any connected (active) workers.

The list contains {Name, Pos, Count}, where Name is the name of the added worker, Pos is its position in the pool, and Count represents the number of times the worker has been picked (assuming callers keep count by explicitly calling log/1).

delete/1


delete(Pool::any()) -> true

Delete an existing pool.

This function will delete a pool, only if there are no connected workers. Ensure that workers have been disconnected before deleting the pool.

disconnect_worker/2


disconnect_worker(Pool, Name) -> true

Disconnect the current process from Name in Pool.

This function is similar to a gproc:unreg() call. It removes the connection between Pool, Name and pid, and makes it possible for another process to connect to Name.

An exception is raised if there is no prior connection between Pool, Name and the current process.

force_delete/1


force_delete(Pool::any()) -> true

Forcibly remove a pool, terminating all active workers

This function is primarily intended for cleanup of any pools that might have become inconsistent (for whatever reason). It will clear out all resources belonging to the pool and send exit(Pid, kill) signals to all connected workers (except the calling process).

log/1


log(X1::GprocKey) -> integer()

Update a counter associated with a worker name.

Each added worker has a gproc counter that can be used e.g. to keep track of the number of times the worker has been picked. Since it's associated with the named 'slot', and not to the connected worker, its value will persist even if the currently connected worker dies.

new/1


new(Pool::any()) -> ok

Equivalent to new(Pool, round_robin, []).

new/3


new(Pool::any(), Type, Opts) -> true
  • Type = round_robin | random | hash | direct | claim
  • Opts = [{size, integer()} | {auto_size, boolean()}]

Create a new pool.

The pool starts out empty. If a size is not given, the pool size is set to 0 initially. auto_size is true by default if size is not specified, but false by default otherwise. If auto_size == true, the pool will be enlarged to accomodate new workers, when necessary. Otherwise, trying to add a worker when the pool is full will raise an exception, as will trying to add a worker on a specific position beyond the current size of the pool.

If the given pool already exists, this function will raise an exception.

pick/1


pick(Pool::any()) -> GprocName | false

Pick a worker from the pool given the pool's load-balancing algorithm.

The pool types that allows picking without an extra argument are round_robin and random. This function returns false if there is no available worker, or if Pool is not a valid pool.

pick/2


pick(Pool::any(), Value::any()) -> GprocName | false

Pick a worker from the pool based on Value.

The pool types that allows picking based on an extra argument are hash and direct. This function returns false if there is no available worker, or if Pool is not a valid pool.

If the pool is of type direct, Value must be an integer corresponding to a position in the pool (modulo the size of the pool). If the type is hash, Value may be any term, and its hash value will serve as a guide for selecting a worker.

pick_worker/1


pick_worker(Pool::any()) -> pid() | false

Pick a worker pid from the pool given the pool's load-balancing algorithm.

Like pick/1, but returns the worker pid instead of the name.

pick_worker/2


pick_worker(Pool::any(), Value::any()) -> pid() | false

Pick a worker pid from the pool given the pool's load-balancing algorithm.

Like pick/2, but returns the worker pid instead of the name.

ptest/4

ptest(N, I, Type, Opts) -> any()

randomize/1


randomize(Pool::any()) -> integer()

Randomizes the "next" pointer for the pool.

This function only has an effect for round_robin pools, which have a reference to the next worker to be picked. Without randomizing, the load balancing will always start with the first worker in the pool.

remove_worker/2


remove_worker(Pool::any(), Name::any()) -> true

Remove a previously added worker.

This function will assume that any connected worker is disconnected first. It will fail if there is no such pool, but will return true in the case when Name did not exist in the pool in the first place.

setup_test_pool/4

setup_test_pool(P, Type0, Opts, Workers) -> any()

test_run0/2

test_run0(N, X) -> any()

whereis_worker/2


whereis_worker(Pool::any(), Name::any()) -> pid() | undefined

Look up the pid of a connected worker.

This function works similarly to gproc:where/1: it will return the pid of the worker connected as Pool / Name, if there is such a worker; otherwise it will return undefined. It will raise an exception if Name has not been added to the pool.

worker_id/2


worker_id(Pool, Name) -> GprocName

Return the unique gproc name corresponding to a name in the pool.

This function assumes that Name has been added to Pool. It returns the unique name that a connected worker will be registered as. This doesn't mean that there is, in fact, such a connected worker.

worker_pool/1


worker_pool(Pool::any()) -> [integer() | {Name, Pos}]

Return a list of slots and/or named workers in the pool.

This function is mainly for testing, but can also be useful when implementing your own worker placement algorithm on top of gproc_pool.

A plain integer represents an unfilled slot, and {Name, Pos} represents an added worker. The pool is always filled to the current size.