Skip to content

SO 5.6 InDepth Runtime Monitoring

Yauheni Akhotnikau edited this page Jan 10, 2020 · 3 revisions

Intro

SObjectizer has a very simple mechanism for collecting and distributing of run-time monitoring information. This mechanism allows collecting some basic information about SObjectizer's internals like count of registered cooperations, count of delayed/periodic messages, length of event queues and so on.

Working Principles

There are several new concepts related to run-time monitoring in SObjectizer-5.6:

  • data-source. This is an object which holds or can obtain some information about some part of SObjectizer's internals. Data-source can distribute that information in the form of an ordinary message which is sent to the special mbox;
  • stats_controller. This is an object which knows about all data-sources and tells them to distribute the information they hold.

The stats_controller and data-sources are created by SObjectizer automatically, there is no need to do that manually.

The stats_controller can be turned on or off. By default, it is turned off. In that mode, no run-time information collected nor distributed. It means there is no performance penalty when stats_controller doesn't work.

If the stats_controller is turned on it works on a separate thread. This thread is created when the stats_controller is turned on and terminated when the stats_controller is turned off.

The stats_controller sleeps for some time. Then wakes up and tells every data-source object to distribute the information they currently have. The distribution is performed on the thread of stats_controller. When distribution finishes the stats_controller is going to sleep again.

The time the stats_controller sleeps can be adjusted. By default stats_controller distributes run-time statistics every 2 seconds. But it can be changed by so_5::stats::controller_t::set_distribution_period() method.

Every instance of SObjectizer Environment has its own stats_controller instance. This instance can be accessed via so_5::environment_t::stats_controller() method. For example:

void run_time_stats_listener::so_evt_start() override {
  // Run-time information must be updated every 0.5 second.
  so_environment().stats_controller().set_distribution_period(
    std::chrono::milliseconds( 500 ) );
  // Turning the run-time monitoring on.
  so_environment().stats_controller().turn_on();
  ...
}

Data-sources distribute their values by ordinary messages. Types of that messages are defined in so_5::rt::stats::messages namespace. Those messages are sent to mbox which can be obtained by so_5::stats::controller_t::mbox() method.

One of the most important messages in so_5::stats::messages namespace is a template so_5::stats::messages::quantity<T>. This message is used to spread information about quantities of something (like the quantity of registered cooperations/agents, the quantity of single-shot or periodic timer messages, the number of working threads for a dispatcher, size of an event queue and so on).

Data-Source Naming Scheme

Data-sources have unique names. Every name is built from two parts: prefix and suffix. Concatenation of prefix and suffix makes a unique name for data-source.

A prefix is a size-limited null-terminated string. Prefix defines a group of related data sources. For example, all data-sources related to a particular dispatcher instance will have the same prefix.

A suffix is a constant pointer to an immutable null-terminated string. Suffix identified data-source inside a group of related data sources.

For example here are some full names of data-sources:

mbox_repository/named_mbox.count: 0
coop_repository/coop.reg.count: 2
coop_repository/coop.dereg.count: 0
coop_repository/agent.count: 41
timer_thread/single_shot.count: 1
timer_thread/periodic.count: 0
disp/ot/DEFAULT/agent.count: 6
disp/ot/DEFAULT/wt-0/demands.count: 23
disp/ot/0x3be520/agent.count: 5
disp/ot/0x3be520/wt-0/demands.count: 19
disp/ao/0x24911b0/agent.count: 5

Substrings like mbox_repository, disp/ot/DEFAULT and disp/ao/0x24911b0 are prefixes. Substring like /named_mbox.count, /agent.count and /demands.count are suffixes.

A name of data-source is included in every message with actual data from that data-source. For example the so_5::stats::messages::quantity is defined as:

template< typename T >
struct quantity : public message_t
   {
      prefix_t m_prefix;
      suffix_t m_suffix;

      T m_value;

      quantity(
         const prefix_t & prefix,
         const suffix_t & suffix,
         T value );
   };

Where quantity::m_prefix and quantity::m_suffix are parts of data source name.

There are some predefined prefixes. They could be obtained by functions defined in so_5::stats::prefixes namespace. But all the prefixes for dispatcher-related data-sources are generated at run-time.

All suffixes are predefined. They could be obtained by functions defined in so_5::stats::suffixes namespace.

How to Use

To receive run-time monitoring information it is necessary to do the two things:

  1. Create a subscription to the appropriate message from the appropriate mbox.
  2. Turn the stats_controller on.

Those can be done like this:

class my_run_time_stats_listener : public so_5::agent_t
{
public :
    my_run_time_stats_listener( context_t ctx )
        :    so_5::agent_t( ctx )
    {}

    void so_define_agent() override
    {
        so_default_state().event(
            so_environment().stats_controller().mbox(),
            &my_run_time_stats_listener::evt_quantity );
    }

    void so_evt_start() override
    {
        so_environment().stats_controller().turn_on();
    }

private :
    void evt_quantity( const so_5::stats::messages::quantity< std::size_t > & evt )
    {
        // Just show to standard output.
        std::cout << evt.m_prefix << evt.m_suffix << ": " << evt.m_value << std::endl;
    }
};

Please note that SObjectizer doesn't provide any ready-to-use tools for working with run-time statistics. If you want to store it anywhere you must write the appropriated code by yourself.

Please note also that there is no possibility to narrow a stream of monitoring information. If you create subscription to quantity message you will receive messages of that type from all data sources. And if you want to narrow the stream you must filter those messages in your event:

void timer_thread_stats_listener::evt_timer_quantities(
    const so_5::stats::messages::quantity< std::size_t > & evt )
{
    // Ignore messages unrelated to timer thread.
    if( so_5::stats::prefixes::timer_thread() != evt.m_prefix )
        return;
    ... // Processing of related to timer thread messages.
}

distribution_started and distribution_finished Messages

The mechanism described above works on clock cycle: a user is able to set the rate of receiving the monitoring information. For example, once in 250ms. This means that every 250ms SObjectizer will send message pack with current values to special mbox.

There are distribution_started and distribution_finished messages which allows detecting the start and the finish of monitoring data distribution.

The message distribution_started is sent in the beginning of each cycle of monitoring information distribution. Actually it’s the first message being sent by stats_controller in the cycle begin. Then follow messages from data sources inside SObjectizer Environment. The cycle is closed by distribution_finished message. This is the last message and there will not be any more messages related to the closed cycle.

These two messages simplify the scenarios of run-time monitoring data processing which require information about the moments of the cycle begin/end. For example to update information on the charts or to fixate the transaction with new data in DB.

Below is the example of an agent that initiates the distribution of monitoring information 3 times per second and then processes only messages with information about request queues length on worker threads running in SObjectizer Environment dispatchers (to filter the required messages the delivery_filters mechanism is used).

// Agent for receiving run-time monitoring information.
class a_stats_listener_t : public so_5::agent_t
{
public :
   a_stats_listener_t(
      // Environment to work in.
      context_t ctx,
      // Address of logger.
      so_5::mbox_t logger )
      :  so_5::agent_t( ctx )
      ,  m_logger( std::move( logger ) )
   {}

   void so_define_agent() override
   {
      using namespace so_5::stats;

      auto & controller = so_environment().stats_controller();

      // Set up a filter for messages with run-time monitoring information.
      so_set_delivery_filter(
         // Message box to which delivery filter must be set.
         controller.mbox(),
         // Delivery predicate.
         []( const messages::quantity< std::size_t > & msg ) {
            // Process only messages related to dispatcher's queue sizes.
            return suffixes::work_thread_queue_size() == msg.m_suffix;
         } );

      // We must receive messages from run-time monitor.
      so_default_state()
         .event(
            // This is mbox to that run-time statistic will be sent.
            controller.mbox(),
            &a_stats_listener_t::evt_quantity )
         .event( controller.mbox(),
            [this]( const messages::distribution_started & ) {
               so_5::send< log_message >( m_logger, "--- DISTRIBUTION STARTED ---" );
            } )
         .event( controller.mbox(),
            [this]( const messages::distribution_finished & ) {
               so_5::send< log_message >( m_logger, "--- DISTRIBUTION FINISHED ---" );
            } );
   }

   void so_evt_start() override
   {
      // Change the speed of run-time monitor updates.
      so_environment().stats_controller().set_distribution_period(
            std::chrono::milliseconds( 330 ) );
      // Turn the run-timer monitoring on.
      so_environment().stats_controller().turn_on();
   }

private :
   const so_5::mbox_t m_logger;

   void evt_quantity(
      const so_5::stats::messages::quantity< std::size_t > & evt )
   {
      std::ostringstream ss;

      ss << "stats: '" << evt.m_prefix << evt.m_suffix << "': " << evt.m_value;

      so_5::send< log_message >( m_logger, ss.str() );
   }
}; 

Run-Time Monitoring in SObjectizer-5.6

There are several predefined data-sources in SObjectizer Environment. And there are some data-sources for every dispatcher and every dispatcher's working thread. This section briefly describes those data-sources.

Count of Agents and Cooperations

There are three data-sources related to agents and cooperations repository. They all have prefix defined by so_5::stats::prefixes::coop_repository() function but the suffixes for data-source are different:

  • data-source with suffix so_5::stats::suffixes::coop_count() informs about quantity of all cooperations (in registered and deregistering stages);
  • data-source with suffix so_5::stats::suffixes::agent_count() informs about quantity of all agents (this is a total quantity of agents in registered cooperations and in the cooperations which are in deregisteration stage).

Count of Named Mboxes

There is one data-source that informs about the current quantity of named mboxes. This data-source has prefix so_5::stats::prefixes::mbox_repository() and suffix so_5::stats::suffixes::named_mbox_count().

Count of Delayed and Periodic Message

There are two data-sources related to timer thread. They have prefix so_5::stats::prefixes::timer_thread() and one of the following suffixes:

  • so_5::stats::suffixes::timer_single_shot_count(). This data-source informs about quantity of delayed messages;
  • so_5::stats::suffixes::timer_periodic_count(). This data-source informs about quantity of periodic messages.

Information from Dispatchers

Every dispatcher creates several data-sources. Those data sources can be grouped into several groups. Every group will have its own prefix.

There is common rules for creation of prefixes for dispatchers' data-sources:

  • every prefix starts with disp;
  • then will be abbreviated type of dispatcher:
    • /ot -- for one_thread dispatcher;
    • /ao -- for active_obj dispatcher;
    • /ag -- for active_group dispatcher;
    • /tp -- for thread_pool dispatcher;
    • /atp -- for adv_thread_pool dispatcher;
    • /pdt-opp -- for prio_dedicated_threads::one_per_prio dispatcher;
    • /pot-so -- for prio_one_thread::strictly_ordered dispatcher;
    • /pot-qrr -- for prio_one_thread::quoted_round_robin dispatcher;
  • then will be identification for dispatcher instance. It could be part of dispatcher name (for the case when name if specified for the dispatcher) or hex representation of address of dispatcher object.

For example:

disp/ot/DEFAULT/agent.count
disp/ot/DEFAULT/wt-0/demands.count
disp/ot/0x3be520/agent.count
disp/ot/0x3be520/wt-0/demands.count
disp/ao/0x24911b0/agent.count
disp/ao/0x24911b0/wt-0x2491260/demands.count
disp/ag/0x2495930/group.count
disp/ag/0x2495930/wt-group#1/agent.count
disp/ot/stats_listener/wt-0/demands.count
disp/tp/workers/aq/0x53acf0/demands.count

Note. A name for a dispatcher can be specified by using the appropriate make_dispatcher function.

one_thread Dispatcher Run-Time Statistics

one_thread dispatcher creates two data-sources:

  • one with prefix like disp/ot/<DISP-ID> and suffix so_5::stats::suffixes::agent_count(). It informs quantity of agents bound to that dispatcher;
  • another with prefix like disp/ot/<DISP-ID>/wt-0 and suffix so_5::stats::suffixes::work_thread_queue_size(). It informs event queue length.

Example:

disp/ot/DEFAULT/agent.count: 6
disp/ot/DEFAULT/wt-0/demands.count: 23
disp/ot/0x3be520/agent.count: 5
disp/ot/0x3be520/wt-0/demands.count: 19

active_obj Dispatcher Run-Time Statistics

active_obj dispatcher creates two types of data-sources:

  • with prefix like disp/ao/<DISP-ID> and suffix so_5::stats::suffixes::agent_count(). It informs quantity of agents bound to that dispatcher. This data-source is created always, even if there is no agents bound to the dispatcher;
  • with prefix like disp/ao/<DISP-ID>/wt-<THREAD> and suffix so_5::stats::suffixes::work_thread_queue_size(). It informs event queue length for a working thread (e.g. for one agent bound to the dispatcher).

Example:

disp/ao/0x24911b0/agent.count: 5
disp/ao/0x24911b0/wt-0x2491260/demands.count: 2
disp/ao/0x24911b0/wt-0x2491d30/demands.count: 4
disp/ao/0x24911b0/wt-0x24931f0/demands.count: 2
disp/ao/0x24911b0/wt-0x24934b0/demands.count: 2
disp/ao/0x24911b0/wt-0x24955c0/demands.count: 0

active_group Dispatcher Run-Time Statistics

active_group dispatcher creates several data-sources:

  • with prefix like disp/ag/<DISP-ID> and suffix so_5::stats::suffixes::disp_active_group_count(). It informs quantity of active groups handled by that dispatcher. This data-source is created always, even if there is no agents bound to the dispatcher;
  • two types of data-sources are created for every active group. Each of them has prefix like disp/ag/<DISP-ID>/wt-<GROUP> and one of the suffixes:
    • so_5::stats::suffixes::agent_count(). It informs quantity of agents in that active group;
    • so_5::stats::suffixes::work_thread_queue_size(). It informs event queue length for a working thread (e.g. for that active group).

Example:

disp/ag/0x2495930/group.count: 5
disp/ag/0x2495930/wt-db_handler/agent.count: 3
disp/ag/0x2495930/wt-db_handler/demands.count: 4
disp/ag/0x2495930/wt-logger/agent.count: 1
disp/ag/0x2495930/wt-logger/demands.count: 0

thread_pool/adv_thread_pool Dispatchers Run-Time Statistics

thread_pool/adv_thread_pool dispatchers create several data-sources:

  • with prefix like disp/tp/<DISP-ID> and suffix so_5::stats::suffixes::disp_thread_count(). It informs about working thread count for the dispatcher;
  • with prefix like disp/tp/<DISP-ID> and suffix so_5::stats::suffixes::agent_count(). It informs about the quantity of agents bound to that dispatcher;
  • two types of data-sources are created for every FIFO queue. If FIFO queue is created for a cooperation then the corresponding data-sources will have prefix like disp/tp/<DISP-ID>/cq/<COOP>. If FIFO queue is created for an individual agent then the corresponding data-sources will have prefix like disp/tp/<DISP-ID>/aq/<AGENT-PTR>. The suffix for those data-sources can be one of the following:
    • so_5::stats::suffixes::agent_count(). It informs quantity of agents bound to that event queue;
    • so_5::stats::suffixes::work_thread_queue_size(). It informs event queue length.

Note. For adv_thred_pool dispatcher, all prefixes will start from disp/atp.

Example:

disp/tp/0x24974b0/threads.count: 4
disp/tp/0x24974b0/agent.count: 5
disp/tp/0x24974b0/cq/db_ops/agent.count: 5
disp/tp/0x24974b0/cq/db_ops/demands.count: 9
disp/tp/0x2499ab0/threads.count: 4
disp/tp/0x2499ab0/agent.count: 5
disp/tp/0x2499ab0/aq/0x2495670/agent.count: 1
disp/tp/0x2499ab0/aq/0x2495670/demands.count: 4
disp/tp/0x2499ab0/aq/0x2495880/agent.count: 1
disp/tp/0x2499ab0/aq/0x2495880/demands.count: 1
disp/tp/0x2499ab0/aq/0x2495ca0/agent.count: 1
disp/tp/0x2499ab0/aq/0x2495ca0/demands.count: 0
disp/tp/0x2499ab0/aq/0x2495d50/agent.count: 1
disp/tp/0x2499ab0/aq/0x2495d50/demands.count: 1
disp/tp/0x2499ab0/aq/0x24962d0/agent.count: 1
disp/tp/0x2499ab0/aq/0x24962d0/demands.count: 2
disp/atp/0x24a3c10/threads.count: 4
disp/atp/0x24a3c10/agent.count: 5
disp/atp/0x24a3c10/cq/crypto_ops/agent.count: 5
disp/atp/0x24a3c10/cq/crypto_ops/demands.count: 15
disp/atp/0x24971a0/threads.count: 4
disp/atp/0x24971a0/agent.count: 2
disp/atp/0x24971a0/aq/0x24a7e10/agent.count: 1
disp/atp/0x24971a0/aq/0x24a7e10/demands.count: 4
disp/atp/0x24971a0/aq/0x24a8a70/agent.count: 1
disp/atp/0x24971a0/aq/0x24a8a70/demands.count: 4

prio_dedicated_threads::one_per_prio Dispatcher Run-Time Statistics

prop_dedicated_threads::one_per_prio dispatcher creates the following types of data-sources:

  • with prefix like disp/pdt-opp/<DISP-ID> and suffix so_5::stats::suffixes::agent_count(). It informs quantity of agents bound to that dispatcher. This data-source is created always, even if there is no agents bound to the dispatcher;
  • with prefix like disp/pdt-opp/<DISP-ID>/wt-p<PRIORITY> and suffix so_5::stats::suffixes::work_thread_queue_size() for every working threads. It informs event queue length for a working thread;
  • with prefix like disp/pdt-opp/<DISP-ID>/wt-p<PRIORITY> and suffix so_5::stats::suffixes::agent_count() for every working threads. It informs a number of agents with priority <PRIORITY>.

Example:

disp/pdt-opp/0x50d7e0/wt-p0/demands.count: 0
disp/pdt-opp/0x50d7e0/wt-p0/agent.count: 5
disp/pdt-opp/0x50d7e0/wt-p1/demands.count: 0
disp/pdt-opp/0x50d7e0/wt-p1/agent.count: 0
disp/pdt-opp/0x50d7e0/wt-p2/demands.count: 0
disp/pdt-opp/0x50d7e0/wt-p2/agent.count: 0
disp/pdt-opp/0x50d7e0/wt-p3/demands.count: 0
disp/pdt-opp/0x50d7e0/wt-p3/agent.count: 0
disp/pdt-opp/0x50d7e0/wt-p4/demands.count: 0
disp/pdt-opp/0x50d7e0/wt-p4/agent.count: 0
disp/pdt-opp/0x50d7e0/wt-p5/demands.count: 0
disp/pdt-opp/0x50d7e0/wt-p5/agent.count: 0
disp/pdt-opp/0x50d7e0/wt-p6/demands.count: 0
disp/pdt-opp/0x50d7e0/wt-p6/agent.count: 0
disp/pdt-opp/0x50d7e0/wt-p7/demands.count: 0
disp/pdt-opp/0x50d7e0/wt-p7/agent.count: 0
disp/pdt-opp/0x50d7e0/agent.count: 5

prio_one_thread::strictly_ordered Dispatcher Run-Time Statistics

prop_one_thread::strictly_ordered dispatcher creates the following types of data-sources:

  • with prefix like disp/pot-so/<DISP-ID> and suffix so_5::stats::suffixes::agent_count(). It informs quantity of agents bound to that dispatcher. This data-source is created always, even if there is no agents bound to the dispatcher;
  • with prefix like disp/pot-so/p<PRIORITY> and suffix so_5::stats::suffixes::work_thread_queue_size() for every priority. It informs event queue length for a priority;
  • with prefix like disp/pot-so/p<PRIORITY> and suffix so_5::stats::suffixes::agent_count() for every priority. It informs a number of agents with priority <PRIORITY>.

Example:

disp/pot-so/0x500850/p0/agent.count: 5
disp/pot-so/0x500850/p0/demands.count: 0
disp/pot-so/0x500850/p1/agent.count: 0
disp/pot-so/0x500850/p1/demands.count: 0
disp/pot-so/0x500850/p2/agent.count: 0
disp/pot-so/0x500850/p2/demands.count: 0
disp/pot-so/0x500850/p3/agent.count: 0
disp/pot-so/0x500850/p3/demands.count: 0
disp/pot-so/0x500850/p4/agent.count: 0
disp/pot-so/0x500850/p4/demands.count: 0
disp/pot-so/0x500850/p5/agent.count: 0
disp/pot-so/0x500850/p5/demands.count: 0
disp/pot-so/0x500850/p6/agent.count: 0
disp/pot-so/0x500850/p6/demands.count: 0
disp/pot-so/0x500850/p7/agent.count: 0
disp/pot-so/0x500850/p7/demands.count: 0
disp/pot-so/0x500850/agent.count: 5

prio_one_thread::quoted_round_robin Dispatcher Run-Time Statistics

prop_one_thread::strictly_ordered dispatcher creates the following types of data-sources:

  • with prefix like disp/pot-qrr/<DISP-ID> and suffix so_5::stats::suffixes::agent_count(). It informs quantity of agents bound to that dispatcher. This data-source is created always, even if there is no agents bound to the dispatcher;
  • with prefix like disp/pot-qrr/p<PRIORITY> and suffix so_5::stats::suffixes::demand_quote() for every priority. It informs a quote size for a priority;
  • with prefix like disp/pot-qrr/p<PRIORITY> and suffix so_5::stats::suffixes::work_thread_queue_size() for every priority. It informs event queue length for a priority;
  • with prefix like disp/pot-qrr/p<PRIORITY> and suffix so_5::stats::suffixes::agent_count() for every priority. It informs a number of agents with priority <PRIORITY>.

Example:

disp/pot-qrr/0x50e260/p0/demands.quote: 20
disp/pot-qrr/0x50e260/p0/agent.count: 5
disp/pot-qrr/0x50e260/p0/demands.count: 0
disp/pot-qrr/0x50e260/p1/demands.quote: 20
disp/pot-qrr/0x50e260/p1/agent.count: 0
disp/pot-qrr/0x50e260/p1/demands.count: 0
disp/pot-qrr/0x50e260/p2/demands.quote: 20
disp/pot-qrr/0x50e260/p2/agent.count: 0
disp/pot-qrr/0x50e260/p2/demands.count: 0
disp/pot-qrr/0x50e260/p3/demands.quote: 20
disp/pot-qrr/0x50e260/p3/agent.count: 0
disp/pot-qrr/0x50e260/p3/demands.count: 0
disp/pot-qrr/0x50e260/p4/demands.quote: 20
disp/pot-qrr/0x50e260/p4/agent.count: 0
disp/pot-qrr/0x50e260/p4/demands.count: 0
disp/pot-qrr/0x50e260/p5/demands.quote: 20
disp/pot-qrr/0x50e260/p5/agent.count: 0
disp/pot-qrr/0x50e260/p5/demands.count: 0
disp/pot-qrr/0x50e260/p6/demands.quote: 20
disp/pot-qrr/0x50e260/p6/agent.count: 0
disp/pot-qrr/0x50e260/p6/demands.count: 0
disp/pot-qrr/0x50e260/p7/demands.quote: 20
disp/pot-qrr/0x50e260/p7/agent.count: 0
disp/pot-qrr/0x50e260/p7/demands.count: 0
disp/pot-qrr/0x50e260/agent.count: 5

Work Thread Activity Tracking

It is also possible to receive information about the time spent inside event-handlers and time spent during waiting for the next event.

Collecting of such information must be explicitely enabled. This information is not collected by default to avoid negative impact on dispatchers' performance. Work thread activity tracking can be enabled for an individual dispatcher:

using namespace so_5::disp::active_obj;
auto disp = make_dispatcher( env, "my_disp",
      // Collecting of work thread activity is enabled
      // in the dispatcher's parameters.
      disp_params_t{}.turn_work_thread_activity_tracking_on() ); 

Also, it can be done for the whole SObjectizer Environment:

so_5::launch(
   []( so_5::environment_t & env ) {
      ... // some intitial actions.
   }
   []( so_5::environment_params_t & params ) {
      // Enable work thread activity statistics collection explicitly.
      params.turn_work_thread_activity_tracking_on();
      ...
   } ); 

In that case, collecting of information of work thread activity will be enabled for all dispatchers inside SObjectizer Environment. If it is not appropriate for some dispatcher then thread activity tracking can be disabled in parameters of that dispatcher:

so_5::launch(
   []( so_5::environment_t & env ) {
      ... // Some initial actions.

      // Create dispatcher and disable work thread activity tracking for it.
      auto my_disp = so_5::disp::one_thread::make_dispatcher(
            env, "my_disp",
            so_5::disp::one_thread::disp_params_t{}
                  .turn_work_thread_activity_tracking_off() );
      ...
   []( so_5::environment_params_t & params ) {
      // Work thread activity tracking will be enabled for the whole environment. 
      params.turn_work_thread_activity_tracking_on();
      ...
   } );

Distribution of information about work thread activity is done in a usual way -- by messages which are sent to message box of stats_collector. These messages have type work_thread_activity which is defined in so_5::stats::messages namespace.

class activity_listener final : public so_5::agent_t
   {
   public :
      activity_listener( context_t ctx )
         :  so_5::agent_t( ctx )
         {
            so_default_state().event(
                  so_environment().stats_controller().mbox(),
                  &activity_listener::evt_monitor_activity );
         }

   private :
      void evt_monitor_activity(
         const so_5::stats::messages::work_thread_activity & evt )
         {
            namespace stats = so_5::stats;

            std::cout << evt.m_prefix << evt.m_suffix
                  << " [" << evt.m_thread_id << "] ->\n"
                  << "  working: " << evt.m_stats.m_working_stats << "\n"
                  << "  waiting: " << evt.m_stats.m_waiting_stats << std::endl;
            ...
         }
   }; 

There are well-known fields m_prefix and m_suffix inside work_thread_activity. Also, there are additional fields:

  • m_thread_id with ID of working thread (activity stats information is related to thread with this ID);
  • m_working_stats with information about time spent in event-handlers;
  • m_waiting_stats with inforation about time spent on waiting for new events.

Fields m_working_stats and m_waiting_stats have type so_5::stats::activity_stats_t:

struct activity_stats_t
   {
      //! Count of events in that period of time.
      std::uint_fast64_t m_count{};

      //! Total time spent for events in that period of time.
      duration_t m_total_time{};

      //! Average time for one event.
      duration_t m_avg_time{};
   }; 

Where duration_t is std::chrono::high_resolution_clock::duration or std::chrono::steady_lock::duration.

Message work_thread_activity allows to know how many events have been handled by a worker thread (from the very beginning of its work), how much time has been spent inside event-handlers (from the very beginning of its work), the average time of event-handler's work. Similarly for waiting for new events (e.g. count of waitings, total time spent in waiting, average waiting time).

This is a small example of how that information can look like:

disp/atp/0x2b7fb70/threads.count: 4
disp/atp/0x2b7fb70/agent.count: 5
disp/atp/0x2b7fb70/wt-14/thread.activity [14] ->
  working: [count=5651;total=3.86549ms;avg=0.000655ms]
  waiting: [count=5652;total=492.831ms;avg=3.60857ms]
disp/atp/0x2b7fb70/wt-15/thread.activity [15] ->
  working: [count=62394;total=40.4094ms;avg=0.000644ms]
  waiting: [count=62395;total=363.346ms;avg=0.201897ms]
disp/atp/0x2b7fb70/wt-16/thread.activity [16] ->
  working: [count=69073;total=46.2095ms;avg=0.000637ms]
  waiting: [count=69073;total=361.668ms;avg=0.000813ms]
disp/atp/0x2b7fb70/wt-17/thread.activity [17] ->
  working: [count=80587;total=52.904ms;avg=0.000656ms]
  waiting: [count=80588;total=325.136ms;avg=0.003536ms]
disp/atp/0x2b7fb70/cq/__so5_au...e_2__/agent.count: 5
disp/atp/0x2b7fb70/cq/__so5_au...e_2__/demands.count: 4

This is a dump of monitoring information for adv_thread_pool-dispatcher with four work threads. A single coop with five agents inside works on that dispatcher. It is possible to see information on each worker thread (there is thread ID in square brackets). For this particular case, it is possible to see that work threads spend more time during waiting for new events than for event-handling.

Some Words About the Performance Impact.

It highly depends on dispatcher's type and load profile. For one_thread-dispatcher and heavy message-stream this impact is hardly noticed. But for active_obj-dispatcher with occasional messages, the performance can be decreased by 3-4 times. Because of that it impossible to provide some numbers of cumulative performance loss: there is too big variety in results.

In any case, this mechanism is considered as secondary. It is intended for debugging and profiling purposes. Because of that, it is necessary to check the performance impact of work thread activity tracking on your application. And only after that, the decision should be made whether to enable this mechanism on the production environment.

Clone this wiki locally