fix(gproc_pool): fix `rand:uniform/1` range #193

thalesmg · 2023-05-03T19:04:40Z

Since rand:uniform/1's return value is a number in a closed interval (0 =< N =< Sz), we shouldn't call it with the pool size + 1, otherwise we will get a non-uniform distribution of results where one of the workers receive substantially more work than all others.

To verify this, I've started a pool with 8 workers using the random load-balancing algorithm and measured the frequencies of each PID in gproc_pool.

PoolName = my_pool.
PoolSize = 8.
ok = gproc_pool:new(PoolName, random, [{size, PoolSize}]).
lists:foreach(fun(I) ->
  gproc_pool:add_worker(PoolName, {PoolName, I}, I),
  spawn(fun() ->
    true = gproc_pool:connect_worker(PoolName, {PoolName, I}),
    receive
        stop -> ok
    end
  end),
  ok
end, lists:seq(1, PoolSize)).
MeasureFreqs = fun(Pool, NumPoints) ->
  Pids = lists:map(fun(_) -> gproc_pool:pick_worker(Pool) end, lists:seq(1, NumPoints)),
  lists:foldl(
    fun(Pid, Acc) ->
      maps:update_with(Pid, fun(N) -> N + 1 end, 1, Acc)
    end,
    #{},
    Pids
   )
end.
MeasureFreqs(PoolName, 100_000).

Results prior to the fix (note how <0.172.0> has almost double the frequency of other PIDs):

 #{<0.172.0> => 22222,<0.173.0> => 11138,<0.174.0> => 11000,
  <0.175.0> => 11046,<0.176.0> => 11141,<0.177.0> => 11165,
  <0.178.0> => 11166,<0.179.0> => 11122}

After the fix:

 #{<0.172.0> => 12460,<0.173.0> => 12367,<0.174.0> => 12541,
  <0.175.0> => 12494,<0.176.0> => 12649,<0.177.0> => 12399,
  <0.178.0> => 12439,<0.179.0> => 12651}

Since [`rand:uniform/1`](https://www.erlang.org/doc/man/rand.html#uniform-1)'s return value is a number in a closed interval (`0 =< N =< Sz`), we shouldn't call it with the pool size + 1, otherwise we will get a non-uniform distribution of results where one of the workers receive substantially more work than all others. To verify this, I've started a pool with 8 workers using the `random` load-balancing algorithm and measured the frequencies of each PID in `gproc_pool`. ```erlang PoolName = my_pool. PoolSize = 8. ok = gproc_pool:new(PoolName, random, [{size, PoolSize}]). lists:foreach(fun(I) -> gproc_pool:add_worker(PoolName, {PoolName, I}, I), spawn(fun() -> true = gproc_pool:connect_worker(PoolName, {PoolName, I}), receive stop -> ok end end), ok end, lists:seq(1, PoolSize)). MeasureFreqs = fun(Pool, NumPoints) -> Pids = lists:map(fun(_) -> gproc_pool:pick_worker(Pool) end, lists:seq(1, NumPoints)), lists:foldl( fun(Pid, Acc) -> maps:update_with(Pid, fun(N) -> N + 1 end, 1, Acc) end, #{}, Pids ) end. MeasureFreqs(PoolName, 100_000). ``` Results prior to the fix (note how `<0.172.0>` has almost double the frequency of other PIDs): ```erlang #{<0.172.0> => 22222,<0.173.0> => 11138,<0.174.0> => 11000, <0.175.0> => 11046,<0.176.0> => 11141,<0.177.0> => 11165, <0.178.0> => 11166,<0.179.0> => 11122} ``` After the fix: ```erlang #{<0.172.0> => 12460,<0.173.0> => 12367,<0.174.0> => 12541, <0.175.0> => 12494,<0.176.0> => 12649,<0.177.0> => 12399, <0.178.0> => 12439,<0.179.0> => 12651} ```

uwiger · 2023-05-04T15:05:58Z

Thanks!

Includes fix: uwiger/gproc#193 Prior to the fix, when using the `random` pool strategy, one of the workers receives about double the load of other workers, which decreases throughput of bridges like webhook.

thalesmg · 2023-05-08T20:25:48Z

Hi @uwiger , do you have plans to tag a new version that would contain this fix?

We are interested in using the upstream version if possible.

Thanks! 😸

See: uwiger/gproc#193 Also (appup and tag): emqx/gproc#1

Includes fix: uwiger/gproc#193 Prior to the fix, when using the `random` pool strategy, one of the workers receives about double the load of other workers, which decreases throughput of bridges like webhook.

uwiger · 2023-05-12T08:56:31Z

Published version 0.9.1

thalesmg · 2023-05-12T16:02:46Z

Thank you! 🍻

Includes this fix: uwiger/gproc#193

uwiger approved these changes May 4, 2023

View reviewed changes

uwiger merged commit 4ca45e0 into uwiger:master May 4, 2023

thalesmg deleted the fix-random-pool-uniform-distribution branch May 5, 2023 14:17

thalesmg mentioned this pull request May 8, 2023

chore: bump gproc -> 0.9.0.1 emqx/emqx#10641

Merged

8 tasks

thalesmg mentioned this pull request May 8, 2023

chore: bump gproc -> 0.9.0.1 emqx/ehttpc#47

Merged

thalesmg mentioned this pull request May 9, 2023

chore: update gproc -> 0.9.0.1 [v4.4] emqx/emqx#10652

Merged

8 tasks

thalesmg added a commit to thalesmg/emqx that referenced this pull request May 9, 2023

chore: update gproc -> 0.9.0.1

601f750

See: uwiger/gproc#193 Also (appup and tag): emqx/gproc#1

thalesmg mentioned this pull request May 16, 2023

chore: bump gproc -> 0.9.0.1 (r5.0) emqx/emqx#10725

Merged

8 tasks

thalesmg added a commit to thalesmg/emqx that referenced this pull request May 17, 2023

chore: bump gproc -> 0.9.0.1 (r5.0)

060efd6

Includes this fix: uwiger/gproc#193

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(gproc_pool): fix `rand:uniform/1` range #193

fix(gproc_pool): fix `rand:uniform/1` range #193

thalesmg commented May 3, 2023 •

edited

Loading

uwiger commented May 4, 2023

thalesmg commented May 8, 2023

uwiger commented May 12, 2023

thalesmg commented May 12, 2023

fix(gproc_pool): fix rand:uniform/1 range #193

fix(gproc_pool): fix rand:uniform/1 range #193

Conversation

thalesmg commented May 3, 2023 • edited Loading

uwiger commented May 4, 2023

thalesmg commented May 8, 2023

uwiger commented May 12, 2023

thalesmg commented May 12, 2023

fix(gproc_pool): fix `rand:uniform/1` range #193

fix(gproc_pool): fix `rand:uniform/1` range #193

thalesmg commented May 3, 2023 •

edited

Loading