From 2efef40d94379ce9ab768787a146d559015d9d57 Mon Sep 17 00:00:00 2001 From: Michael Nikitochkin Date: Sun, 9 Oct 2022 14:18:26 +0200 Subject: [PATCH] Found that there is checking on ENV variable on each acquire invoking method. Move logic of disabled? for Circuit breaker to Initalization part. Circtuit breaker should have any knowledge of ENV or configurations. Update using of SEMIAN_DISABLED to disable Semian without creating any resources. It would allow only check only once if CircuitBreaker exists in resource acquire. If there is any of ENV `SEMIAN_CIRCUIT_BREAKER_DISABLED` or `SEMIAN_DISABLED` exists, then it would disable Circuit breaker initialization. --- CHANGELOG.md | 3 + README.md | 37 +++++++---- lib/semian.rb | 10 +-- lib/semian/adapter.rb | 2 + lib/semian/circuit_breaker.rb | 6 -- lib/semian/platform.rb | 2 +- test/adapters/net_http_test.rb | 20 +++++- test/circuit_breaker_test.rb | 20 +++--- test/resource_test.rb | 2 +- test/semian_test.rb | 115 +++++++++++++++++++++++++++++++-- 10 files changed, 175 insertions(+), 42 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index ed67f44fa..e8f645d1f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,6 +3,9 @@ * Avoid prepending the same prefix twice to errors messages. (#423) This mostly happens with the `redis-rb 5+` gem as it translate `redis-client` errors. * Fix running tests in DEBUG mode to test missing semaphores resources. (#430) +* Environment variable `SEMIAN_DISABLED` created a Semian unprotected resource. (#427) + Introduces environment variables `SEMIAN_CIRCUIT_BREAKER_DISABLED` to disable only + `circuit_breaker` and `SEMIAN_BULKHEAD_DISABLED` only `bulkhead`. # v0.16.0 diff --git a/README.md b/README.md index 607cf81e9..90ad26205 100644 --- a/README.md +++ b/README.md @@ -419,9 +419,11 @@ response time. This is the problem Semian solves by failing fast. ## How does Semian work? -Semian consists of two parts: **Circuit breaker** and **Bulkheading**. To understand -Semian, and especially how to configure it, we must understand these patterns -and their implementation. +Semian consists of two parts: **Circuit Breaker** and **Bulkheading**. +To understand Semian, and especially how to configure it, +we must understand these patterns and their implementation. + +Disable Semian via environment variable `SEMIAN_DISABLED=1`. ### Circuit Breaker @@ -452,18 +454,28 @@ all workers on a server. There are four configuration parameters for circuit breakers in Semian: -* **error_threshold**. The amount of errors a worker encounters within error_threshold_timeout amount of time before - opening the circuit, that is to start rejecting requests instantly. -* **error_threshold_timeout**. The amount of time in seconds that error_threshold errors must occur to open the circuit. Defaults to error_timeout seconds if not set. +* **circuit_breaker**. Enable or Disable Circuit Breaker. Defaults to `true` if not set. +* **error_threshold**. The amount of errors a worker encounters within `error_threshold_timeout` + amount of time before opening the circuit, + that is to start rejecting requests instantly. +* **error_threshold_timeout**. The amount of time in seconds that `error_threshold` + errors must occur to open the circuit. + Defaults to `error_timeout` seconds if not set. * **error_timeout**. The amount of time in seconds until trying to query the resource again. -* **error_threshold_timeout_enabled**. If set to false it will disable the time window for evicting old exceptions. `error_timeout` is still used and will reset - the circuit. Defaults to `true` if not set. +* **error_threshold_timeout_enabled**. If set to false it will disable + the time window for evicting old exceptions. `error_timeout` is still used and + will reset the circuit. Defaults to `true` if not set. * **success_threshold**. The amount of successes on the circuit until closing it again, that is to start accepting all requests to the circuit. -* **half_open_resource_timeout**. Timeout for the resource in seconds when the circuit is half-open (supported for MySQL, Net::HTTP and Redis). +* **half_open_resource_timeout**. Timeout for the resource in seconds when + the circuit is half-open (supported for MySQL, Net::HTTP and Redis). + +It is possible to disable Circuit Breaker with environment variable +`SEMIAN_CIRCUIT_BREAKER_DISABLED=1`. -For more information about configuring these parameters, please read [this post](https://engineering.shopify.com/blogs/engineering/circuit-breaker-misconfigured). +For more information about configuring these parameters, please read +[this post](https://engineering.shopify.com/blogs/engineering/circuit-breaker-misconfigured). ### Bulkheading @@ -516,11 +528,15 @@ still experimenting with ways to figure out optimal ticket numbers. Generally something below half the number of workers on the server for endpoints that are queried frequently has worked well for us. +* **bulkhead**. Enable or Disable Bulkhead. Defaults to `true` if not set. * **tickets**. Number of workers that can concurrently access a resource. * **timeout**. Time to wait in seconds to acquire a ticket if there are no tickets left. We recommend this to be `0` unless you have very few workers running (i.e. less than ~5). +It is possible to disable Bulkhead with environment variable +`SEMIAN_BULKHEAD_DISABLED=1`. + Note that there are system-wide limitations on how many tickets can be allocated on a system. `cat /proc/sys/kernel/sem` will tell you. @@ -550,7 +566,6 @@ ipcs -si $(ipcs -s | grep 0x48af51ea | awk '{print $2}') Which should output something like: ``` - Semaphore Array semid=5570729 uid=8192 gid=8192 cuid=8192 cgid=8192 mode=0660, access_perms=0660 diff --git a/lib/semian.rb b/lib/semian.rb index 3422dfdca..65efd2cd0 100644 --- a/lib/semian.rb +++ b/lib/semian.rb @@ -181,6 +181,8 @@ def to_s # # Returns the registered resource. def register(name, **options) + return UnprotectedResource.new(name) if ENV.key?("SEMIAN_DISABLED") + circuit_breaker = create_circuit_breaker(name, **options) bulkhead = create_bulkhead(name, **options) @@ -275,8 +277,8 @@ def thread_safe=(thread_safe) private def create_circuit_breaker(name, **options) - circuit_breaker = options.fetch(:circuit_breaker, true) - return unless circuit_breaker + return if ENV.key?("SEMIAN_CIRCUIT_BREAKER_DISABLED") + return unless options.fetch(:circuit_breaker, true) require_keys!([:success_threshold, :error_threshold, :error_timeout], options) @@ -315,8 +317,8 @@ def implementation(**options) end def create_bulkhead(name, **options) - bulkhead = options.fetch(:bulkhead, true) - return unless bulkhead + return if ENV.key?("SEMIAN_BULKHEAD_DISABLED") + return unless options.fetch(:bulkhead, true) permissions = options[:permissions] || default_permissions timeout = options[:timeout] || 0 diff --git a/lib/semian/adapter.rb b/lib/semian/adapter.rb index fb2922266..7a1c6e96d 100644 --- a/lib/semian/adapter.rb +++ b/lib/semian/adapter.rb @@ -32,6 +32,8 @@ def clear_semian_resource private def acquire_semian_resource(scope:, adapter:, &block) + # NOTICE: It could be happen that Circuit breaker/Bulkhead skipped + # if the resource was acquired return yield if resource_already_acquired? semian_resource.acquire(scope: scope, adapter: adapter, resource: self) do diff --git a/lib/semian/circuit_breaker.rb b/lib/semian/circuit_breaker.rb index b777412c5..3ecee20f9 100644 --- a/lib/semian/circuit_breaker.rb +++ b/lib/semian/circuit_breaker.rb @@ -36,8 +36,6 @@ def initialize(name, exceptions:, success_threshold:, error_threshold:, end def acquire(resource = nil, &block) - return yield if disabled? - transition_to_half_open if transition_to_half_open? raise OpenCircuitError unless request_allowed? @@ -166,10 +164,6 @@ def notify_state_transition(new_state) Semian.notify(:state_change, self, nil, nil, state: new_state) end - def disabled? - ENV["SEMIAN_CIRCUIT_BREAKER_DISABLED"] || ENV["SEMIAN_DISABLED"] - end - def maybe_with_half_open_resource_timeout(resource, &block) if half_open? && @half_open_resource_timeout && resource.respond_to?(:with_resource_timeout) resource.with_resource_timeout(@half_open_resource_timeout) do diff --git a/lib/semian/platform.rb b/lib/semian/platform.rb index 897434b2a..6b92567bf 100644 --- a/lib/semian/platform.rb +++ b/lib/semian/platform.rb @@ -13,6 +13,6 @@ def semaphores_enabled? end def disabled? - ENV["SEMIAN_SEMAPHORES_DISABLED"] || ENV["SEMIAN_DISABLED"] + ENV.key?("SEMIAN_SEMAPHORES_DISABLED") || ENV.key?("SEMIAN_DISABLED") end end diff --git a/test/adapters/net_http_test.rb b/test/adapters/net_http_test.rb index f65410286..f58ca51a1 100644 --- a/test/adapters/net_http_test.rb +++ b/test/adapters/net_http_test.rb @@ -279,8 +279,11 @@ def test_disable_semian_should_be_false def test_disable_semian_for_all_http_requests_with_flag with_server do with_semian_configuration do - http = Net::HTTP.new(SemianConfig["toxiproxy_upstream_host"], SemianConfig["http_toxiproxy_port"], - semian: false) + http = Net::HTTP.new( + SemianConfig["toxiproxy_upstream_host"], + SemianConfig["http_toxiproxy_port"], + semian: false, + ) assert_predicate(http, :disabled?) end @@ -455,6 +458,19 @@ def test_persistent_state_after_server_restart end end + def test_semian_disabled_env + ENV["SEMIAN_DISABLED"] = "1" + with_server do + with_semian_configuration do + http = Net::HTTP.new("example.com", 80) + + assert_equal("nethttp_example.com_80", http.semian_identifier) + end + end + ensure + ENV.delete("SEMIAN_DISABLED") + end + private def half_open_cicuit!(backwards_time_travel = 10) diff --git a/test/circuit_breaker_test.rb b/test/circuit_breaker_test.rb index 45601cfb9..10b0b2814 100644 --- a/test/circuit_breaker_test.rb +++ b/test/circuit_breaker_test.rb @@ -358,22 +358,20 @@ def test_error_threshold_timeout_is_skipped_when_not_using_error_threshold_and_n def test_env_var_disables_circuit_breaker ENV["SEMIAN_CIRCUIT_BREAKER_DISABLED"] = "1" - open_circuit! + resource = Semian.register( + :env_var_disables_circuit_breaker, + tickets: 1, + error_threshold: 1, + error_timeout: 10, + success_threshold: 1, + ) + open_circuit!(resource) - assert_circuit_closed + assert_circuit_closed(resource) ensure ENV.delete("SEMIAN_CIRCUIT_BREAKER_DISABLED") end - def test_semian_wide_env_var_disables_circuit_breaker - ENV["SEMIAN_DISABLED"] = "1" - open_circuit! - - assert_circuit_closed - ensure - ENV.delete("SEMIAN_DISABLED") - end - class RawResource def timeout @timeout || 2 diff --git a/test/resource_test.rb b/test/resource_test.rb index 1ed0601f5..d54362074 100644 --- a/test/resource_test.rb +++ b/test/resource_test.rb @@ -513,7 +513,7 @@ def test_resize_tickets_decrease_with_fork # This is sort of racey, because it's waiting on enough workers to quit # So that it has room to adjust the ticket count to the desired value. # This must happen in less than 5 seconds, or an internal timeout occurs. - resource = create_resource(:testing, tickets: (tickets / 2).floor) + resource = create_resource(:test_resize_tickets_decrease_with_fork, tickets: (tickets / 2).floor) # Immediately on the above call returning, the tickets should be correct assert_equal((tickets / 2).floor, resource.tickets) diff --git a/test/semian_test.rb b/test/semian_test.rb index a99b890e5..8ec1f670f 100644 --- a/test/semian_test.rb +++ b/test/semian_test.rb @@ -3,12 +3,6 @@ require "test_helper" class TestSemian < Minitest::Test - def setup - Semian.destroy(:testing) - rescue - nil - end - def test_unsupported_acquire_yields acquired = false Semian.register(:testing, tickets: 1, error_threshold: 1, error_timeout: 2, success_threshold: 1) @@ -109,4 +103,113 @@ def test_disabled_via_semian_wide_env_var ensure ENV.delete("SEMIAN_DISABLED") end + + def test_disabled_both_bulkheading_and_circuit_breaker + exception = assert_raises(ArgumentError) do + Semian.register( + :disabled_bulkhead_and_circuit_breaker, + bulkhead: false, + circuit_breaker: false, + ) + end + + assert_equal("Both bulkhead and circuitbreaker cannot be disabled.", + exception.message) + end + + def test_disabled_bulkheading + resource = Semian.register( + :disabled_bulkhead, + bulkhead: false, + success_threshold: 1, + error_threshold: 1, + error_timeout: 1, + ) + + assert_nil(resource.bulkhead) + end + + def test_disabled_bulkhead_via_env + ENV["SEMIAN_BULKHEAD_DISABLED"] = "1" + + resource = Semian.register( + :disabled_bulkhead_via_env, + success_threshold: 1, + error_threshold: 1, + error_timeout: 1, + ) + + assert_nil(resource.bulkhead) + ensure + ENV.delete("SEMIAN_BULKHEAD_DISABLED") + end + + def test_disabled_bulkhead_via_env_with_option_enabled + ENV["SEMIAN_BULKHEAD_DISABLED"] = "1" + + resource = Semian.register( + :disabled_bulkhead_via_env, + bulkhead: true, + tickets: 1, + success_threshold: 1, + error_threshold: 1, + error_timeout: 1, + ) + + assert_nil(resource.bulkhead) + ensure + ENV.delete("SEMIAN_BULKHEAD_DISABLED") + end + + def test_disabled_circuit_breaker + resource = Semian.register( + :disabled_circuit_breaker, + tickets: 1, + circuit_breaker: false, + ) + + assert_nil(resource.circuit_breaker) + end + + def test_disabled_circuit_breaker_via_env + ENV["SEMIAN_CIRCUIT_BREAKER_DISABLED"] = "1" + + resource = Semian.register( + :disabled_circuit_breaker_via_env, + tickets: 1, + ) + + assert_nil(resource.circuit_breaker) + ensure + ENV.delete("SEMIAN_CIRCUIT_BREAKER_DISABLED") + end + + def test_disabled_circuit_breaker_via_semian_env + ENV["SEMIAN_DISABLED"] = "1" + + resource = Semian.register(:disabled_semina_via_env) + + assert_nil(resource.circuit_breaker) + assert_nil(resource.bulkhead) + ensure + ENV.delete("SEMIAN_DISABLED") + end + + def test_enabled_circuit_breaker_by_options_but_disabled_by_env + ENV["SEMIAN_CIRCUIT_BREAKER_DISABLED"] = "1" + + resource = Semian.register( + :disabled_circuit_breaker_conflict, + bulkhead: true, + tickets: 1, + circuit_breaker: true, + success_threshold: 1, + error_threshold: 1, + error_timeout: 1, + ) + + assert_nil(resource.circuit_breaker) + ensure + ENV.delete("SEMIAN_CIRCUIT_BREAKER_DISABLED") + end end