Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Response header values folded over multiple lines can't be parsed #38

Open
vytautaskubilius opened this issue Sep 12, 2024 · 7 comments

Comments

@vytautaskubilius
Copy link

vytautaskubilius commented Sep 12, 2024

When receiving a response with a header that is folded over multiple lines, the following error is returned:

Protocol::HTTP1::BadHeader: Could not parse header

I first noticed this issue when trying to migrate to using the async-http-faraday adapter - some requests to external APIs started returning these responses. I was able to reproduce this behaviour by adapting the example from the readme with a simple localhost NGINX server that returns a multi-line header:

repro.rb

require 'async'
require 'io/stream'
require 'async/http/endpoint'
require 'protocol/http1/connection'

Async do
	endpoint = Async::HTTP::Endpoint.parse("http://localhost:8042", alpn_protocols: ["http/1.1"])

	peer = endpoint.connect

	puts "Connected to #{peer} #{peer.remote_address.inspect}"

	# IO Buffering...
	stream = IO::Stream::Buffered.new(peer)
	client = Protocol::HTTP1::Connection.new(stream)

	def client.read_line
		@stream.read_until(Protocol::HTTP1::Connection::CRLF) or raise EOFError
	end

	puts "Writing request..."
	client.write_request("localhost:8042", "GET", "/", "HTTP/1.1", [["Accept", "*/*"]])
	client.write_body("HTTP/1.1", nil)

	puts "Reading response..."
	response = client.read_response("GET")

	puts "Got response: #{response.inspect}"

	puts "Closing client..."
	client.close
end

nginx.conf

user  nginx;
worker_processes  auto;

error_log  /var/log/nginx/error.log notice;
pid        /var/run/nginx.pid;


events {
    worker_connections  1024;
}


http {
    include       /etc/nginx/mime.types;
    add_header "Test-Multiline-Header" "foo;
		bar";
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
}

Start an NGINX container with this configuration and run the repro script to observe the error:

$ docker run --name test-nginx -v ./nginx.conf:/etc/nginx/nginx.conf -d -p 8042:80 nginx
66c3263f56467000d11fec50ffe76e3b29d3f3cc2e1396a559c801ea80313be6
$ ruby repro.rb
Connected to #<Socket:0x00000001206bbab8> #<Addrinfo: [::1]:8042 TCP>
Writing request...
Reading response...
  0.0s     warn: Async::Task [oid=0x230] [ec=0x244] [pid=24080] [2024-09-12 16:40:27 +0300]
               | Task may have ended with unhandled exception.
               |   Protocol::HTTP1::BadHeader: Could not parse header: "Test-Multiline-Header: foo;\n\t\tbar"
               |   → /Users/vytautaskubilius/.rbenv/versions/3.3.3/lib/ruby/gems/3.3.0/gems/protocol-http1-0.22.0/lib/protocol/http1/connection.rb:243 in `read_headers'
               |     /Users/vytautaskubilius/.rbenv/versions/3.3.3/lib/ruby/gems/3.3.0/gems/protocol-http1-0.22.0/lib/protocol/http1/connection.rb:222 in `read_response'
               |     repro.rb:26 in `block in <main>'
               |     /Users/vytautaskubilius/.rbenv/versions/3.3.3/lib/ruby/gems/3.3.0/gems/async-2.16.1/lib/async/task.rb:197 in `block in run'
               |     /Users/vytautaskubilius/.rbenv/versions/3.3.3/lib/ruby/gems/3.3.0/gems/async-2.16.1/lib/async/task.rb:422 in `block in schedule'

The HTTP/1.1 specs indicate that multiline header values are acceptable as long at they start with horizontal tabs or spaces:

HTTP/1.1 header field values can be folded onto multiple lines if the
continuation line begins with a space or horizontal tab. All linear
white space, including folding, has the same semantics as SP. A
recipient MAY replace any linear white space with a single SP before
interpreting the field value or forwarding the message downstream.

@ioquatix
Copy link
Member

https://datatracker.ietf.org/doc/html/rfc7230#section-3.2.4

According to this, folded headers are obsolete and a bad request is appropriate. Wdyt?

@vytautaskubilius
Copy link
Author

Thanks for pointing that out - I hadn't noticed the link to the newer RFC. Reading through that though I still think that in this case parsing the folded value instead of returning an error would be more appropriate. The RFC only provides a single option for user agents receiving the headers in a response:

   A user agent that receives an obs-fold in a response message that is
   not within a message/http container MUST replace each received
   obs-fold with one or more SP octets prior to interpreting the field
   value.

That same option is also valid for servers and proxies, so doing that should account for all use cases.

@ioquatix
Copy link
Member

ioquatix commented Sep 13, 2024

I'm kind of curious about what servers are generating such a response in a production environment.

Are you able to provide more details?

I'd be willing to accept a PR to implement this (conversion of obs-fold with spaces) but I'm slightly against it as it is deprecated behaviour.

@jakeonfire
Copy link

we integrate with many 3rd party APIs, and one of them (an IoT hardware company from the UK) is returning a header with an obs-fold value, so we're seeing the following error:

Protocol::HTTP1::BadHeader
Could not parse header: "Strict-Transport-Security: max-age=31536000;\n\t\t\t   includeSubDomains; preload"

@ioquatix
Copy link
Member

What about reporting a bug to them too?

@vytautaskubilius
Copy link
Author

I’m afraid that’s not a reliable (and potentially not scalable) option for a couple of reasons:

  • We might not always be able to have the third party make changes to their API in a timely manner, especially when nothing on their end is functionally broken. That would cause significant inconvenience and confusion for our users whose integrations would stop working if we did migrate to async-http-faraday, which would mean that we would have to keep reverting the migration until we manage to catch all such issues.
  • Because we have hundreds of integrations with many API endpoints for each of them, testing each one with all possible use cases in advance is prohibitively expensive in terms if both time and engineering resources.

@ioquatix
Copy link
Member

I would be happy to accept a PR to parse folded headers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants