Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cURL error on concurrent Lambda invocations #2991

Open
mcopik opened this issue Jun 6, 2024 · 1 comment
Open

cURL error on concurrent Lambda invocations #2991

mcopik opened this issue Jun 6, 2024 · 1 comment
Labels
feature-request A feature should be added or improved. p3 This is a minor priority issue

Comments

@mcopik
Copy link

mcopik commented Jun 6, 2024

Describe the bug

I wrote a client that uses the Lambda component to invoke functions asynchronously. I created my own callback, and it seemed to work pretty well. However, when I scaled up to 512 functions, the performance dropped quite quickly - I expected behavior close to linear scaling. I investigated it further, and I found out that there's a default configuration limit of a maximum of 25 TCP connections. So, I changed this parameter to the value of 520, and it started scaling again, although performance became very unpredictable. Unfortunately, for 256 concurrent invocations, we now get an error inside the SDK.

I do not believe this is an issue with Lambda. Our custom implementation of Lambda invoker, based on an HTTP2 client, can scale up much higher concurrent invocations without any issues.

Expected Behavior

The Lambda client from SDK should scale up to the limit of concurrent connections without (a) errors and (b) performance degradation. With the default limit of 25 connections, the SDK can handle even 512 concurrent invocations - it's just very slow.

Current Behavior

I observed the following error on the client. I don't see any failed Lambda invocations in AWS metrics.

Error with Lambda::InvokeRequest. curlCode: 6, Couldn't resolve host name

Reproduction Steps

This is the main invocation code. np is equal to the number of invocations, which in this case is equal to 256. lambda_name refers to the name of my function.

    Aws::SDKOptions options;
    Aws::InitAPI(options);

    Aws::Client::ClientConfiguration clientConfig;
    clientConfig.maxConnections = 520;
    Aws::Lambda::LambdaClient client(clientConfig);
    int id = 0;
    for (int i = 0; i < np; i++) {

        Aws::Lambda::Model::InvokeRequest request;
        request.SetFunctionName(lambda_name);
        Aws::Utils::Json::JsonValue jsonPayload;
        jsonPayload.WithInt64("iterations", n / np);

        std::shared_ptr<Aws::Client::AsyncCallerContext> context =
                Aws::MakeShared<Aws::Client::AsyncCallerContext>("tag");
        context->SetUUID(std::to_string(id++).c_str());

        std::shared_ptr<Aws::IOStream> payload = Aws::MakeShared<Aws::StringStream>(
                "FunctionTest");
        *payload << jsonPayload.View().WriteReadable();
        request.SetBody(payload);
        request.SetContentType("application/json");

        client.InvokeAsync(request, handler, context);
 }

The handler function fails immediately at this step:

void handler(
    const Aws::Lambda::LambdaClient*, const Aws::Lambda::Model::InvokeRequest&,
    Aws::Lambda::Model::InvokeOutcome outcome, const std::shared_ptr<const Aws::Client::AsyncCallerContext>& ctx
)
{
  if (!outcome.IsSuccess()) {
    std::cerr << "Error with Lambda::InvokeRequest. "
                << outcome.GetError().GetMessage()
                << std::endl;
    exit(1);
  }

Possible Solution

No response

Additional Information/Context

No response

AWS CPP SDK version used

Current git master, commit f067d45

Compiler and Version used

Clang 15 (custom fork)

Operating System and version

Ubuntu 22.04

@mcopik mcopik added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jun 6, 2024
@SergeyRyabinin
Copy link
Contributor

Hi @mcopik ,

Thanks a lot for submitting this issue.

Why it happens

Our current async execution model is quite bad: it simply uses regular sync blocking calls and wraps them into execution on a separate thread using the thread executor.
such as

void Lambda::InvokeAsync(...)
{
  threadExecutor->Submit(std::function<...>(Lambda::Invoke(...))
}

The default thread executor is going to spawn a separate thread for each submitted async operation.
There is a slightly better PooledThreadExecutor that is going to use a set of threads, avoiding the usage of 520 threads for each single operation call.

Another big problem with our current async model is that we use sync/blocking HTTP clients, such as
WinHTTP in a sync mode or libCurl using curl_easy_handle, so that SDK can't send out the HTTP request and allow thread to execute some other code.

Therefore, when you submit 520 async requests, SDK is going to spawn 520 threads and 520 curl_easy_handles each creating their own HTTP connection (including the TLS session).

Is there any mitigation

I'd suggest to use PooledThreadExecutor, it won't improve overall throughput, however, it will reduce the amount of threads being spawned.

Long-term plan

We plan to improve/refactor our async model in order to use proper async request handling with the usage of async HTTP client, such as "curl_multi_handle" and AWS CRT HTTP client in the async mode.
Right now it lives on this branch: https://github.com/aws/aws-sdk-cpp/tree/sr/curlMulti2
Unfortunately, we can't give any ETA here.

I'll mark this issue as a feature request.
Please let us know if you have any other question about the SDK.

Best regards,
Sergey

@SergeyRyabinin SergeyRyabinin added feature-request A feature should be added or improved. and removed bug This issue is a bug. labels Jun 6, 2024
@jmklix jmklix added p3 This is a minor priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved. p3 This is a minor priority issue
Projects
None yet
Development

No branches or pull requests

3 participants