Skip to content
This repository has been archived by the owner on Aug 23, 2018. It is now read-only.

Seeing elevated network read ECONNRESET from AWS #10

Closed
nodesocket opened this issue Apr 28, 2016 · 67 comments
Closed

Seeing elevated network read ECONNRESET from AWS #10

nodesocket opened this issue Apr 28, 2016 · 67 comments

Comments

@nodesocket
Copy link

nodesocket commented Apr 28, 2016

Been seeing lots of npm ERR! network read ECONNRESET when invoking npm install.

Does this indicate a CDN issue on npm's side? This is from multiple AWS instances in the us-west-1 and us-west-2 regions.

npm ERR! network read ECONNRESET
npm ERR! network This is most likely not a problem with npm itself
npm ERR! network and is related to network connectivity.
npm ERR! network In most cases you are behind a proxy or have bad network settings.
npm ERR! network
npm ERR! network If you are behind a proxy, please make sure that the
npm ERR! network 'proxy' config is set properly.  See: 'npm help config'
npm ERR! Linux 3.13.0-48-generic
npm ERR! argv "node" "/home/node/.nvm/versions/node/v0.12.7/bin/npm" "install" "--production"
npm ERR! node v0.12.7
npm ERR! npm  v2.14.3
npm ERR! code ECONNRESET
npm ERR! errno ECONNRESET
npm ERR! syscall read
@nodesocket
Copy link
Author

nodesocket commented Apr 28, 2016

In tandem also seeing:

npm ERR! Linux 3.13.0-48-generic
npm ERR! argv "node" "/home/node/.nvm/versions/node/v0.12.7/bin/npm" "install" "--production"
npm ERR! node v0.12.7
npm ERR! npm  v2.14.3

npm ERR! Callback called more than once.
npm ERR!
npm ERR! If you need help, you may report this error at:
npm ERR!     <https://github.com/npm/npm/issues>

npm ERR! Please include the following file with any support request:
npm ERR!     npm-debug.log

@chrisdickinson
Copy link
Contributor

I'm looking into this — our CDN isn't reporting any issues. Would you mind sharing the output of traceroute registry.npmjs.org?

@blopker
Copy link

blopker commented Apr 29, 2016

See also: npm/npm#12484

@othiym23
Copy link

@blopker: I think you mean npm/npm#12484 – this is a separate repository.

@blopker
Copy link

blopker commented Apr 29, 2016

Yep, already fixed it 😄

@nodesocket
Copy link
Author

@chrisdickinson sorry for the delay, here is the result of traceroute:

$ traceroute registry.npmjs.org
traceroute to registry.npmjs.org (23.235.47.162), 30 hops max, 60 byte packets
 1  ec2-50-112-0-164.us-west-2.compute.amazonaws.com (50.112.0.164)  1.648 ms  1.662 ms  1.686 ms
 2  100.64.1.53 (100.64.1.53)  1.031 ms 100.64.1.63 (100.64.1.63)  1.024 ms 100.64.1.9 (100.64.1.9)  1.332 ms
 3  100.64.0.70 (100.64.0.70)  1.339 ms 100.64.0.96 (100.64.0.96)  1.357 ms 100.64.0.34 (100.64.0.34)  1.159 ms
 4  100.64.16.111 (100.64.16.111)  0.401 ms 100.64.16.159 (100.64.16.159)  0.418 ms 100.64.16.97 (100.64.16.97)  0.415 ms
 5  54.239.48.190 (54.239.48.190)  0.711 ms 54.239.48.188 (54.239.48.188)  0.734 ms 54.239.48.190 (54.239.48.190)  0.822 ms
 6  205.251.232.144 (205.251.232.144)  0.854 ms 205.251.232.162 (205.251.232.162)  0.718 ms 205.251.232.144 (205.251.232.144)  0.672 ms
 7  54.239.42.7 (54.239.42.7)  21.606 ms 205.251.232.109 (205.251.232.109)  19.991 ms 54.239.42.7 (54.239.42.7)  21.312 ms
 8  205.251.229.174 (205.251.229.174)  19.945 ms  20.479 ms 205.251.229.138 (205.251.229.138)  19.295 ms
 9  eqix-sv1.fastly.sv5-2.com (206.223.116.70)  19.626 ms 54.240.242.69 (54.240.242.69)  19.374 ms 54.240.242.73 (54.240.242.73)  19.969 ms
10  eqix-sv1.fastly.sv5-2.com (206.223.116.70)  19.740 ms * *
11  * * *
12  * * *
13  * * *
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

Also did mtr:

$ mtr --report registry.npmjs.org
Start: Fri Apr 29 17:07:40 2016
HOST: ip-172-31-14-205            Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- ???                       100.0    10    0.0   0.0   0.0   0.0   0.0
  2.|-- ???                       100.0    10    0.0   0.0   0.0   0.0   0.0
  3.|-- ???                       100.0    10    0.0   0.0   0.0   0.0   0.0
  4.|-- 100.64.16.23               0.0%    10    0.4   0.4   0.4   0.5   0.0
  5.|-- 54.239.48.188              0.0%    10    0.8   1.1   0.7   1.8   0.0
  6.|-- 205.251.232.144            0.0%    10    1.0   0.7   0.5   1.3   0.0
  7.|-- 205.251.232.109            0.0%    10   27.3  21.5  19.8  28.5   3.3
  8.|-- 205.251.229.170            0.0%    10   19.5  20.2  19.5  21.2   0.0
  9.|-- eqix-sv1.fastly.sv5-1.com  0.0%    10   19.8  19.9  19.8  19.9   0.0
 10.|-- ???                       100.0    10    0.0   0.0   0.0   0.0   0.0
 11.|-- 23.235.47.162              0.0%    10   19.9  20.1  19.9  20.7   0.0

@achaudhry
Copy link

Guys, I've pretty much lost all my hair trying to figure this out. Finally stumbled upon this issue. Any progress? I can't deploy anything on any of my EC2 instances. Thank you in advance!

npm ERR! Linux 4.4.5-15.26.amzn1.x86_64
  npm ERR! argv "/opt/elasticbeanstalk/node-install/node-v4.3.0-linux-x64/bin/node" "/opt/elasticbeanstalk/node-install/node-v4.3.0-linux-x64/bin/npm" "--production" "install"
  npm ERR! node v4.3.0
  npm ERR! npm  v2.14.12
  npm ERR! code ECONNRESET
  npm ERR! errno ECONNRESET
  npm ERR! syscall read

  npm ERR! network read ECONNRESET
  npm ERR! network This is most likely not a problem with npm itself
  npm ERR! network and is related to network connectivity.
  npm ERR! network In most cases you are behind a proxy or have bad network settings.
  npm ERR! network 
  npm ERR! network If you are behind a proxy, please make sure that the
  npm ERR! network 'proxy' config is set properly.  See: 'npm help config'

@nodesocket
Copy link
Author

@achaudhry what AWS region are you in?

I've been trying to debug this all day with no luck. We are in us-west-1 and us-west-2. I've tried increasing the SSD EBS volume size to 300GB to improve the IOPS, with no luck. Even have EBS optimized instances with type c4.4xlarge which should have plenty of cpu processing and memory.

It seems to be intermittent, though more often than not.

@achaudhry
Copy link

@nodesocket My instances are in us-west-2.

@achaudhry
Copy link

Although I wasn't actually going to use http, I tried changing the registry to http just out of curiosity but that didn't work for me either. Just FYI...

@nodesocket
Copy link
Author

nodesocket commented Apr 30, 2016

@achaudhry is it 100% error or does it work sometimes? Also, what instance type are you using? Using SSD EBS?

@achaudhry
Copy link

@nodesocket Worked once about 5 hours ago but 100% error since the last 4ish hours which is bizarre! I've tried on a couple of instances about 5-7 times on each.

I have nothing fancy at the moment. Both instances are t1.micro - don't laugh, I'm a poor entrepreneur!

@nodesocket
Copy link
Author

nodesocket commented May 2, 2016

Your instance size may be the problem, 't1.micro' barely have any power and I/O. Would you be able to try a 'c4.large' just for a few minutes? Should only cost 15 cents or so.

I've tried 't2.large', 'c4.2xlarge' and even 'c4.4xlarge'.

@tolgaek
Copy link

tolgaek commented May 2, 2016

Having the same problem on EC2 instance through Elastic Beanstalk

@soldair
Copy link
Contributor

soldair commented May 2, 2016

We are discussing internally how best to get enough data to figure out the issue.

We are successfully serving so many requests from all zones of ec2 we know its not a general registry issue. This makes it very tricky to move forward but we are kicking around some ideas. Hope to have some more information soon.

If anyone comes up with a way to profile their requests to the registry and finds out more please let us know.

@tolgaek
Copy link

tolgaek commented May 2, 2016

I created a forum post on aws: https://forums.aws.amazon.com/thread.jspa?threadID=230574&tstart=0

Maybe that can get some attention from AWS side

@acusti
Copy link

acusti commented May 2, 2016

Quoting a suggestion from @othiym23 in npm/npm#9418 (comment) that hasn’t been mentioned yet in this thread:

This is a network issue that occurs somewhere between the CLI and the CDN, and may or may not be due to the large number of simultaneous HTTPS connections initiated by the CLI. If it's the latter, than updating to the newest versions of npm@2 or npm@3 might mitigate the problem somewhat, due to the fact that they now limit the number of sockets open simultaneously.

@tolgaek
Copy link

tolgaek commented May 2, 2016

Not sure if anything changed on AWS or registry side but I was just able to deploy 3 times in a row

@nodesocket
Copy link
Author

nodesocket commented May 2, 2016

Just confirmed that npm install is magically working again. This leads me to believe one of the two:

  • 1.) The problem is npm or npm's CDN (Fastly) related.
  • 2.) The problem is AWS related.

I'm leaning toward 1, because AWS instances were up and running and seem to be working correctly all besides npm.

@achaudhry
Copy link

@nodesocket Agreed, it worked for me as well on the micro instance. Confused as to what was going wrong for two days straight. Would love to find out for future reference since that was not a very fun exercise...

That said, I'm happy to test it on a better / slightly more powerful instance but since it's working now, I highly doubt the instance type was the issue? Thanks!

@tolgaek
Copy link

tolgaek commented May 2, 2016

@achaudhry I was blocked on very powerful instances. I don't think the issue was related to that

@sicloudhosting
Copy link

traceroute from c3.2xlarge in us-west-2

traceroute to registry.npmjs.org (199.27.79.162), 30 hops max, 60 byte packets
1 ec2-50-112-0-164.us-west-2.compute.amazonaws.com (50.112.0.164) 1.067 ms ec2-50-112-0-198.us-west-2.compute.amazonaws.com (50.112.0.198) 0.963 ms ec2-50-112-0-164.us-west-2.compute.amazonaws.com (50.112.0.164) 1.027 ms
2 100.64.1.207 (100.64.1.207) 1.023 ms 100.64.1.11 (100.64.1.11) 1.992 ms 100.64.1.223 (100.64.1.223) 1.188 ms
3 100.64.0.38 (100.64.0.38) 1.102 ms 100.64.0.130 (100.64.0.130) 1.292 ms 100.64.0.228 (100.64.0.228) 1.922 ms
4 100.64.16.211 (100.64.16.211) 0.305 ms 100.64.16.11 (100.64.16.11) 0.334 ms 100.64.16.77 (100.64.16.77) 0.314 ms
5 54.239.48.188 (54.239.48.188) 0.613 ms 205.251.230.124 (205.251.230.124) 0.621 ms 54.239.48.188 (54.239.48.188) 0.568 ms
6 205.251.232.162 (205.251.232.162) 0.694 ms 205.251.232.144 (205.251.232.144) 0.758 ms 205.251.232.150 (205.251.232.150) 0.888 ms
7 205.251.232.93 (205.251.232.93) 8.024 ms 205.251.232.95 (205.251.232.95) 8.534 ms 7.906 ms
8 52.95.52.24 (52.95.52.24) 7.245 ms 52.95.52.56 (52.95.52.56) 17.797 ms 52.95.52.192 (52.95.52.192) 12.497 ms
9 52.95.52.53 (52.95.52.53) 7.815 ms 52.95.52.119 (52.95.52.119) 7.841 ms 52.95.52.157 (52.95.52.157) 8.028 ms
10 be-128-pe04.seattle.wa.ibone.comcast.net (66.208.228.209) 8.370 ms 7.358 ms be-127-pe04.seattle.wa.ibone.comcast.net (50.248.119.141) 8.264 ms
11 hu-1-2-0-9-cr01.seattle.wa.ibone.comcast.net (68.86.84.45) 8.972 ms hu-1-2-0-5-cr01.seattle.wa.ibone.comcast.net (68.86.82.229) 8.953 ms hu-0-7-0-4-cr02.seattle.wa.ibone.comcast.net (68.86.84.53) 9.475 ms
12 be-12125-cr01.9greatoaks.ca.ibone.comcast.net (68.86.85.198) 26.031 ms 25.998 ms be-10821-cr01.seattle.wa.ibone.comcast.net (68.86.85.81) 8.472 ms
13 be-10925-cr01.sunnyvale.ca.ibone.comcast.net (68.86.87.157) 27.973 ms 27.782 ms be-12125-cr01.9greatoaks.ca.ibone.comcast.net (68.86.85.198) 26.605 ms
14 be-10915-cr02.losangeles.ca.ibone.comcast.net (68.86.86.98) 31.558 ms 30.357 ms 32.436 ms
15 hu-0-2-0-1-pe02.600wseventh.ca.ibone.comcast.net (68.86.88.14) 30.807 ms be-10915-cr02.losangeles.ca.ibone.comcast.net (68.86.86.98) 32.551 ms 32.511 ms
16 hu-0-2-0-1-pe02.600wseventh.ca.ibone.comcast.net (68.86.88.14) 30.846 ms 30.992 ms 31.025 ms
17 * * *
18 * * *
19 * * *
20 * * *
21 * * *
22 * * *
23 * * *
24 * * *
25 * * *
26 * * *
27 * * *
28 * * *
29 * * *
30 * * *

@blopker
Copy link

blopker commented May 3, 2016

Don't know if it helps, but we're getting this error on Centos 6, but not 5 or 7.

@nodesocket
Copy link
Author

nodesocket commented May 3, 2016

@chrisdickinson just started happening again. Seeing elevated connection timeouts consistently.

Region: us-west-2
Instance Type: t2.large

This is seriously affecting us, since all our of CI builds are failing.

npm ERR! network read ECONNRESET
npm ERR! network This is most likely not a problem with npm itself
npm ERR! network and is related to network connectivity.
npm ERR! network In most cases you are behind a proxy or have bad network settings.
npm ERR! network
npm ERR! network If you are behind a proxy, please make sure that the
npm ERR! network 'proxy' config is set properly.  See: 'npm help config'
npm ERR! Linux 3.13.0-48-generic
npm ERR! argv "node" "/home/node/.nvm/versions/node/v0.12.7/bin/npm" "install" "--production"
npm ERR! node v0.12.7
npm ERR! npm  v2.14.3
npm ERR! code ECONNRESET
npm ERR! errno ECONNRESET
npm ERR! syscall read

@tolgaek
Copy link

tolgaek commented May 3, 2016

Can confirm I'm once again having issues as well :(

@thesmart
Copy link

thesmart commented May 3, 2016

This issue is intermittent in our us-west-2 ec2 clusters, occurring daily and maybe for a few minutes at a time. Kicking the build a second time usually does the trick. Considering the number of people reporting issue, I doubt it's a local problem unless maybe something is effed with amazons network. There seem to be a high number of Amazon users reporting in.

@terinjokes
Copy link

I wonder if this is a rate limit kicking in from Fastly, the reports in the last day all seem to be from AWS us-west.

@ricoli
Copy link

ricoli commented May 3, 2016

having the same issue on eu-west-1

@tolgaek
Copy link

tolgaek commented May 3, 2016

started happening again. Anyone know of any workarounds until this is resolved? Can't get builds out!

@deleteme
Copy link

deleteme commented May 5, 2016

I can confirm that upgrading node from 5.1.1 to 5.11.0 fixes it.

@blopker
Copy link

blopker commented May 5, 2016

Just one more datapoint: 4.4.3 fixed it for us too.

@adrianovalente
Copy link

I had the same problem deploying to aws eb with npm 2.4.12. Changing registry to http worked for me!

@achaudhry
Copy link

@nodesocket @soldair so after about 5 months all of a sudden the issue has resurfaced for me and it's happening consistently today. Anyone seeing it again? I remember last time I tried everything mentioned above and nothing worked for me but it randomly started working one day. I've again tried everything mentioned here but so far no luck. Anyone figured out the root cause?

@soldair
Copy link
Contributor

soldair commented Oct 1, 2016

Node and npm client version?
To debug we need more data. If you install with stats-npm and send me the log for failures i stand the best chance to figure it out.

We haven't had and infrastructure or system changes recently that could cause this so we are in much the same place as before.

@atungare
Copy link

atungare commented Oct 6, 2016

@soldair We are seeing a bunch of these ECONNRESETs recently on Travis, with e.g. node 5.8.0 and npm 3.7.3 (but other versions of node & npm are also experiencing errors).

@soldair
Copy link
Contributor

soldair commented Oct 6, 2016

@atungare we are not experiencing any downtime or service interuption as far as i can tell. we'll have to find a way to get stats-npm logs for your install at least an npm-debug.log to debug

@clareliguori
Copy link

clareliguori commented Jan 14, 2017

My team is seeing this on AWS today (see https://github.com/npm/registry/issues/112).

I'll throw out a possibility: does npm registry (or Fastly) do any IP-based throttling? It seems fairly consistent that builds run "in the cloud" (on AWS or on Travis) are affected, and I'm assuming their requests to npm registry would all come from the same IP range. I'm wondering if rate limiting is kicking in.

@jboler
Copy link

jboler commented Mar 9, 2017

About 50% of our builds on our paid Travis-CI account fail with this error. node 4.1.2, npm 2.15.11.

@pjar
Copy link

pjar commented Oct 8, 2017

It's holding my work too. NPM is so broken - how the hell it became a kind of standard?
Sorry, I just lost so much time on npm errors this week, getting frustrated...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests