Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tserver does not start with 'Too many open files' error #380

Closed
vincentvictoria opened this issue Jul 12, 2018 · 6 comments
Closed

tserver does not start with 'Too many open files' error #380

vincentvictoria opened this issue Jul 12, 2018 · 6 comments
Labels
kind/question This is a question

Comments

@vincentvictoria
Copy link

First, let me say Yugabyte looks awesome.
To my problem,
I'm on a mac, 10.13.2 and following https://docs.yugabyte.com/latest/quick-start/create-local-cluster/

I did ifconfig alias stuff then,

I ran
./bin/yb-ctl create
then
./bin/yb-ctl status
but it displays

2018-07-12 16:54:29,813 INFO: Server is running: type=master, node_id=1, PID=947, admin service=http://127.0.0.1:7000
2018-07-12 16:54:29,834 INFO: Server is running: type=master, node_id=2, PID=950, admin service=http://127.0.0.2:7000
2018-07-12 16:54:29,857 INFO: Server is running: type=master, node_id=3, PID=953, admin service=http://127.0.0.3:7000
2018-07-12 16:54:29,880 INFO: Server tserver-1 is not running
2018-07-12 16:54:29,898 INFO: Server tserver-2 is not running
2018-07-12 16:54:29,921 INFO: Server tserver-3 is not running

I looked at /private/tmp/yugabyte-local-cluster/node-1/disk-1/tserver.err


Could not create logging file: Too many open files
COULD NOT CREATE A LOGGINGFILE 20180712-165429.956!F0712 16:54:29.066678 44068864 reactor.cc:103] LibEV fatal error: (libev) error creating signal/async pipe: Too many open files [24]
Fatal failure details written to /tmp/yugabyte-local-cluster/node-1/disk-1/yb-data/tserver/logs/yb-tserver.FATAL.details.2018-07-12T16_54_29.pid956.txt
F20180712 16:54:29 ../../../../../src/yb/rpc/reactor.cc:103] LibEV fatal error: (libev) error creating signal/async pipe: Too many open files [24]
    @        0x10e873c8b  google::LogDestination::LogToSinks()
    @        0x10e872daf  google::LogMessage::SendToLog()
    @        0x10e873775  google::LogMessage::Flush()
    @        0x10e8735f3  google::LogMessage::~LogMessage()
    @        0x10e8744ee  google::ErrnoLogMessage::~ErrnoLogMessage()
    @        0x10dfbd1be  yb::rpc::(anonymous namespace)::LibevSysErr()
    @        0x10dc580a8  evpipe_init
    @        0x10dc59203  ev_async_start
    @        0x10dfb73c0  yb::rpc::Reactor::Init()
    @        0x10df991eb  yb::rpc::MessengerBuilder::Build()
    @        0x10ce51d43  yb::client::YBClientBuilder::Build()
    @        0x10ce3e4f6  yb::client::AsyncClientInitialiser::InitClient()
    @        0x10ce4047e  _ZNSt3__114__thread_proxyINS_5tupleIJNS_10unique_ptrINS_15__thread_structENS_14default_deleteIS3_EEEENS_6__bindIMN2yb6client22AsyncClientInitialiserEFvvEJPSA_EEEEEEEEPvSG_
    @     0x7fff78df26c1  _pthread_body
    @     0x7fff78df256d  _pthread_start
    @     0x7fff78df1c5d  thread_start

*** Check failure stack trace: ***
    @        0x10e87400a  google::LogMessage::Fail()
    @        0x10e873058  google::LogMessage::SendToLog()
    @        0x10e873775  google::LogMessage::Flush()
    @        0x10e8735f3  google::LogMessage::~LogMessage()
    @        0x10e8744ee  google::ErrnoLogMessage::~ErrnoLogMessage()
    @        0x10dfbd1be  yb::rpc::(anonymous namespace)::LibevSysErr()
    @        0x10dc580a8  evpipe_init
    @        0x10dc59203  ev_async_start
    @        0x10dfb73c0  yb::rpc::Reactor::Init()
    @        0x10df991eb  yb::rpc::MessengerBuilder::Build()
    @        0x10ce51d43  yb::client::YBClientBuilder::Build()
    @        0x10ce3e4f6  yb::client::AsyncClientInitialiser::InitClient()
    @        0x10ce4047e  _ZNSt3__114__thread_proxyINS_5tupleIJNS_10unique_ptrINS_15__thread_structENS_14default_deleteIS3_EEEENS_6__bindIMN2yb6client22AsyncClientInitialiserEFvvEJPSA_EEEEEEEEPvSG_
    @     0x7fff78df26c1  _pthread_body
    @     0x7fff78df256d  _pthread_start
    @     0x7fff78df1c5d  thread_start

Then I ran

sysctl kern.maxfiles
kern.maxfiles: 524288
sysctl kern.maxfilesperproc
kern.maxfilesperproc: 65535

So I'm stuck here.
Any help will be appreciated :)

@bmatican
Copy link
Contributor

Hey @vincentvictoria , can you also post ulimit -a output?

Also, my local settings for the same sysctl's:

sysctl -a | grep maxfiles
27:kern.maxfiles: 1048576
39:kern.maxfilesperproc: 1048576

We should enhance our prereq docs section to add a note on this!

@rven1
Copy link
Contributor

rven1 commented Jul 12, 2018

We are seeing about 1800 files in lsof for the 6 processes for Yugabyte.
Can you check the following command before you start?
lsof |wc

@vincentvictoria
Copy link
Author

I changed the settings for sysctl as @bmatican.

sysctl -a | grep maxfiles

kern.maxfiles: 1048576
kern.maxfilesperproc: 1048576

ulimit -a

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
file size               (blocks, -f) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 256
pipe size            (512 bytes, -p) 1
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1418
virtual memory          (kbytes, -v) unlimited

And I'm getting a slightly different error but essentially the same:

libc++abi.dylib: terminating with uncaught exception of type std::__1::system_error: random_device failed to open /dev/urandom: Too many open files
*** Aborted at 1531441747 (unix time) try "date -d @1531441747" if you are using GNU date ***
PC: @     0x7fff7338de3e __pthread_kill
*** SIGABRT (@0x7fff7338de3e) received by PID 2720 (TID 0x70000fad4000) stack trace: ***
    @     0x7fff734bff5a _sigtramp
    @            0x20008 (unknown)
    @     0x7fff732ea312 abort
    @     0x7fff712c5f8f abort_message
    @     0x7fff712c6113 default_terminate_handler()
    @     0x7fff72650eab _objc_terminate()
    @     0x7fff712e17c9 std::__terminate()
    @     0x7fff712e126d __cxa_throw
    @     0x7fff712b47af std::__1::__throw_system_error()
    @     0x7fff712a749d std::__1::random_device::random_device()
    @        0x105f32792 yb::Seed<>()
    @        0x105f32711 yb::ThreadLocalRandom()
    @        0x1058ea8ea yb::rpc::RpcRetrier::DelayedRetry()
    @        0x104c74c26 yb::master::GetLeaderMasterRpc::Finished()
    @        0x104c74abd yb::master::GetLeaderMasterRpc::GetMasterRegistrationRpcCbForNode()
    @        0x104c771ae _ZNSt3__110__function6__funcINS_6__bindIMN2yb6master18GetLeaderMasterRpcEFviRKNS3_6StatusERKNS_10shared_ptrINS3_3rpc10RpcCommandEEEN5boost9container22stable_vector_iteratorIPSC_Lb0EEEEJPS5_RiRKNS_12placeholders4__phILi1EEERSC_RSJ_EEENS_9allocatorISV_EEFvS8_EEclES8_
    @        0x104c77708 yb::master::(anonymous namespace)::GetMasterRegistrationRpc::Finished()
    @        0x1058cdb1e yb::rpc::OutboundCall::CallCallback()
    @        0x1058e1dc6 yb::rpc::Reactor::AssignOutboundCall()
    @        0x1058df08e yb::rpc::Reactor::ProcessOutboundQueue()
    @        0x1058e17eb yb::rpc::Reactor::AsyncHandler()
    @        0x105a28d59 ev_invoke_pending
    @        0x105a299ea ev_run
    @        0x1058df5ac yb::rpc::Reactor::RunThread()
    @        0x105f4a86a yb::Thread::SuperviseThread()
    @     0x7fff734c96c1 _pthread_body
    @     0x7fff734c956d _pthread_start
    @     0x7fff734c8c5d thread_start

Before yb-ctl create:

lsof |wc
9595 91101 1398418

After yb-ctl create (with the error):
9992 94581 1454919

@bmatican
Copy link
Contributor

Oh, as I suspected, seems like ulimit is taking precedence over sysctl, as your max open files in ulimit is 256...

@vincentvictoria Can you try sudo ulimit -n 1048576 ?

For reference, my settings:

ulimit -a
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
file size               (blocks, -f) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1048576
pipe size            (512 bytes, -p) 1
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 2500
virtual memory          (kbytes, -v) unlimited

@vincentvictoria
Copy link
Author

sudo ulimit -n 1048576 didn't work, so I did a search and found a solution: https://superuser.com/a/1171026

It now works for me.

Summary, I did following to make it work for High Sierra.

Created \etc\sysctl.conf:

kern.maxfiles=1048576
kern.maxfilesperproc=1048576

Then I did what's described in the link above,
Restarted,
It works!

Thanks for the help!

@kmuthukk
Copy link
Collaborator

Hi @vincentvictoria - thanks for sharing the tip for High Sierra (Mac).

@rven1 : Could you please take an action item to document the recommended ulimit settings in the "Prerequisites" section of the docs for MacOS here https://docs.yugabyte.com/latest/quick-start/install/#macos? Thanks.

@kmuthukk kmuthukk added the kind/question This is a question label Jul 14, 2018
jasonyb pushed a commit that referenced this issue Jun 11, 2024
PG-542: Performance improvement of pg_stat_monitor.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/question This is a question
Projects
None yet
Development

No branches or pull requests

4 participants