Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpm 19.17.4.11 (altinity) -> 20.1.2.4 (yandex) = Segmentation fault #8925

Closed
morozovsk opened this issue Jan 31, 2020 · 7 comments · Fixed by #10025
Closed

rpm 19.17.4.11 (altinity) -> 20.1.2.4 (yandex) = Segmentation fault #8925

morozovsk opened this issue Jan 31, 2020 · 7 comments · Fixed by #10025
Labels
bug Confirmed user-visible misbehaviour in official release crash Crash / segfault / abort st-need-info We need extra data to continue (waiting for response)

Comments

@morozovsk
Copy link
Contributor

CentOS 7.6.1810.

Раньше стояла версия кликхауса 19.17.4.11 из rpm-пакета от Altinity_clickhouse/x86_64.
После обновления кликхауса из rpm-пакета от repo.yandex.ru_clickhouse_rpm_stable_x86_64
стал периодически падать (падает в полночь два дня подряд).

2020.01.31 00:00:06.531882 [ 118 ] {} <Fatal> BaseDaemon: ########################################
2020.01.31 00:00:06.531938 [ 118 ] {} <Fatal> BaseDaemon: (version 20.1.2.4 (official build)) (from thread 51) (query_id: 3b7f94c7-7aae-4fe6-863e-5248cfe97cc3) Received signal Segmentation fault (11).
2020.01.31 00:00:06.531960 [ 118 ] {} <Fatal> BaseDaemon: Address: 0x7fd3e767f000 Access: read. Attempted access has violated the permissions assigned to the memory area.
2020.01.31 00:00:06.531970 [ 118 ] {} <Fatal> BaseDaemon: Stack trace: 0xbc03280 0x5695ad8 0x95c1263 0x95e8b5b 0x95e8f66 0x95e94c5 0x9651341 0x9587e60 0x4fa4657 0x4fa4c84 0x4fa3b77 0x4fa212f 0x7fd4a1aeadd5 0x7fd4a1407ead
2020.01.31 00:00:06.531993 [ 118 ] {} <Fatal> BaseDaemon: 3. 0xbc03280 memcpy  in /usr/bin/clickhouse
2020.01.31 00:00:06.532015 [ 118 ] {} <Fatal> BaseDaemon: 4. 0x5695ad8 DB::ColumnString::insertData(char const*, unsigned long)  in /usr/bin/clickhouse
2020.01.31 00:00:06.532041 [ 118 ] {} <Fatal> BaseDaemon: 5. 0x95c1263 void DB::Aggregator::convertToBlockImplFinal<DB::AggregationMethodStringNoCache<TwoLevelStringHashMap<char*, Allocator<true, true>, StringHashMap> >, StringHashMap<char*, Allocator<true, true> > >(DB::AggregationMethodStringNoCache<TwoLevelStringHashMap<char*, Allocator<true, true>, StringHashMap> >&, StringHashMap<char*, Allocator<true, true> >&, std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&) const  in /usr/bin/clickhouse
2020.01.31 00:00:06.532074 [ 118 ] {} <Fatal> BaseDaemon: 6. 0x95e8b5b void DB::Aggregator::convertToBlockImpl<DB::AggregationMethodStringNoCache<TwoLevelStringHashMap<char*, Allocator<true, true>, StringHashMap> >, StringHashMap<char*, Allocator<true, true> > >(DB::AggregationMethodStringNoCache<TwoLevelStringHashMap<char*, Allocator<true, true>, StringHashMap> >&, StringHashMap<char*, Allocator<true, true> >&, std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, std::__1::vector<DB::PODArray<char*, 4096ul, Allocator<false, false>, 15ul, 16ul>*, std::__1::allocator<DB::PODArray<char*, 4096ul, Allocator<false, false>, 15ul, 16ul>*> >&, std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, bool) const  in /usr/bin/clickhouse
2020.01.31 00:00:06.532094 [ 118 ] {} <Fatal> BaseDaemon: 7. 0x95e8f66 DB::Block DB::Aggregator::prepareBlockAndFill<DB::Block DB::Aggregator::convertOneBucketToBlock<DB::AggregationMethodStringNoCache<TwoLevelStringHashMap<char*, Allocator<true, true>, StringHashMap> > >(DB::AggregatedDataVariants&, DB::AggregationMethodStringNoCache<TwoLevelStringHashMap<char*, Allocator<true, true>, StringHashMap> >&, bool, unsigned long) const::'lambda'(std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, std::__1::vector<DB::PODArray<char*, 4096ul, Allocator<false, false>, 15ul, 16ul>*, std::__1::allocator<DB::PODArray<char*, 4096ul, Allocator<false, false>, 15ul, 16ul>*> >&, std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, bool)>(DB::AggregatedDataVariants&, bool, unsigned long, DB::AggregationMethodStringNoCache<TwoLevelStringHashMap<char*, Allocator<true, true>, StringHashMap> >&&) const  in /usr/bin/clickhouse
2020.01.31 00:00:06.532113 [ 118 ] {} <Fatal> BaseDaemon: 8. 0x95e94c5 DB::Block DB::Aggregator::convertOneBucketToBlock<DB::AggregationMethodStringNoCache<TwoLevelStringHashMap<char*, Allocator<true, true>, StringHashMap> > >(DB::AggregatedDataVariants&, DB::AggregationMethodStringNoCache<TwoLevelStringHashMap<char*, Allocator<true, true>, StringHashMap> >&, bool, unsigned long) const  in /usr/bin/clickhouse
2020.01.31 00:00:06.532126 [ 118 ] {} <Fatal> BaseDaemon: 9. 0x9651341 DB::MergingAndConvertingBlockInputStream::thread(int, std::__1::shared_ptr<DB::ThreadGroupStatus>)  in /usr/bin/clickhouse
2020.01.31 00:00:06.532139 [ 118 ] {} <Fatal> BaseDaemon: 10. 0x9587e60 std::__1::__function::__func<std::__1::__bind<void (DB::MergingAndConvertingBlockInputStream::*)(int, std::__1::shared_ptr<DB::ThreadGroupStatus>), DB::MergingAndConvertingBlockInputStream*, int&, std::__1::shared_ptr<DB::ThreadGroupStatus> >, std::__1::allocator<std::__1::__bind<void (DB::MergingAndConvertingBlockInputStream::*)(int, std::__1::shared_ptr<DB::ThreadGroupStatus>), DB::MergingAndConvertingBlockInputStream*, int&, std::__1::shared_ptr<DB::ThreadGroupStatus> > >, void ()>::operator()()  in /usr/bin/clickhouse
2020.01.31 00:00:06.532151 [ 118 ] {} <Fatal> BaseDaemon: 11. 0x4fa4657 ThreadPoolImpl<ThreadFromGlobalPool>::worker(std::__1::__list_iterator<ThreadFromGlobalPool, void*>)  in /usr/bin/clickhouse
2020.01.31 00:00:06.532163 [ 118 ] {} <Fatal> BaseDaemon: 12. 0x4fa4c84 ThreadFromGlobalPool::ThreadFromGlobalPool<void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda1'()>(void&&, void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda1'()&&...)::'lambda'()::operator()() const  in /usr/bin/clickhouse
2020.01.31 00:00:06.532173 [ 118 ] {} <Fatal> BaseDaemon: 13. 0x4fa3b77 ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>)  in /usr/bin/clickhouse
2020.01.31 00:00:06.532194 [ 118 ] {} <Fatal> BaseDaemon: 14. 0x4fa212f ?  in /usr/bin/clickhouse
2020.01.31 00:00:06.532207 [ 118 ] {} <Fatal> BaseDaemon: 15. 0x7dd5 start_thread  in /usr/lib64/libpthread-2.17.so
2020.01.31 00:00:06.532218 [ 118 ] {} <Fatal> BaseDaemon: 16. 0xfdead clone  in /usr/lib64/libc-2.17.so

На втором сервере (CentOS 7.3.1611) был также обновлён кликхаус c altinity на yandex , но подобной ошибки не возникает, правда там другие бд и запросы.

@morozovsk morozovsk added the bug Confirmed user-visible misbehaviour in official release label Jan 31, 2020
@filimonov
Copy link
Contributor

Can you check / share the query which caused that? Most probably you will be able to find it grepping logs for 3b7f94c7-7aae-4fe6-863e-5248cfe97cc3 (query_id from stacktrace).

@filimonov filimonov added st-need-info We need extra data to continue (waiting for response) crash Crash / segfault / abort labels Jan 31, 2020
@morozovsk
Copy link
Contributor Author

Запроса в логах нет потому что уровень логгирования был warning, поменял на trace.
И выполнил systemctl enable clickhouse-server.

Получил новую ошибку в то же самоме вермя как предыдущие, только раньше кликхаус просто падал, а сейчас загружает процессор на 100% и при этом недоступен

2020.02.01 00:00:05.218606 [ 63 ] {9e24c969-e99c-44ef-911c-078f691fcb0c} <Error> executeQuery: Code: 173, e.displayText() = DB::ErrnoException: Allocator: Cannot realloc from 1.00 MiB to 0.00 B., errno: 0, strerror: Success (version 20.1.2.4 (official build)) (from 127.0.0.1:55718) (in query: -- Metabase SELECT `nginx`.`report_by_query`.`query` AS `query` FROM `nginx`.`report_by_query` GROUP BY `nginx`.`report_by_query`.`query` ORDER BY `nginx`.`report_by_query`.`query` ASC LIMIT 5000 FORMAT TabSeparatedWithNamesAndTypes;), Stack trace (when copying this message, always include the lines below):

0. 0xbc31d9c Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int)  in /usr/bin/clickhouse
1. 0x4f6ee17 DB::ErrnoException::ErrnoException(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, int, std::__1::optional<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const&)  in /usr/bin/clickhouse
2. 0x4968044 DB::throwFromErrno(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, int)  in /usr/bin/clickhouse
3. 0x4f919ac Allocator<false, false>::realloc(void*, unsigned long, unsigned long, unsigned long)  in /usr/bin/clickhouse
4. 0x5695aa4 DB::ColumnString::insertData(char const*, unsigned long)  in /usr/bin/clickhouse
5. 0x95c1263 void DB::Aggregator::convertToBlockImplFinal<DB::AggregationMethodStringNoCache<TwoLevelStringHashMap<char*, Allocator<true, true>, StringHashMap> >, StringHashMap<char*, Allocator<true, true> > >(DB::AggregationMethodStringNoCache<TwoLevelStringHashMap<char*, Allocator<true, true>, StringHashMap> >&, StringHashMap<char*, Allocator<true, true> >&, std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&) const  in /usr/bin/clickhouse
6. 0x95e8b5b void DB::Aggregator::convertToBlockImpl<DB::AggregationMethodStringNoCache<TwoLevelStringHashMap<char*, Allocator<true, true>, StringHashMap> >, StringHashMap<char*, Allocator<true, true> > >(DB::AggregationMethodStringNoCache<TwoLevelStringHashMap<char*, Allocator<true, true>, StringHashMap> >&, StringHashMap<char*, Allocator<true, true> >&, std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, std::__1::vector<DB::PODArray<char*, 4096ul, Allocator<false, false>, 15ul, 16ul>*, std::__1::allocator<DB::PODArray<char*, 4096ul, Allocator<false, false>, 15ul, 16ul>*> >&, std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, bool) const  in /usr/bin/clickhouse
7. 0x95e8f66 DB::Block DB::Aggregator::prepareBlockAndFill<DB::Block DB::Aggregator::convertOneBucketToBlock<DB::AggregationMethodStringNoCache<TwoLevelStringHashMap<char*, Allocator<true, true>, StringHashMap> > >(DB::AggregatedDataVariants&, DB::AggregationMethodStringNoCache<TwoLevelStringHashMap<char*, Allocator<true, true>, StringHashMap> >&, bool, unsigned long) const::'lambda'(std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, std::__1::vector<DB::PODArray<char*, 4096ul, Allocator<false, false>, 15ul, 16ul>*, std::__1::allocator<DB::PODArray<char*, 4096ul, Allocator<false, false>, 15ul, 16ul>*> >&, std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, bool)>(DB::AggregatedDataVariants&, bool, unsigned long, DB::AggregationMethodStringNoCache<TwoLevelStringHashMap<char*, Allocator<true, true>, StringHashMap> >&&) const  in /usr/bin/clickhouse
8. 0x95e94c5 DB::Block DB::Aggregator::convertOneBucketToBlock<DB::AggregationMethodStringNoCache<TwoLevelStringHashMap<char*, Allocator<true, true>, StringHashMap> > >(DB::AggregatedDataVariants&, DB::AggregationMethodStringNoCache<TwoLevelStringHashMap<char*, Allocator<true, true>, StringHashMap> >&, bool, unsigned long) const  in /usr/bin/clickhouse
9. 0x9651341 DB::MergingAndConvertingBlockInputStream::thread(int, std::__1::shared_ptr<DB::ThreadGroupStatus>)  in /usr/bin/clickhouse
10. 0x9587e60 std::__1::__function::__func<std::__1::__bind<void (DB::MergingAndConvertingBlockInputStream::*)(int, std::__1::shared_ptr<DB::ThreadGroupStatus>), DB::MergingAndConvertingBlockInputStream*, int&, std::__1::shared_ptr<DB::ThreadGroupStatus> >, std::__1::allocator<std::__1::__bind<void (DB::MergingAndConvertingBlockInputStream::*)(int, std::__1::shared_ptr<DB::ThreadGroupStatus>), DB::MergingAndConvertingBlockInputStream*, int&, std::__1::shared_ptr<DB::ThreadGroupStatus> > >, void ()>::operator()()  in /usr/bin/clickhouse
11. 0x4fa4657 ThreadPoolImpl<ThreadFromGlobalPool>::worker(std::__1::__list_iterator<ThreadFromGlobalPool, void*>)  in /usr/bin/clickhouse
12. 0x4fa4c84 ThreadFromGlobalPool::ThreadFromGlobalPool<void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda1'()>(void&&, void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda1'()&&...)::'lambda'()::operator()() const  in /usr/bin/clickhouse
13. 0x4fa3b77 ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>)  in /usr/bin/clickhouse
14. 0x4fa212f ?  in /usr/bin/clickhouse
15. 0x7dd5 start_thread  in /usr/lib64/libpthread-2.17.so
16. 0xfdead clone  in /usr/lib64/libc-2.17.so

sql запрос, который её вызывает:

SELECT nginx.report_by_query.query AS query FROM nginx.report_by_query GROUP BY nginx.report_by_query.query ORDER BY nginx.report_by_query.query ASC LIMIT 5000 FORMAT TabSeparatedWithNamesAndTypes;

Видимо metabase раз в сутки перестраивает свои внутренние индексы и делает этот запрос.
В таблице 65кк записей и весит она около 1gb.

@filimonov
Copy link
Contributor

А структура таблицы? Show create table nginx.report_by_query

@morozovsk
Copy link
Contributor Author

morozovsk commented Feb 2, 2020

CREATE TABLE nginx.report_by_query 
(`timestamp` DateTime, `country` String, `os` String, `query` String, `n` UInt64) 
ENGINE = SummingMergeTree 
PARTITION BY toYYYYMM(timestamp) 
ORDER BY (timestamp, country, query) 
SETTINGS index_granularity = 8192

@morozovsk
Copy link
Contributor Author

Возможно проблема была в структуре таблицы, она содержит текстовое поле query, но не содержит его в "ORDER BY". Возможно для движка SummingMergeTree в новых версиях кликхауса это вызывает проблему.
Исправил структуру таблицы, пока что ошибка не воспроизводится.

@morozovsk
Copy link
Contributor Author

@romanitalian по ходу вы ошиблись тикетом, здесь ни слова не было про кафку.

@romanitalian
Copy link

Прошу прощения. Удалю все сообщения, если Вы не против.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed user-visible misbehaviour in official release crash Crash / segfault / abort st-need-info We need extra data to continue (waiting for response)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants