-
Notifications
You must be signed in to change notification settings - Fork 410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
disagg: too many request make tiflash compute node crash #9334
Labels
affects-7.5
This bug affects the 7.5.x(LTS) versions.
affects-8.1
This bug affects the 8.1.x(LTS) versions.
component/storage
type/enhancement
The issue or PR belongs to an enhancement.
Comments
/assign CalvinNeo |
/severity critical |
12 tasks
Change it to an enhancement because it is caused by a large amount of requests making too many threads. We will try to reduce the number of threads created for handling disaggregated requests. |
Closed
12 tasks
12 tasks
12 tasks
This was referenced Nov 29, 2024
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
affects-7.5
This bug affects the 7.5.x(LTS) versions.
affects-8.1
This bug affects the 8.1.x(LTS) versions.
component/storage
type/enhancement
The issue or PR belongs to an enhancement.
Bug Report
Please answer these questions before submitting your issue. Thanks!
1. Minimal reproduce step (Required)
1、run ch
2、inject one of cn network partition
2. What did you expect to see? (Required)
no crash
3. What did you see instead (Required)
tiflash cn crash occurs after the network isolation recovery
{"stream":"stdout","container":"errorlog","pod":"secondary-tc-tiflash-0","namespace":"ha-test-disagg-tiflash-tps-7552417-1-58","time":"2024-08-19T17:34:59.20448412Z","log":"[2024/08/20 01:34:58.361 +08:00] [ERROR] [BaseDaemon.cpp:560] ["\n 0x55a4c9778b9e\tfaultSignalHandler(int, siginfo_t*, void*) [tiflash+124169118]\n \tlibs/libdaemon/src/BaseDaemon.cpp:211\n 0x7fb214a5e6f0\t [libc.so.6+255728]\n 0x55a4c95a7d9a\tDB::DM::SegmentReadTask::SegmentReadTask(std::__1::shared_ptrDB::Logger const&, DB::Context const&, std::__1::shared_ptrDB::DM::ScanContext const&, DB::DM::RemotePb::RemoteSegment const&, DB::DM::DisaggTaskId const&, unsigned long, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator> const&, unsigned int, long) [tiflash+122264986]\n \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:884\n 0x55a4cad9eb63\tstd::__1::__function::__func<DB::StorageDisaggregated::buildReadTaskForWriteNodeTable(DB::Context const&, std::__1::shared_ptrDB::DM::ScanContext const&, DB::DM::DisaggTaskId const&, unsigned long, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator> const&, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator> const&, std::__1::mutex&, std::__1::list<std::__1::shared_ptrDB::DM::SegmentReadTask, std::__1::allocator<std::__1::shared_ptrDB::DM::SegmentReadTask>>&)::$_0, std::__1::allocator<DB::StorageDisaggregated::buildReadTaskForWriteNodeTable(DB::Context const&, std::__1::shared_ptrDB::DM::ScanContext const&, DB::DM::DisaggTaskId const&, unsigned long, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator> const&, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator> const&, std::__1::mutex&, std::__1::list<std::__1::shared_ptrDB::DM::SegmentReadTask, std::__1::allocator<std::__1::shared_ptrDB::DM::SegmentReadTask>>&)::$_0>, void ()>::operator()() (.139ff689715caee4ff84ce0b2eee41ae) [tiflash+147393379]\n \t/usr/local/bin/../include/c++/v1/__memory/construct_at.h:41\n 0x55a4c9a903b5\tauto DB::wrapInvocable<std::__1::function<void ()>>(bool, std::__1::function<void ()>&&)::'lambda'()::operator()() [tiflash+127411125]\n \t/usr/local/bin/../include/c++/v1/__functional/function.h:517\n 0x55a4c41e60c5\tstd::__1::packaged_task<void ()>::operator()() [tiflash+34439365]\n \t/usr/local/bin/../include/c++/v1/future:1891\n 0x55a4c419e4d6\tDB::DynamicThreadPool::executeTask(std::__1::unique_ptr<DB::IExecutableTask, std::__1::default_deleteDB::IExecutableTask>&) [tiflash+34145494]\n \tdbms/src/Common/DynamicThreadPool.cpp:124\n 0x55a4c419e973\tDB::DynamicThreadPool::dynamicWork(std::__1::unique_ptr<DB::IExecutableTask, std::__1::default_deleteDB::IExecutableTask>) [tiflash+34146675]\n \tdbms/src/Common/DynamicThreadPool.cpp:148\n 0x55a4c419f3df\tvoid* std::__1::__thread_proxy[abi:ue170006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_deletestd::__1::__thread_struct>, std::__1::thread DB::ThreadFactory::newThread<void (DB::DynamicThreadPool::)(std::__1::unique_ptr<DB::IExecutableTask, std::__1::default_deleteDB::IExecutableTask>), DB::DynamicThreadPool, std::__1::unique_ptr<DB::IExecutableTask, std::__1::default_deleteDB::IExecutableTask>>(bool, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator>, void (DB::DynamicThreadPool::&&)(std::__1::unique_ptr<DB::IExecutableTask, std::__1::default_deleteDB::IExecutableTask>), DB::DynamicThreadPool&&, std::__1::unique_ptr<DB::IExecutableTask, std::__1::default_deleteDB::IExecutableTask>&&)::'lambda'(auto&&...), DB::DynamicThreadPool*, std::__1::unique_ptr<DB::IExecutableTask, std::__1::default_deleteDB::IExecutableTask>>>(void*) [tiflash+34149343]\n \t/usr/local/bin/../include/c++/v1/__type_traits/invoke.h:308\n 0x7fb214aa9c02\tstart_thread [libc.so.6+564226]"] [source=BaseDaemon] [thread_id=30184]\n"}
4. What is your TiFlash version? (Required)
/tiflash/tiflash version
TiFlash
Release Version: v8.3.0-alpha
Edition: Community
Git Commit Hash: 14ed7c0
Git Branch: heads/refs/tags/v8.3.0-alpha
UTC Build Time: 2024-08-15 11:39:16
Enable Features: jemalloc sm4(GmSSL) mem-profiling avx2 avx512 unwind thinlto
Profile: RELWITHDEBINFO
Compiler: clang++ 17.0.6
Raft Proxy
Git Commit Hash: 4ebe44d321d4c738d89bc145d418b1d6f3464862
Git Commit Branch: HEAD
UTC Build Time: ""
Rust Version: rustc 1.77.0-nightly (89e2160c4 2023-12-27)
Storage Engine: tiflash
Prometheus Prefix: tiflash_proxy_
Profile: release
Enable Features: external-je
The text was updated successfully, but these errors were encountered: