Aurora MySQL crashing randomly

0

I am on my third Aurora cluster that is randomly failing, leading my application to die. AWS support team didn't answer my support case.

Engine version: 8.0.mysql_aurora.3.01.0

/etc/rds/oscar-start-cmd: line 39:  2687 Killed                  /rdsdbbin/oscar/bin/mysqld --core-file --innodb_shared_buffer_pool_uses_huge_pages='1' "$@"
grover/runtime/overlay.cpp:2270: Assertion failed: err == 0   Stack trace:
	/rdsdbbin/oscar/bin/mysqld() [0x2be2f08]
	/rdsdbbin/oscar/bin/mysqld(_Z27log_grover_pid_from_page_nomm+0x1d) [0x2850bdd]
	    <inline> (in buf_page_t::set_grover_vol_pid(unsigned long, unsigned long) at /local/p4clients/pkgbuild-FRTaI/workspace/src/OscarMysql80/storage/innobase/include/ut0lock_free_hash.h:638)
	/rdsdbbin/oscar/bin/mysqld() [0x2597395] (in buf_page_init at /local/p4clients/pkgbuild-FRTaI/workspace/src/OscarMysql80/storage/innobase/buf/buf0buf.cc:6645)
	/rdsdbbin/oscar/bin/mysqld(_Z22buf_page_init_for_readP7dberr_tmRK9page_id_tRK11page_size_tm+0x2e0) [0x25a3cf0]
	/rdsdbbin/oscar/bin/mysqld(_Z17buf_read_page_lowP7dberr_tbmmRK9page_id_tRK11page_size_tbbb+0x91) [0x25c6c91]
	/rdsdbbin/oscar/bin/mysqld(_Z13buf_read_pageRK9page_id_tRK11page_size_tb+0x3c) [0x25c76bc]
	/rdsdbbin/oscar/bin/mysqld(_ZN9Buf_fetchI16Buf_fetch_normalE9read_pageEv+0x27) [0x2597ce7]
	/rdsdbbin/oscar/bin/mysqld(_ZN16Buf_fetch_normal3getERP11buf_block_t+0xb2) [0x259ed82]
	/rdsdbbin/oscar/bin/mysqld(_ZN9Buf_fetchI16Buf_fetch_normalE11single_pageEv+0x4e) [0x25a654e]
	/rdsdbbin/oscar/bin/mysqld(_Z16buf_page_get_genRK9page_id_tRK11page_size_tmP11buf_block_t10Page_fetchPKcmP5mtr_tb+0x1d9) [0x25a75a9]
	/rdsdbbin/oscar/bin/mysqld() [0x2637bc1]
	/rdsdbbin/oscar/bin/mysqld(_Z28fseg_alloc_free_page_generalPhjhmP5mtr_tS1_+0x1d0) [0x2639160]
	/rdsdbbin/oscar/bin/mysqld(_Z14btr_page_allocP12dict_index_tjhmP5mtr_tS2_+0xd5) [0x256ecc5]
	/rdsdbbin/oscar/bin/mysqld(_ZN3lob14alloc_lob_pageEP12dict_index_tP5mtr_tjb+0x216) [0x28bb676]
	/rdsdbbin/oscar/bin/mysqld(_ZN3lob12first_page_t5allocEP5mtr_tb+0x24) [0x28ab0c4]
	/rdsdbbin/oscar/bin/mysqld(_ZN3lob6insertEPNS_13InsertContextEP5trx_tRNS_5ref_tEP15big_rec_field_tm+0x14f) [0x28b78df]
	/rdsdbbin/oscar/bin/mysqld(_ZN3lob31btr_store_big_rec_extern_fieldsEP5trx_tP10btr_pcur_tPK5upd_tPmPK9big_rec_tP5mtr_tNS_6opcodeE+0xb16) [0x26edbb6]
	/rdsdbbin/oscar/bin/mysqld() [0x277331d]
	/rdsdbbin/oscar/bin/mysqld(_Z29row_ins_clust_index_entry_lowjmP12dict_index_tmP8dtuple_tP10btr_pcur_tP9que_thr_tb+0x646) [0x2774906]
	/rdsdbbin/oscar/bin/mysqld(_Z25row_ins_clust_index_entryP12dict_index_tP8dtuple_tP10btr_pcur_tP9que_thr_tb+0xe8) [0x277b158]
	/rdsdbbin/oscar/bin/mysqld(_Z12row_ins_stepP9que_thr_t+0x274) [0x277b7d4]
	/rdsdbbin/oscar/bin/mysqld() [0x278ca73]
	/rdsdbbin/oscar/bin/mysqld(_ZN11ha_innobase9write_rowEPh+0x226) [0x268fac6]
	/rdsdbbin/oscar/bin/mysqld(_ZN7handler12ha_write_rowEPh+0x177) [0x14a4867]
	/rdsdbbin/oscar/bin/mysqld(_Z12write_recordP3THDP5TABLEP9COPY_INFOS4_+0x5d4) [0x172e3d4]
	/rdsdbbin/oscar/bin/mysqld(_ZN21Sql_cmd_insert_values13execute_innerEP3THD+0xbaf) [0x173018f]
	/rdsdbbin/oscar/bin/mysqld(_ZN11Sql_cmd_dml7executeEP3THD+0x6cc) [0x119905c]
	/rdsdbbin/oscar/bin/mysqld(_Z30mysql_execute_command_internalP3THDb+0x1143) [0x1139f33]
	/rdsdbbin/oscar/bin/mysqld(_Z21mysql_execute_commandP3THDb+0x17b) [0x113d31b]
	/rdsdbbin/oscar/bin/mysqld(_Z20dispatch_sql_commandP3THDP12Parser_state+0x351) [0x113df91]
21:03:27 UTC - mysqld got signal 6 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Thread pointer: 0x14652cf4e000
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 1465869fba9f thread_stack 0x40000
/rdsdbbin/oscar/bin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x2d) [0x246ac4d]
/rdsdbbin/oscar/bin/mysqld(handle_fatal_signal+0x532) [0x1310292]
/lib64/libpthread.so.0(+0x117df) [0x147cc9b707df]
/lib64/libc.so.6(gsignal+0x110) [0x147cc8ef3c20]
/lib64/libc.so.6(abort+0x147) [0x147cc8ef50c7]
/rdsdbbin/oscar/bin/mysqld() [0xf963d7]
/rdsdbbin/oscar/bin/mysqld() [0x2dba17a]
/rdsdbbin/oscar/bin/mysqld() [0x2dba333]
/rdsdbbin/oscar/bin/mysqld() [0x2be2f08]
/rdsdbbin/oscar/bin/mysqld(log_grover_pid_from_page_no(unsigned long, unsigned long)+0x1d) [0x2850bdd]
/rdsdbbin/oscar/bin/mysqld() [0x2597395]
/rdsdbbin/oscar/bin/mysqld(buf_page_init_for_read(dberr_t*, unsigned long, page_id_t const&, page_size_t const&, unsigned long)+0x2e0) [0x25a3cf0]
/rdsdbbin/oscar/bin/mysqld(buf_read_page_low(dberr_t*, bool, unsigned long, unsigned long, page_id_t const&, page_size_t const&, bool, bool, bool)+0x91) [0x25c6c91]
/rdsdbbin/oscar/bin/mysqld(buf_read_page(page_id_t const&, page_size_t const&, bool)+0x3c) [0x25c76bc]
/rdsdbbin/oscar/bin/mysqld(Buf_fetch<Buf_fetch_normal>::read_page()+0x27) [0x2597ce7]
/rdsdbbin/oscar/bin/mysqld(Buf_fetch_normal::get(buf_block_t*&)+0xb2) [0x259ed82]
/rdsdbbin/oscar/bin/mysqld(Buf_fetch<Buf_fetch_normal>::single_page()+0x4e) [0x25a654e]
/rdsdbbin/oscar/bin/mysqld(buf_page_get_gen(page_id_t const&, page_size_t const&, unsigned long, buf_block_t*, Page_fetch, char const*, unsigned long, mtr_t*, bool)+0x1d9) [0x25a75a9]
/rdsdbbin/oscar/bin/mysqld() [0x2637bc1]
/rdsdbbin/oscar/bin/mysqld(fseg_alloc_free_page_general(unsigned char*, unsigned int, unsigned char, unsigned long, mtr_t*, mtr_t*)+0x1d0) [0x2639160]
/rdsdbbin/oscar/bin/mysqld(btr_page_alloc(dict_index_t*, unsigned int, unsigned char, unsigned long, mtr_t*, mtr_t*)+0xd5) [0x256ecc5]
/rdsdbbin/oscar/bin/mysqld(lob::alloc_lob_page(dict_index_t*, mtr_t*, unsigned int, bool)+0x216) [0x28bb676]
/rdsdbbin/oscar/bin/mysqld(lob::first_page_t::alloc(mtr_t*, bool)+0x24) [0x28ab0c4]
/rdsdbbin/oscar/bin/mysqld(lob::insert(lob::InsertContext*, trx_t*, lob::ref_t&, big_rec_field_t*, unsigned long)+0x14f) [0x28b78df]
/rdsdbbin/oscar/bin/mysqld(lob::btr_store_big_rec_extern_fields(trx_t*, btr_pcur_t*, upd_t const*, unsigned long*, big_rec_t const*, mtr_t*, lob::opcode)+0xb16) [0x26edbb6]
/rdsdbbin/oscar/bin/mysqld() [0x277331d]
/rdsdbbin/oscar/bin/mysqld(row_ins_clust_index_entry_low(unsigned int, unsigned long, dict_index_t*, unsigned long, dtuple_t*, btr_pcur_t*, que_thr_t*, bool)+0x646) [0x2774906]
/rdsdbbin/oscar/bin/mysqld(row_ins_clust_index_entry(dict_index_t*, dtuple_t*, btr_pcur_t*, que_thr_t*, bool)+0xe8) [0x277b158]
/rdsdbbin/oscar/bin/mysqld(row_ins_step(que_thr_t*)+0x274) [0x277b7d4]
/rdsdbbin/oscar/bin/mysqld() [0x278ca73]
/rdsdbbin/oscar/bin/mysqld(ha_innobase::write_row(unsigned char*)+0x226) [0x268fac6]
/rdsdbbin/oscar/bin/mysqld(handler::ha_write_row(unsigned char*)+0x177) [0x14a4867]
/rdsdbbin/oscar/bin/mysqld(write_record(THD*, TABLE*, COPY_INFO*, COPY_INFO*)+0x5d4) [0x172e3d4]
/rdsdbbin/oscar/bin/mysqld(Sql_cmd_insert_values::execute_inner(THD*)+0xbaf) [0x173018f]
/rdsdbbin/oscar/bin/mysqld(Sql_cmd_dml::execute(THD*)+0x6cc) [0x119905c]
/rdsdbbin/oscar/bin/mysqld(mysql_execute_command_internal(THD*, bool)+0x1143) [0x1139f33]
/rdsdbbin/oscar/bin/mysqld(mysql_execute_command(THD*, bool)+0x17b) [0x113d31b]
/rdsdbbin/oscar/bin/mysqld(dispatch_sql_command(THD*, Parser_state*)+0x351) [0x113df91]
/rdsdbbin/oscar/bin/mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0x1b39) [0x113ff99]
/rdsdbbin/oscar/bin/mysqld(do_command(THD*)+0x1c6) [0x1140f46]
/rdsdbbin/oscar/bin/mysqld(THD_task::process_connection()+0x134) [0x12fcfc4]
/rdsdbbin/oscar/bin/mysqld(Thread_pool::worker_loop()+0x180) [0x12fbc80]
/rdsdbbin/oscar/bin/mysqld(Thread_pool::worker_launch(void*)+0x20) [0x12fbea0]
/rdsdbbin/oscar/bin/mysqld() [0x296c531]
/lib64/libpthread.so.0(+0x740a) [0x147cc9b6640a]
/lib64/libc.so.6(clone+0x3e) [0x147cc8fad09e]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (1479bc268028): [omitted]
Connection ID (thread ID): 45980
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
aurora backtrace compare flag : 1
Writing a core file
[...]
tobias
asked 2 years ago887 views
1 Answer
0

Hi,

Regarding to the issue, this seems to be related to known issue. I would suggest to raise live call or chat which will be faster if you need immediate engagement. Or if you have the support case number let me know and I can take a look at the support case.

AWS
SUPPORT ENGINEER
Kevin_Z
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions