MySQL Crash Recovery分析

作者: Huang Jinqiang 分类: MySQL 发布时间: 2018-04-10 22:15

某云服务器配置4核8G内存
操作系统版本: CentOS release 6.9 (Final)
数据库版本: mysql-5.6.39-linux-glibc2.12-x86_64

(1) free -m 可以看到内存使用情况
total used free shared buffers cached
Mem: 7870 7390 480 0 1 1234
-/+ buffers/cache: 6153 1716
Swap: 0 0 0
从内存使用情况可以看出,内存基本耗尽了

(2) error log日志显示
2018-04-10 17:33:38 8531 [Note] Plugin 'FEDERATED' is disabled.
2018-04-10 17:33:38 8531 [Note] InnoDB: Using atomics to ref count buffer pool pages
2018-04-10 17:33:38 8531 [Note] InnoDB: The InnoDB memory heap is disabled
2018-04-10 17:33:38 8531 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2018-04-10 17:33:38 8531 [Note] InnoDB: Memory barrier is not used
2018-04-10 17:33:38 8531 [Note] InnoDB: Compressed tables use zlib 1.2.3
2018-04-10 17:33:38 8531 [Note] InnoDB: Using Linux native AIO
2018-04-10 17:33:38 8531 [Note] InnoDB: Using CPU crc32 instructions
2018-04-10 17:33:38 8531 [Note] InnoDB: Initializing buffer pool, size = 5.0G
2018-04-10 17:33:38 8531 [Note] InnoDB: Completed initialization of buffer pool
2018-04-10 17:33:38 8531 [Note] InnoDB: Opened 3 undo tablespaces
2018-04-10 17:33:38 8531 [Note] InnoDB: Highest supported file format is Barracuda.
2018-04-10 17:33:38 8531 [Note] InnoDB: Log scan progressed past the checkpoint lsn 96827763326
2018-04-10 17:33:38 8531 [Note] InnoDB: Database was not shutdown normally!
2018-04-10 17:33:38 8531 [Note] InnoDB: Starting crash recovery.
2018-04-10 17:33:38 8531 [Note] InnoDB: Reading tablespace information from the .ibd files...
2018-04-10 17:33:38 8531 [Note] InnoDB: Restoring possible half-written data pages
2018-04-10 17:33:38 8531 [Note] InnoDB: from the doublewrite buffer...
InnoDB: Doing recovery: scanned up to log sequence number 96827859103
InnoDB: Transaction 39736326 was in the XA prepared state.
InnoDB: Transaction 39736326 was in the XA prepared state.
InnoDB: 1 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 0 row operations to undo
InnoDB: Trx id counter is 39736832
2018-04-10 17:33:38 8531 [Note] InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percent: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
InnoDB: Apply batch completed
InnoDB: Last MySQL binlog file position 0 470991980, file name mysql-bin.000064
2018-04-10 17:33:39 8531 [Note] InnoDB: 128 rollback segment(s) are active.
InnoDB: Starting in background the rollback of uncommitted transactions
2018-04-10 17:33:39 7f0721bff700 InnoDB: Rollback of non-prepared transactions completed
2018-04-10 17:33:39 8531 [Note] InnoDB: Waiting for purge to start
2018-04-10 17:33:39 8531 [Note] InnoDB: 5.6.39 started; log sequence number 96827859103
2018-04-10 17:33:39 8531 [Note] Recovering after a crash using /data/mysql/mysql3306/bin_logs/mysql-bin
2018-04-10 17:33:41 8531 [Note] Starting crash recovery...
2018-04-10 17:33:41 7f0898815780 InnoDB: Starting recovery for XA transactions...
2018-04-10 17:33:41 7f0898815780 InnoDB: Transaction 39736326 in prepared state after recovery
2018-04-10 17:33:41 7f0898815780 InnoDB: Transaction contains changes to 3 rows
2018-04-10 17:33:41 7f0898815780 InnoDB: 1 transactions in prepared state after recovery
2018-04-10 17:33:41 8531 [Note] Found 1 prepared transaction(s) in InnoDB
2018-04-10 17:33:41 8531 [Note] Crash recovery finished.
2018-04-10 17:33:41 8531 [Note] Server hostname (bind-address): '*'; port: 3306
2018-04-10 17:33:41 8531 [Note] IPv6 is not available.
2018-04-10 17:33:41 8531 [Note] - '0.0.0.0' resolves to '0.0.0.0';
2018-04-10 17:33:41 8531 [Note] Server socket created on IP: '0.0.0.0'.
2018-04-10 17:33:41 8531 [Warning] Recovery from master pos 39854021 and file mysql-bin.000060. Previous relay log pos and relay log file had been set to 39854184, /data/mysql/mysql3306/relay_logs/relay-bin.000328 respectively.
2018-04-10 17:33:41 8531 [Note] Event Scheduler: Loaded 0 events
2018-04-10 17:33:41 8531 [Note] /usr/local/mysql/bin/mysqld: ready for connections.
Version: '5.6.39-log' socket: '/tmp/mysql3306.sock' port: 3306 MySQL Community Server (GPL)
2018-04-10 20:45:20 8531 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2018-04-10 20:45:20 8531 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000060' at position 39854021, relay log '/data/mysql/mysql3306/relay_logs/relay-bin.000329' position: 4
2018-04-10 20:45:20 8531 [Note] Slave I/O thread: connected to master 'repl@xxx:3306',replication started in log 'mysql-bin.000060' at position 39854021
从error log可以看出MySQL发生了Crash Recovery, 没有其他报错信息

(3) 在slow log里面可以看crash前后有慢日志
grep 'Time: 180410 17:33' slow.log
# Time: 180410 17:33:02
# Time: 180410 17:33:14
# Time: 180410 17:33:18
# Time: 180410 17:33:19
# Time: 180410 17:33:20
# Time: 180410 17:33:27
# Time: 180410 17:33:29
# Time: 180410 17:33:30
# Time: 180410 17:33:33
# Time: 180410 17:33:35
# Time: 180410 17:33:36
# Time: 180410 17:33:37
# Time: 180410 17:33:48
# Time: 180410 17:33:49
# Time: 180410 17:33:50
# Time: 180410 17:33:53
# Time: 180410 17:33:55

(4) /var/log/messages可以看当时发生OOM
Apr 10 17:33:37 db_slave2 kernel: [17459736.633116] mysqld invoked oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0
Apr 10 17:33:37 db_slave2 kernel: [17459736.634680] mysqld cpuset=/ mems_allowed=0
Apr 10 17:33:37 db_slave2 kernel: [17459736.635449] Pid: 31767, comm: mysqld Not tainted 2.6.32-696.10.2.el6.x86_64 #1
Apr 10 17:33:37 db_slave2 kernel: [17459736.636762] Call Trace:
Apr 10 17:33:37 db_slave2 kernel: [17459736.637220] [<ffffffff81131630>] ? dump_header+0x90/0x1b0
Apr 10 17:33:37 db_slave2 kernel: [17459736.638160] [<ffffffff8123c24c>] ? security_real_capable_noaudit+0x3c/0x70
Apr 10 17:33:37 db_slave2 kernel: [17459736.639338] [<ffffffff81131ab2>] ? oom_kill_process+0x82/0x2a0
Apr 10 17:33:37 db_slave2 kernel: [17459736.640315] [<ffffffff811319f1>] ? select_bad_process+0xe1/0x120
Apr 10 17:33:37 db_slave2 kernel: [17459736.641416] [<ffffffff81131ef0>] ? out_of_memory+0x220/0x3c0
Apr 10 17:33:37 db_slave2 kernel: [17459736.642408] [<ffffffff8113e8cc>] ? __alloc_pages_nodemask+0x93c/0x950
Apr 10 17:33:37 db_slave2 kernel: [17459736.643470] [<ffffffff81145634>] ? ____pagevec_lru_add+0x164/0x180
Apr 10 17:33:37 db_slave2 kernel: [17459736.644503] [<ffffffff81177bda>] ? alloc_pages_vma+0x9a/0x150
Apr 10 17:33:37 db_slave2 kernel: [17459736.645497] [<ffffffff81159f0d>] ? handle_pte_fault+0x71d/0xab0
Apr 10 17:33:37 db_slave2 kernel: [17459736.646578] [<ffffffff81056767>] ? pte_alloc_one+0x37/0x50
Apr 10 17:33:37 db_slave2 kernel: [17459736.647619] [<ffffffff81194109>] ? do_huge_pmd_anonymous_page+0xb9/0x3b0
Apr 10 17:33:37 db_slave2 kernel: [17459736.648869] [<ffffffff8115a54a>] ? handle_mm_fault+0x2aa/0x3f0
Apr 10 17:33:37 db_slave2 kernel: [17459736.649955] [<ffffffff81052111>] ? __do_page_fault+0x141/0x500
Apr 10 17:33:37 db_slave2 kernel: [17459736.651039] [<ffffffff8154a7ae>] ? schedule+0x3ee/0xb70
Apr 10 17:33:37 db_slave2 kernel: [17459736.666483] [<ffffffff812a86c8>] ? call_rwsem_wake+0x18/0x30
Apr 10 17:33:37 db_slave2 kernel: [17459736.668105] [<ffffffff81550e8e>] ? do_page_fault+0x3e/0xa0
Apr 10 17:33:37 db_slave2 kernel: [17459736.669707] [<ffffffff8154e185>] ? page_fault+0x25/0x30
Apr 10 17:33:37 db_slave2 kernel: [17459736.671262] Mem-Info:
Apr 10 17:33:37 db_slave2 kernel: [17459736.672340] Node 0 DMA per-cpu:
Apr 10 17:33:37 db_slave2 kernel: [17459736.673536] CPU 0: hi: 0, btch: 1 usd: 0
Apr 10 17:33:37 db_slave2 kernel: [17459736.675010] CPU 1: hi: 0, btch: 1 usd: 0
Apr 10 17:33:37 db_slave2 kernel: [17459736.676472] CPU 2: hi: 0, btch: 1 usd: 0
Apr 10 17:33:37 db_slave2 kernel: [17459736.677944] CPU 3: hi: 0, btch: 1 usd: 0
Apr 10 17:33:37 db_slave2 kernel: [17459736.679423] Node 0 DMA32 per-cpu:
Apr 10 17:33:37 db_slave2 kernel: [17459736.680665] CPU 0: hi: 186, btch: 31 usd: 62
Apr 10 17:33:37 db_slave2 kernel: [17459736.682148] CPU 1: hi: 186, btch: 31 usd: 0
Apr 10 17:33:37 db_slave2 kernel: [17459736.683544] CPU 2: hi: 186, btch: 31 usd: 0
Apr 10 17:33:37 db_slave2 kernel: [17459736.684926] CPU 3: hi: 186, btch: 31 usd: 0
Apr 10 17:33:37 db_slave2 kernel: [17459736.686305] Node 0 Normal per-cpu:
Apr 10 17:33:37 db_slave2 kernel: [17459736.687494] CPU 0: hi: 186, btch: 31 usd: 60
Apr 10 17:33:37 db_slave2 kernel: [17459736.688850] CPU 1: hi: 186, btch: 31 usd: 55
Apr 10 17:33:37 db_slave2 kernel: [17459736.690253] CPU 2: hi: 186, btch: 31 usd: 68
Apr 10 17:33:37 db_slave2 kernel: [17459736.691620] CPU 3: hi: 186, btch: 31 usd: 52
Apr 10 17:33:37 db_slave2 kernel: [17459736.693050] active_anon:1952931 inactive_anon:99 isolated_anon:0
Apr 10 17:33:37 db_slave2 kernel: [17459736.693051] active_file:86 inactive_file:382 isolated_file:0
Apr 10 17:33:37 db_slave2 kernel: [17459736.693052] unevictable:0 dirty:0 writeback:0 unstable:0
Apr 10 17:33:37 db_slave2 kernel: [17459736.693052] free:25338 slab_reclaimable:3850 slab_unreclaimable:6781
Apr 10 17:33:37 db_slave2 kernel: [17459736.693053] mapped:153 shmem:137 pagetables:5394 bounce:0
Apr 10 17:33:37 db_slave2 kernel: [17459736.700885] Node 0 DMA free:15724kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15336kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Apr 10 17:33:37 db_slave2 kernel: [17459736.710935] lowmem_reserve[]: 0 3512 8057 8057
Apr 10 17:33:37 db_slave2 kernel: [17459736.712460] Node 0 DMA32 free:47492kB min:29404kB low:36752kB high:44104kB active_anon:3263628kB inactive_anon:0kB active_file:60kB inactive_file:124kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3596504kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:1564kB slab_unreclaimable:576kB kernel_stack:256kB pagetables:4872kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:85 all_unreclaimable? no
Apr 10 17:33:37 db_slave2 kernel: [17459736.723018] lowmem_reserve[]: 0 0 4545 4545
Apr 10 17:33:37 db_slave2 kernel: [17459736.724576] Node 0 Normal free:37808kB min:38052kB low:47564kB high:57076kB active_anon:4548604kB inactive_anon:436kB active_file:408kB inactive_file:1340kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:4654080kB mlocked:0kB dirty:0kB writeback:0kB mapped:692kB shmem:548kB slab_reclaimable:13596kB slab_unreclaimable:26580kB kernel_stack:4352kB pagetables:16916kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:341 all_unreclaimable? yes
Apr 10 17:33:37 db_slave2 kernel: [17459736.735436] lowmem_reserve[]: 0 0 0 0
Apr 10 17:33:37 db_slave2 kernel: [17459736.736835] Node 0 DMA: 3*4kB 2*8kB 1*16kB 2*32kB 2*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15724kB
Apr 10 17:33:37 db_slave2 kernel: [17459736.740180] Node 0 DMA32: 205*4kB 133*8kB 142*16kB 128*32kB 91*64kB 70*128kB 32*256kB 18*512kB 7*1024kB 0*2048kB 0*4096kB = 47612kB
Apr 10 17:33:37 db_slave2 kernel: [17459736.743667] Node 0 Normal: 1290*4kB 1119*8kB 553*16kB 232*32kB 80*64kB 10*128kB 2*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 37808kB
Apr 10 17:33:37 db_slave2 kernel: [17459736.747348] 589 total pagecache pages
Apr 10 17:33:37 db_slave2 kernel: [17459736.748641] 0 pages in swap cache
Apr 10 17:33:37 db_slave2 kernel: [17459736.749910] Swap cache stats: add 0, delete 0, find 0/0
Apr 10 17:33:37 db_slave2 kernel: [17459736.751414] Free swap = 0kB
Apr 10 17:33:37 db_slave2 kernel: [17459736.752617] Total swap = 0kB
Apr 10 17:33:37 db_slave2 kernel: [17459736.767626] 2097151 pages RAM
Apr 10 17:33:37 db_slave2 kernel: [17459736.768819] 82363 pages reserved
Apr 10 17:33:37 db_slave2 kernel: [17459736.770032] 2892 pages shared
Apr 10 17:33:37 db_slave2 kernel: [17459736.771216] 1983378 pages non-shared
Apr 10 17:33:37 db_slave2 kernel: [17459736.772481] [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name
............................
Apr 10 17:33:37 db_slave2 kernel: [17459736.897551] Out of memory: Kill process 21822 (mysqld) score 964 or sacrifice child
Apr 10 17:33:37 db_slave2 kernel: [17459736.900107] Killed process 21822, UID 27, (mysqld) total-vm:9180284kB, anon-rss:7758064kB, file-rss:328kB

 

(5) 执行show global status可以看到
Created_tmp_disk_tables | 852
Created_tmp_tables | 23187
从crash之后可以看到,产生磁盘上的临时表和内存上的临时表次数较高
(6) MySQL部份参数如下
tmp_table_size = 64M
max_heap_table_size = 64M
innodb_buffer_pool_size = 5G

这个服务器目前主要是后台查数据使用,可能会查询一样统计分析类的SQL,产生内存临时表甚至产生磁盘级的临时表,最后导致内存不够用而触发OOM,导致MySQL Crash Recovery

发表评论

电子邮件地址不会被公开。 必填项已用*标注