记一次进程异常退出的问题排查

机器搬家之后,之前一直稳定的PHP多进程程序子进程突然异常退出,但是退出的不是很频繁,查看进程日志并也没有发现有什么导致退出的,问题比较诡异。于是开启了一段问题排查之路。

首先查看内核日志,使用dmesg,拉到最后发现有一些这样的错误,看来确实是崩溃了。

[4791991.998535] php[16776]: segfault at 7f6443ee18c8 ip 00007f6443ee18c8 sp 00007fff4d4ba818 error 15 in libc-2.17.so[7f6443ee1000+2000]
[4792165.192628] php[609]: segfault at 0 ip 000000000075919d sp 00007fff0c6e0578 error 4 in php[400000+94b000]
[4792423.164949] traps: php[2337] general protection ip:75919d sp:7fff0c6e0578 error:0 in php[400000+94b000]
[4793914.900298] traps: php[589] general protection ip:7576b6 sp:7fff0c6e0460 error:0 in php[400000+94b000]
[4794155.124685] php[25418]: segfault at 35007265746c ip 000000000075919d sp 00007fff0c6e0578 error 4 in php[400000+94b000]
[4794677.119847] traps: php[2314] general protection ip:75959b sp:7fff4d4ba840 error:0 in php[400000+94b000]
[4795121.747090] php[4642]: segfault at 0 ip 000000000075919d sp 00007fff0c6e0578 error 4 in php[400000+94b000]
[4795666.787427] php[2372]: segfault at 40 ip 000000000075958c sp 00007fff0c6e0500 error 4 in php[400000+94b000]
[4796212.001686] php[6224]: segfault at 10 ip 000000000075919d sp 00007fff0c6e0578 error 4 in php[400000+94b000]
[4796224.510583] traps: php[6156] general protection ip:75919d sp:7fff0c6e0578 error:0 in php[400000+94b000]
[4796337.623455] php[562]: segfault at 247ec40 ip 000000000247ec40 sp 00007fff0c6e04d8 error 15
[4796427.436886] php[1711]: segfault at ffffffffffffffff ip 00000000007576b6 sp 00007fff0c6e0460 error 5 in php[400000+94b000]
[4796554.025960] php[6662]: segfault at 6b6f01000040 ip 000000000075958c sp 00007fff0c6e0500 error 4 in php[400000+94b000]
[4797141.552356] php[6658]: segfault at 18 ip 0000000000758daf sp 00007fff0c6e04d0 error 4 in php[400000+94b000]
[4797495.302089] php[7239]: segfault at 110 ip 00000000007576d2 sp 00007fff0c6e0460 error 4 in php[400000+94b000]
[4797867.446166] php[8265]: segfault at 247e730 ip 000000000247e730 sp 00007fff0c6e04d8 error 15
[4798245.596106] php[8223]: segfault at 247ef40 ip 000000000247ef40 sp 00007fff0c6e04d8 error 15
[4798514.326132] traps: php[8152] general protection ip:75919d sp:7fff0c6e0578 error:0 in php[400000+94b000]
[4798769.904337] traps: php[7255] general protection ip:7576d2 sp:7fff0c6e0460 error:0 in php[400000+94b000]
[4799427.934198] php[2297]: segfault at 17b57d0 ip 00000000017b57d0 sp 00007fff4d4ba838 error 15
[4800091.548467] php[9826]: segfault at 247ed10 ip 000000000247ed10 sp 00007fff0c6e04d8 error 15
[4800607.342570] php[11239]: segfault at 100000007 ip 000000000075919d sp 00007fff0c6e0578 error 4 in php[400000+94b000]
[4800806.439680] php[9796]: segfault at 247ec90 ip 000000000247ec90 sp 00007fff0c6e04d8 error 15
[4801110.909591] php[8317]: segfault at 247ed20 ip 000000000247ed20 sp 00007fff0c6e04d8 error 15
[4801417.477197] php[9326]: segfault at 0 ip 00000000007576d2 sp 00007fff0c6e0460 error 4 in php[400000+94b000]

运气不错,现在居然就有眉目了。上面的信息一般是内存访问越界导致的。现在找一条看看 (更多…)