Date: Mon, 26 Jan 2015 20:32:22 +0100 From: Mark Martinec <Mark.Martinec+freebsd@ijs.si> To: freebsd-current@freebsd.org, perl@freebsd.org Subject: Memory corruption in a master perl process after child exits - only under FreeBSD 10.0 amd64 (not in 10.1 or 9.*) Message-ID: <1ac9f02be1360da3969ddb9501d0375a@mailbox.ijs.si>
next in thread | raw e-mail | index | archive | help
There is a problem report since July 2014 in a Perl bug tracker, which seems to affect only FreeBSD 10.0 amd64 (regardless of a version of Perl or usage of clang vs. gcc compiler): https://rt.perl.org/Ticket/Display.html?id=122199 I wonder if someone intimately familiar with handling of virtual memory, fork, swap, and process exit / wait(2) under FreeBSD would be able to recognize what has changed in these areas between 9.2 -> 10.0 and 10.0 -> 10.1, so that only 10.0 is misbehaving, but 10.1 apparently fixed the problem again. Below is my short summary of the issue (it is the last comment in the referenced problem report). Further details are in that PR. It's been a real mystery, difficult to reproduce, but definitely there. It might be a Perl bug, but it looks ever more likely that it is a FreeBSD issue. Mark After upgrading to FreeBSD 10.1 (from 10.0) and running the same application with the same version of Perl for two months now, with child process periodic retiring and re-spawning new child process by a master process as previously under FreeBSD 9.x, I can now confirm that the problem no longer occurs. I can also confirm that the problem under 10.0 can be avoided by not letting child processes to voluntarily exit, so the master process never sees a child termination in wait() and never needs to spawn (fork) another child process. A brief summary of the problem: Setup: an application consisting of a master perl process spawning worker child processes, which periodically voluntarily self-terminate, to be replaced by a fresh child process forked from the master process. Environent: - occurs only on FreeBSD 10.0 amd64, any recent version of perl, gcc or clang. - does not occur on FreeBSD 9.x or 10.1, and not on i383, not reproducible on Linux What seems to be happening: - a child process after doing some work (possibly touching swap) does a normal exit; - a parent process gets a SIGCHLD signal, handles a wait() and for some obscure reason some of its memory gets corrupted; - a parent process forks creating a new worker child process, which inherits corrupted sections of parent's memory, consequently later leading to its (child) crash if it happens to use that part of the memory (opcodes or data structures) during its normal work. Any newly born child process inherits the same memory corruption and crashes alike. So it seems the problem is somehow connected with how FreeBSD 10.0 on amd64 manages virtual memory (fork, exit, wait, possibly involving swap). The problem is apparently fixed in 10.1, and not present in 9.x. Does anybody have a sound explanation?
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1ac9f02be1360da3969ddb9501d0375a>