Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 27 Mar 2017 19:18:33 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Steven Hartland <killing@multiplay.co.uk>
Cc:        "K. Macy" <kmacy@freebsd.org>, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Subject:   Re: Help needed to identify golang fork / memory corruption issue on FreeBSD
Message-ID:  <20170327161833.GL43712@kib.kiev.ua>
In-Reply-To: <5ba92447-945e-6fea-ad4f-f58ac2a0012e@multiplay.co.uk>
References:  <20161206125919.GQ54029@kib.kiev.ua> <8b502580-4d2d-1e1f-9e05-61d46d5ac3b1@multiplay.co.uk> <20161206143532.GR54029@kib.kiev.ua> <e160381c-9935-6edf-04a9-1ff78e95d818@multiplay.co.uk> <CAHM0Q_Mg662u9D0KJ9knEWWqi9Ydy38qKDnjLt6XaS0ks%2B9-iw@mail.gmail.com> <18b40a69-4460-faf2-c0ce-7491eca92782@multiplay.co.uk> <20170317082333.GP16105@kib.kiev.ua> <180a601b-5481-bb41-f7fc-67976aabe451@multiplay.co.uk> <20170317124437.GR16105@kib.kiev.ua> <5ba92447-945e-6fea-ad4f-f58ac2a0012e@multiplay.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Mar 27, 2017 at 12:47:11PM +0100, Steven Hartland wrote:
> OK now the similar but unrelated issue with signal stacks is solved I've 
> moved back to the initial issue.
> 
> I've made some progress with a reproduction case as detailed here:
> https://github.com/golang/go/issues/15658#issuecomment-288747812
> 
> In short it seems that having a running child, while the parent runs GC, 
> is some how responsible for memory corruption in the parent.
> 
> The reason I believe this is if I run the same GC in the parent after 
> the child exits instead of while its running, I've been unable to 
> reproduce the issue.
> 
> As the memory segments are COW then the issue might be in VM subsystem.
Well, it might be, but it is a strange corruption mode to believe.

> 
> In order to confirm / deny this I was wondering if there was a way to 
> force a full copy of all segments for the child instead of using the COW 
> optimisation.
No, there is no. By design, copying only occurs on faults, when VM
detects that the map entry needs copying. Doing the actual copy at fork
time would require writing a lot of new code.

Does go have FreeBSD/i386 port ?  If yes, is the issue reproducable there ?

Another blind experiment to try is to comment out call to
vm_object_collapse() in sys/vm/vm_map.c:vm_map_copy_entry() and see if
it changes anything.

What could be quite interesting is to look at the parent and possibly
child address map after the error occured, using procstat -v. At
least for parent, this should be relatively easy to set up, just make
go runtime spin or pause on panic, instead of exiting, and then use
procstat.

> 
> Is this something that would be relatively easy to hack into the kernel, 
> and if so pointers would be appreciated.

BTW, I looked some more at the go code, and I noted that
runtime<stupid UTF-8 char>mmap() implementation looks very strange.
It ignores %rflags.C bit to identify error, and instead callers
of mmap() compare the return value with 4096, assuming Linux-style
error reporting.  This would certainly break if mmap(2) syscall
returns ERESTART one day.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170327161833.GL43712>