Date: Thu, 05 Sep 2002 21:41:05 +0100 From: Duncan Barclay <dmlb@dmlb.org> To: FreeBSD-gnats-submit@FreeBSD.org Cc: marcel@xclint.net, dmlb@dmlb.org Subject: kern/42457: Hack to allow Linux Matlab to exit Message-ID: <E17n3R3-0000Ms-00@slave.my.domain>
next in thread | raw e-mail | index | archive | help
>Number: 42457
>Category: kern
>Synopsis: Hack to allow Linux Matlab to exit
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Thu Sep 05 13:50:01 PDT 2002
>Closed-Date:
>Last-Modified:
>Originator: Duncan Barclay
>Release: FreeBSD 4.6-PRERELEASE i386
>Organization:
>Environment:
System: FreeBSD slave.my.domain 4.6-PRERELEASE FreeBSD 4.6-PRERELEASE #2: Thu Sep 5 21:11:18 BST 2002 dmlb@slave.my.domain:/usr/src-CVSup/sys/compile/SLAVE i386
>Description:
Linux Matlab version 6 and 6.1 and possibly 6.5, are known to hang
on exit when the matlab Java VM is used. A kill -9 is required.
Matlab when using its JVM creates a number of threads:
matlab
matlab thread #1
matlab thread #1.1
matlab thread #1.2
matlab thread #1.3
On exit, threads #1.1, #1.2 and #1.3 die gracefully and are reaped by
thread #1. However, thread #1 is not reaped correctly with matlab
apparently issuing a
linux_wait4(-1, &foo, 0 0).
This does not reap threads but processes.
Thread #1 is created with
linux_clone(0xf00, *bar())
The options mask specifies a thread that does not want to send its
parent a signal when it dies.
From linux clone(2):
The low byte of flags contains the number of the signal sent
to the parent when the child dies. If this signal is specified
as anything other than SIGCHLD , then the parent process must
specify the __WALL or __WCLONE options when waiting for the
child with wait (2). If no signal is specified, then the
parent process is not signaled when the child terminates.
[note last sentance]
FreeBSD always sends a signal to the parent when terminating
a process, from /sys/kern_exit.c:exit1()
if (p->p_sigparent && p->p_pptr != initproc) {
psignal(p->p_pptr, p->p_sigparent);
} else {
psignal(p->p_pptr, SIGCHLD);
}
FreeBSD therefore sends matlab a SIGCHLD. Matlab has a SIGCHLD handler
that issues the above wait4. This is shown in the following ktrace
output with matlab pid = 6255, and thread #1 pid = 6304.
6304 matlab CALL linux_kill(0x186f,0x20)
6255 matlab PSIG SIG(null) caught handler=0x28c96e10 mask=0x80000000 code=0x0
6304 matlab RET linux_kill 0
6304 matlab CALL exit(0)
6255 matlab RET linux_rt_sigsuspend -1 errno 4 Interrupted system call
6255 matlab PSIG SIGCHLD caught handler=0x28c97460 mask=0x80000000 code=0x0
6255 matlab CALL linux_wait4(0xffffffff,0xbfbfa1b0,0,0)
If the above code in kern_exit.c is replaced with
if (p->p_sigparent && p->p_pptr != initproc) {
psignal(p->p_pptr, p->p_sigparent);
} else if (p->p_sigparent != 0) {
psignal(p->p_pptr, SIGCHLD);
}
to not send a SIGCHLD, then matlab reaps the thread. ktrace output
with matlab pid = 808, and thread #1 pid = 857.
857 matlab CALL linux_kill(0x328,0x20)
808 matlab PSIG SIG(null) caught handler=0x28c96e10 mask=0x80000000 code=0x0
857 matlab RET linux_kill 0
857 matlab CALL exit(0)
808 matlab RET linux_rt_sigsuspend -1 errno 4 Interrupted system call
808 matlab CALL linux_sigreturn(0xbfbfa928)
808 matlab RET linux_sigreturn JUSTRETURN
808 matlab CALL linux_wait4(0x359,0,0x80000000,0)
808 matlab RET linux_wait4 857/0x359
808 matlab CALL munmap(0x2d75d000,0x1000)
808 matlab RET munmap 0
808 matlab CALL exit(0)
>How-To-Repeat:
run matlab and type "exit" at the prompt
>Fix:
Snippet of code above is suggested as a change to kern_exit.c,
but is probably dangerous as it stands as it changes exit
signalling behaviour.
Maintainers of kern_exit.c and the linuxulator are requested to
implement a more robust solution.
>Release-Note:
>Audit-Trail:
>Unformatted:
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E17n3R3-0000Ms-00>
