Date: Wed, 5 Jan 2011 23:51:44 GMT From: Hugh Blandford <hugh@island.net.au> To: freebsd-gnats-submit@FreeBSD.org Subject: ports/153715: FreeRADIUS exiting with Signal 11 on FreeBSD Message-ID: <201101052351.p05NpiSk005362@red.freebsd.org> Resent-Message-ID: <201101060000.p0600KRI051925@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 153715 >Category: ports >Synopsis: FreeRADIUS exiting with Signal 11 on FreeBSD >Confidential: no >Severity: serious >Priority: low >Responsible: freebsd-ports-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Jan 06 00:00:20 UTC 2011 >Closed-Date: >Last-Modified: >Originator: Hugh Blandford >Release: 8.1 >Organization: Island Internet >Environment: FreeBSD 8.1-RELEASE-p1 FreeBSD 8.1-RELEASE-p1 #0: Thu Oct 7 12:09:14 EST 2010 hugh@island.net.au:/usr/obj/usr/src/sys/GENERIC amd64 >Description: A ktrace of the problem: /var/log/radius.log: Sat Dec 18 02:43:43 2010 : Info: Exiting normally. /var/log/messages: Dec 18 02:43:43 radius kernel: pid 655 (radiusd), uid 133: exited on signal 11 $ kdump | tail -20 655 radiusd CALL fstat(0xa,0xbf8fb280) 655 radiusd STRU struct stat {dev=83, ino=2378852, mode=-rw------- , nlink=1, uid=133, gid=133, rdev=0, atime=1292640223, stime=1292640223, ctime=1292640223, birthtime=1292640223, size=0, blksize=16384, blocks=0, flags=0x0 } 655 radiusd RET fstat 0 655 radiusd CALL lseek(0xa,0,SEEK_SET,0x2) 655 radiusd RET lseek 0 655 radiusd CALL write(0xa,0x28752000,0x54) 655 radiusd GIO fd 10 wrote 84 bytes "Sat Dec 18 02:43:43 2010 Timestamp = 1292640223 Request-Authenticator = Verified " 655 radiusd RET write 84/0x54 655 radiusd CALL close(0xa) 655 radiusd RET close 0 655 radiusd PSIG SIGSEGV SIG_DFL 655 radiusd RET _umtx_op -1 errno 4 Interrupted system call 655 radiusd RET _umtx_op -1 errno 4 Interrupted system call 655 radiusd RET _umtx_op -1 errno 4 Interrupted system call 655 radiusd RET _umtx_op -1 errno 4 Interrupted system call gdb output: (I have cut some out as I ran out of characters for the post. Let me know if I might have chopped something important) Starting program: /usr/local/sbin/radiusd -f [New LWP 100056] [New Thread 28401140 (LWP 100056)] [New Thread 28567ac0 (LWP 100051)] [New Thread 28567980 (LWP 100068)] [New Thread 28567840 (LWP 100069)] [New Thread 28567700 (LWP 100070)] [New Thread 285675c0 (LWP 100102)] Program received signal SIGABRT, Aborted. [Switching to Thread 28567ac0 (LWP 100051)] 0x2831866b in thr_kill () from /lib/libc.so.7 (gdb) (gdb) (gdb) where #0 0x2831866b in thr_kill () from /lib/libc.so.7 #1 0x282c7896 in pthread_kill () from /lib/libthr.so.3 #2 0x282c53b3 in raise () from /lib/libthr.so.3 #3 0x283bac4a in abort () from /lib/libc.so.7 #4 0x0806c852 in rad_assert_fail (file=Could not find the frame base for "rad_assert_fail". ) at util.c:365 #5 0x0806af44 in request_dequeue (request=0x28542b7c, fun=0xbf9fef8c) at threads.c:412 #6 0x0806b18b in request_handler_thread (arg=0x28542b60) at threads.c:516 #7 0x282c175f in pthread_getprio () from /lib/libthr.so.3 #8 0x00000000 in ?? () (gdb) thread apply all bt full (these were the same so I have shortened them) Thread 7 (Thread 285675c0 (LWP 100102)): Thread 6 (Thread 28567700 (LWP 100070)): Thread 5 (Thread 28567840 (LWP 100069)): Thread 4 (Thread 28567980 (LWP 100068)): #0 0x282cc227 in __error () from /lib/libthr.so.3 No symbol table info available. #1 0x282cbc19 in __error () from /lib/libthr.so.3 No symbol table info available. #2 0x285484fc in ?? () No symbol table info available. #3 0x0000000f in ?? () No symbol table info available. #4 0x00000000 in ?? () No symbol table info available. #5 0x00000000 in ?? () No symbol table info available. #6 0x00000000 in ?? () No symbol table info available. #7 0x282cdbf4 in ?? () from /lib/libthr.so.3 No symbol table info available. #8 0xbf8fdf68 in ?? () No symbol table info available. #9 0x282c0fcc in sem_wait () from /lib/libthr.so.3 No symbol table info available. Previous frame identical to this frame (corrupt stack?) Thread 3 (Thread 28567ac0 (LWP 100051)): ---Type <return> to continue, or q <return> to quit--- #0 0x2831866b in thr_kill () from /lib/libc.so.7 No symbol table info available. #1 0x282c7896 in pthread_kill () from /lib/libthr.so.3 No symbol table info available. #2 0x282c53b3 in raise () from /lib/libthr.so.3 No symbol table info available. #3 0x283bac4a in abort () from /lib/libc.so.7 No symbol table info available. #4 0x0806c852 in rad_assert_fail (file=Could not find the frame base for "rad_assert_fail". ) at util.c:365 No locals. #5 0x0806af44 in request_dequeue (request=0x28542b7c, fun=0xbf9fef8c) at threads.c:412 blocked = 134809920 i = RAD_LISTEN_ACCT start = RAD_LISTEN_ACCT entry = (request_queue_t *) 0x0 last_complained = 0 #6 0x0806b18b in request_handler_thread (arg=0x28542b60) at threads.c:516 fun = 0x804e250 <rad_accounting> self = (THREAD_HANDLE *) 0x28542b60 #7 0x282c175f in pthread_getprio () from /lib/libthr.so.3 No symbol table info available. #8 0x00000000 in ?? () No symbol table info available. Thread 2 (Thread 28401140 (LWP 100056)): #0 0x283abf9f in unlink () from /lib/libc.so.7 No symbol table info available. #1 0x0805daf9 in command_socket_free (this=0x28569100) at command.c:1839 sock = (fr_command_socket_t *) 0x285b4800 #2 0x08060053 in listen_free (head=0x8090b7c) at listen.c:2257 next = (rad_listen_t *) 0x285690a0 this = (rad_listen_t *) 0x28569100 #3 0x08061e7c in free_mainconfig () at mainconfig.c:992 cc = (cached_config_t *) 0x0 next = (cached_config_t *) 0x0 #4 0x0806859a in main (argc=2, argv=0xbfbfec74) at radiusd.c:443 rcode = 0 argval = -1 spawn_flag = 1 dont_fork = 1 flag = 0 act = {__sigaction_u = {__sa_handler = 0x80687f0 <sig_fatal>, __sa_sigaction = 0x80687f0 <sig_fatal>}, sa_flags = 0, sa_mask = {__bits = {0, 0, 0, 0}}} #0 0x2831866b in thr_kill () from /lib/libc.so.7 Alan DeKok the main FreeRADIUS developer looked at the output and commented: Code: > #4 0x0806c852 in rad_assert_fail (file=Could not find the frame base for > > "rad_assert_fail". > > ) at util.c:365 > > #5 0x0806af44 in request_dequeue (request=0x28542b7c, fun=0xbf9fef8c) at > > threads.c:412 Ugh. Something is free'ing the request when it's still queued. This *only* can happen when the request is in the queue for more than 30s. So... why is your server so slow? Fix that, and the problem won't be fixed, but it *will* go away. I can take a look at the code, but this is a very odd edge case, and hard to track down/fix. >How-To-Repeat: just let the server run. It will eventually fall over. It does seem to be random. >Fix: not known >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201101052351.p05NpiSk005362>