From owner-freebsd-stable@FreeBSD.ORG Wed Nov 16 18:20:38 2005 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E50E816A41F for ; Wed, 16 Nov 2005 18:20:38 +0000 (GMT) (envelope-from lars+lister.freebsd@adventuras.no) Received: from mail.adventuras.no (mail.adventuras.no [194.63.250.215]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4B9D643D69 for ; Wed, 16 Nov 2005 18:20:23 +0000 (GMT) (envelope-from lars+lister.freebsd@adventuras.no) Received: from mail.adventuras.no (seven [127.0.0.1]) by mail.adventuras.no (8.12.10/8.12.10) with ESMTP id jAGIK2jt013718 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Wed, 16 Nov 2005 19:20:02 +0100 Received: (from apache@localhost) by mail.adventuras.no (8.12.10/8.12.10/Submit) id jAGIK1JO013716; Wed, 16 Nov 2005 19:20:01 +0100 Received: from 213.236.228.129 (SquirrelMail authenticated user lars) by mail.adventuras.no with HTTP; Wed, 16 Nov 2005 19:20:01 +0100 (CET) Message-ID: <64897.213.236.228.129.1132165201.squirrel@mail.adventuras.no> In-Reply-To: <63732.213.236.228.129.1132161564.squirrel@mail.adventuras.no> References: <20051115065740.GH39882@cirb503493.alcatel.com.au> <20051115100813.74195.qmail@web36214.mail.mud.yahoo.com> <20051115103821.GJ39882@cirb503493.alcatel.com.au> <54759.213.236.228.129.1132153296.squirrel@mail.adventuras.no> <20051116162421.GE76352@green.homeunix.org> <63732.213.236.228.129.1132161564.squirrel@mail.adventuras.no> Date: Wed, 16 Nov 2005 19:20:01 +0100 (CET) From: "Lars Kristiansen" To: freebsd-stable@freebsd.org User-Agent: SquirrelMail/1.4.4-1 MIME-Version: 1.0 Content-Type: multipart/mixed;boundary="----=_20051116192001_86947" X-Priority: 3 (Normal) Importance: Normal X-Adventuras-MailScanner-Information: Please contact the ISP for more information X-Adventuras: du kan filtrere etter AdvSpamScore over 5-10 X-Adventuras-SpamCheck: not spam, SpamAssassin (score=-4.399, required 6, autolearn=not spam, ALL_TRUSTED -1.80, BAYES_00 -2.60) X-MailScanner-From: lars+lister.freebsd@adventuras.no Subject: Re: Swapfile problem in 6? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Nov 2005 18:20:39 -0000 ------=_20051116192001_86947 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit >> On Wed, Nov 16, 2005 at 04:01:36PM +0100, Lars Kristiansen wrote: >>> > On Tue, 2005-Nov-15 02:08:12 -0800, Rob wrote: >>> >> makeoptions DEBUG=-g >>> >> options INVARIANTS >>> >> options WITNESS >>> >> options WITNESS_KDB >>> >> options KDB >>> >> options DDB >>> >> options DDB_NUMSYM >>> >> options GDB >>> >> >>> >>Is that enough? >>> > >>> > If your system is headless, you probably want 'options >>> BREAK_TO_DEBUGGER' >>> > as well. >>> > >>> > First question is: Does the system still deadlock? INVARIANTS and >>> > WITNESS will have added sanity checks which might have picked up the >>> > problem. >>> > >>> >>1) Can I debug a kernel that does not crash, but >>> >> just hangs in a deadlock? Everything seems to >>> >> be frozen, except pinging the PC.... >>> > >>> > Have a look at >>> > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-online-ddb.html >>> > and ddb(4). Unless you have another system handy, you might like to >>> print >>> > out ddb(4) - it's difficult to read man pages when you're in the >>> kernel >>> > debugger :-). >>> > >>> >>2) Is such debugging possible on a headless PC >>> >> without a keyboard attached? >>> >> I do have serial console access. >>> > >>> > Yes. See above URL. The advantage is that you can (hopefully) >>> > capture a log of your debug session. Send a serial BREAK and you >>> > should get a DDB> prompt. >>> > >>> > Basically, wait until your system deadlocks. BREAK into DDB. >>> > As a start, run 'show lockedvnods', 'ps'. My guess is that you'll >>> > see a lock that has a number of waiters - which is probably the >>> > culprit. Use 'panic' to get a crashdump and then you can use kgdb >>> > to rummage around once you reboot - see >>> > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-gdb.html >>> > >>> > If in doubt, post the output from the above commands here and someone >>> > will hopefully provide further input. >>> >>> Hello again, I am the "me too"-guy with console-access. >>> I am not a programmer and it is the first time I see debugging screen. >>> >>> It deadlocked again, and I did as advised above: >>> (ddb: show lockedvnods; ps ; panic) >>> but did not understand much of the output. >>> Looked maybe like syncer and swap_pager was locked? >>> Do i need to write all this down or can I get the output saved >>> somewhere? >>> >>> I got a 32MB coredump but the same lack of understanding applies. >>> >>> Please tell me if I can be of any help! This is fun. >> >> Do you have the ability to connect another computer by RS-232? >> It's easy to get a serial terminal console going (err that is >> if you find the right guide as opposed to stabbing blindly and >> just referencing man pages as I like to do.) The coredump >> should supply the same (and more) information, and someone >> can walk through with you doing a post-mortem gdb session. >> >> For example, try doing the following now that you have the coredump: >> # ps wwwauxlH -N /boot/kernel/kernel -M /var/crash/vmcore.whatever > > Sure, I will get a serial terminal console going and try to repeat this > process from it. > > In the meantime here is output from the above ps command provided as > attachement. > > -- > Lars Yes, it deadlocked almost immediately. A debug session is attached. -- Lars > >> >> -- >> Brian Fundakowski Feldman \'[ FreeBSD >> ]''''''''''\ >> <> green@FreeBSD.org \ The Power to >> Serve! \ >> Opinions expressed are my own. >> \,,,,,,,,,,,,,,,,,,,,,,\ >> > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" ------=_20051116192001_86947 Content-Type: text/plain; name="swapfile_ds.txt" Content-Transfer-Encoding: 8bit Content-Disposition: attachment; filename="swapfile_ds.txt" KDB: enter: Line break on console [thread pid 11 tid 100005 ] Stopped at 0xc0671fcb = kdb_enter+0x2b: nop db> show lockedvnods Locked vnodes 0xc12cdbb0: tag syncer, type VNON usecount 1, writecount 0, refcount 2 mountedhere 0 flags () lock type syncer: EXCL (count 1) by thread 0xc1143a80 (pid 40) 0xc12cdaa0: tag ufs, type VREG usecount 1, writecount 0, refcount 3 mountedhere 0 flags (VV_SYSTEM) lock type snaplk: EXCL (count 1) by thread 0xc126d780 (pid 175) ino 231, on dev ad0s1f 0xc1301550: tag ufs, type VREG usecount 1, writecount 1, refcount 111 mountedhere 0 flags () v_object 0xc1306dec ref 0 pages 436 lock type ufs: EXCL (count 1) by thread 0xc126d780 (pid 175) ino 8155, on dev ad0s1f db> ps pid proc uid ppid pgrp flag stat wmesg wchan cmd 724 c1544624 0 675 675 0004002 [SLPQ pfault 0xc09b2a98][SLP] sh 675 c1544a3c 0 648 675 0004002 [SLPQ piperd 0xc1296b28][SLP] ruby18 648 c1544c48 0 647 648 0004002 [SLPQ pause 0xc1544c7c][SLP][SWAP] csh 647 c1546000 0 1 647 0004102 [SLPQ wait 0xc1546000][SLP][SWAP] login 646 c131020c 0 1 646 0004002 [SLPQ ttyin 0xc1226410][SLP][SWAP] getty 645 c1310830 0 1 645 0004002 [SLPQ ttyin 0xc1226810][SLP][SWAP] getty 644 c146420c 0 1 644 0004002 [SLPQ ttyin 0xc1226c10][SLP][SWAP] getty 643 c1464830 0 1 643 0004002 [SLPQ ttyin 0xc1216010][SLP][SWAP] getty 642 c1464c48 0 1 642 0004002 [SLPQ ttyin 0xc1212010][SLP][SWAP] getty 641 c11ddc48 0 1 641 0004002 [SLPQ ttyin 0xc1212c10][SLP][SWAP] getty 640 c1310418 0 1 640 0004002 [SLPQ ttyin 0xc1218010][SLP][SWAP] getty 639 c1464418 0 1 639 0004002 [SLPQ ttyin 0xc1211010][SLP][SWAP] getty 623 c1464624 0 1 623 0000000 [SLPQ select 0xc09a46e4][SLP][SWAP] inetd 606 c1464000 0 1 605 0000000 [SLPQ nanslp 0xc0959bcc][SLP][SWAP] smartd 594 c1310a3c 1002 1 594 0000100 [SLPQ select 0xc09a46e4][SLP][SWAP] dhcpd 566 c1310000 0 1 566 0000000 [SWAP] cron 554 c1464a3c 25 1 554 0000100 [SLPQ pause 0xc1464a70][SLP][SWAP] sendmail 550 c126cc48 0 1 550 0000100 [SLPQ select 0xc09a46e4][SLP] sendmail 544 c126c624 0 1 544 0000100 [SWAP] sshd 526 c12c820c 0 1 526 0000000 [SLPQ pfault 0xc09b2a98][SLP] ntpd 499 c126c418 0 1 499 0000000 [SWAP] usbd 459 c126ca3c 0 1 459 0000000 [SWAP] amd 444 c12c8418 0 1 444 0000000 [SWAP] rpcbind 403 c12c8830 53 1 403 0000100 [SWAP] named 343 c126c830 0 1 343 0000000 [SWAP] syslogd 310 c11dd830 0 1 310 0000000 [SLPQ select 0xc09a46e4][SLP][SWAP] devd 276 c11dd624 64 267 267 0000100 [SLPQ pfault 0xc09b2a98][SLP] pflogd 267 c12c8a3c 0 1 267 0000000 [SLPQ sbwait 0xc13220a8][SLP][SWAP] pflogd 227 c12c8c48 65 1 227 0000100 [SWAP] dhclient 207 c11dda3c 0 1 47 0000002 [SLPQ select 0xc09a46e4][SLP][SWAP] dhclient 175 c12c8624 0 0 0 0000204 [SLPQ vmwait 0xc09b2a98][SLP] md0 46 c126a000 0 0 0 0000204 [SLPQ - 0xc4715d04][SLP] schedcpu 45 c126a20c 0 0 0 0000204 [SLPQ - 0xc09acc6c][SLP] nfsiod 3 44 c126a418 0 0 0 0000204 [SLPQ - 0xc09acc68][SLP] nfsiod 2 43 c126a624 0 0 0 0000204 [SLPQ - 0xc09acc64][SLP] nfsiod 1 42 c126a830 0 0 0 0000204 [SLPQ - 0xc09acc60][SLP] nfsiod 0 41 c126aa3c 0 0 0 0000204 [SLPQ vlruwt 0xc126aa3c][SLP] vnlru 40 c126ac48 0 0 0 0000204 [SLPQ vmwait 0xc09b2a98][SLP] syncer 39 c126c000 0 0 0 0000204 [SLPQ psleep 0xc09a4c2c][SLP] bufdaemon 38 c1142c48 0 0 0 000020c [SLPQ pgzero 0xc09b3224][SLP] pagezero 9 c11dc000 0 0 0 0000204 [SLPQ psleep 0xc09b2d74][SLP] vmdaemon 8 c11dc20c 0 0 0 0000204 [SLPQ wswbuf0 0xc09b2514][SLP] pagedaemon 37 c11dc418 0 0 0 0000204 [IWAIT] swi0: sio 7 c11dc624 0 0 0 0000204 [SLPQ - 0xc121023c][SLP] fdc0 36 c11dc830 0 0 0 0000204 [SLPQ usbtsk 0xc0956884][SLP] usbtask 35 c11dca3c 0 0 0 0000204 [SLPQ usbevt 0xc11e9210][SLP] usb0 34 c11dcc48 0 0 0 0000204 [IWAIT] swi5:+ 6 c11dd000 0 0 0 0000204 [SLPQ - 0xc1117400][SLP] thread taskq 33 c11dd20c 0 0 0 0000204 [IWAIT] swi6:+ 32 c11dd418 0 0 0 0000204 [IWAIT] swi6: task queue 31 c1133624 0 0 0 0000204 [IWAIT] swi2: cambio 5 c1133830 0 0 0 0000204 [SLPQ - 0xc1117800][SLP] kqueue taskq 30 c1133a3c 0 0 0 0000204 [SLPQ - 0xc09545a0][SLP] yarrow 4 c1133c48 0 0 0 0000204 [SLPQ - 0xc09570c8][SLP] g_down 3 c1142000 0 0 0 0000204 [SLPQ - 0xc09570c4][SLP] g_up 2 c114220c 0 0 0 0000204 [SLPQ - 0xc09570bc][SLP] g_event 29 c1142418 0 0 0 0000204 [IWAIT] swi3: vm 28 c1142624 0 0 0 000020c [IWAIT] swi4: clock sio 27 c1142830 0 0 0 0000204 [IWAIT] swi1: net 26 c1142a3c 0 0 0 0000204 [IWAIT] irq15: ata1 25 c111e20c 0 0 0 0000204 [IWAIT] irq14: ata0 24 c111e418 0 0 0 0000204 [IWAIT] irq13: 23 c111e624 0 0 0 0000204 [IWAIT] irq12: 22 c111e830 0 0 0 0000204 [IWAIT] irq11: xl0 uhci0 21 c111ea3c 0 0 0 0000204 [IWAIT] irq10: ed0 20 c111ec48 0 0 0 0000204 [IWAIT] irq9: 19 c1133000 0 0 0 0000204 [IWAIT] irq8: rtc 18 c113320c 0 0 0 0000204 [IWAIT] irq7: ppc0 17 c1133418 0 0 0 0000204 [IWAIT] irq6: fdc0 16 c1118000 0 0 0 0000204 [IWAIT] irq5: 15 c111820c 0 0 0 0000204 [IWAIT] irq4: sio0 14 c1118418 0 0 0 0000204 [IWAIT] irq3: 13 c1118624 0 0 0 0000204 [IWAIT] irq1: atkbd0 12 c1118830 0 0 0 0000204 [IWAIT] irq0: clk 11 c1118a3c 0 0 0 000020c [CPU 0] idle 1 c1118c48 0 0 1 0004200 [SLPQ wait 0xc1118c48][SLP] init 10 c111e000 0 0 0 0000204 [SLPQ ktrace 0xc0957b18][SLP] ktrace 0 c09571c0 0 0 0 0000200 [SLPQ vmwait 0xc09b2a98][SLP] swapper db> ------=_20051116192001_86947--