From owner-freebsd-stable@FreeBSD.ORG Sat Apr 23 21:18:17 2005 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B247016A4CE for ; Sat, 23 Apr 2005 21:18:17 +0000 (GMT) Received: from mail23.sea5.speakeasy.net (mail23.sea5.speakeasy.net [69.17.117.25]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1977D43D45 for ; Sat, 23 Apr 2005 21:18:17 +0000 (GMT) (envelope-from omniBSD@speakeasy.net) Received: (qmail 23885 invoked from network); 23 Apr 2005 21:18:16 -0000 Received: from acute.anhedonia.com (HELO [10.20.30.10]) (omni@[66.93.24.213]) (envelope-sender ) by mail23.sea5.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 23 Apr 2005 21:18:16 -0000 Message-ID: <426ABC6A.3080408@speakeasy.net> Date: Sat, 23 Apr 2005 16:21:46 -0500 From: Ash User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.2) Gecko/20041104 Netscape/7.1 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Ash References: <426825BA.4020609@speakeasy.net> <42685722.1040107@speakeasy.net> In-Reply-To: <42685722.1040107@speakeasy.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-stable@freebsd.org Subject: Re: 5.4-RC3 hung - DDB trace/ps provided [possible cause found] X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Apr 2005 21:18:17 -0000 Ash wrote: > Ash wrote: > >> I've included copies of ddb's ps output and a dmesg as attachments. >> Here is a trace (pid 95 is bufdaemon): >> > > Replying to myself because I just noticed that the ddb ps output wasn't > attached as promised. Computers are hard, let me try this again... :) > Apologies for replying to myself yet again. I realized that the ps output from ddb did not attach, so I'm going to include it at the end of this e-mail. Unfortunately, the machine came back online shortly after I noticed it had hung, so I was unable to generate a dump the last time this happened. I think I found the cause of this systems lock ups: # ps -auxwwwwww | grep 'o snapshot' root 6508 0.0 0.0 1260 720 ?? I 12:13AM 0:00.00 mount -u -o snapshot /xvols/sysvol00/.snap/daily.0 /xvols/sysvol00 root 6509 0.0 0.0 1260 720 ?? D 12:13AM 0:00.25 mount -u -o snapshot /xvols/sysvol00/.snap/daily.0 /xvols/sysvol00 root 9289 0.0 0.0 1260 720 ?? I 9:00AM 0:00.00 mount -u -o snapshot /xvols/vmvol00/.snap/hourly.0 /xvols/vmvol00 root 9290 0.0 0.0 1260 720 ?? D 9:00AM 0:00.24 mount -u -o snapshot /xvols/vmvol00/.snap/hourly.0 /xvols/vmvol00 It is now nearly 16:00 and these processes are still idle. I am unable to access either /xvols/vmvol00 or /vxols/sysvol00. I imagine if this were to happen on /, /usr or /var the symptoms would be in line with what I've been experiencing (system becoming unresponsive for an indefinite period of time). I probably won't be able to use this machine for too much testing as I am going to have to put this machine into production (either without snapshots or with a different OS). However, I can duplicate the configuration on another machines with different hardware if anyone has testing suggestions. As for the promised ddb ps output: pid proc uid ppid pgrp flag stat wmesg wchan cmd 1426 c3c0ea98 0 1422 1422 0000100 [SLPQ sbwait 0xc3c8a464][SLP] sshd 1422 c393da98 0 497 1422 0000100 [SLPQ wait 0xc393da98][SLP] sshd 1413 c393d710 0 612 1413 0004002 [SLPQ ttyin 0xc38ab210][SLP] vi 1409 c393d000 0 1408 920 0000000 [SLPQ physrd 0xd750e450][SLP] mount 1408 c3c0e000 0 1370 920 0004000 [SLPQ wait 0xc3c0e000][SLP] mount 1370 c3c0e54c 0 936 920 0004000 [SLPQ wait 0xc3c0e54c][SLP] sh 936 c39391c4 0 920 920 0004000 [SLPQ wait 0xc39391c4][SLP] sh 920 c3c0e8d4 0 916 920 0004000 [SLPQ wait 0xc3c0e8d4][SLP] sh 916 c3939a98 0 523 523 0000000 [SLPQ piperd 0xc391ed80][SLP] cron 815 c39398d4 0 812 815 0004002 [SLPQ ttyin 0xc35e2610][SLP] csh 812 c3939388 0 497 812 0000100 [SLPQ flswai 0xc09541c4][SLP] sshd 612 c393dc5c 0 608 612 0004002 [SLPQ pause 0xc393dc94][SLP] csh 608 c3c0e1c4 0 497 608 0000100 [SLPQ flswai 0xc09541c4][SLP] sshd 572 c3c12000 0 1 572 0000000 [SLPQ select 0xc0953b84][SLP] amd 571 c3c121c4 0 1 571 0004002 [RUNQ] getty 570 c393954c 0 1 570 0004002 [SLPQ ttyin 0xc362ec10][SLP] getty 569 c38ace20 0 1 569 0004002 [SLPQ ttyin 0xc362ee10][SLP] getty 568 c38d5388 0 1 568 0004002 [SLPQ ttyin 0xc3693010][SLP] getty 567 c393d388 0 1 567 0004002 [SLPQ ttyin 0xc3693210][SLP] getty 566 c38dbc5c 0 1 566 0004002 [SLPQ ttyin 0xc3693410][SLP] getty 565 c3939710 0 1 565 0004002 [SLPQ ttyin 0xc3693610][SLP] getty 564 c3939e20 0 1 564 0004002 [SLPQ ttyin 0xc3693810][SLP] getty 563 c393d1c4 0 1 563 0004002 [SLPQ ttyin 0xc3693a10][SLP] getty 523 c38d51c4 0 1 523 0000000 [RUNQ] cron 507 c393d54c 25 1 507 0000100 [SLPQ pause 0xc393d584][SLP] sendmail 503 c38d5000 0 1 503 0000100 [RUNQ] sendmail 497 c38dbe20 0 1 497 0000100 [SLPQ select 0xc0953b84][SLP] sshd 407 c38dba98 0 1 407 0000000 [SLPQ select 0xc0953b84][SLP] rpcbind 366 c3939000 0 1 366 0000000 [RUNQ] syslogd 348 c3939c5c 0 1 348 0000000 [SLPQ select 0xc0953b84][SLP] devd 119 c38d554c 0 0 0 0000204 [SLPQ - 0xc38a8600][SLP] gv_v datavol00 118 c38d5710 0 0 0 0000204 [SLPQ - 0xc38a8500][SLP] gv_v sysvol00 117 c38d58d4 0 0 0 0000204 [SLPQ - 0xc38a8400][SLP] gv_v vmvol00 116 c38d5a98 0 0 0 0000204 [SLPQ - 0xc38a8300][SLP] gv_v qbvol04 115 c38d5c5c 0 0 0 0000204 [SLPQ - 0xc38a8200][SLP] gv_v qbvol03 114 c38d5e20 0 0 0 0000204 [SLPQ - 0xc38a8700][SLP] gv_v qbvol02 113 c38db000 0 0 0 0000204 [SLPQ - 0xc38a8900][SLP] gv_v qbvol01 112 c38db1c4 0 0 0 0000204 [SLPQ - 0xc38a8b00][SLP] gv_v qbvol00 111 c38db388 0 0 0 0000204 [SLPQ - 0xc35e2a00][SLP] gv_p datavol00.p0 110 c38db54c 0 0 0 0000204 [SLPQ - 0xc38aa600][SLP] gv_p sysvol00.p0 109 c38db710 0 0 0 0000204 [SLPQ - 0xc35e2c00][SLP] gv_p vmvol00.p0 108 c38db8d4 0 0 0 0000204 [SLPQ - 0xc38ab800][SLP] gv_p qbvol04.p0 107 c353b1c4 0 0 0 0000204 [SLPQ - 0xc35e2e00][SLP] gv_p qbvol03.p0 106 c353b388 0 0 0 0000204 [SLPQ - 0xc38ab600][SLP] gv_p qbvol02.p0 105 c353b54c 0 0 0 0000204 [SLPQ - 0xc3623000][SLP] gv_p qbvol01.p0 104 c353b710 0 0 0 0000204 [SLPQ - 0xc38aac00][SLP] gv_p qbvol00.p0 103 c353b8d4 0 0 0 0000204 [SLPQ - 0xc38e0e00][SLP] gv_d array00 102 c353ba98 0 0 0 0000204 [RUNQ] schedcpu 101 c353bc5c 0 0 0 0000204 [SLPQ - 0xc0956bac][SLP] nfsiod 3 100 c353be20 0 0 0 0000204 [SLPQ - 0xc0956ba8][SLP] nfsiod 2 99 c38ac000 0 0 0 0000204 [SLPQ - 0xc0956ba4][SLP] nfsiod 1 98 c38ac1c4 0 0 0 0000204 [SLPQ - 0xc0956ba0][SLP] nfsiod 0 97 c38ac388 0 0 0 0000204 [SLPQ vlruwt 0xc38ac388][SLP] vnlru 96 c38ac54c 0 0 0 0000204 [RUNQ] syncer 95 c38ac710 0 0 0 0000204 [CPU 0] bufdaemon 94 c38ac8d4 0 0 0 000020c [SLPQ pgzero 0xc095d514][SLP] pagezero 9 c38aca98 0 0 0 0000204 [SLPQ psleep 0xc095d568][SLP] vmdaemon 8 c38acc5c 0 0 0 0000204 [SLPQ psleep 0xc095d524][SLP] pagedaemon 93 c351b54c 0 0 0 0000204 [IWAIT] swi0: sio 7 c351b710 0 0 0 0000204 [SLPQ actask 0xc0aa6bcc][SLP] acpi_task0 92 c351b8d4 0 0 0 0000204 [IWAIT] swi3: cambio 91 c351ba98 0 0 0 0000204 [IWAIT] swi2: camnet 90 c351bc5c 0 0 0 0000204 [IWAIT] swi6:+ 89 c351be20 0 0 0 0000204 [IWAIT] swi6: acpitaskq 6 c3538000 0 0 0 0000204 [SLPQ - 0xc35c0140][SLP] thread taskq 88 c35381c4 0 0 0 0000204 [IWAIT] swi6:+ 87 c3538388 0 0 0 0000204 [IWAIT] swi6: task queue 5 c353854c 0 0 0 0000204 [SLPQ - 0xc35c0280][SLP] kqueue taskq 86 c3538710 0 0 0 0000204 [SLPQ - 0xc09462a0][SLP] yarrow 4 c35388d4 0 0 0 0000204 [SLPQ - 0xc094abe8][SLP] g_down 3 c3538a98 0 0 0 0000204 [SLPQ - 0xc094abe4][SLP] g_up 2 c3538c5c 0 0 0 0000204 [SLPQ - 0xc094abdc][SLP] g_event 85 c3538e20 0 0 0 0000204 [IWAIT] swi1: net 84 c353b000 0 0 0 0000204 [IWAIT] swi4: vm 83 c3509a98 0 0 0 000020c [IWAIT] swi5: clock sio 82 c3509c5c 0 0 0 0000204 [IWAIT] irq0: clk 81 c3509e20 0 0 0 0000204 [IWAIT] irq71: 80 c3518000 0 0 0 0000204 [IWAIT] irq70: 79 c35181c4 0 0 0 0000204 [IWAIT] irq69: 78 c3518388 0 0 0 0000204 [IWAIT] irq68: 77 c351854c 0 0 0 0000204 [IWAIT] irq67: 76 c3518710 0 0 0 0000204 [IWAIT] irq66: 75 c35188d4 0 0 0 0000204 [IWAIT] irq65: 74 c3518a98 0 0 0 0000204 [IWAIT] irq64: 73 c3518c5c 0 0 0 0000204 [IWAIT] irq63: 72 c3518e20 0 0 0 0000204 [IWAIT] irq62: 71 c351b000 0 0 0 0000204 [IWAIT] irq61: 70 c351b1c4 0 0 0 0000204 [IWAIT] irq60: 69 c351b388 0 0 0 0000204 [IWAIT] irq59: 68 c34fa1c4 0 0 0 0000204 [IWAIT] irq58: 67 c34fa388 0 0 0 0000204 [IWAIT] irq57: 66 c34fa54c 0 0 0 0000204 [IWAIT] irq56: 65 c34fa710 0 0 0 0000204 [IWAIT] irq55: 64 c34fa8d4 0 0 0 0000204 [IWAIT] irq54: 63 c34faa98 0 0 0 0000204 [IWAIT] irq53: 62 c34fac5c 0 0 0 0000204 [IWAIT] irq52: 61 c34fae20 0 0 0 0000204 [IWAIT] irq51: 60 c3509000 0 0 0 0000204 [IWAIT] irq50: 59 c35091c4 0 0 0 0000204 [IWAIT] irq49: 58 c3509388 0 0 0 0000204 [IWAIT] irq48: twa0 57 c350954c 0 0 0 0000204 [IWAIT] irq47: 56 c3509710 0 0 0 0000204 [IWAIT] irq46: 55 c35098d4 0 0 0 0000204 [IWAIT] irq45: 54 c34e2a98 0 0 0 0000204 [IWAIT] irq44: 53 c34e2c5c 0 0 0 0000204 [IWAIT] irq43: 52 c34e2e20 0 0 0 0000204 [IWAIT] irq42: 51 c34f6000 0 0 0 0000204 [IWAIT] irq41: 50 c34f61c4 0 0 0 0000204 [IWAIT] irq40: 49 c34f6388 0 0 0 0000204 [IWAIT] irq39: 48 c34f654c 0 0 0 0000204 [IWAIT] irq38: 47 c34f6710 0 0 0 0000204 [IWAIT] irq37: 46 c34f68d4 0 0 0 0000204 [IWAIT] irq36: 45 c34f6a98 0 0 0 0000204 [IWAIT] irq35: 44 c34f6c5c 0 0 0 0000204 [IWAIT] irq34: 43 c34f6e20 0 0 0 0000204 [IWAIT] irq33: 42 c34fa000 0 0 0 0000204 [IWAIT] irq32: 41 c34d654c 0 0 0 0000204 [IWAIT] irq31: em3 40 c34d6710 0 0 0 0000204 [IWAIT] irq30: em2 39 c34d68d4 0 0 0 0000204 [IWAIT] irq29: em1 38 c34d6a98 0 0 0 0000204 [IWAIT] irq28: em0 37 c34d6c5c 0 0 0 0000204 [IWAIT] irq27: 36 c34d6e20 0 0 0 0000204 [IWAIT] irq26: 35 c34e2000 0 0 0 0000204 [IWAIT] irq25: 34 c34e21c4 0 0 0 0000204 [IWAIT] irq24: 33 c34e2388 0 0 0 0000204 [IWAIT] irq23: em4 32 c34e254c 0 0 0 0000204 [IWAIT] irq22: 31 c34e2710 0 0 0 0000204 [IWAIT] irq21: 30 c34e28d4 0 0 0 0000204 [IWAIT] irq20: fxp0 29 c34831c4 0 0 0 0000204 [IWAIT] irq19: 28 c3483388 0 0 0 0000204 [IWAIT] irq18: 27 c348354c 0 0 0 0000204 [IWAIT] irq17: 26 c3483710 0 0 0 0000204 [IWAIT] irq16: 25 c34838d4 0 0 0 0000204 [IWAIT] irq15: ata1 24 c3483a98 0 0 0 0000204 [IWAIT] irq14: ata0 23 c3483c5c 0 0 0 0000204 [IWAIT] irq13: 22 c3483e20 0 0 0 0000204 [IWAIT] irq12: 21 c34d6000 0 0 0 0000204 [IWAIT] irq11: 20 c34d61c4 0 0 0 0000204 [IWAIT] irq10: 19 c34d6388 0 0 0 0000204 [IWAIT] irq9: acpi0 18 c347c000 0 0 0 0000204 [IWAIT] irq8: rtc 17 c347c1c4 0 0 0 0000204 [IWAIT] irq7: ppc0 16 c347c388 0 0 0 0000204 [IWAIT] irq6: 15 c347c54c 0 0 0 0000204 [IWAIT] irq5: 14 c347c710 0 0 0 0000204 [IWAIT] irq4: sio0 13 c347c8d4 0 0 0 0000204 [IWAIT] irq3: sio1 12 c347ca98 0 0 0 0000204 [IWAIT] irq1: atkbd0 11 c347cc5c 0 0 0 000020c [Can run] idle 1 c347ce20 0 0 1 0004200 [SLPQ wait 0xc347ce20][SLP] init 10 c3483000 0 0 0 0000204 [SLPQ ktrace 0xc094e538][SLP] ktrace 0 c094ace0 0 0 0 0000200 [SLPQ sched 0xc094ace0][SLP] swapper 1423 c3c0e710 22 1422 1422 0002100 zomb[INACTIVE] sshd