From owner-freebsd-stable@FreeBSD.ORG Fri Mar 29 18:19:53 2013 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 1837318A for ; Fri, 29 Mar 2013 18:19:53 +0000 (UTC) (envelope-from mi+thun@aldan.algebra.com) Received: from smtp.rcn.com (smtp.rcn.com [69.168.97.78]) by mx1.freebsd.org (Postfix) with ESMTP id BAFAC290 for ; Fri, 29 Mar 2013 18:19:52 +0000 (UTC) X_CMAE_Category: 0,0 Undefined,Undefined X-CNFS-Analysis: v=2.0 cv=EehKsYaC c=1 sm=0 a=fEl05wXzeJCkBz9gs2itqQ==:17 a=FtuzIWGT4-YA:10 a=2KMPMXqnonIA:10 a=YNqtyO0l_hcA:10 a=LaogzpLLAAAA:8 a=YDosoCDu0MMA:10 a=kqInehigDGRn8pZ9jrsA:9 a=QEXdDO2ut3YA:10 a=RdatWQ088fg4aZLYEhgA:9 a=_W_S_7VecoQA:10 a=g2B_iAzQUjWKgafg:21 a=fEl05wXzeJCkBz9gs2itqQ==:117 X-CM-Score: 0 X-Scanned-by: Cloudmark Authority Engine Authentication-Results: smtp02.rcn.cmh.synacor.com header.from=mi+thun@aldan.algebra.com; sender-id=neutral Authentication-Results: smtp02.rcn.cmh.synacor.com smtp.mail=mi+thun@aldan.algebra.com; spf=neutral; sender-id=neutral Authentication-Results: smtp02.rcn.cmh.synacor.com smtp.user=anat; auth=pass (PLAIN) Received-SPF: neutral (smtp02.rcn.cmh.synacor.com: 209.6.63.29 is neither permitted nor denied by domain of aldan.algebra.com) Received: from [209.6.63.29] ([209.6.63.29:10315] helo=utka.zajac) by smtp.rcn.com (envelope-from ) (ecelerity 2.2.3.49 r(42060/42061)) with ESMTPA id C9/7B-28841-74BD5515; Fri, 29 Mar 2013 14:19:51 -0400 Message-ID: <5155DB46.3030601@aldan.algebra.com> Date: Fri, 29 Mar 2013 14:19:50 -0400 From: "Mikhail T." User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:17.0) Gecko/20130325 Thunderbird/17.0.4 MIME-Version: 1.0 To: stable@FreeBSD.org Subject: smbfus: panic on the second attempt to reach unavailable server Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: =?UTF-8?B?0JHQvtGA0LjRgSDQn9C+0L/QvtCy?= X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Mar 2013 18:19:53 -0000 Hello! I have my FreeBSD-server dump nightly backups onto an entertainment device running embedded Linux. The device has no NFS-server, but does run Samba (3.0.30). It allows access to its internal hard-drive, which my server mounts as: //dune/hdd750_..._32 /dune smbfs rw,noauto,-N,-Ekoi8-u:utf-8 There are two nightly cronjob using dump(8), xz(1), and ccrypt(1) to dump two "important" filesystems (/var/spool/imap and /home). The imap one kicks off at 3:11am and the home -- at 3:31am. This normally works perfectly fine every night, except when somebody accidentally sits on top of the remote-control of the entertainment device in the living room -- or somehow else managed to turn the box off. When this happens, the first dump simply fails, as one would expect: cannot create /dune/backups/narawntapu.imap.1.Tuesday.dump.xz.cpt: No such file or directory DUMP: Date of this level 1 dump: Tue Mar 12 03:11:07 2013 DUMP: Date of last level 0 dump: Wed Mar 6 01:31:07 2013 DUMP: Dumping snapshot of /dev/da0a (/var/spool/imap) to standard output DUMP: mapping (Pass I) [regular files] DUMP: Cache 16 MB, blocksize = 65536 DUMP: mapping (Pass II) [directories] DUMP: estimated 169895 tape blocks. DUMP: dumping (Pass III) [directories] DUMP: Broken pipe DUMP: The ENTIRE dump is aborted. However, when the second job tries to do the same twenty minutes later, the machine panics. This morning I was able to get a kernel coredump: ... #6 0xffffffff80750f2f in calltrap () at /cache/src/sys/amd64/amd64/exception.S:228 No locals. #7 0xffffffff805a46ca in turnstile_broadcast (ts=0x0, queue=0) at /cache/src/sys/kern/subr_turnstile.c:838 _tid = ts1 = td = #8 0xffffffff80550e52 in _mtx_unlock_sleep (m=0xfffffe0105ecd8f0, opts=, file=, line=) at /cache/src/sys/kern/kern_mutex.c:715 ts = (struct turnstile *) 0x0 #9 0xffffffff8101a0cd in smb_iod_invrq (iod=) at /cache/src/sys/modules/smbfs/../../netsmb/smb_iod.c:91 rqp = (struct smb_rq *) 0xfffffe0105ecd800 #10 0xffffffff8101b172 in smb_iod_addrq (rqp=0xfffffe0105ecd800) at /cache/src/sys/modules/smbfs/../../netsmb/smb_iod.c:418 vcp = iod = (struct smbiod *) 0xfffffe009483b800 error = __func__ = "uЪ", '\220' #11 0xffffffff81017da2 in smb_rq_simple (rqp=0xfffffe0105ecd800) at /cache/src/sys/modules/smbfs/../../netsmb/smb_rq.c:168 vcp = (struct smb_vc *) 0xfffffe011f957000 error = i = 0 #12 0xffffffff81016202 in smb_smb_treeconnect (ssp=0xfffffe015f069200, scred=0xfffffe009483b868) at /cache/src/sys/modules/smbfs/../../netsmb/smb_smb.c:574 vcp = (struct smb_vc *) 0xfffffe011f957000 rq = {sr_state = 1720810032, sr_vc = 0xfffffe0002a8c490, sr_share = 0xffffff8366917a90, sr_mid = 40352, sr_seqno = 4294967295, sr_rseqno = 1720810112, sr_rq = {mb_top = 0xffffffff80574fea, mb_cur = 0x100000001, mb_mleft = 1458488464, mb_count = -512, mb_copy = 0xffffff8366917a80, mb_udata = 0xffffffff80755149}, sr_rqflags = 0 '\0', sr_rqflags2 = 0, sr_wcount = 0x0, sr_bcount = 0xffffff8366917ac0, sr_rp = {md_top = 0xffffffff8057546d, md_cur = 0x0, md_pos = 0xfffffe0056eec490 "\2005л\200ЪЪЪЪ"}, sr_rpgen = -1803307004, sr_rplast = -512, sr_flags = 1458488464, sr_rpsize = -512, sr_cred = 0xfffffe009483b804, sr_timo = 1458488464, sr_rexmit = -512, sr_sendcnt = 1720810208, sr_timesent = {tv_sec = 582, tv_nsec = -2196531595260}, sr_lerror = 0, sr_rqsig = 0xffffff8366917b10 "\200{\221f\203ЪЪЪ\206╚V\200ЪЪЪЪ\200{\221f\203ЪЪЪ\035є\001\201п\a", sr_rqtid = 0xffffffff805a0e97, sr_rquid = 0xffffff8366917b10, sr_errclass = 1 '\001', sr_serror = 0, sr_error = 0, sr_rpflags = 208 'п', sr_rpflags2 = 0, sr_rptid = 0, sr_rppid = 0, sr_rpuid = 0, sr_rpmid = 0, sr_slock = {lock_object = {lo_name = 0xffffff8366917b80 "Ю{\221f\203ЪЪЪ\032ґ\001\201ЪЪЪЪП{\221f\203ЪЪЪ\230╦\203\224", lo_flags = 2153163654, lo_data = 4294967295, lo_witness = 0xffffff8366917b80}, mtx_lock = 8592098960413}, sr_t2 = 0xffffffff8102517c, sr_link = {tqe_next = 0x9483b820, tqe_prev = 0x0}} rqp = (struct smb_rq *) 0xfffffe0105ecd800 mbp = (struct mbchain *) 0xfffffe0105ecd828 pp = pbuf = 0x0 encpass = 0x0 error = plen = 1 upper = 0 #13 0xffffffff8101ad1a in smb_iod_thread (arg=) at /cache/src/sys/modules/smbfs/../../netsmb/smb_iod.c:206 iod = (struct smbiod *) 0xfffffe009483b800 #14 0xffffffff805365df in fork_exit (callout=0xffffffff8101aa83 , arg=0xfffffe009483b800, frame=0xffffff8366917c40) at /cache/src/sys/kern/kern_fork.c:992 p = (struct proc *) 0xfffffe0181104000 td = (struct thread *) 0xfffffe0056eec490 #15 0xffffffff8075145e in fork_trampoline () at /cache/src/sys/amd64/amd64/exception.S:602 Looking inside the smb_iod_invrq (smb_iod.c:91), I'm wondering, if an attempt is made to invalidate/release something twice (causing the turnstile_broadcast() to be invoked with ts being NULL the second time)? That would explain, why the first attempt to use the absent server errors-out as normal, and only the second attempt panics. My kernel is 9.1-PRERELEASE as of Dec 19. Any ideas? Thanks! Yours, -mi