Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 18 Oct 2015 09:18:12 -0400 (EDT)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Christian Kratzer <ck@cksoft.de>
Cc:        freebsd-stable@freebsd.org, John Baldwin <jhb@freebsd.org>
Subject:   Re: smbfs crashes since approx. 10.1-RELEASE
Message-ID:  <1459207327.41372204.1445174292836.JavaMail.zimbra@uoguelph.ca>
In-Reply-To: <alpine.BSF.2.20.1510161223390.47677@noc1.cksoft.de>
References:  <alpine.BSF.2.20.1510051157450.16263@noc1.cksoft.de> <358885214.31305796.1444518367048.JavaMail.zimbra@uoguelph.ca> <alpine.BSF.2.20.1510120946150.47677@noc1.cksoft.de> <alpine.BSF.2.20.1510121008010.47677@noc1.cksoft.de> <2135054744.32546564.1444653156980.JavaMail.zimbra@uoguelph.ca> <alpine.BSF.2.20.1510121552090.47677@noc1.cksoft.de> <173739656.33429352.1444704458926.JavaMail.zimbra@uoguelph.ca> <alpine.BSF.2.20.1510161223390.47677@noc1.cksoft.de>

next in thread | previous in thread | raw e-mail | index | archive | help
------=_Part_41372202_2128483097.1445174292834
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

Christian Kratzer wrote:
> Hi Rick,
> 
> looks like your latest patch nailed the issue. The box has been up for 3
> days:
> 
>      ck@noc3:~ % uptime
>      12:22PM  up 3 days,  4:11, 1 user, load averages: 0.07, 0.10, 0.08
>      ck@noc3:~ %
> 
> If it does not crash over the weekend this seems to be it:
> 
When I took a closer look, it appears that PR 172942 was a different crash and
it appears that one was fixed via r264600.

Your problem does not appear to be in the bugs database. (I will commit the
patch in mid-November anyhow, but creating a PR for this might be useful for
others.)

Btw, I think the attached patch (which includes this change) also fixes a
problem that caused a crash during mounting, reported via PR 201912.
(If you`d like to test this one that would be appreciated. It should be
 applied to code not already patched with the one below, since the below
 patch is included in it.)

Thanks for your help with this, rick

> 
> ck@noc3:/usr/src % svn diff sys/netsmb/smb_iod.c
> Index: sys/netsmb/smb_iod.c
> ===================================================================
> --- sys/netsmb/smb_iod.c        (revision 289211)
> +++ sys/netsmb/smb_iod.c        (working copy)
> @@ -659,6 +659,11 @@
>                          break;
>                  tsleep(&iod->iod_flags, PWAIT, "90idle",
>                  iod->iod_sleeptimo);
>          }
> +
> +       /* We can now safely destroy the mutexes and free the iod structure.
> */
> +       smb_sl_destroy(&iod->iod_rqlock);
> +       smb_sl_destroy(&iod->iod_evlock);
> +       free(iod, M_SMBIOD);
>          mtx_unlock(&Giant);
>          kproc_exit(0);
>   }
> @@ -695,9 +700,6 @@
>   smb_iod_destroy(struct smbiod *iod)
>   {
>          smb_iod_request(iod, SMBIOD_EV_SHUTDOWN | SMBIOD_EV_SYNC, NULL);
> -       smb_sl_destroy(&iod->iod_rqlock);
> -       smb_sl_destroy(&iod->iod_evlock);
> -       free(iod, M_SMBIOD);
>          return 0;
>   }
> 
> ck@noc3:/usr/src %
> 
> 
> Can you get this committed into current and later stable  ?
> 
> Greetings
> Christian
> 
> 
> 
> On Mon, 12 Oct 2015, Rick Macklem wrote:
> 
> > Christian Kratzer wrote:
> >> Hi Rick,
> >>
> >> On Mon, 12 Oct 2015, Rick Macklem wrote:
> >>
> >>> Christian Kratzer wrote:
> >>>> Hi Rick,
> >>>>
> >>>> there was also a second more recent crash in /var/crash
> >>>>
> >>>>      Mon Oct 12 03:01:16 CEST 2015
> >>>>
> >>>>      FreeBSD noc3.cksoft.de 10.2-STABLE FreeBSD 10.2-STABLE #2 r288980M:
> >>>>      Sun
> >>>>      Oct 11 08:37:40 CEST 2015
> >>>>      ck@noc3.cksoft.de:/usr/obj/usr/src/sys/NOC  amd64
> >>>>
> >>>>      panic: Assertion mtx_unowned(m) failed at
> >>>>      /usr/src/sys/kern/kern_mutex.c:955
> >>>>
> >>> Oops, I screwed up. I should have looked at this panic assertion when you
> >>> reported
> >>> it before. Ok, so if I understand the assertion correctly, it means that
> >>> another
> >>> thread has the mutex locked. If this is correct, I'll have to take
> >>> another
> >>> look at
> >>> the code and figure out how to wait for these other threads to finish
> >>> with
> >>> the mutexes.
> >>>
> >>> I do think the patch fixes the race I saw, but there must be other races
> >>> in
> >>> the code.
> >>>
> >>> I'll take another look, but if anyone else is conversant with netsmb,
> >>> feel
> >>> free to
> >>> jump in, because it is all new to me.
> >>>
> >>> Unfortunately, I won't have any way to do testing for the next month or
> >>> so,
> >>> so any
> >>> patches I do come up with will be "try this untested..".
> >>
> >> thats no problem.
> >>
> >> Just keep the patches coming when you have time and tell me when to reset
> >> back to stable,
> >> current or whatever so we don't lose sync of the status.
> >>
> > Well, you can try the attached one instead of the previous ones (ie.
> > against stable).
> > It just delays destroying the mutexes until the iod thread is exiting.
> >
> > I can't quite see why the previous patches wouldn't fix it, but this one
> > leaves
> > smb_iod_main() unchanged, so it is a simpler patch and doesn't affect
> > semantics
> > except for a slight delay in destroying the mutexes.
> >
> >> As it looks like that the race happens on unmount I could try putting a
> >> sleep
> >> 60 into the
> >> script that does the "mount && rsycn && umount" magic just before the
> >> umount.
> >> That would
> >> allow anything that it slow to go away to perhaps release the mutexes
> >> before
> >> the umount.
> >>
> > If it still crashes with this patch, it might be worth a try.
> >
> > Or, if this patch still crashes, you could just delete the 3 lines that the
> > patch moves, so the mutexes are never destroyed. This would result in a
> > leak,
> > but it would tell us if destroying these mutexes is the problem.
> >
> > Thanks for your willingness to test these, rick
> >
> >> Not a real fix of course but might help to verify what's going on.
> >>
> >> Greetings
> >> Christian
> >>
> >>
> >> --
> >> Christian Kratzer                   CK Software GmbH
> >> Email:   ck@cksoft.de               Wildberger Weg 24/2
> >> Phone:   +49 7032 893 997 - 0       D-71126 Gaeufelden
> >> Fax:     +49 7032 893 997 - 9       HRB 245288, Amtsgericht Stuttgart
> >> Mobile:  +49 171 1947 843           Geschaeftsfuehrer: Christian Kratzer
> >> Web:     http://www.cksoft.de/
> >> _______________________________________________
> >> freebsd-stable@freebsd.org mailing list
> >> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> >> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
> >>
> >
> 
> --
> Christian Kratzer                   CK Software GmbH
> Email:   ck@cksoft.de               Wildberger Weg 24/2
> Phone:   +49 7032 893 997 - 0       D-71126 Gaeufelden
> Fax:     +49 7032 893 997 - 9       HRB 245288, Amtsgericht Stuttgart
> Mobile:  +49 171 1947 843           Geschaeftsfuehrer: Christian Kratzer
> Web:     http://www.cksoft.de/
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
> 

------=_Part_41372202_2128483097.1445174292834
Content-Type: text/x-patch; name=smbiod3.patch
Content-Disposition: attachment; filename=smbiod3.patch
Content-Transfer-Encoding: base64

LS0tIHNtYl9pb2QuYy5vcmlnCTIwMTUtMTAtMTAgMTg6NTM6MzQuMDAwMDAwMDAwIC0wNDAwCisr
KyBzbWJfaW9kLmMJMjAxNS0xMC0xNiAyMToxNDo1MS4wMDAwMDAwMDAgLTA0MDAKQEAgLTY1OSw2
ICs2NTksMTEgQEAgc21iX2lvZF90aHJlYWQodm9pZCAqYXJnKQogCQkJYnJlYWs7CiAJCXRzbGVl
cCgmaW9kLT5pb2RfZmxhZ3MsIFBXQUlULCAiOTBpZGxlIiwgaW9kLT5pb2Rfc2xlZXB0aW1vKTsK
IAl9CisKKwkvKiBXZSBjYW4gbm93IHNhZmVseSBkZXN0cm95IHRoZSBtdXRleGVzIGFuZCBmcmVl
IHRoZSBpb2Qgc3RydWN0dXJlLiAqLworCXNtYl9zbF9kZXN0cm95KCZpb2QtPmlvZF9ycWxvY2sp
OworCXNtYl9zbF9kZXN0cm95KCZpb2QtPmlvZF9ldmxvY2spOworCWZyZWUoaW9kLCBNX1NNQklP
RCk7CiAJbXR4X3VubG9jaygmR2lhbnQpOwogCWtwcm9jX2V4aXQoMCk7CiB9CkBAIC02ODUsNiAr
NjkwLDkgQEAgc21iX2lvZF9jcmVhdGUoc3RydWN0IHNtYl92YyAqdmNwKQogCSAgICBSRk5PV0FJ
VCwgMCwgInNtYmlvZCVkIiwgaW9kLT5pb2RfaWQpOwogCWlmIChlcnJvcikgewogCQlTTUJFUlJP
UigiY2FuJ3Qgc3RhcnQgc21iaW9kOiAlZCIsIGVycm9yKTsKKwkJdmNwLT52Y19pb2QgPSBOVUxM
OworCQlzbWJfc2xfZGVzdHJveSgmaW9kLT5pb2RfcnFsb2NrKTsKKwkJc21iX3NsX2Rlc3Ryb3ko
JmlvZC0+aW9kX2V2bG9jayk7CiAJCWZyZWUoaW9kLCBNX1NNQklPRCk7CiAJCXJldHVybiBlcnJv
cjsKIAl9CkBAIC02OTUsOSArNzAzLDYgQEAgaW50CiBzbWJfaW9kX2Rlc3Ryb3koc3RydWN0IHNt
YmlvZCAqaW9kKQogewogCXNtYl9pb2RfcmVxdWVzdChpb2QsIFNNQklPRF9FVl9TSFVURE9XTiB8
IFNNQklPRF9FVl9TWU5DLCBOVUxMKTsKLQlzbWJfc2xfZGVzdHJveSgmaW9kLT5pb2RfcnFsb2Nr
KTsKLQlzbWJfc2xfZGVzdHJveSgmaW9kLT5pb2RfZXZsb2NrKTsKLQlmcmVlKGlvZCwgTV9TTUJJ
T0QpOwogCXJldHVybiAwOwogfQogCi0tLSBzbWJfY29ubi5jLnNhdgkyMDE1LTEwLTE2IDIxOjA5
OjQ3LjAwMDAwMDAwMCAtMDQwMAorKysgc21iX2Nvbm4uYwkyMDE1LTEwLTE2IDIxOjEwOjQzLjAw
MDAwMDAwMCAtMDQwMApAQCAtNjgzLDcgKzY4Myw5IEBAIGludAogc21iX3ZjX2Rpc2Nvbm5lY3Qo
c3RydWN0IHNtYl92YyAqdmNwKQogewogCi0Jc21iX2lvZF9yZXF1ZXN0KHZjcC0+dmNfaW9kLCBT
TUJJT0RfRVZfRElTQ09OTkVDVCB8IFNNQklPRF9FVl9TWU5DLCBOVUxMKTsKKwlpZiAodmNwLT52
Y19pb2QpCisJCXNtYl9pb2RfcmVxdWVzdCh2Y3AtPnZjX2lvZCwgU01CSU9EX0VWX0RJU0NPTk5F
Q1QgfAorCQkgICAgU01CSU9EX0VWX1NZTkMsIE5VTEwpOwogCXJldHVybiAwOwogfQogCg==
------=_Part_41372202_2128483097.1445174292834--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1459207327.41372204.1445174292836.JavaMail.zimbra>