Date: Tue, 4 Jun 2013 13:09:08 +0530 From: Ajit Jain <ajit.jain@cloudbyte.com> To: freebsd-fs <freebsd-fs@freebsd.org>, Steven Hartland <killing@multiplay.co.uk> Subject: Fwd: seeing data corruption with zfs trim functionality Message-ID: <CAA71u6ZmZNKOHECqX=cEuVLFNfZkTCD6yUaz%2BhnG2GKsUHVp7A@mail.gmail.com> In-Reply-To: <CAA71u6a8d5b5CdaAp50HLGmNvK7p1PBJM6yH8AisCsSf%2B8U3-A@mail.gmail.com> References: <CAA71u6Y5dKZ9O0rqxCpx-9t7DYgTnPZSoNy-iHOnmzrOUYp%2Bvw@mail.gmail.com> <CAA71u6Zh7BbbdC=utqfR2MD1Nn=9euUDXHKqqu9NyBG-Jx%2B=Ow@mail.gmail.com> <9681E07546D348168052D4FC5365B4CD@multiplay.co.uk> <CAA71u6ZuO9CF0ECFS4z07-E5qPea-6SfNwkvhr_g6pFT5MV5yQ@mail.gmail.com> <CAA71u6YKGHDRVg6W_xnCNaA68bJvAZ2Lkp-UisiPqb1vKjJhfA@mail.gmail.com> <3E9CA9334E6F433A8F135ACD5C237340@multiplay.co.uk> <CAA71u6YZAKrmfTLU32f8UmYecmydwiqRT-OrR1ukZ9V6PGsU%2Bw@mail.gmail.com> <A05ACD84EB974E80B7142CE9982E479C@multiplay.co.uk> <93D0677B373A452BAF58C8EA6823783D@multiplay.co.uk> <CAA71u6bZ_4fb9FxYSwcrHBBApkZog30iQJGyTERi-xFMksud1g@mail.gmail.com> <35ABA7AAEB7F4D86A1ED54C4C47FEB49@multiplay.co.uk> <CAA71u6ahzRai=uUp5L6nDQxxEZC=d5jd4jBBfPNa2k29OwTZDg@mail.gmail.com> <2C2F5CAAE72B4658BFA09E4694A21375@multiplay.co.uk> <CAA71u6a3TJ_sO3Q%2BiJa8EHKE2iM0MKh31D37pGAoua7QU_6xYg@mail.gmail.com> <6E4EBFE196274519B847A47A062950EE@multiplay.co.uk> <CAA71u6bZqYcyW-3RAQj9zjYcWp%2BUXPa4KhH4__nY=S6EuVVR-w@mail.gmail.com> <F71FEDB8BA5142C5A3A0F72DF75A6421@multiplay.co.uk> <CAA71u6a8d5b5CdaAp50HLGmNvK7p1PBJM6yH8AisCsSf%2B8U3-A@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] Hi Steven, I am not able to send full output file to freebsd-fs. I am just sending the error file in this mail and will send you another mail which contain to full untar output. regards, ajit ---------- Forwarded message ---------- From: Ajit Jain <ajit.jain@cloudbyte.com> Date: Mon, Jun 3, 2013 at 11:51 PM Subject: Re: seeing data corruption with zfs trim functionality To: Steven Hartland <killing@multiplay.co.uk> Cc: freebsd-fs <freebsd-fs@freebsd.org> Hi Steven, untar of the tarball is throwing the error below: tar: Error exit delayed from previous errors. I have download the file from the link 3 times, every time I am seeing the same issue. Please find the tar output file and error (grep from the tar output file) attached with mail. checksum of tar ball (after unzip, on freebsd) is: root@everest:/pool_9stable/obj_src/new # cksum stable-9-r251096.tar 2972813925 3474278400 stable-9-r251096.tar regards, ajit On Fri, May 31, 2013 at 4:12 AM, Steven Hartland <killing@multiplay.co.uk>wrote: > Tar archive of /usr/src and /usr/obj with built world and GENERIC kernel > for ams64 can be found here:- > http://blog.multiplay.co.uk/**dropzone/freebsd/stable-9-**r251096.tar.gz<http://blog.multiplay.co.uk/dropzone/freebsd/stable-9-r251096.tar.gz> > > This is based off r251096 with current proposed MFC of CAM BIO_DELETE & > ZFS TRIM. > > > Regards > Steve > ----- Original Message ----- From: "Ajit Jain" <ajit.jain@cloudbyte.com> > > > Hi Steven, >> >> That would be really great. I'll install build provided by you and can >> quickly >> update the result. I am kind of feeling that I am asking too much of fever >> from you. >> >> thanks for the help and bearing me, >> ajit >> >> >> On Wed, May 29, 2013 at 6:39 PM, Steven Hartland <killing@multiplay.co.uk >> >**wrote: >> >> Unfortunately FS corruption is a serious matters so even though I'm >>> 99.99% >>> convinced there isn't a problem I'd still prefer to confirm this was >>> indeed >>> an issue with your code base and not an issue with the current code prior >>> to MFC'ing. >>> >>> Would a pre-patched stable/9 source / build help. If so I can look at >>> making >>> that available for you. >>> >>> >>> Regards >>> Steve >>> >>> ----- Original Message ----- From: "Ajit Jain" <ajit.jain@cloudbyte.com> >>> >>> >>> Hi Steven, >>> >>>> >>>> Sorry for the long delay, but might delay even further. >>>> I think the reason for the corruption was, my code >>>> was not updated specially cam directory. >>>> >>>> I request please do not stop just because of the issue I reported. >>>> I'll update my src tree and rerun the experiments I was running >>>> if I see some issue then probably we fix the bug rather then stopping >>>> for MFC. >>>> >>>> thanks, >>>> ajit >>>> >>>> >>>> >>>> On Wed, May 29, 2013 at 5:19 PM, Steven Hartland < >>>> killing@multiplay.co.uk >>>> >**wrote: >>>> >>>> >>>> Sorry to pester, but any update on this Ajit? >>>> >>>>> >>>>> I ask as its currently blocking the MFC of TRIM to stable/8 & 9 and >>>>> I've >>>>> been >>>>> unable to reproduce this issue even with your testing code on working >>>>> FW >>>>> versions. >>>>> >>>>> >>>>> Regards >>>>> Steve >>>>> >>>>> ----- Original Message ----- From: "Ajit Jain" < >>>>> ajit.jain@cloudbyte.com> >>>>> >>>>> >>>>> Sure Steven, >>>>> >>>>> I'll apply the patches and update ASAP. >>>>>> >>>>>> thanks >>>>>> ajit >>>>>> >>>>>> >>>>>> On Thu, May 23, 2013 at 3:03 PM, Steven Hartland < >>>>>> killing@multiplay.co.uk >>>>>> >**wrote: >>>>>> >>>>>> >>>>>> I've attacked the two patch sets I'm looking to MFC to stable-9, one >>>>>> >>>>>> adds BIO_DELETE CAM changes and the other is ZFS TRIM support. >>>>>>> >>>>>>> They should both apply cleanly to stable-9, if you could test with >>>>>>> those on your machine and let me know. >>>>>>> >>>>>>> Regards >>>>>>> Steve >>>>>>> >>>>>>> ----- Original Message ----- From: "Ajit Jain" < >>>>>>> ajit.jain@cloudbyte.com> >>>>>>> >>>>>>> >>>>>>> Hi Steven, >>>>>>> >>>>>>> >>>>>>> FW version on the setup is P15. >>>>>>>> I will upgrade the FW to P16, but I think my >>>>>>>> best bet will be to update code base to 9 stable as unlike you, >>>>>>>> I was seeing corruption for all three delete methods. >>>>>>>> >>>>>>>> thanks >>>>>>>> ajit >>>>>>>> >>>>>>>> On Sat, May 18, 2013 at 4:15 AM, Steven Hartland < >>>>>>>> killing@multiplay.co.uk >>>>>>>> >**wrote: >>>>>>>> >>>>>>>> >>>>>>>> ----- Original Message ----- From: "Steven Hartland" < >>>>>>>> >>>>>>>> killing@multiplay.co.uk> >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> After initially seeing not issues, our overnight monitoring >>>>>>>>> started >>>>>>>>> >>>>>>>>> moaning >>>>>>>>> >>>>>>>>>> big time on the test box. So we checked and there was zpool >>>>>>>>>> corruption >>>>>>>>>> as >>>>>>>>>> well >>>>>>>>>> as a missing boot loader and a corrupt GPT, so I believe we have >>>>>>>>>> reproduced >>>>>>>>>> your issue. >>>>>>>>>> >>>>>>>>>> After recovering the machine I created 3 pools on 3 different >>>>>>>>>> disks >>>>>>>>>> each >>>>>>>>>> running a different delete_method. >>>>>>>>>> >>>>>>>>>> We then re-ran the tests which resulted in the pool running with >>>>>>>>>> delete_method >>>>>>>>>> WS16 being so broken it had suspended IO. A reboot resulted in it >>>>>>>>>> once >>>>>>>>>> again >>>>>>>>>> reporting no partition table via gpart. >>>>>>>>>> >>>>>>>>>> A third test run again produced a corrupt pool for WS16. >>>>>>>>>> >>>>>>>>>> I've conducted a preliminary review of the CAM WS16 code path >>>>>>>>>> along >>>>>>>>>> with >>>>>>>>>> SBC-3 >>>>>>>>>> spec which didn't identify any obvious issues. >>>>>>>>>> >>>>>>>>>> Given we're both using LSI 2008 based controllers it could be FW >>>>>>>>>> issue >>>>>>>>>> specific >>>>>>>>>> to WS16 but that's just speculation atm, so I'll continue to >>>>>>>>>> investigate. >>>>>>>>>> >>>>>>>>>> If you could re-test you end without using WS16 to see if you can >>>>>>>>>> reproduce the >>>>>>>>>> problem with either UNMAP or ATA_TRIM that would be a very useful >>>>>>>>>> data >>>>>>>>>> point. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> After much playing I narrow down a test case of one delete which >>>>>>>>>> was >>>>>>>>>> >>>>>>>>>> causing >>>>>>>>> disc corruption for us (deleted the partition table instead of data >>>>>>>>> in >>>>>>>>> the middle of the disk). >>>>>>>>> >>>>>>>>> The conclusion is LSI 2008 HBA with FW below P13 will eat the data >>>>>>>>> on >>>>>>>>> your >>>>>>>>> SATA >>>>>>>>> disks if you use WS16 due to the following bug:- >>>>>>>>> SCGCQ00230159 (DFCT) - Write same command to a SATA drive that >>>>>>>>> doesn't >>>>>>>>> support >>>>>>>>> SCT write same may write wrong region. >>>>>>>>> >>>>>>>>> After updating here to P16, which we would generally be running, >>>>>>>>> but >>>>>>>>> test >>>>>>>>> box >>>>>>>>> was new and hadnt updated yet the corruption issue is no longer >>>>>>>>> reproducable. >>>>>>>>> >>>>>>>>> So Ajit please check your FW version, I'm hoping to here your on >>>>>>>>> something >>>>>>>>> below P13, P12 possibly? >>>>>>>>> >>>>>>>>> If so then this is your issue, to fix simply update to P16 and the >>>>>>>>> problem >>>>>>>>> should be gone. >>>>>>>>> >>>>>>>>> >>>>>>>>> Regards >>>>>>>>> Steve >>>>>>>>> >>>>>>>>> >>>>>>>>> ==============================**********================== >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> This e.mail is private and confidential between Multiplay (UK) Ltd. >>>>>>>>> and >>>>>>>>> the person or entity to whom it is addressed. In the event of >>>>>>>>> misdirection, >>>>>>>>> the recipient is prohibited from using, copying, printing or >>>>>>>>> otherwise >>>>>>>>> disseminating it or any information contained in it. >>>>>>>>> In the event of misdirection, illegible or incomplete transmission >>>>>>>>> please >>>>>>>>> telephone +44 845 868 1337 >>>>>>>>> or return the E.mail to postmaster@multiplay.co.uk. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ==============================********================== >>>>>>>>> >>>>>>>> >>>>>>>> This e.mail is private and confidential between Multiplay (UK) >>>>>>> Ltd. and >>>>>>> the person or entity to whom it is addressed. In the event of >>>>>>> misdirection, >>>>>>> the recipient is prohibited from using, copying, printing or >>>>>>> otherwise >>>>>>> disseminating it or any information contained in it. >>>>>>> In the event of misdirection, illegible or incomplete transmission >>>>>>> please >>>>>>> telephone +44 845 868 1337 >>>>>>> or return the E.mail to postmaster@multiplay.co.uk. >>>>>>> >>>>>>> >>>>>>> >>>>>>> ==============================******================== >>>>>> >>>>> This e.mail is private and confidential between Multiplay (UK) Ltd. and >>>>> the person or entity to whom it is addressed. In the event of >>>>> misdirection, >>>>> the recipient is prohibited from using, copying, printing or otherwise >>>>> disseminating it or any information contained in it. >>>>> In the event of misdirection, illegible or incomplete transmission >>>>> please >>>>> telephone +44 845 868 1337 >>>>> or return the E.mail to postmaster@multiplay.co.uk. >>>>> >>>>> >>>>> >>>>> >>>> ==============================****================== >>> This e.mail is private and confidential between Multiplay (UK) Ltd. and >>> the person or entity to whom it is addressed. In the event of >>> misdirection, >>> the recipient is prohibited from using, copying, printing or otherwise >>> disseminating it or any information contained in it. >>> In the event of misdirection, illegible or incomplete transmission please >>> telephone +44 845 868 1337 >>> or return the E.mail to postmaster@multiplay.co.uk. >>> >>> >>> >> > ==============================**================== > This e.mail is private and confidential between Multiplay (UK) Ltd. and > the person or entity to whom it is addressed. In the event of misdirection, > the recipient is prohibited from using, copying, printing or otherwise > disseminating it or any information contained in it. > In the event of misdirection, illegible or incomplete transmission please > telephone +44 845 868 1337 > or return the E.mail to postmaster@multiplay.co.uk. > > [-- Attachment #2 --] x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Skopje: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Skopje' x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Zagreb: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Zagreb' x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Isle_of_Man: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Isle_of_Man' x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Guernsey: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Guernsey' x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Jersey: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Jersey' x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Podgorica: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Podgorica' x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Vatican: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Vatican' x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Ljubljana: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Ljubljana' x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Zurich: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Zurich' x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Rome: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Rome' x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Belgrade: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Belgrade' x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Prague: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Prague' x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Mariehamn: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Mariehamn' x usr/obj/usr/src/share/zoneinfo/builddir/Arctic/Longyearbyen: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Arctic/Longyearbyen' x usr/obj/usr/src/share/zoneinfo/builddir/Asia/Istanbul: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Asia/Istanbul' x usr/obj/usr/src/share/zoneinfo/builddir/Asia/Nicosia: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Asia/Nicosia' x usr/obj/usr/src/share/zoneinfo/builddir/Antarctica/McMurdo: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Antarctica/McMurdo' x usr/obj/usr/src/share/zoneinfo/builddir/America/Lower_Princes: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/America/Lower_Princes' x usr/obj/usr/src/share/zoneinfo/builddir/America/New_York: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/America/New_York' x usr/obj/usr/src/share/zoneinfo/builddir/America/Kralendijk: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/America/Kralendijk' x usr/obj/usr/src/share/zoneinfo/builddir/America/Marigot: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/America/Marigot' x usr/obj/usr/src/share/zoneinfo/builddir/America/Shiprock: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/America/Shiprock' x usr/obj/usr/src/share/zoneinfo/builddir/America/St_Barthelemy: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/America/St_Barthelemy' x usr/obj/usr/src/share/zoneinfo/builddir/Etc/Universal: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Etc/Universal' x usr/obj/usr/src/share/zoneinfo/builddir/Etc/GMT0: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Etc/GMT0' x usr/obj/usr/src/share/zoneinfo/builddir/Etc/GMT+0: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Etc/GMT+0' x usr/obj/usr/src/share/zoneinfo/builddir/Etc/GMT: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Etc/GMT' x usr/obj/usr/src/share/zoneinfo/builddir/Etc/Greenwich: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Etc/Greenwich' x usr/obj/usr/src/share/zoneinfo/builddir/Etc/Zulu: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Etc/Zulu' x usr/obj/usr/src/share/zoneinfo/builddir/Etc/UTC: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Etc/UTC' x usr/obj/usr/src/tmp/legacy/usr/libexec/makewhatis.local: Can't create 'usr/obj/usr/src/tmp/legacy/usr/libexec/makewhatis.local' x usr/obj/usr/src/tmp/usr/bin/gcpp: Can't create 'usr/obj/usr/src/tmp/usr/bin/gcpp' x usr/obj/usr/src/tmp/usr/bin/CC: Can't create 'usr/obj/usr/src/tmp/usr/bin/CC' x usr/obj/usr/src/tmp/usr/bin/cc: Can't create 'usr/obj/usr/src/tmp/usr/bin/cc' x usr/obj/usr/src/tmp/usr/bin/c++: Can't create 'usr/obj/usr/src/tmp/usr/bin/c++' x usr/obj/usr/src/tmp/usr/lib/libfl.a: Can't create 'usr/obj/usr/src/tmp/usr/lib/libfl.a' x usr/obj/usr/src/tmp/usr/lib/libln.a: Can't create 'usr/obj/usr/src/tmp/usr/lib/libln.a' x usr/obj/usr/src/lib32/usr/lib32/libfl_p.a: Can't create 'usr/obj/usr/src/lib32/usr/lib32/libfl_p.a' x usr/obj/usr/src/lib32/usr/lib32/libfl.a: Can't create 'usr/obj/usr/src/lib32/usr/lib32/libfl.a' x usr/obj/usr/src/lib32/usr/lib32/libl.a: Can't create 'usr/obj/usr/src/lib32/usr/lib32/libl.a' x usr/obj/usr/src/lib32/usr/lib32/libln_p.a: Can't create 'usr/obj/usr/src/lib32/usr/lib32/libln_p.a'
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAA71u6ZmZNKOHECqX=cEuVLFNfZkTCD6yUaz%2BhnG2GKsUHVp7A>
