Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 4 Jun 2013 13:09:08 +0530
From:      Ajit Jain <ajit.jain@cloudbyte.com>
To:        freebsd-fs <freebsd-fs@freebsd.org>, Steven Hartland <killing@multiplay.co.uk>
Subject:   Fwd: seeing data corruption with zfs trim functionality
Message-ID:  <CAA71u6ZmZNKOHECqX=cEuVLFNfZkTCD6yUaz%2BhnG2GKsUHVp7A@mail.gmail.com>
In-Reply-To: <CAA71u6a8d5b5CdaAp50HLGmNvK7p1PBJM6yH8AisCsSf%2B8U3-A@mail.gmail.com>
References:  <CAA71u6Y5dKZ9O0rqxCpx-9t7DYgTnPZSoNy-iHOnmzrOUYp%2Bvw@mail.gmail.com> <CAA71u6Zh7BbbdC=utqfR2MD1Nn=9euUDXHKqqu9NyBG-Jx%2B=Ow@mail.gmail.com> <9681E07546D348168052D4FC5365B4CD@multiplay.co.uk> <CAA71u6ZuO9CF0ECFS4z07-E5qPea-6SfNwkvhr_g6pFT5MV5yQ@mail.gmail.com> <CAA71u6YKGHDRVg6W_xnCNaA68bJvAZ2Lkp-UisiPqb1vKjJhfA@mail.gmail.com> <3E9CA9334E6F433A8F135ACD5C237340@multiplay.co.uk> <CAA71u6YZAKrmfTLU32f8UmYecmydwiqRT-OrR1ukZ9V6PGsU%2Bw@mail.gmail.com> <A05ACD84EB974E80B7142CE9982E479C@multiplay.co.uk> <93D0677B373A452BAF58C8EA6823783D@multiplay.co.uk> <CAA71u6bZ_4fb9FxYSwcrHBBApkZog30iQJGyTERi-xFMksud1g@mail.gmail.com> <35ABA7AAEB7F4D86A1ED54C4C47FEB49@multiplay.co.uk> <CAA71u6ahzRai=uUp5L6nDQxxEZC=d5jd4jBBfPNa2k29OwTZDg@mail.gmail.com> <2C2F5CAAE72B4658BFA09E4694A21375@multiplay.co.uk> <CAA71u6a3TJ_sO3Q%2BiJa8EHKE2iM0MKh31D37pGAoua7QU_6xYg@mail.gmail.com> <6E4EBFE196274519B847A47A062950EE@multiplay.co.uk> <CAA71u6bZqYcyW-3RAQj9zjYcWp%2BUXPa4KhH4__nY=S6EuVVR-w@mail.gmail.com> <F71FEDB8BA5142C5A3A0F72DF75A6421@multiplay.co.uk> <CAA71u6a8d5b5CdaAp50HLGmNvK7p1PBJM6yH8AisCsSf%2B8U3-A@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

[-- Attachment #1 --]
Hi Steven,

I am not able to send full output file to freebsd-fs.
I am just sending the error file in this mail and will
send you another mail which contain to full untar output.


regards,
ajit

---------- Forwarded message ----------
From: Ajit Jain <ajit.jain@cloudbyte.com>
Date: Mon, Jun 3, 2013 at 11:51 PM
Subject: Re: seeing data corruption with zfs trim functionality
To: Steven Hartland <killing@multiplay.co.uk>
Cc: freebsd-fs <freebsd-fs@freebsd.org>


Hi Steven,


untar of the tarball is throwing the error below:
tar: Error exit delayed from previous errors.

I have download the file from the link 3 times, every time I am seeing the
same issue.
Please find the tar output file and error (grep from the tar output file)
attached with mail.

checksum of tar ball (after unzip, on freebsd) is:
root@everest:/pool_9stable/obj_src/new # cksum stable-9-r251096.tar
2972813925 3474278400 stable-9-r251096.tar


regards,
ajit




On Fri, May 31, 2013 at 4:12 AM, Steven Hartland <killing@multiplay.co.uk>wrote:

> Tar archive of /usr/src and /usr/obj with built world and GENERIC kernel
> for ams64 can be found here:-
> http://blog.multiplay.co.uk/**dropzone/freebsd/stable-9-**r251096.tar.gz<http://blog.multiplay.co.uk/dropzone/freebsd/stable-9-r251096.tar.gz>;
>
> This is based off r251096 with current proposed MFC of CAM BIO_DELETE &
> ZFS TRIM.
>
>
>    Regards
>    Steve
> ----- Original Message ----- From: "Ajit Jain" <ajit.jain@cloudbyte.com>
>
>
>  Hi Steven,
>>
>> That would be really great. I'll install build provided by you and can
>> quickly
>> update the result. I am kind of feeling that I am asking too much of fever
>> from you.
>>
>> thanks for the help and bearing me,
>> ajit
>>
>>
>> On Wed, May 29, 2013 at 6:39 PM, Steven Hartland <killing@multiplay.co.uk
>> >**wrote:
>>
>>  Unfortunately FS corruption is a serious matters so even though I'm
>>> 99.99%
>>> convinced there isn't a problem I'd still prefer to confirm this was
>>> indeed
>>> an issue with your code base and not an issue with the current code prior
>>> to MFC'ing.
>>>
>>> Would a pre-patched stable/9 source / build help. If so I can look at
>>> making
>>> that available for you.
>>>
>>>
>>>    Regards
>>>    Steve
>>>
>>> ----- Original Message ----- From: "Ajit Jain" <ajit.jain@cloudbyte.com>
>>>
>>>
>>>  Hi Steven,
>>>
>>>>
>>>> Sorry for the long delay, but might delay even further.
>>>> I think the reason for the corruption was, my code
>>>> was not updated specially cam directory.
>>>>
>>>> I request please do not stop just because of the issue I reported.
>>>> I'll update my src tree and rerun the experiments I was running
>>>> if I see some issue then probably we fix the bug rather then stopping
>>>> for MFC.
>>>>
>>>> thanks,
>>>> ajit
>>>>
>>>>
>>>>
>>>> On Wed, May 29, 2013 at 5:19 PM, Steven Hartland <
>>>> killing@multiplay.co.uk
>>>> >**wrote:
>>>>
>>>>
>>>>  Sorry to pester, but any update on this Ajit?
>>>>
>>>>>
>>>>> I ask as its currently blocking the MFC of TRIM to stable/8 & 9 and
>>>>> I've
>>>>> been
>>>>> unable to reproduce this issue even with your testing code on working
>>>>> FW
>>>>> versions.
>>>>>
>>>>>
>>>>>    Regards
>>>>>    Steve
>>>>>
>>>>> ----- Original Message ----- From: "Ajit Jain" <
>>>>> ajit.jain@cloudbyte.com>
>>>>>
>>>>>
>>>>>  Sure Steven,
>>>>>
>>>>>  I'll apply the patches and update ASAP.
>>>>>>
>>>>>> thanks
>>>>>> ajit
>>>>>>
>>>>>>
>>>>>> On Thu, May 23, 2013 at 3:03 PM, Steven Hartland <
>>>>>> killing@multiplay.co.uk
>>>>>> >**wrote:
>>>>>>
>>>>>>
>>>>>>  I've attacked the two patch sets I'm looking to MFC to stable-9, one
>>>>>>
>>>>>>  adds BIO_DELETE CAM changes and the other is ZFS TRIM support.
>>>>>>>
>>>>>>> They should both apply cleanly to stable-9, if you could test with
>>>>>>> those on your machine and let me know.
>>>>>>>
>>>>>>>    Regards
>>>>>>>    Steve
>>>>>>>
>>>>>>> ----- Original Message ----- From: "Ajit Jain" <
>>>>>>> ajit.jain@cloudbyte.com>
>>>>>>>
>>>>>>>
>>>>>>>  Hi Steven,
>>>>>>>
>>>>>>>
>>>>>>>  FW version on the setup is P15.
>>>>>>>> I will upgrade the FW to P16, but I think my
>>>>>>>> best bet will be to update code base to 9 stable as unlike you,
>>>>>>>> I was seeing corruption for all three delete methods.
>>>>>>>>
>>>>>>>> thanks
>>>>>>>> ajit
>>>>>>>>
>>>>>>>> On Sat, May 18, 2013 at 4:15 AM, Steven Hartland <
>>>>>>>> killing@multiplay.co.uk
>>>>>>>> >**wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>  ----- Original Message ----- From: "Steven Hartland" <
>>>>>>>>
>>>>>>>>  killing@multiplay.co.uk>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  After initially seeing not issues, our overnight monitoring
>>>>>>>>> started
>>>>>>>>>
>>>>>>>>>  moaning
>>>>>>>>>
>>>>>>>>>> big time on the test box. So we checked and there was zpool
>>>>>>>>>> corruption
>>>>>>>>>> as
>>>>>>>>>> well
>>>>>>>>>> as a missing boot loader and a corrupt GPT, so I believe we have
>>>>>>>>>> reproduced
>>>>>>>>>> your issue.
>>>>>>>>>>
>>>>>>>>>> After recovering the machine I created 3 pools on 3 different
>>>>>>>>>> disks
>>>>>>>>>> each
>>>>>>>>>> running a different delete_method.
>>>>>>>>>>
>>>>>>>>>> We then re-ran the tests which resulted in the pool running with
>>>>>>>>>> delete_method
>>>>>>>>>> WS16 being so broken it had suspended IO. A reboot resulted in it
>>>>>>>>>> once
>>>>>>>>>> again
>>>>>>>>>> reporting no partition table via gpart.
>>>>>>>>>>
>>>>>>>>>> A third test run again produced a corrupt pool for WS16.
>>>>>>>>>>
>>>>>>>>>> I've conducted a preliminary review of the CAM WS16 code path
>>>>>>>>>> along
>>>>>>>>>> with
>>>>>>>>>> SBC-3
>>>>>>>>>> spec which didn't identify any obvious issues.
>>>>>>>>>>
>>>>>>>>>> Given we're both using LSI 2008 based controllers it could be FW
>>>>>>>>>> issue
>>>>>>>>>> specific
>>>>>>>>>> to WS16 but that's just speculation atm, so I'll continue to
>>>>>>>>>> investigate.
>>>>>>>>>>
>>>>>>>>>> If you could re-test you end without using WS16 to see if you can
>>>>>>>>>> reproduce the
>>>>>>>>>> problem with either UNMAP or ATA_TRIM that would be a very useful
>>>>>>>>>> data
>>>>>>>>>> point.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  After much playing I narrow down a test case of one delete which
>>>>>>>>>> was
>>>>>>>>>>
>>>>>>>>>>  causing
>>>>>>>>> disc corruption for us (deleted the partition table instead of data
>>>>>>>>> in
>>>>>>>>> the middle of the disk).
>>>>>>>>>
>>>>>>>>> The conclusion is LSI 2008 HBA with FW below P13 will eat the data
>>>>>>>>> on
>>>>>>>>> your
>>>>>>>>> SATA
>>>>>>>>> disks if you use WS16 due to the following bug:-
>>>>>>>>> SCGCQ00230159 (DFCT) - Write same command to a SATA drive that
>>>>>>>>> doesn't
>>>>>>>>> support
>>>>>>>>> SCT write same may write wrong region.
>>>>>>>>>
>>>>>>>>> After updating here to P16, which we would generally be running,
>>>>>>>>> but
>>>>>>>>> test
>>>>>>>>> box
>>>>>>>>> was new and hadnt updated yet the corruption issue is no longer
>>>>>>>>> reproducable.
>>>>>>>>>
>>>>>>>>> So Ajit please check your FW version, I'm hoping to here your on
>>>>>>>>> something
>>>>>>>>> below P13, P12 possibly?
>>>>>>>>>
>>>>>>>>> If so then this is your issue, to fix simply update to P16 and the
>>>>>>>>> problem
>>>>>>>>> should be gone.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    Regards
>>>>>>>>>    Steve
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ==============================**********==================
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This e.mail is private and confidential between Multiplay (UK) Ltd.
>>>>>>>>> and
>>>>>>>>> the person or entity to whom it is addressed. In the event of
>>>>>>>>> misdirection,
>>>>>>>>> the recipient is prohibited from using, copying, printing or
>>>>>>>>> otherwise
>>>>>>>>> disseminating it or any information contained in it.
>>>>>>>>> In the event of misdirection, illegible or incomplete transmission
>>>>>>>>> please
>>>>>>>>> telephone +44 845 868 1337
>>>>>>>>> or return the E.mail to postmaster@multiplay.co.uk.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   ==============================********==================
>>>>>>>>>
>>>>>>>>
>>>>>>>>  This e.mail is private and confidential between Multiplay (UK)
>>>>>>> Ltd. and
>>>>>>> the person or entity to whom it is addressed. In the event of
>>>>>>> misdirection,
>>>>>>> the recipient is prohibited from using, copying, printing or
>>>>>>> otherwise
>>>>>>> disseminating it or any information contained in it.
>>>>>>> In the event of misdirection, illegible or incomplete transmission
>>>>>>> please
>>>>>>> telephone +44 845 868 1337
>>>>>>> or return the E.mail to postmaster@multiplay.co.uk.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>   ==============================******==================
>>>>>>
>>>>> This e.mail is private and confidential between Multiplay (UK) Ltd. and
>>>>> the person or entity to whom it is addressed. In the event of
>>>>> misdirection,
>>>>> the recipient is prohibited from using, copying, printing or otherwise
>>>>> disseminating it or any information contained in it.
>>>>> In the event of misdirection, illegible or incomplete transmission
>>>>> please
>>>>> telephone +44 845 868 1337
>>>>> or return the E.mail to postmaster@multiplay.co.uk.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>  ==============================****==================
>>> This e.mail is private and confidential between Multiplay (UK) Ltd. and
>>> the person or entity to whom it is addressed. In the event of
>>> misdirection,
>>> the recipient is prohibited from using, copying, printing or otherwise
>>> disseminating it or any information contained in it.
>>> In the event of misdirection, illegible or incomplete transmission please
>>> telephone +44 845 868 1337
>>> or return the E.mail to postmaster@multiplay.co.uk.
>>>
>>>
>>>
>>
> ==============================**==================
> This e.mail is private and confidential between Multiplay (UK) Ltd. and
> the person or entity to whom it is addressed. In the event of misdirection,
> the recipient is prohibited from using, copying, printing or otherwise
> disseminating it or any information contained in it.
> In the event of misdirection, illegible or incomplete transmission please
> telephone +44 845 868 1337
> or return the E.mail to postmaster@multiplay.co.uk.
>
>

[-- Attachment #2 --]
x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Skopje: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Skopje'
x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Zagreb: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Zagreb'
x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Isle_of_Man: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Isle_of_Man'
x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Guernsey: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Guernsey'
x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Jersey: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Jersey'
x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Podgorica: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Podgorica'
x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Vatican: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Vatican'
x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Ljubljana: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Ljubljana'
x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Zurich: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Zurich'
x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Rome: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Rome'
x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Belgrade: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Belgrade'
x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Prague: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Prague'
x usr/obj/usr/src/share/zoneinfo/builddir/Europe/Mariehamn: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Europe/Mariehamn'
x usr/obj/usr/src/share/zoneinfo/builddir/Arctic/Longyearbyen: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Arctic/Longyearbyen'
x usr/obj/usr/src/share/zoneinfo/builddir/Asia/Istanbul: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Asia/Istanbul'
x usr/obj/usr/src/share/zoneinfo/builddir/Asia/Nicosia: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Asia/Nicosia'
x usr/obj/usr/src/share/zoneinfo/builddir/Antarctica/McMurdo: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Antarctica/McMurdo'
x usr/obj/usr/src/share/zoneinfo/builddir/America/Lower_Princes: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/America/Lower_Princes'
x usr/obj/usr/src/share/zoneinfo/builddir/America/New_York: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/America/New_York'
x usr/obj/usr/src/share/zoneinfo/builddir/America/Kralendijk: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/America/Kralendijk'
x usr/obj/usr/src/share/zoneinfo/builddir/America/Marigot: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/America/Marigot'
x usr/obj/usr/src/share/zoneinfo/builddir/America/Shiprock: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/America/Shiprock'
x usr/obj/usr/src/share/zoneinfo/builddir/America/St_Barthelemy: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/America/St_Barthelemy'
x usr/obj/usr/src/share/zoneinfo/builddir/Etc/Universal: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Etc/Universal'
x usr/obj/usr/src/share/zoneinfo/builddir/Etc/GMT0: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Etc/GMT0'
x usr/obj/usr/src/share/zoneinfo/builddir/Etc/GMT+0: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Etc/GMT+0'
x usr/obj/usr/src/share/zoneinfo/builddir/Etc/GMT: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Etc/GMT'
x usr/obj/usr/src/share/zoneinfo/builddir/Etc/Greenwich: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Etc/Greenwich'
x usr/obj/usr/src/share/zoneinfo/builddir/Etc/Zulu: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Etc/Zulu'
x usr/obj/usr/src/share/zoneinfo/builddir/Etc/UTC: Can't create 'usr/obj/usr/src/share/zoneinfo/builddir/Etc/UTC'
x usr/obj/usr/src/tmp/legacy/usr/libexec/makewhatis.local: Can't create 'usr/obj/usr/src/tmp/legacy/usr/libexec/makewhatis.local'
x usr/obj/usr/src/tmp/usr/bin/gcpp: Can't create 'usr/obj/usr/src/tmp/usr/bin/gcpp'
x usr/obj/usr/src/tmp/usr/bin/CC: Can't create 'usr/obj/usr/src/tmp/usr/bin/CC'
x usr/obj/usr/src/tmp/usr/bin/cc: Can't create 'usr/obj/usr/src/tmp/usr/bin/cc'
x usr/obj/usr/src/tmp/usr/bin/c++: Can't create 'usr/obj/usr/src/tmp/usr/bin/c++'
x usr/obj/usr/src/tmp/usr/lib/libfl.a: Can't create 'usr/obj/usr/src/tmp/usr/lib/libfl.a'
x usr/obj/usr/src/tmp/usr/lib/libln.a: Can't create 'usr/obj/usr/src/tmp/usr/lib/libln.a'
x usr/obj/usr/src/lib32/usr/lib32/libfl_p.a: Can't create 'usr/obj/usr/src/lib32/usr/lib32/libfl_p.a'
x usr/obj/usr/src/lib32/usr/lib32/libfl.a: Can't create 'usr/obj/usr/src/lib32/usr/lib32/libfl.a'
x usr/obj/usr/src/lib32/usr/lib32/libl.a: Can't create 'usr/obj/usr/src/lib32/usr/lib32/libl.a'
x usr/obj/usr/src/lib32/usr/lib32/libln_p.a: Can't create 'usr/obj/usr/src/lib32/usr/lib32/libln_p.a'

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAA71u6ZmZNKOHECqX=cEuVLFNfZkTCD6yUaz%2BhnG2GKsUHVp7A>