Date: Tue, 9 Jun 2009 20:29:35 -0700 From: Kip Macy <kmacy@freebsd.org> To: Thomas Vogt <freebsdlists@bsdunix.ch> Cc: FreeBSD Current <freebsd-current@freebsd.org> Subject: Re: ZFS Crash with latest current from May 26 Message-ID: <3c1674c90906092029x5f8da830yfe741249534fe029@mail.gmail.com> In-Reply-To: <9B07CA1C-FC50-4E85-B054-1C5DB1483115@bsdunix.ch> References: <4A1DB307.70007@bsdunix.ch> <3c1674c90905280050i1822380cmb71b7c4dd808ac92@mail.gmail.com> <9B07CA1C-FC50-4E85-B054-1C5DB1483115@bsdunix.ch>
next in thread | previous in thread | raw e-mail | index | archive | help
Just a follow-on note to people hitting this. Adding better caching and throttling to buffer allocation *is* at the front of my short list. Disabling compression on a file system won't necessarily help because, unless you copy out and copy back in all the data, you still have to double buffer and decompress on read. Cheers, Kip On Thu, May 28, 2009 at 3:25 AM, Thomas Vogt <freebsdlists@bsdunix.ch> wrot= e: > Hi Kip > > Am 28.05.2009 um 09:50 schrieb Kip Macy: >> >> Do you have compression enabled? What types of disks do you have? > > Yes compression is enabled at zfs. pool/cvsup has compression on. Everyth= ing > else is "uncompressed". > > pool =A0 =A0 =A0 =A0 =A0 =A0 1.2T =A0 =A0 =A00B =A0 =A01.2T =A0 =A0 0% = =A0 =A0/usr/local/data > pool/cvsup =A0 =A0 =A0 1.2T =A0 =A06.0G =A0 =A01.2T =A0 =A0 0% =A0 =A0/us= r/local/data/cvsup > pool/ftp =A0 =A0 =A0 =A0 3.3T =A0 =A02.1T =A0 =A01.2T =A0 =A064% =A0 =A0/= usr/local/data/ftp > pool/portsnap =A0 =A01.2T =A0 =A01.3G =A0 =A01.2T =A0 =A0 0% =A0 =A0/usr/= local/data/portsnap > pool/www =A0 =A0 =A0 =A0 1.2T =A0 =A0 93M =A0 =A01.2T =A0 =A0 0% =A0 =A0/= usr/local/data/www > > I use an areca controller 1230 in jbod mode (except for the main system) > > Disks are Seagate ST3750640NS R001 > > da0: <Areca ARC-1230-VOL#00 R001> Fixed Direct Access SCSI-5 device > da0: 166.666MB/s transfers (83.333MHz, offset 32, 16bit) > da0: Command Queueing Enabled > da0: 238418MB (488281088 512 byte sectors: 255H 63S/T 30394C) > da1 at arcmsr0 bus 0 target 0 lun 1 > da1: <Seagate ST3750640NS R001> Fixed Direct Access SCSI-5 device > da1: 166.666MB/s transfers (83.333MHz, offset 32, 16bit) > da1: Command Queueing Enabled > da1: 715404MB (1465149168 512 byte sectors: 255H 63S/T 91201C) > da2 at arcmsr0 bus 0 target 0 lun 2 > da2: <Seagate ST3750640NS R001> Fixed Direct Access SCSI-5 device > da2: 166.666MB/s transfers (83.333MHz, offset 32, 16bit) > da2: Command Queueing Enabled > da2: 715404MB (1465149168 512 byte sectors: 255H 63S/T 91201C) > da3 at arcmsr0 bus 0 target 0 lun 3 > da3: <Seagate ST3750640NS R001> Fixed Direct Access SCSI-5 device > da3: 166.666MB/s transfers (83.333MHz, offset 32, 16bit) > da3: Command Queueing Enabled > da3: 715404MB (1465149168 512 byte sectors: 255H 63S/T 91201C) > da4 at arcmsr0 bus 0 target 0 lun 4 > da4: <Seagate ST3750640NS R001> Fixed Direct Access SCSI-5 device > da4: 166.666MB/s transfers (83.333MHz, offset 32, 16bit) > da4: Command Queueing Enabled > da4: 715404MB (1465149168 512 byte sectors: 255H 63S/T 91201C) > da5 at arcmsr0 bus 0 target 0 lun 5 > da5: <Seagate ST3750640NS R001> Fixed Direct Access SCSI-5 device > da5: 166.666MB/s transfers (83.333MHz, offset 32, 16bit) > da5: Command Queueing Enabled > da5: 715404MB (1465149168 512 byte sectors: 255H 63S/T 91201C) > da6 at arcmsr0 bus 0 target 0 lun 6 > da6: <Seagate ST3750640NS R001> Fixed Direct Access SCSI-5 device > da6: 166.666MB/s transfers (83.333MHz, offset 32, 16bit) > da6: Command Queueing Enabled > da6: 715404MB (1465149168 512 byte sectors: 255H 63S/T 91201C) > da7 at arcmsr0 bus 0 target 0 lun 7 > da7: <Seagate ST3750640NS R001> Fixed Direct Access SCSI-5 device > da7: 166.666MB/s transfers (83.333MHz, offset 32, 16bit) > da7: Command Queueing Enabled > da7: 715404MB (1465149168 512 byte sectors: 255H 63S/T 91201C) > > Regards, > Thomas > >> >> On Wed, May 27, 2009 at 2:39 PM, Thomas Vogt <freebsdlists@bsdunix.ch> >> wrote: >>> >>> Hi >>> >>> I updated today to latest current (from a current system late april). >>> Now, >>> the system is very unstable. >>> >>> I started a zpool scrub and the system crashed after a few minutes. >>> >>> loader.conf: >>> autoboot_delay=3D3 >>> beastie_disable=3D"YES" >>> zfs_load=3D"YES" >>> kern.maxfiles=3D"65536" >>> kern.maxproc=3D"20480" >>> #vfs.zfs.arc_min=3D"64M" >>> #vfs.zfs.arc_max=3D"2G" >>> vfs.zfs.prefetch_disable=3D"1" >>> vfs.zfs.zil_disable=3D"1" >>> >>> >>> FreeBSD 8.0-CURRENT #0: Wed May 27 20:02:24 UTC 2009 >>> root@lisa.foo.ch:/usr/obj/usr/src/sys/GENERIC amd64 >>> >>> CPU: >>> WARNING: WITNESS option enabled, expect reduced performance. >>> Timecounter "i8254" frequency 1193182 Hz quality 0 >>> CPU: Intel(R) Xeon(R) CPU =A0 =A0 =A0 =A0 =A0 =A05110 =A0@ 1.60GHz (173= 1.57-MHz >>> K8-class >>> CPU) >>> =A0Origin =3D "GenuineIntel" =A0Id =3D 0x6f6 =A0Stepping =3D 6 >>> >>> >>> Features=3D0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,= PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> >>> >>> >>> Features2=3D0x4e33d<SSE3,DTES64,MON,DS_CPL,VMX,TM2,SSSE3,CX16,xTPR,PDCM= ,DCA> >>> =A0AMD Features=3D0x20100800<SYSCALL,NX,LM> >>> =A0AMD Features2=3D0x1<LAHF> >>> =A0TSC: P-state invariant >>> real memory =A0=3D 4294967296 (4096 MB) >>> avail memory =3D 4060078080 (3871 MB) >>> ACPI APIC Table: <INTEL =A0S5000XVN> >>> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs >>> FreeBSD/SMP: 1 package(s) x 2 core(s) >>> =A0cpu0 (BSP): APIC ID: =A00 >>> =A0cpu1 (AP): APIC ID: =A01 >>> This module (opensolaris) contains code covered b >>> >>> >>> >>> DDB Output: >>> >>> FatKernale tlra p 12:p paage gfaeul t fwhaile uin klertnel mode >>> c puid =3D 1; apic wiid t=3D 01 >>> hfa uthe following non-sleepable locks lth veirtual address =A0 =A0 =3D= l 0x0 >>> t au:l >>> =A0code =A0 =A0 =A0 =A0 =A0=3D supervisor werite data, pagex noct lpres= ent >>> iunstructsion pointier =A0=3D 0x2v0:0xfffeff ffsf8l08e28e22p6 >>> =A0stack poinmter =A0 =A0 =A0 =A0 =3D 0x28:0xfufftfff80e779x27 a68r0 >>> fcrame pointer =A0 =A0 =A0 =A0 =A0=3D 0x28m:0sxfrff fff80779276e0 >>> cQode segm enbt =A0 =A0 =A0 =A0 =3D buasfe 0x0f, leimrit 0xf ffff, tylp= e 0x1ob >>> c =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =3D DPL 0k, pres 1, long = 1, =A0de(f32 0, gran 1 >>> parocessor eflags =A0 =A0 =A0 =3D interrupt enabled, resurme, IOPL =3D = 0 >>> ccurrent prmocess =A0 =A0 =A0 =A0 =A0 =A0 =A0 =3D 5142 (mv) >>> [thread pid 5142 tid 100159 ] >>> Stopped at =A0 =A0 =A0bcopy+0x16: =A0 =A0 repe movsq =A0 =A0 =A0(%rsi),= %es:(%rdi) >>> >>> b> bt >>> Tracing pid 5142 tid 100159 td 0xffffff0032172720 >>> bcopy() at bcopy+0x16 >>> dnode_set_blksz() at dnode_set_blksz+0x2ae >>> dmu_object_set_blocksize() at dmu_object_set_blocksize+0x4c >>> zfs_grow_blocksize() at zfs_grow_blocksize+0x45 >>> zfs_freebsd_write() at zfs_freebsd_write+0x9e6 >>> VOP_WRITE_APV() at VOP_WRITE_APV+0xfe >>> vn_write() at vn_write+0x221 >>> dofilewrite() at dofilewrite+0x85 >>> kern_writev() at kern_writev+0x60 >>> write() at write+0x54 >>> syscall() at syscall+0x1bf >>> Xfast_syscall() at Xfast_syscall+0xd0 >>> --- syscall (4, FreeBSD ELF64, write), rip =3D 0x80073616c, rsp =3D >>> 0x7fffffffe068, rbp =3D 0x7fffff >>> >>> >>> More debug information at >>> http://www.bsdunix.ch/public/FreeBSD/crash_2009-May_27.txt >>> >>> The system was pretty stable with current from late April. >>> >>> >>> Regards, >>> Thomas >>> _______________________________________________ >>> freebsd-current@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-current >>> To unsubscribe, send any mail to >>> "freebsd-current-unsubscribe@freebsd.org" >>> >> >> >> >> -- >> When bad men combine, the good must associate; else they will fall one >> by one, an unpitied sacrifice in a contemptible struggle. >> >> =A0 Edmund Burke > > Thomas Vogt > > > > --=20 When bad men combine, the good must associate; else they will fall one by one, an unpitied sacrifice in a contemptible struggle. Edmund Burke
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3c1674c90906092029x5f8da830yfe741249534fe029>