Date: Fri, 19 Oct 2012 22:38:29 -0700 From: Dennis Glatting <freebsd@pki2.com> To: Andriy Gapon <avg@freebsd.org> Cc: freebsd-fs@freebsd.org Subject: Re: ZFS hang status update Message-ID: <1350711509.86715.59.camel@btw.pki2.com> In-Reply-To: <1350698905.86715.33.camel@btw.pki2.com> References: <1350698905.86715.33.camel@btw.pki2.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 2012-10-19 at 19:08 -0700, Dennis Glatting wrote: > I applied your debugging patch and that system has been running under > load for 43 hours. I have no idea why. > > That said, some of my prior batch jobs have run for over a month. There > was a time when ZFS was fairly stable but took a dive some months ago. > Boom. Roughly 49 hours, adding a SFTP transfer (60GB off the pool disk-1) and a ls (a directory in the disk-1 pool) in a while loop. My pools: mc# zpool status pool: disk-1 state: ONLINE scan: scrub repaired 0 in 0h38m with 0 errors on Tue Oct 16 16:47:51 2012 config: NAME STATE READ WRITE CKSUM disk-1 ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 da7 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 cache da0 ONLINE 0 0 0 errors: No known data errors pool: disk-2 state: ONLINE scan: scrub repaired 0 in 0h6m with 0 errors on Tue Oct 16 17:05:43 2012 config: NAME STATE READ WRITE CKSUM disk-2 ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 da9 ONLINE 0 0 0 da10 ONLINE 0 0 0 errors: No known data errors camcontrol output (static linked and stored in a md): mc# /mnt/camcontrol tags da0 -v (no output. session hung.) mc# /mnt/camcontrol tags da1 -v (** swap disk **) (pass1:mps0:0:5:0): dev_openings 255 (pass1:mps0:0:5:0): dev_active 0 (pass1:mps0:0:5:0): devq_openings 255 (pass1:mps0:0:5:0): devq_queued 0 (pass1:mps0:0:5:0): held 0 (pass1:mps0:0:5:0): mintags 2 (pass1:mps0:0:5:0): maxtags 255 mc# /mnt/camcontrol tags da2 -v (pass2:mps0:0:6:0): dev_openings 245 (pass2:mps0:0:6:0): dev_active 10 (pass2:mps0:0:6:0): devq_openings 245 (pass2:mps0:0:6:0): devq_queued 0 (pass2:mps0:0:6:0): held 0 (pass2:mps0:0:6:0): mintags 2 (pass2:mps0:0:6:0): maxtags 255 mc# /mnt/camcontrol tags da3 -v (pass3:mps0:0:7:0): dev_openings 245 (pass3:mps0:0:7:0): dev_active 10 (pass3:mps0:0:7:0): devq_openings 245 (pass3:mps0:0:7:0): devq_queued 0 (pass3:mps0:0:7:0): held 0 (pass3:mps0:0:7:0): mintags 2 (pass3:mps0:0:7:0): maxtags 255 mc# /mnt/camcontrol tags da4 -v (pass4:mps0:0:8:0): dev_openings 245 (pass4:mps0:0:8:0): dev_active 10 (pass4:mps0:0:8:0): devq_openings 245 (pass4:mps0:0:8:0): devq_queued 0 (pass4:mps0:0:8:0): held 0 (pass4:mps0:0:8:0): mintags 2 (pass4:mps0:0:8:0): maxtags 255 mc# /mnt/camcontrol tags da5 -v (pass5:mps0:0:9:0): dev_openings 245 (pass5:mps0:0:9:0): dev_active 10 (pass5:mps0:0:9:0): devq_openings 245 (pass5:mps0:0:9:0): devq_queued 0 (pass5:mps0:0:9:0): held 0 (pass5:mps0:0:9:0): mintags 2 (pass5:mps0:0:9:0): maxtags 255 mc# /mnt/camcontrol tags da6 -v (pass6:mps0:0:10:0): dev_openings 245 (pass6:mps0:0:10:0): dev_active 10 (pass6:mps0:0:10:0): devq_openings 245 (pass6:mps0:0:10:0): devq_queued 0 (pass6:mps0:0:10:0): held 0 (pass6:mps0:0:10:0): mintags 2 (pass6:mps0:0:10:0): maxtags 255 mc# /mnt/camcontrol tags da7 -v (pass7:mps0:0:11:0): dev_openings 245 (pass7:mps0:0:11:0): dev_active 10 (pass7:mps0:0:11:0): devq_openings 245 (pass7:mps0:0:11:0): devq_queued 0 (pass7:mps0:0:11:0): held 0 (pass7:mps0:0:11:0): mintags 2 (pass7:mps0:0:11:0): maxtags 255 mc# /mnt/camcontrol tags da8 -v (** OS hdw RAID1 **) (pass8:mps1:0:0:0): dev_openings 245 (pass8:mps1:0:0:0): dev_active 10 (pass8:mps1:0:0:0): devq_openings 245 (pass8:mps1:0:0:0): devq_queued 0 (pass8:mps1:0:0:0): held 0 (pass8:mps1:0:0:0): mintags 2 (pass8:mps1:0:0:0): maxtags 255 mc# /mnt/camcontrol tags da9 -v (pass9:mps1:0:9:0): dev_openings 251 (pass9:mps1:0:9:0): dev_active 4 (pass9:mps1:0:9:0): devq_openings 251 (pass9:mps1:0:9:0): devq_queued 0 (pass9:mps1:0:9:0): held 0 (pass9:mps1:0:9:0): mintags 2 (pass9:mps1:0:9:0): maxtags 255 mc# /mnt/camcontrol tags da10 -v (pass10:mps1:0:11:0): dev_openings 251 (pass10:mps1:0:11:0): dev_active 4 (pass10:mps1:0:11:0): devq_openings 251 (pass10:mps1:0:11:0): devq_queued 0 (pass10:mps1:0:11:0): held 0 (pass10:mps1:0:11:0): mintags 2 (pass10:mps1:0:11:0): maxtags 255 I did not run procstat before reboot. I wasn't sure if that was redundant information from my prior email. This is da0 (the cache --SSD) on which camcontrol hanged. It is on the same controller. da0 at mps0 bus 0 scbus0 target 3 lun 0 da0: <ATA M4-CT256M4SSD2 000F> Fixed Direct Access SCSI-6 device da0: 600.000MB/s transfers da0: Command Queueing enabled da0: 244198MB (500118192 512 byte sectors: 255H 63S/T 31130C)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1350711509.86715.59.camel>