From owner-freebsd-questions@FreeBSD.ORG Fri Jun 10 07:48:39 2011 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5CBE91065674 for ; Fri, 10 Jun 2011 07:48:39 +0000 (UTC) (envelope-from ml@netfence.it) Received: from cp-out8.libero.it (cp-out8.libero.it [212.52.84.108]) by mx1.freebsd.org (Postfix) with ESMTP id B768D8FC0A for ; Fri, 10 Jun 2011 07:48:38 +0000 (UTC) X-CTCH-Spam: Unknown X-CTCH-RefID: str=0001.0A0B0206.4DF1CC55.00BA,ss=1,re=0.000,fgs=0 X-libjamoibt: 1555 Received: from soth.ventu (151.41.174.43) by cp-out8.libero.it (8.5.133) id 4DD2415403591B83 for freebsd-questions@freebsd.org; Fri, 10 Jun 2011 09:48:37 +0200 Received: from alamar.ventu (alamar.ventu [10.1.2.18]) by soth.ventu (8.14.5/8.14.4) with ESMTP id p5A7mALV055678 for ; Fri, 10 Jun 2011 09:48:10 +0200 (CEST) (envelope-from ml@netfence.it) Message-ID: <4DF1CC3A.4030308@netfence.it> Date: Fri, 10 Jun 2011 09:48:10 +0200 From: Andrea Venturoli User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; it-IT; rv:1.9.2.17) Gecko/20110429 Thunderbird/3.1.10 MIME-Version: 1.0 To: freebsd-questions@freebsd.org References: <4DE77F01.1050900@netfence.it> In-Reply-To: <4DE77F01.1050900@netfence.it> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.68 on 10.1.2.13 Subject: Re: Critical issues with WD green drives X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Jun 2011 07:48:39 -0000 On 06/02/11 14:16, Andrea Venturoli wrote: > Hello. > > In a server of mine (7.3p4/i386) I replaced a 1TB Hitachi SATA drive > (which worked perfectly), with two brand new Western Digital 2TB disks. > Now I'm having critical problems, ranging from the disks getting stuck, > to the box rebooting. Thanks to everyone who replied. I've followed Bruce's suggestion, booting the server into plain old DOS, running wdidle3.exe and "turning off" the idle timer. What the utility really says is that the timer gets set to a larger value (something more than an hour) IIRC. Anyway I was then able to newfs the disks and try and use them. Still: _ the disks will often get to a stop, blocking any I/O for some minutes, then restart working; _ i get a lot of messages in the logs, like: > Jun 9 18:10:53 david kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly > Jun 9 18:11:34 david kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly > Jun 9 18:13:45 david kernel: ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly > Jun 9 18:13:45 david kernel: ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly > Jun 9 18:13:45 david kernel: ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly > Jun 9 18:13:45 david kernel: ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=0 > Jun 9 18:13:45 david kernel: ad4: FAILURE - SETFEATURES SET TRANSFER MODE status=51 error=84 > Jun 9 18:13:45 david kernel: ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=5984 > Jun 9 18:21:41 david smartd[2170]: Device: /dev/ad4, SMART Usage Attribute: 194 Temperature_Celsius changed from 118 to 117 > Jun 9 18:21:42 david smartd[2170]: Device: /dev/ad8, SMART Usage Attribute: 193 Load_Cycle_Count changed from 199 to 198 > Jun 9 18:21:42 david smartd[2170]: Device: /dev/ad8, SMART Usage Attribute: 194 Temperature_Celsius changed from 115 to 114 > Jun 9 18:32:46 david kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly > Jun 9 18:32:46 david kernel: ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly > Jun 9 18:32:46 david kernel: ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly > Jun 9 18:32:46 david kernel: ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly > Jun 9 18:32:46 david kernel: ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly > Jun 9 18:32:46 david kernel: ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=56677632 > Jun 9 18:32:46 david kernel: ad4: FAILURE - SETFEATURES SET TRANSFER MODE status=51 error=84 > Jun 9 18:32:46 david kernel: ad4: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=56677664 > Jun 9 20:23:24 david kernel: ad4: 1907729MB at ata2-master SATA300 > Jun 9 20:23:24 david kernel: ad8: 1907729MB at ata4-master SATA300 > Jun 9 20:23:24 david kernel: GEOM_STRIPE: Disk ad4 attached to backup. > Jun 9 20:23:24 david kernel: GEOM_STRIPE: Disk ad8 attached to backup. > Jun 9 20:23:24 david kernel: ad8: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly > Jun 9 20:23:24 david kernel: ad8: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly > Jun 9 20:23:24 david kernel: ad8: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly > Jun 9 20:23:24 david kernel: ad8: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly > Jun 9 20:23:24 david kernel: ad8: WARNING - SET_MULTI taskqueue timeout - completing request directly > Jun 9 20:23:24 david kernel: ad8: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=494338464 > Jun 9 20:23:24 david kernel: ad8: FAILURE - SETFEATURES SET TRANSFER MODE status=51 error=84 > Jun 9 20:23:24 david kernel: ad8: TIMEOUT - READ_DMA48 retrying (0 retries left) LBA=494338464 _ I already got a crash: > # kgdb kernel.debug /var/crash/vmcore.15 > ... > Loaded symbols for /boot/kernel/acpi.ko > #0 doadump () at pcpu.h:196 > 196 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); > (kgdb) bt > #0 doadump () at pcpu.h:196 > #1 0xc0563d48 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 > #2 0xc0564025 in panic (fmt=Variable "fmt" is not available. > ) at /usr/src/sys/kern/kern_shutdown.c:574 > #3 0xc0732764 in trap_fatal (frame=0xc4dbec50, eva=0) at /usr/src/sys/i386/i386/trap.c:950 > #4 0xc07329b4 in trap_pfault (frame=0xc4dbec50, usermode=0, eva=0) at /usr/src/sys/i386/i386/trap.c:863 > #5 0xc0733351 in trap (frame=0xc4dbec50) at /usr/src/sys/i386/i386/trap.c:541 > #6 0xc0718abb in calltrap () at /usr/src/sys/i386/i386/exception.s:166 > #7 0xc050f18b in g_bioq_first (bq=0xc07dc100) at /usr/src/sys/geom/geom_io.c:105 > #8 0xc050f9cc in g_io_schedule_down (tp=0xc5084d80) at /usr/src/sys/geom/geom_io.c:484 > #9 0xc05100e8 in g_down_procbody () at /usr/src/sys/geom/geom_kern.c:118 > #10 0xc053e9a1 in fork_exit (callout=0xc051007a , arg=0x0, frame=0xc4dbed38) at /usr/src/sys/kern/kern_fork.c:811 > #11 0xc0718b30 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:271 Is there anything else I can try? Patches? Knobs? Settings? Any help is appreciated. bye & Thanks av.