Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 1 Jul 2008 14:58:53 +0200
From:      =?ISO-8859-1?Q?S=F8ren_Schmidt?= <sos@FreeBSD.ORG>
To:        "Daniel Eriksson" <daniel_k_eriksson@telia.com>
Cc:        legioner.r@gmail.com, morten@lightworkings.dk, freebsd-stable@FreeBSD.ORG
Subject:   Re: MCP55 SATA data corruption in FreeBSD 7
Message-ID:  <7ABD8B47-2ECD-457E-908D-E0BED4C6AE56@FreeBSD.ORG>
In-Reply-To: <4F9C9299A10AE74E89EA580D14AA10A61A1968@royal64.emp.zapto.org>
References:  <4F9C9299A10AE74E89EA580D14AA10A61A1968@royal64.emp.zapto.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail-1-386732697
Content-Type: text/plain;
	charset=ISO-8859-1;
	format=flowed;
	delsp=yes
Content-Transfer-Encoding: quoted-printable

Hi

OK, the only "modern" nVidia board I have is MCP51 based, however it =20
uses the same codepath as the MCP55.
Anyhow, there has been fixes fro these in -current, thats not in any =20
of the releng's yet.

Please try the attached patch, or even better try a -current kernel.

-S=F8ren


--Apple-Mail-1-386732697
Content-Disposition: attachment;
	filename=ff
Content-Type: application/octet-stream;
	x-unix-mode=0644;
	name="ff"
Content-Transfer-Encoding: 7bit

Index: ata-chipset.c
===================================================================
RCS file: /home/ncvs/src/sys/dev/ata/ata-chipset.c,v
retrieving revision 1.202.2.7
diff -u -r1.202.2.7 ata-chipset.c
--- ata-chipset.c	1 Apr 2008 15:20:49 -0000	1.202.2.7
+++ ata-chipset.c	1 Jul 2008 12:57:01 -0000
@@ -3147,15 +3475,23 @@
     struct ata_channel *ch = device_get_softc(dev);
     int offset = ctlr->chip->cfg2 & NV4 ? 0x0440 : 0x0010;
     int shift = ch->unit << (ctlr->chip->cfg2 & NVQ ? 4 : 2);
-    u_int32_t istatus = ATA_INL(ctlr->r_res2, offset);
+    u_int32_t istatus;
+
+    /* get interrupt status */
+    if (ctlr->chip->cfg2 & NVQ)
+	istatus = ATA_INL(ctlr->r_res2, offset);
+    else
+	istatus = ATA_INB(ctlr->r_res2, offset);
 
     /* do we have any PHY events ? */
     if (istatus & (0x0c << shift))
 	ata_sata_phy_check_events(dev);
 
     /* clear interrupt(s) */
-    ATA_OUTB(ctlr->r_res2, offset,
-	     (0x0f << shift) | (ctlr->chip->cfg2 & NVQ ? 0x00f000f0 : 0));
+    if (ctlr->chip->cfg2 & NVQ)
+	ATA_OUTL(ctlr->r_res2, offset, (0x0f << shift) | 0x00f000f0);
+    else
+	ATA_OUTB(ctlr->r_res2, offset, (0x0f << shift));
 
     /* do we have any device action ? */
     return (istatus & (0x01 << shift));

--Apple-Mail-1-386732697
Content-Type: text/plain;
	charset=ISO-8859-1;
	format=flowed;
	delsp=yes
Content-Transfer-Encoding: quoted-printable




On 1Jul, 2008, at 11:01 , Daniel Eriksson wrote:

>
> I am having problems with silent data corruption on (some) drives
> connected to an MCP55 SATA controller.
>
> I have two servers, both running RELENG_7_0/amd64. One has the 570 =20
> Ultra
> chipset, the other has 570 SLI. Both chipsets have the MCP55 SATA
> controller.
>
> The server with 570 Ultra chipset has a bunch of older 250GB SATA-150
> drives hooked up to the MCP55 controller and it is working just fine.
> The server with 570 SLI chipset has a bunch of new SATA-300 drives
> hooked up to the MCP55 controller and it is giving me silent data
> corruption (easily detectable by running ZFS scrub, every time I run =20=

> it
> new checksum errors show up). I know the drives are good because when
> they are hooked up to another controller they work just fine.
>
> Unfortunately the drives does not have a jumper for setting SATA-150
> speed (they are Samsung 1 TB drives), and trying to force the drives =20=

> to
> SATA-150 speed with the "patch" provided by the manufacturer does not
> seem to work (the drives still negotiate SATA-300 speed). I will try =20=

> to
> get my hands on another older SATA-150 drive (or a new that can be
> jumpered) to verify if the culprit is the MCP55 revision (see below) =20=

> or
> the interface speed.
>
>
> NOT working (570 SLI)
> ---------------------
> atapci1@pci0:0:5:0:     class=3D0x010185 card=3D0x72501462 =
chip=3D0x037f10de
> rev=3D0xa2 hdr=3D0x00
>    vendor     =3D 'Nvidia Corp'
>    device     =3D 'MCP55 SATA Controller'
>    class      =3D mass storage
>    subclass   =3D ATA
>
> Working (570 Ultra)
> -------------------
> atapci1@pci0:0:5:0:     class=3D0x010185 card=3D0xcb8410de =
chip=3D0x037f10de
> rev=3D0xa3 hdr=3D0x00
>    vendor     =3D 'Nvidia Corp'
>    device     =3D 'MCP55 SATA Controller'
>    class      =3D mass storage
>    subclass   =3D ATA
>
> This is most likely related to kern/120296
> (http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/120296) and kern/=20=

> 121396
> (http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/121396).
>
>
> If someone else is having data corruption problems with drives =20
> connected
> to an MCP55 controller it might be worth testing if limiting the =20
> drives
> to SATA-150 makes a difference. It will most likely take me a while
> before I can verify this.
>
> ---
> Daniel Eriksson (http://www.toomuchdata.com/)
>

-S=F8ren







--Apple-Mail-1-386732697--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7ABD8B47-2ECD-457E-908D-E0BED4C6AE56>