From owner-freebsd-stable@FreeBSD.ORG Fri Jul 27 17:24:20 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1253D106566C for ; Fri, 27 Jul 2012 17:24:20 +0000 (UTC) (envelope-from prvs=1555b974d3=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id 93E838FC18 for ; Fri, 27 Jul 2012 17:24:19 +0000 (UTC) X-Spam-Processed: mail1.multiplay.co.uk, Fri, 27 Jul 2012 18:23:45 +0100 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on mail1.multiplay.co.uk X-Spam-Level: X-Spam-Status: No, score=-5.0 required=6.0 tests=USER_IN_WHITELIST shortcircuit=ham autolearn=disabled version=3.2.5 Received: from r2d2 ([188.220.16.49]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50020967912.msg for ; Fri, 27 Jul 2012 18:23:45 +0100 X-MDRemoteIP: 188.220.16.49 X-Return-Path: prvs=1555b974d3=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk X-MDaemon-Deliver-To: freebsd-stable@freebsd.org Message-ID: From: "Steven Hartland" To: Date: Fri, 27 Jul 2012 18:23:53 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Subject: AHCI Timeout errors on Intel Patsburg X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jul 2012 17:24:20 -0000 We're seeing some strange timeout errors on some new Supermicro X9DRT-HF MB's we here when combined with KINGSTON HyperX 3K SSD's It seems that when connnected to the second channel reads often timeout stalling all IO under 8.3-RELEASE-p3 When this happens we see:- Jul 27 14:35:59 lon059 kernel: ahcich1: Timeout on slot 0 port 0 Jul 27 14:35:59 lon059 kernel: ahcich1: is 00000000 cs 00000000 ss 00000001 rs 00000001 tfd 40 serr 00880000 cmd 0004c017 Jul 27 14:37:41 lon059 kernel: ahcich1: Timeout on slot 0 port 0 Jul 27 14:37:41 lon059 kernel: ahcich1: is 00000000 cs 00000000 ss 00000001 rs 00000001 tfd 40 serr 00880000 cmd 0004c017 Jul 27 14:38:35 lon059 kernel: ahcich1: Timeout on slot 0 port 0 Jul 27 14:38:35 lon059 kernel: ahcich1: is 00000000 cs 00000000 ss 00000001 rs 00000001 tfd 40 serr 00880000 cmd 0004c017 Jul 27 14:39:05 lon059 kernel: ahcich1: Timeout on slot 0 port 0 Jul 27 14:39:05 lon059 kernel: ahcich1: is 00000000 cs 00000000 ss 00000001 rs 00000001 tfd 40 serr 00880000 cmd 0004c017 Jul 27 14:39:39 lon059 kernel: ahcich1: Timeout on slot 0 port 0 Jul 27 14:39:39 lon059 kernel: ahcich1: is 00000000 cs 00000000 ss 00000001 rs 00000001 tfd 40 serr 00880000 cmd 0004c017 Jul 27 13:58:06 lon059 kernel: ahcich1: Timeout on slot 14 port 0 Jul 27 13:58:06 lon059 kernel: ahcich1: is 00000000 cs 00000000 ss 00004000 rs 00004000 tfd 40 serr 00880000 cmd 0004ce17 Jul 27 14:21:17 lon059 kernel: ahcich1: Timeout on slot 14 port 0 Jul 27 14:21:17 lon059 kernel: ahcich1: is 00000000 cs 00000000 ss 00004000 rs 00004000 tfd 40 serr 00880000 cmd 0004ce17 Jul 27 14:29:16 lon059 kernel: ahcich1: Timeout on slot 7 port 0 Jul 27 14:29:16 lon059 kernel: ahcich1: is 00000000 cs 00000000 ss 00000080 rs 00000080 tfd 40 serr 00880000 cmd 0004c717 Jul 27 14:31:43 lon059 kernel: ahcich1: Timeout on slot 12 port 0 Jul 27 14:31:43 lon059 kernel: ahcich1: is 00000000 cs 00000000 ss 00001000 rs 00001000 tfd 40 serr 00880000 cmd 0004cc17 The disk in ahcich0 is identical but doesn't seem to exhibit the same problem. Thought it may be a disk issue even though they are brand new but 2 out of the 3 machines tested have the same problem. In addition I've not managed to reproduce the issue if I force sata to rev 2 with: hint.ahcich.1.sata_rev=2 Machine is running with the latest SSD and machine firmware / bios. Could this be a ahci bug? dmesg and camcontrol output:- ahci0: port 0x9050-0x9057,0x9040-0x9043,0x9030-0x9037,0x9020-0x9023,0x9000-0x901f mem 0xdfa22000-0xdfa227ff irq 18 at device 31.2 on pci0 ahci0: [ITHREAD] ahci0: AHCI v1.30 with 6 6Gbps ports, Port Multiplier not supported ahcich0: at channel 0 on ahci0 ahcich0: [ITHREAD] ahcich1: at channel 1 on ahci0 ahcich1: [ITHREAD] ahcich2: at channel 2 on ahci0 ahcich2: [ITHREAD] ahcich3: at channel 3 on ahci0 ahcich3: [ITHREAD] ahcich4: at channel 4 on ahci0 ahcich4: [ITHREAD] ahcich5: at channel 5 on ahci0 ahcich5: [ITHREAD] camcontrol identify ada1 pass1: ATA-8 SATA 3.x device pass1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) protocol ATA/ATAPI-8 SATA 3.x device model KINGSTON SH103S3120G firmware revision 501ABBF0 serial number 50026B7223027059 WWN 50026b7223027059 cylinders 16383 heads 16 sectors/track 63 sector size logical 512, physical 512, offset 0 LBA supported 234441648 sectors LBA48 supported 234441648 sectors PIO supported PIO4 DMA supported WDMA2 UDMA6 media RPM non-rotating Feature Support Enabled Value Vendor read ahead yes yes write cache yes yes flush cache yes yes overlap no Tagged Command Queuing (TCQ) no no Native Command Queuing (NCQ) yes 32 tags SMART yes yes microcode download yes yes security yes no power management yes yes advanced power management yes yes 254/0xFE automatic acoustic management no no media status notification no no power-up in Standby yes no write-read-verify yes no 0/0x0 unload yes yes free-fall no no data set management (DSM/TRIM) yes DSM - max 512byte blocks yes 8 DSM - deterministic read yes any value Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk.