From owner-freebsd-scsi@FreeBSD.ORG Mon Jun 23 10:29:44 2003 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1EBBA37B401 for ; Mon, 23 Jun 2003 10:29:44 -0700 (PDT) Received: from mail.allocity.com (exchange.allocity.com [65.90.51.20]) by mx1.FreeBSD.org (Postfix) with ESMTP id 17C0443FAF for ; Mon, 23 Jun 2003 10:29:43 -0700 (PDT) (envelope-from bbawn@allocity.com) Received: from allocity.com ([10.105.2.29]) by mail.allocity.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 23 Jun 2003 11:29:42 -0600 Message-ID: <3EF738CB.3090901@allocity.com> Date: Mon, 23 Jun 2003 11:28:43 -0600 From: Bob Bawn User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.4) Gecko/20011019 Netscape6/6.2 X-Accept-Language: en-us MIME-Version: 1.0 To: freebsd-scsi@freebsd.org Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 23 Jun 2003 17:29:42.0200 (UTC) FILETIME=[07F30B80:01C339AD] Subject: problem with large aio_write(2)s to raw device through Compaq ciss driver X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Jun 2003 17:29:44 -0000 Hello, (I hope this is the right forum for this issue - I tried freebsd-questions a couple weeks ago and got no response.) I am running FreeBSD 4.7 on a Compaq DL 380 with a Compaq Smart Array 5i. My application accesses a raw device (e.g. /dev/da0s1g) using aio_write(2). aio_writes of buffers larger than 224 (512-byte) blocks but smaller than 257 blocks fail with EIO. I get the following messages when this happens: bus_dmamap_load: Too many segs! buf_len = 0x3000 ciss0: invalid command, offense size 0 at 52, value 0x0 These writes succeed on various other hardware configurations (Dell RAID, SCSI disk, IDE disk, etc.) so I suspect the ciss driver. Synchronous (write(2)) writes in this size range to the raw device succeed. aio_writes to normal files succeed. Glancing through the ciss source, I noticed that 224 * 512 = 28 * 4096 where 28 is CISS_COMMAND_SG_LENGTH (the max number of scatter/gather elements per command??). So maybe the write fails if the s/g vector doesn't fit in a single command? (I am a non-expert in this area, so this is speculative...) I don't understand why writes larger than 256 blocks succeed. The following patch seems to fix the problem: *** /usr/src/sys/dev/ciss/cissvar.h.orig Mon Jun 16 14:16:39 2003 --- /usr/src/sys/dev/ciss/cissvar.h Mon Jun 16 14:21:40 2003 *************** *** 140,146 **** * too small. */ ! #define CISS_COMMAND_ALLOC_SIZE 512 /* XXX tune to get sensible s/g list length */ #define CISS_COMMAND_SG_LENGTH ((CISS_COMMAND_ALLOC_SIZE - sizeof(struct ciss_command)) \ / sizeof(struct ciss_sg_entry)) --- 140,153 ---- * too small. */ ! /* ! * 6/16/03 bbawn - aio_write(2)s between 225 and 256 blocks (inclusive) ! * fail with EIO with CISS_COMMAND_ALLOC_SIZE of 512. Fix (or actually kludge ! * around this) by having room for enough scatter/gather entries to ! * exceed 256 blocks) (the max size for a SCSI WRITE(6) command??). ! * #define CISS_COMMAND_ALLOC_SIZE 512 ! */ ! #define CISS_COMMAND_ALLOC_SIZE 1024 /* XXX tune to get sensible s/g list length */ #define CISS_COMMAND_SG_LENGTH ((CISS_COMMAND_ALLOC_SIZE - sizeof(struct ciss_command)) \ / sizeof(struct ciss_sg_entry)) Any clues on what's going on here? (and more importantly, if my "fix" is adequate?) If time permits, I hope to investigate in the debugger but any information would be appreciated. I have a small program that illustrates the problem - let me know if you want it. It seems possible that I have something mis-configured. Here are the boot messages from ciss: Jun 6 10:00:16 queso /kernel: pci0: on pcib0 Jun 6 10:00:16 queso /kernel: ciss0: port 0x2000-0x20ff mem 0xf5ef0000-0xf5ef3fff,0xf7ec0000-0xf7efffff irq 3 at device 1.0 on pci0 Jun 6 10:00:16 queso /kernel: ciss0: using 256 of 1024 available commands Jun 6 10:00:16 queso /kernel: ciss0: 3 logical drives configured Jun 6 10:00:16 queso /kernel: ciss0: firmware 1.92 Jun 6 10:00:16 queso /kernel: ciss0: 2 SCSI channels Jun 6 10:00:16 queso /kernel: ciss0: signature 'CISS' Jun 6 10:00:16 queso /kernel: ciss0: valence 1 Jun 6 10:00:16 queso /kernel: ciss0: supported I/O methods 0xe Jun 6 10:00:16 queso /kernel: ciss0: active I/O method 0x3 Jun 6 10:00:16 queso /kernel: ciss0: 4G page base 0x00000000 Jun 6 10:00:16 queso /kernel: ciss0: interrupt coalesce delay 1000us Jun 6 10:00:16 queso /kernel: ciss0: interrupt coalesce count 16 Jun 6 10:00:16 queso /kernel: ciss0: max outstanding commands 1024 Jun 6 10:00:16 queso /kernel: ciss0: bus types 0x2 Jun 6 10:00:16 queso /kernel: ciss0: server name '' Jun 6 10:00:16 queso /kernel: ciss0: heartbeat 0x10000033 Jun 6 10:00:16 queso /kernel: ciss0: 3 logical drives Jun 6 10:00:16 queso /kernel: ciss0: logical drive 0: RAID 5, 92160MB online Jun 6 10:00:16 queso /kernel: ciss0: logical drive 1: RAID 5, 92160MB online Jun 6 10:00:16 queso /kernel: ciss0: logical drive 2: RAID 5, 92160MB online Thanks, Bob Bawn bbawn@allocity.com