From owner-freebsd-scsi@FreeBSD.ORG Sun Mar 15 04:40:37 2015 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9B5E16CE for ; Sun, 15 Mar 2015 04:40:37 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8192498B for ; Sun, 15 Mar 2015 04:40:37 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t2F4ebaf050167 for ; Sun, 15 Mar 2015 04:40:37 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-scsi@FreeBSD.org Subject: [Bug 192247] [cam][cd] READ_CD_CAPACITY could be reaped from scsi_cd.h Date: Sun, 15 Mar 2015 04:40:37 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 11.0-CURRENT X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: linimon@FreeBSD.org X-Bugzilla-Status: New X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-scsi@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 15 Mar 2015 04:40:37 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=192247 Mark Linimon changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-bugs@FreeBSD.org |freebsd-scsi@FreeBSD.org -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-scsi@FreeBSD.ORG Sun Mar 15 05:02:39 2015 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 43764B68 for ; Sun, 15 Mar 2015 05:02:39 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 291F7C24 for ; Sun, 15 Mar 2015 05:02:39 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t2F52dMo007705 for ; Sun, 15 Mar 2015 05:02:39 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-scsi@FreeBSD.org Subject: [Bug 184154] [cam] QUIRK: SYNC_CACHE not supported on IBM ServeRAID 8k Date: Sun, 15 Mar 2015 05:02:39 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: linimon@FreeBSD.org X-Bugzilla-Status: In Progress X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: freebsd-scsi@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: assigned_to keywords Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 15 Mar 2015 05:02:39 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=184154 Mark Linimon changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|freebsd-bugs@FreeBSD.org |freebsd-scsi@FreeBSD.org Keywords| |patch -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-scsi@FreeBSD.ORG Sun Mar 15 11:58:23 2015 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7639DDF8 for ; Sun, 15 Mar 2015 11:58:23 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5C621801 for ; Sun, 15 Mar 2015 11:58:23 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t2FBwN7G069196 for ; Sun, 15 Mar 2015 11:58:23 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-scsi@FreeBSD.org Subject: [Bug 184154] [cam] QUIRK: SYNC_CACHE not supported on IBM ServeRAID 8k Date: Sun, 15 Mar 2015 11:58:23 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: patch X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: smh@FreeBSD.org X-Bugzilla-Status: In Progress X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: freebsd-scsi@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 15 Mar 2015 11:58:23 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=184154 Steven Hartland changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |smh@FreeBSD.org --- Comment #4 from Steven Hartland --- Could you provide the full output from camcontrol identify (if it works) and camcontrol inquiry. The version string you've used is very wide and disabling cache sync is not ideal. Also which firmware version are your using and is there an update, as this really should be fixed there. -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-scsi@FreeBSD.ORG Mon Mar 16 08:26:29 2015 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 55BEAFBF; Mon, 16 Mar 2015 08:26:29 +0000 (UTC) Received: from mail-ie0-x22d.google.com (mail-ie0-x22d.google.com [IPv6:2607:f8b0:4001:c03::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1152C7E9; Mon, 16 Mar 2015 08:26:29 +0000 (UTC) Received: by iegc3 with SMTP id c3so167601454ieg.3; Mon, 16 Mar 2015 01:26:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=a8oizIEG9d95GSYzj2v49kOURsc3F2JvxoHb/6cmbo4=; b=w4Q5bLhrbKGVFgc1whFfTi+ALzVWFc4NRFGBw1vO7ByKBXIom/pKavTMrR87Dsp1UB A5LU9Nz7Y294evENRFMGV3S4Nrg/1M3Y/ghHdRLeSQlPFl63cfVIfXnOx7GupN7Ar26y rLUImQ7cinhyRBbI4kQ06mEhKvGUOS8rJ5o9dfl81fMQ6t5iYC4Gc6NRJnGBGXt4TBhm alayd5BWren/yyTgtxgRfQ1dBwjCRLE8sNI6kQbkoRP2uzkah2KQKu71Vzp3RMfsoi+V 6qNe3iKPQQMzSKzi4AsTZoQNr6rEtiTTWo8lMwTv5F+hTxob4DJtj51N2Bmr3PKF4XQO d5ng== MIME-Version: 1.0 X-Received: by 10.42.247.68 with SMTP id mb4mr75627252icb.2.1426494388210; Mon, 16 Mar 2015 01:26:28 -0700 (PDT) Received: by 10.36.23.1 with HTTP; Mon, 16 Mar 2015 01:26:28 -0700 (PDT) In-Reply-To: References: <20150303065052.GA98687@mithlond.kdm.org> <8689E5CD-4E89-48D1-B0EE-3821E7174A0D@freebsd.org> Date: Mon, 16 Mar 2015 16:26:28 +0800 Message-ID: Subject: Re: What does the error code 82 mean? From: fengyd To: Ken Merry Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: freebsd-scsi@freebsd.org X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Mar 2015 08:26:29 -0000 Hi, case SIR_DATA_OVERRUN: if (cp) { OUTONB (HF_PRT, HF_EXT_ERR); cp->xerr_status |=3D XE_EXTRA_DATA; cp->extra_bytes +=3D INL (nc_scratcha); tp->tinfo.current.width =3D BUS_8_BIT; // Added here to trigger re-negotiation } goto out; Before SIR_DATA_OVERRUN, tp->tinfo.current.width =3D tp->tinfo.goal.width =3D BUS_16_BIT; I added the code to set tp->tinfo.current.width as BUS_8_BIT, which will trigger re-negotiation next IO request, then everything is OK. And I found the value width is got from tp->tinfo.goal.width in the function sym_prepare_nego for the next negotiation message. So, the transfer width is still 16-bit after device reset? What we need to do is to re-negotiate with the target? And, do you know what re-negotiation actually does? Thanks Br. Yafeng On Thu, Mar 12, 2015 at 11:21 PM, fengyd wrote: > Hi, > > -Is it always an INQUIRY that is sent from UNIT1 after you reset the > target from UNIT0? > Yes. > > I did some tests: > First, UNIT0 reset one device, UNIT1 cannot access it. > Then, UNIT1 reset the same device, UNIT1 can access it, but UNIT0 cannot > access it. > > I think device reset should restore some device parameters to the origina= l > values. > Then after both of UNIT0 and UNIT1 reset the same device, they should be > able to access the device. > But it seems not. > > Do you know what device reset actually do? > > Thanks > > Br. > Yafeng > > > On Thu, Mar 5, 2015 at 12:24 AM, Ken Merry wrote: > >> The challenge is that the data transfer rate is reset on the target for >> both the initiator doing the reset, and the other initiator. >> >> So re-negotiating from the initiator that did the reset will do no good. >> You need to re-negotiate from the other initiator. >> >> You can either detect the situation from a unit attention (that you will >> get in response from a test unit ready) returned from the target, or you >> can communicate between the nodes so that the other node knows that it >> needs to re-negotiate. >> >> Ken >> =E2=80=94 >> Ken Merry >> ken@FreeBSD.ORG >> >> >> >> On Mar 4, 2015, at 2:44 AM, fengyd wrote: >> >> Hi, >> >> The code to reset the target: >> static void sym_reset_dev(hcb_p np, union ccb *ccb) >> { >> tcb_p tp; >> struct ccb_hdr *ccb_h =3D &ccb->ccb_h; >> >> if (ccb_h->target_id =3D=3D np->myaddr || >> ccb_h->target_id >=3D SYM_CONF_MAX_TARGET || >> ccb_h->target_lun >=3D SYM_CONF_MAX_LUN) { >> sym_xpt_done2(np, ccb, CAM_DEV_NOT_THERE); >> return; >> } >> >> tp =3D &np->target[ccb_h->target_id]; >> >> tp->to_reset =3D 1; >> sym_xpt_done2(np, ccb, CAM_REQ_CMP); >> >> np->istat_sem =3D SEM; >> OUTB (nc_istat, SIGP|SEM); >> return; >> } >> >> Can target reset set data transfer with the size provided by driver? >> >> >> Thanks for your help. >> >> Br. >> Yafeng >> >> On Wed, Mar 4, 2015 at 5:40 PM, fengyd wrote: >> >>> Hi, >>> >>> It seems that during initialization, data transfer is set as 16-bit by >>> driver, it is set as 8-bit due to target reset. >>> So it means default data transfer for the drive is 8-bit? >>> >>> -You might try seeing what the ahc(4) and ahd(4) drivers do in this >>> situation. >>> I didn't find the code related with ahc or ahd. >>> Do you know in which release ahc and ahd are implemented? >>> >>> -If you have an idea that this may have happened, you can try doing a >>> bus or target rescan. >>> I just begin to study FREEBSD driver. >>> Could you give some instructions how to do bus or target rescan? >>> >>> -Just out of curiosity, why are you doing multi-initiator with this >>> hardware? >>> Two units needs to access the device at the same time. >>> >>> Thanks for your help. >>> >>> Br. >>> Yafeng >>> >>> On Wed, Mar 4, 2015 at 12:28 AM, Ken Merry wrote: >>> >>>> It sounds like the target reset is causing the drive to reset its >>>> negotiation parameters, and go back to narrow SCSI. >>>> >>>> UNIT1 still thinks it is talking wide SCSI, but the drive is actually >>>> talking 8 bit. So the drive sends back the 64 bytes of inquiry data i= n 64 >>>> bus clocks. The drive is only changing the bottom 8 bits, but the >>>> controller thinks it is driving all 16, and records the top 8 bits as = zeros. >>>> >>>> The result is that you get 64 bytes of =E2=80=9Cextra=E2=80=9D data, a= nd every other >>>> byte is zero. >>>> >>>> So, you=E2=80=99ll need to figure out a way for the sym(4) driver to f= igure out >>>> that the target has been reset, and re-negotiate with the drive. >>>> >>>> You might try seeing what the ahc(4) and ahd(4) drivers do in this >>>> situation. I don=E2=80=99t know whether or not they actually handle i= t, but it >>>> might be instructive to look. >>>> >>>> If you have an idea that this may have happened, you can try doing a >>>> bus or target rescan. That may go through the domain validation path = and >>>> trigger re-negotiation with the target. >>>> >>>> Just out of curiosity, why are you doing multi-initiator with this >>>> hardware? It would probably be easier to do all of this with more mod= ern >>>> SAS hardware and expanders. >>>> >>>> Ken >>>> =E2=80=94 >>>> Ken Merry >>>> ken@FreeBSD.ORG >>>> >>>> >>>> >>>> On Mar 3, 2015, at 12:50 AM, fengyd wrote: >>>> >>>> Hi, >>>> >>>> Thanks very much for your reply. >>>> >>>> -How are you sending the INQUIRY command? >>>> Yes. >>>> -Are you sending it via the pass(4) driver? >>>> Yes >>>> -How many bytes are you asking for in the CDB? >>>> 64 >>>> -How many bytes are you setting in the dxfer_len field in the CCB? >>>> 64, but it seems the device wants to transfer 128 bytes. >>>> >>>> -What kind of device are you talking to? >>>> Some kernel log: >>>> da3 at sym1 bus 0 target 0 lun 0 >>>> da3: Fixed Direct Access SCSI-3 device >>>> da3: 40.000MB/s transfers (20.000MHz, offset 31, 16bit), Tagged >>>> Queueing Enabled >>>> da3: 70136MB (143638992 512 byte sectors: 255H 63S/T 8941C) >>>> >>>> >>>> >>>> >>>> The brief connections as above: >>>> UNIT0 can access DISK0 and DISK1 by IOC0. >>>> UNIT1 can access DISK0 and DISK1 by IOC1. >>>> >>>> The problem happens when UNIT0 sends XPT_RESET_DEV to reset one disk, >>>> UNIT1 sends INQUIRY to get the basic information from the target, but = fails >>>> to get the correct information. >>>> >>>> And I added some log. >>>> >>>> >>>> The right information got from device: >>>> >>>> 00 00 03 12 5B 00 01 3A 46 55 4A 49 54 53 55 20 >>>> >>>> 4D 42 41 33 30 37 33 4E 50 20 20 20 20 20 20 20 >>>> >>>> 34 37 30 32 42 42 53 32 50 41 41 30 31 31 46 34 >>>> >>>> 00 00 00 00 00 00 00 00 0F 00 00 40 0B 54 01 3C >>>> >>>> >>>> The wrong information got from device: >>>> >>>> 00 00 00 00 03 00 12 00 5B 00 00 00 01 00 3A 00 >>>> >>>> 46 00 55 00 4A 00 49 00 54 00 53 00 55 00 20 00 >>>> >>>> 4D 00 42 00 41 00 33 00 30 00 37 00 33 00 4E 00 >>>> >>>> 50 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 >>>> >>>> >>>> Compared to the right log, it seems one extra byte *00* is added after >>>> every byte. >>>> >>>> >>>> >>>> Thanks for your help. >>>> >>>> Br. >>>> Yafeng >>>> >>>> >>>> On Tue, Mar 3, 2015 at 2:50 PM, Kenneth D. Merry >>>> wrote: >>>> >>>>> >>>>> An overrun is exactly what the comment below indicates. It is when t= he >>>>> target sends back more data than you asked for. You will generally >>>>> see it >>>>> on commands that receive data from a target. >>>>> >>>>> How are you sending the INQUIRY command? Are you sending it via the >>>>> pass(4) driver? How many bytes are you asking for in the CDB? How >>>>> many >>>>> bytes are you setting in the dxfer_len field in the CCB? >>>>> >>>>> What kind of device are you talking to? Obviously, you're using the >>>>> sym(4) >>>>> driver, so I'm guessing this is a parallel SCSI device (unless there >>>>> is a >>>>> virtualization stack that emulates the sym(4) hardware). >>>>> >>>>> Ken >>>>> >>>>> On Mon, Mar 02, 2015 at 15:49:57 +0800, fengyd wrote: >>>>> > Hi, >>>>> > >>>>> > I found the related code in the function sym_int_sir: >>>>> > /* >>>>> > * The device wants us to tranfer more data than >>>>> > * expected or in the wrong direction. >>>>> > * The number of extra bytes is in scratcha. >>>>> > * It is a data overrun condition. >>>>> > */ >>>>> > case *SIR_DATA_OVERRUN*: >>>>> > if (cp) { >>>>> > OUTONB (HF_PRT, HF_EXT_ERR); >>>>> > * cp->xerr_status |=3D XE_EXTRA_DATA;* >>>>> > cp->extra_bytes +=3D INL (nc_scratcha); >>>>> > } >>>>> > goto out; >>>>> > >>>>> > I'm not familiar with SCSI. >>>>> > What does DATA_OVERRUN actually mean? >>>>> > How can it be triggered? >>>>> > Could you give more details about it? >>>>> > >>>>> > Thanks for your help. >>>>> > >>>>> > Br. >>>>> > Yafeng >>>>> > >>>>> > >>>>> > >>>>> > On Sat, Feb 28, 2015 at 4:50 PM, fengyd wrote: >>>>> > >>>>> > > Hi, >>>>> > > >>>>> > > It seems the error code 82 & 3F is 0x12. >>>>> > > And the definition of the error code in the file cam.h: >>>>> > > CAM_AUTOSENSE_FAIL =3D 0x10,/* Autosense: request sense c= md >>>>> fail */ >>>>> > > CAM_NO_HBA, /* No HBA Detected error */ >>>>> > > CAM_DATA_RUN_ERR, /* Data Overrun error */ >>>>> > > >>>>> > > So, it means data overrun error? >>>>> > > >>>>> > > Thanks. >>>>> > > >>>>> > > Br. >>>>> > > Yafeng >>>>> > > >>>>> > > On Sat, Feb 28, 2015 at 4:32 PM, fengyd >>>>> wrote: >>>>> > > >>>>> > >> Hi, >>>>> > >> >>>>> > >> INQUIRY command is sent to the target, but error code 82 is >>>>> returned. >>>>> > >> I added some log in the driver: >>>>> > >> SIR_COMPLETE_ERROR >>>>> > >> (pass0:sym0:0:0:0): sym_complete_error status =3D 18 >>>>> > >> (pass0:sym0:0:0:0): status =3D 82 >>>>> > >> >>>>> > >> Do you know what does the error code 82 mean? >>>>> > >> >>>>> > >> Thanks in advance. >>>>> > >> >>>>> > >> Br. >>>>> > >> Yafeng >>>>> > >> >>>>> > > >>>>> > > >>>>> > _______________________________________________ >>>>> > freebsd-scsi@freebsd.org mailing list >>>>> > http://lists.freebsd.org/mailman/listinfo/freebsd-scsi >>>>> > To unsubscribe, send any mail to " >>>>> freebsd-scsi-unsubscribe@freebsd.org" >>>>> >>>>> -- >>>>> Kenneth Merry >>>>> ken@FreeBSD.ORG >>>>> >>>> >>>> >>>> >>> >> >> > From owner-freebsd-scsi@FreeBSD.ORG Mon Mar 16 12:43:27 2015 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E065CC4E for ; Mon, 16 Mar 2015 12:43:27 +0000 (UTC) Received: from mail-wi0-f180.google.com (mail-wi0-f180.google.com [209.85.212.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 50CC1A4C for ; Mon, 16 Mar 2015 12:43:27 +0000 (UTC) Received: by wibg7 with SMTP id g7so36633813wib.1 for ; Mon, 16 Mar 2015 05:43:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:references:in-reply-to:mime-version :thread-index:date:message-id:subject:to:content-type; bh=Km8gOxNZujHeLpGk8J9gtQAbJSQ70G2TqlHteafk+NM=; b=bWiiZLgylg1US7Gxe1/VneevViF90Jhr3yqAFf2bBmR7IXfC0/Pow97uw+j5WdNQim 9TKpmDZ6uIf9N/4eiirxt2/T0SBNvnQc0tQkGDMRhTyXe89yTnSIi1kkFo1U9IMvl33R fAnLfdasDRnt4BEVuEujvhPIAZVKfhIorH9GrRiWji3VAm85AWXOTzcBDCDIAToLhiam 3k/O5C5wM1nRk8/xFN0DoD4rSmxjYbMuZ9V63kUwFsffWcvHEMaZEyiABv/Pl3NTfyLS 5h5biGX/1+WuULjvtPgwuS1jJxNokv6n2qrtEsrAx/Pc2Np9jTr57lbZ9dtsdslj1TcD wk2w== X-Gm-Message-State: ALoCoQn8mbKueUgoWd1oStYXf0OYbQ8PsC1X8lECfC1X0lGyB8EIz/QxH9JqTahr0wXuOOch576F X-Received: by 10.180.106.225 with SMTP id gx1mr166621142wib.53.1426509805430; Mon, 16 Mar 2015 05:43:25 -0700 (PDT) From: Sibananda Sahu References: 4d8df3d1c15489c01665e69703ab0050@mail.gmail.com In-Reply-To: 4d8df3d1c15489c01665e69703ab0050@mail.gmail.com MIME-Version: 1.0 X-Mailer: Microsoft Outlook 14.0 Thread-Index: AdBcvKnIAR8sP8a0TDu6+GipRP0OvADJdM5A Date: Mon, 16 Mar 2015 18:13:22 +0530 Message-ID: <44ed50503f6c017a18c4d9df294c7391@mail.gmail.com> Subject: RE: Kenrel panic in bus_dmamem_alloc() To: freebsd-scsi@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Mar 2015 12:43:28 -0000 Hi, Regarding this issue I have some more information to update. System info: - FreeBSD 10.0-RELEASE - Amd64 - 4GB of memory The bus_dmamem_alloc() tries to allocate 3796992 bytes of contiguous memory. The size that this call asks is exactly 927 pages. #11 0xffffffff80acc153 in kmem_alloc_contig (vmem=0xffffffff8144bd80, size=3796992, flags=1, low=0, high=4294967295, alignment=4) at /usr/src/sys/vm/vm_kern.c:239 After loading and unloading the mrsas.ko module for several times it crashes at bus_dmamem_alloc(). When I modified the size from 3796992 bytes to 3801088 bytes, which is exactly 928 pages, I have observed that: I loaded mrsas.ko and unloaded immediately. Again I loaded mrsas.ko and the system just crashed at bus_dmamem_alloc(). Following are the dma tag and memory allocation calls: /* * Allocate Chain Frames */ chain_frame_size = sc->chain_frames_alloc_sz; // chain_frame_size accounts for 3796992 bytes if (bus_dma_tag_create(sc->mrsas_parent_tag, 4, 0, BUS_SPACE_MAXADDR_32BIT, BUS_SPACE_MAXADDR, NULL, NULL, chain_frame_size, 1, chain_frame_size, BUS_DMA_ALLOCNOW, NULL, NULL, &sc->chain_frame_tag)) { device_printf(sc->mrsas_dev, "Cannot create chain frame tag\n"); return (ENOMEM); } if (bus_dmamem_alloc(sc->chain_frame_tag, (void **)&sc->chain_frame_mem, BUS_DMA_NOWAIT, &sc->chain_frame_dmamap)) { device_printf(sc->mrsas_dev, "Cannot alloc chain frame memory\n"); return (ENOMEM); } bzero(sc->chain_frame_mem, chain_frame_size); if (bus_dmamap_load(sc->chain_frame_tag, sc->chain_frame_dmamap, sc->chain_frame_mem, chain_frame_size, mrsas_addr_cb, &sc->chain_frame_phys_addr, BUS_DMA_NOWAIT)) { device_printf(sc->mrsas_dev, "Cannot load chain frame memory\n"); return (ENOMEM); } Following are the de-allocations made while unloading the driver: /* * Free chain frame memory */ if (sc->chain_frame_phys_addr) bus_dmamap_unload(sc->chain_frame_tag, sc->chain_frame_dmamap); if (sc->chain_frame_mem != NULL) bus_dmamem_free(sc->chain_frame_tag, sc->chain_frame_mem, sc->chain_frame_dmamap); if (sc->chain_frame_tag != NULL) bus_dma_tag_destroy(sc->chain_frame_tag); Why the kernel panics when I load the driver for second time and not the first time?? Am I doing something wrong somewhere while allocating the dma tag or memory?? Or am I leaving to release some extra stuffs while unloading the driver?? Is there any memory allocation limitations either from architecture or OS design? Please suggest me what to do on this case and what is the reason for this behaviour??? Please tell me if some more information is required. I hope I will get some response this time. Thanks, Sibananda Sahu *From:* Sibananda Sahu [mailto:sibananda.sahu@avagotech.com] *Sent:* Thursday, March 12, 2015 5:34 PM *To:* 'freebsd-scsi@freebsd.org' *Subject:* Kenrel panic in bus_dmamem_alloc() Hi, Recently I was working with the mrsas(4) driver and found that after several operations when I unload and reload the driver the kernel was entering into panic at the bus_dmamem_alloc() call. I have attached the core.txt file. Although this is primarily not related to SCSI, but still I thought if any kind of help from any kernel developers will be very much helpful. Below is the Back trace info extracted from the core text file: Unread portion of the kernel message buffer: AVAGO MegaRAID SAS FreeBSD mrsas driver version: 06.708.09.00 mrsas0: port 0xfc00-0xfcff mem 0xdf2f0000-0xdf2fffff,0xdf300000-0xdf3fffff irq 32 at device 0.0 on pci5 mrsas0: Waiting for FW to come to ready state mrsas0: FW now in Ready state mrsas0: Using MSI-X with 4 number of vectors mrsas0: FW supports <96> MSIX vector,Online CPU 4 Current MSIX <4> mrsas0: Avago Debug: MAX sge 0x106 MAX chain frame size 0x1000 mrsas0: Allocating ver buf memory size 256 mrsas0: Allocating IO req memory 237824 mrsas0: Allocating chain frame memory 3796992 Fatal trap 9: general protection fault while in kernel mode cpuid = 2; apic id = 32 instruction pointer = 0x20:0xffffffff80ae16f3 stack pointer = 0x28:0xfffffe01213d41d0 frame pointer = 0x28:0xfffffe01213d4240 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 1406 (kldload) Error while mapping shared library sections: ./mrsas.ko: No such file or directory. Reading symbols from /boot/kernel/ums.ko.symbols...done. Loaded symbols for /boot/kernel/ums.ko.symbols Reading symbols from /boot/kernel/uftdi.ko.symbols...done. Loaded symbols for /boot/kernel/uftdi.ko.symbols Reading symbols from /boot/kernel/ucom.ko.symbols...done. Loaded symbols for /boot/kernel/ucom.ko.symbols Error while reading shared library symbols: ./mrsas.ko: No such file or directory. #0 doadump (textdump=0) at pcpu.h:219 219 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump (textdump=0) at pcpu.h:219 #1 0xffffffff8033f5ae in db_dump (dummy=, dummy2=0, dummy3=0, dummy4=0x0) at /usr/src/sys/ddb/db_command.c:543 #2 0xffffffff8033f08d in db_command (cmd_table=) at /usr/src/sys/ddb/db_command.c:449 #3 0xffffffff8033ee04 in db_command_loop () at /usr/src/sys/ddb/db_command.c:502 #4 0xffffffff80341720 in db_trap (type=, code=0) at /usr/src/sys/ddb/db_main.c:231 #5 0xffffffff808b9bc3 in kdb_trap (type=9, code=0, tf=) at /usr/src/sys/kern/subr_kdb.c:656 #6 0xffffffff80c4b442 in trap_fatal (frame=0xfffffe01213d4120, eva=) at /usr/src/sys/amd64/amd64/trap.c:877 #7 0xffffffff80c4b0bf in trap (frame=) at /usr/src/sys/amd64/amd64/trap.c:224 #8 0xffffffff80c32d02 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:232 #9 0xffffffff80ae16f3 in vm_reserv_alloc_contig (object=0xffffffff8166d9c8, pindex=, npages=, low=0, high=18446735281027570048, alignment=, boundary=) at /usr/src/sys/vm/vm_reserv.c:252 #10 0xffffffff80ada2ee in vm_page_alloc_contig (object=0xffffffff8166d9c8, pindex=1183744, req=546, npages=927, low=0, high=4294967295, memattr=) at /usr/src/sys/vm/vm_page.c:1741 #11 0xffffffff80acc153 in kmem_alloc_contig (vmem=0xffffffff8144bd80, size=3796992, flags=1, low=0, high=4294967295, alignment=4) at /usr/src/sys/vm/vm_kern.c:239 #12 0xffffffff80d454cd in bus_dmamem_alloc (dmat=0xfffff80099ba3480, vaddr=0xfffffe00019aa0b8, flags=, mapp=0xfffffe00019aa0b0) at /usr/src/sys/x86/x86/busdma_machdep.c:551 #13 0xffffffff81a2500b in ?? () #14 0x0000000000000000 in ?? () Current language: auto; currently minimal (kgdb) Can anybody suggest me what might have gone wrong so this disaster happened??? Thanks, Sibananda Sahu