From owner-freebsd-arm@FreeBSD.ORG Thu Sep 12 15:44:24 2013 Return-Path: Delivered-To: freebsd-arm@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 3FFFFFAE for ; Thu, 12 Sep 2013 15:44:24 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from mail-oa0-f48.google.com (mail-oa0-f48.google.com [209.85.219.48]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id E5D7D2B85 for ; Thu, 12 Sep 2013 15:44:23 +0000 (UTC) Received: by mail-oa0-f48.google.com with SMTP id o17so10768381oag.21 for ; Thu, 12 Sep 2013 08:44:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:mime-version:content-type:from :in-reply-to:date:cc:content-transfer-encoding:message-id:references :to; bh=72saHaxrIBibvFWVLLQgyv0BMMPeV5gVUSEq7C/9MVI=; b=bQ2GXWXgikzF+7a8VY806gGWH9W2pDcpGVYgfSfFYmbaziSxpBKeQA4Lx01zWCUopS 3aCyVr4+fiWlvwlh51ERAmGLp7CDinuRJkD7MxwvuW9fHS8BEfyYPzgacbtLrQ+B7dbS AJm58S05hvOu6x5N6uO/s3KWXOC6fDj/SeOT3V9faGaFPG1DMfbeanCXoOjsiqJiJmzf RJzr8ZBENzJQ2ZlT4/OPslmcNQ9hTK7Z8jQL5H+xmwph+q9KCSiYKOtQULjnZ/RI3vN3 K+xNnQyBsYBZMbPdQb7LrAInOpkzKTdBaMqr2HVTP1AtSlP/QqBfInYmmb33Zwr7JCq1 sQqA== X-Gm-Message-State: ALoCoQnR3MoBep9p7lgalVD7injdniipSH1YRW4HKkLSOEyRViv0zoLFj1BMZAiYEXeFGRDVn44K X-Received: by 10.182.22.226 with SMTP id h2mr7483744obf.8.1379000657052; Thu, 12 Sep 2013 08:44:17 -0700 (PDT) Received: from monkey-bot.int.fusionio.com ([209.117.142.2]) by mx.google.com with ESMTPSA id ru3sm6316821obc.2.1969.12.31.16.00.00 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 12 Sep 2013 08:44:16 -0700 (PDT) Sender: Warner Losh Subject: Re: Panic mounting root on BeagleBone Black Mime-Version: 1.0 (Apple Message framework v1085) Content-Type: text/plain; charset=windows-1252 From: Warner Losh In-Reply-To: <1378997738.1111.631.camel@revolution.hippie.lan> Date: Thu, 12 Sep 2013 09:44:14 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: References: <47E403AE-01A2-4AC8-8028-41F0298FAC3E@freebsd.org> <1378997738.1111.631.camel@revolution.hippie.lan> To: Ian Lepore X-Mailer: Apple Mail (2.1085) Cc: "freebsd-arm@freebsd.org" X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Porting FreeBSD to the StrongARM Processor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Sep 2013 15:44:24 -0000 On Sep 12, 2013, at 8:55 AM, Ian Lepore wrote: > On Wed, 2013-09-11 at 06:43 -0700, Tim Kientzle wrote: >> Just built a new image for BBB from SVN r255438. >>=20 >> At the second boot, I got this: >> =10=10 >> Mounting local file systems:. >> mmcsd0: Error indicated: 1 Timeout >> g_vfs_done():mmcsd0s2a[READ(offset=3D2016903168, length=3D4096)]error = =3D 5 >> vnode_pager_getpages: I/O read error >> vm_fault: pager read error, pid 126 (ps) >> mmcsd0: Error indicated: 1 Timeout >> g_vfs_done():mmcsd0s2a[READ(offset=3D131072, length=3D32768)]error =3D = 5 >> sdhci_ti0-slot0: Got data interrupt 0x00000010, but there is no = active command. >> sdhci_ti0-slot0: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D REGISTER = DUMP =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> sdhci_ti0-slot0: Sys addr: 0x00000000 | Version: 0x00003101 >> sdhci_ti0-slot0: Blk size: 0x00000200 | Blk cnt: 0x00000010 >> sdhci_ti0-slot0: Argument: 0x0024679e | Trn mode: 0x0000193a >> sdhci_ti0-slot0: Present: 0x01f70000 | Host ctl: 0x00000006 >> sdhci_ti0-slot0: Power: 0x0000000d | Blk gap: 0x00000000 >> sdhci_ti0-slot0: Wake-up: 0x00000000 | Clock: 0x00000007 >> sdhci_ti0-slot0: Timeout: 0x0000000d | Int stat: 0x00000000 >> sdhci_ti0-slot0: Int enab: 0x017f00fb | Sig enab: 0x017f00fb >> sdhci_ti0-slot0: AC12 err: 0x00000000 | Slot int: 0x00000000 >> sdhci_ti0-slot0: Caps: 0x06e10080 | Max curr: 0x00000000 >> sdhci_ti0-slot0: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= >>=20 >> =85. few more similar messages, then =85. >>=20 >> mmcsd0: Error indicated: 1 Timeout >> g_vfs_done():mmcsd0s2a[WRITE(offset=3D20808192, length=3D512)]error =3D= 5 >> g_vfs_done():mmcsd0s2a[WRITE(offset=3D1276346368, length=3D24576)]error= =3D 5 >> panic: brelse: inappropriate B_PAGING or B_CLUSTER bp 0xcd148778 >> [bt snipped] >>=20 >=20 > This was a single occurance, right? Like you're not dead in the water > or anything? >=20 > There's insanity in that info... the register dump shows a multi-block > write (8kbytes) was set up, but the command that timed out was a read. > If a prior write had timed out why isn't there a g_vfs_done() error > logged for it? >=20 > I think what we really need is some better error recovery in the mmc = and > sd layers. Retrying a failed IO is cheap and easy. More complex > recovery is possible too (power cycling and re-intializing the card > and/or controller). But that has its own difficulties -- what if the > nature of the problem was that the user swapped cards? -- you don't = want > to retry a write under those conditions. I'd disagree with this... Retrying often is the wrong thing to do. If = the write didn't work the first time, why would it work the second? = Looks like a programming bug here in controlling the sdhci controller = since we got errors, then we got an interrupt with no pending commands. = This suggests that our timeout isn't quite right... Warner