From owner-freebsd-arm@FreeBSD.ORG Mon Jan 12 02:49:03 2015 Return-Path: Delivered-To: freebsd-arm@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 37876522 for ; Mon, 12 Jan 2015 02:49:03 +0000 (UTC) Received: from mho-01-ewr.mailhop.org (mho-03-ewr.mailhop.org [204.13.248.66]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0D4B729F for ; Mon, 12 Jan 2015 02:49:02 +0000 (UTC) Received: from [73.34.117.227] (helo=ilsoft.org) by mho-01-ewr.mailhop.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.72) (envelope-from ) id 1YAV3r-000AHz-S7; Mon, 12 Jan 2015 02:49:00 +0000 Received: from revolution.hippie.lan (revolution.hippie.lan [172.22.42.240]) by ilsoft.org (8.14.9/8.14.9) with ESMTP id t0C2mvQt045370; Sun, 11 Jan 2015 19:48:58 -0700 (MST) (envelope-from ian@freebsd.org) X-Mail-Handler: Dyn Standard SMTP by Dyn X-Originating-IP: 73.34.117.227 X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX1/Ccs6mzG6oMqfURg759o4D Message-ID: <1421030937.14601.153.camel@freebsd.org> Subject: Re: read(2) into some addresses doesn't return data on RPi From: Ian Lepore To: Peter Jeremy Date: Sun, 11 Jan 2015 19:48:57 -0700 In-Reply-To: <20150110060412.GE77914@server.rulingia.com> References: <20150110060412.GE77914@server.rulingia.com> Content-Type: text/plain; charset="us-ascii" X-Mailer: Evolution 3.12.8 FreeBSD GNOME Team Port Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Cc: freebsd-arm@freebsd.org X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Jan 2015 02:49:03 -0000 On Sat, 2015-01-10 at 17:04 +1100, Peter Jeremy wrote: > Trying to access the boot partition using mtools consistently fails on my > RPi because the kernel is returning NULs for the first sector. The second > sector is correct. If I use dd(2) then the expected data is returned. > > This is running 11-current r276818 (but ISTR seeing it on older kernels). > > I did some digging and found that read(2)s of the SD card device return > successful but do not actually write anything to the buffer for some > addresses (and they happen to contain all NULs in mtools). This doesn't > appear to affect reads of normal files. > > Running the attached program on /dev/mmcsd0s1 gave me the following results: > - There are no partial reads. Either all 512 bytes are updated or none are. > - There are two blocks of addresses 0xbfff0e00 thru 0xbfff0e00 and 0xbfff2e00 > thru 0xbfff2e00 where reads work on a 32-byte alignment but not otherwise. > - Reads consistently fail between 0xbfff1e08 and 0xbfff1ff8 > - Reads consistently fail between 0xbfff3e08 and 0xbfff3f?? (I got a hang). > - The program never completes. In 3 runs, I've gotten: > - panic: null_fetch_syscall_args > - kernel hang > - panic: malloc: bad malloc type magic > I don't have a serial console and so can't debug kernel panics. > > Putting that together, it seems to related to accesses that aren't cache-line > aligned and cross page boundaries but I'm not sure why it behaves differently > at different page boundaries. The hangs/panics suggest that it's writing to > random other kernel addresses instead. > > Does this ring a bell for anyone? > This turned out to be two problems, both fixed now as of r277038. The first problem was that the driver wasn't able to handle a dma that was split across two physically discontiguous pages, and when an IO isn't aligned to a cacheline the arm busdma logic that auto-bounces it inherently ends up setting up a split buffer. Since the dma tag required a single buffer, the mapping operation would fail with EFBIG. The second problem was that the rpi sdhci driver was completely ignoring the status of the busdma mapping calls, so after a failed mapping it would do the dma anyway, using who-knows-what for a dma address, leading to later panics or crashes due to corrupted memory. So first I made it handle errors better, then I made it able to handle an IO that crosses page boundaries. I couldn't have done any of it without that program that recreated the failure and confirmed the fix, thanks Peter! -- Ian