Date: Wed, 21 Jan 2009 12:28:42 -0700 (MST) From: "M. Warner Losh" <imp@bsdimp.com> To: mav@FreeBSD.org Cc: freebsd-arm@FreeBSD.org Subject: Re: Mount root from SD card? Message-ID: <20090121.122842.-1582190967.imp@bsdimp.com> In-Reply-To: <49776734.8030805@FreeBSD.org> References: <20090121.100533.-1955669401.imp@bsdimp.com> <20090121.101459.2022307528.imp@bsdimp.com> <49776734.8030805@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
In message: <49776734.8030805@FreeBSD.org> Alexander Motin <mav@FreeBSD.org> writes: : M. Warner Losh wrote: : > In message: <20090121.100533.-1955669401.imp@bsdimp.com> : > "M. Warner Losh" <imp@bsdimp.com> writes: : > : In message: <4977500A.7060902@bulinfo.net> : > : Krassimir Slavchev <krassi@bulinfo.net> writes: : > : : -----BEGIN PGP SIGNED MESSAGE----- : > : : Hash: SHA1 : > : : : > : : M. Warner Losh wrote: : > : : > In message: <4977236E.2020409@bulinfo.net> : > : : > Krassimir Slavchev <krassi@bulinfo.net> writes: : > : : > Boot with verbose messages is here: : > : : > : > : : > http://mnemonic.bulinfo.net/~krassi/ARM/arm.verbose : > : : > : > : : >> This looks very similar to the data corruption I saw when I had : > : : >> enabled multiblock read. To track this down, we're going to have to : > : : >> print the actual data returned for each sector... : > : : > : > : : >> Warner : > : : : > : : : > : : Here is a dump of data right after the byte swapping in : > : : at91_mci_read_done(): : > : : : > : : http://mnemonic.bulinfo.net/~krassi/ARM/sd.dump : > : : : > : : and here is the first 1M of the SD card: : > : : : > : : http://mnemonic.bulinfo.net/~krassi/ARM/sd.bin : > : : > : Looks like we're getting some data corruption: : > : : > : CMD: 11 ARG 0 len 512 : > : : > : ff ff ff ff fc 31 c0 8e c0 8e d8 8e d0 bc 00 7c : > : 89 e6 bf 00 06 b9 00 01 f3 a5 89 fd b1 08 f3 ab : > : fe 45 f2 e9 00 8a f6 46 bb 20 75 08 84 d2 78 07 : > : 80 4e bb 40 8a 56 ba 88 56 00 e8 fc 00 52 bb c2 : > : ... : > : : > : and then: : > : : > : CMD: 11 ARG 0 len 512 : > : : > : 00 00 55 aa fc 31 c0 8e c0 8e d8 8e d0 bc 00 7c : > : 89 e6 bf 00 06 b9 00 01 f3 a5 89 fd b1 08 f3 ab : > : fe 45 f2 e9 00 8a f6 46 bb 20 75 08 84 d2 78 07 : > : 80 4e bb 40 8a 56 ba 88 56 00 e8 fc 00 52 bb c2 : > : ... : > : : > : So it looks like the first 4 bytes are corrupted on the read. If you : > : look closely at the data on the device, you'll see that 'fc 31 c0 8e' : > : are the first 4 bytes of the reads are the 'left over' data from prior : > : data streams. This didn't used to be the case in the prior code : > : before the recent changes. The only way we're going to find the bad : > : change is to do a binary search on the svn changes to find out where : > : we go off the rails. This problem seems familiar to me, but I can't : > : quite put my finger on what the root-cause was last time I had it. : > : > I should have said 'fc 31 c0 8e' are the first four bytes of the data : > on the device, and 'ff ff ff ff' and '00 00 55 aa' are the leftover : > data which is corrupting things. The latter is actually the last 4 : > bytes of the block, which indicates that our PMC usage has stopped too : > soon, or that we have left over PMC data from a previous "read" that : > didn't specify enough data to be transferred. I suspect that we're : > sending a command down and not expecting enough data. On other : > bridges we toss the data harmlessly. On at91, the data is still in : > the FIFO for the mci device, so we see it first on the next read. At : > least that's the theory that just popped into my head, and also the : > root-cause that I now recall from before when I saw similar : > problems... : > : > Of course, given the number of transfers that had a lot of 'ff' in : > them, maybe the PMC is trasnferring data that doesn't really exist : > yet... : > : > Warner : : This part looks quire strange for plain FIFO explanation. Several : consequential commands give different results: : : CMD: 37 ARG 10000 len 0 : RES: 0 : CMD: d ARG 0 len 64 : : ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 28 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... : RES: 2 : CMD: 37 ARG 10000 len 0 : RES: 0 : CMD: d ARG 0 len 64 : : ff ff ff ff ff ff ff ff ff ff ff ff 00 00 00 00 : 00 00 00 28 00 00 00 00 00 00 00 00 00 00 00 00 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... : RES: 2 : CMD: 37 ARG 10000 len 0 : RES: 0 : CMD: d ARG 0 len 64 : : ff ff ff ff 00 00 00 00 00 00 00 28 00 00 00 00 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... : RES: 2 : CMD: 37 ARG 10000 len 0 : RES: 0 : CMD: d ARG 0 len 64 : : ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 28 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... : RES: 2 : : While working on sdhci driver for on ENE chip I have found that for : short transfers (less then 1K) DMA engine returns set of zeroes instead : of data. I haven't found better solution and just handling short : transfers by PIO. Same problem exists for PIO also, but there it was : masked by adding short delay before reading from port. I'll have to take a look at the code in more detail to make sure that we're doing the right thing. I noticed all the ff's, but didn't notice until now what they were shifted the same way that the data blocks were later. In this case, you'll see there's three of them. I believe that this is the first use of a CMD that generates data that isn't a full block of data... Warner
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090121.122842.-1582190967.imp>