Date: Wed, 15 Aug 2018 09:31:03 +0300 From: Toomas Soome <tsoome@me.com> To: tech-lists <tech-lists@zyxst.net> Cc: freebsd-current@freebsd.org Subject: Re: boot errors since upgrading to 12-current Message-ID: <C3AC526B-56DD-4273-A3FB-7BBB472563E5@me.com> In-Reply-To: <24f2e3f5-67b3-a5ac-8394-a7b5ecd0ce39@zyxst.net> References: <f3cb9196-0e89-6c4e-5e8f-d3c4e48e16dc@zyxst.net> <22F5A9FD-3167-4029-8CFF-B4096E9E69BB@me.com> <24f2e3f5-67b3-a5ac-8394-a7b5ecd0ce39@zyxst.net>
next in thread | previous in thread | raw e-mail | index | archive | help
> On 15 Aug 2018, at 06:06, tech-lists <tech-lists@zyxst.net> wrote: >=20 > On 14/08/2018 21:16, Toomas Soome wrote: >>> On 14 Aug 2018, at 22:37, tech-lists <tech-lists@zyxst.net> wrote: >>> Hello, >>> context: amd64, FreeBSD 12.0-ALPHA1 #0 r337682, ZFS. The system is >>> *not* root-on-zfs. It boots to an SSD. The three disks indicated >>> below are spinning rust. >>> NAME STATE READ WRITE CKSUM storage ONLINE 0 >>> 0 0 raidz1-0 ONLINE 0 0 0 ada1 ONLINE 0 >>> 0 0 ada2 ONLINE 0 0 0 ada3 ONLINE 0 >>> 0 0 >>> This machine was running 11.2 up until about a month ago. >>> Recently I've seen this flash up on the screen before getting to >>> the beastie screen: >>> BIOS drive C: is disk0 BIOS drive D: is disk1 BIOS drive E: is >>> disk2 BIOS drive F: is disk3 BIOS drive G: is disk4 BIOS drive H: >>> is disk5 BIOS drive I: is disk6 BIOS drive J: is disk7 >>> [the above is normal and has always has been seen on every boot] >>> read 1 from 0 to 0xcbdb1330, error: 0x31 read 1 from 0 to >>> 0xcbdb1330, error: 0x31 read 1 from 0 to 0xcbdb1330, error: 0x31 = read 1 from 0 to 0xcbdb1330, error: 0x31 read 1 from 0 to >>> 0xcbdb1330, error: 0x31 read 1 from 0 to 0xcbdb1330, error: 0x31 = read 1 from 0 to 0xcbdb1330, error: 0x31 read 1 from 0 to >>> 0xcbdb1330, error: 0x31 >>> the above has been happening since upgrading to -current a month >>> ago >>> ZFS: i/o error - all block copies unavailable ZFS: can't read MOS >>> of pool storage >>> the above is alarming and has been happening for the past couple of >>> days, since upgrading to r337682 on the 12th August. >>> The beastie screen then loads and it boots normally. >>> Should I be concerned? Is the output indicative of a problem? >> Not immediately and yes. In BIOS loader, we do all disk IO with INT13 >> and the error 0x31 is often hinting about missing media or some other >> controller related error. Could you paste the output from loader >> lsdev -v output? >> The drive list appears as an result of probing the disks in >> biosdisk.c. The read errors are from attempt to read 1 sector from >> sector 0 (that is, to read the partition table from the disk). Why >> this does end with error, would be interesting to know, unfortunately >> that error does not tell us which disk was probed. >=20 > Hi Toomas, thanks for looking at this. >=20 > lsdev -v looks like this: >=20 > OK lsdev -v > disk devices: > disk0: BIOS drive C (16514064 X 512): > disk0s1: FreeBSD 111GB > disk0s1a: FreeBSD UFS 108GB > disk0s1b: FreeBSD swap 3881MB >=20 > disk1: BIOS drive D (16514064 X 512): > disk2: BIOS drive E (16514064 X 512): > disk3: BIOS drive F (16514064 X 512): > disk4: BIOS drive G (2880 X 512): > read 1 from 0 to 0xcbde0a20, error 0x31 > disk5: BIOS drive D (2880 X 512): > read 1 from 0 to 0xcbde0a20, error 0x31 > disk6: BIOS drive D (2880 X 512): > read 1 from 0 to 0xcbde0a20, error 0x31 > disk7: BIOS drive D (2880 X 512): > read 1 from 0 to 0xcbde0a20, error 0x31 > OK >=20 > disk4 to disk7 corresponds with da0 to da3 which are sd/mmc devices = without any media in. What made me notice it is it never showed the read = 1 from 0 to $random_value on 11-stable. The system runs 12-current now. Yea, its not about random value, but the rework to process the missing = media is on the way to current, stay tuned:) >=20 > disk1 to disk3 are the hard drives making up ZFS. These are 4TB = Western Digital SATA-3 WDC WD4001FAEX. Well that does explain the problem, if you look on the sizes reported=E2=80= =A6 so your BIOS is reporting wrong sizes, is unable to access whole 4TB = space and the zfs reader is not getting the correct data from the disks = - and is resulting with errors. Thats why you get the errors from = =E2=80=98storage=E2=80=99 pool and yes, this is harmless for boot = because you have separate (small) disk for the boot. rgds, toomas >=20 >>> Since you are getting errors from data pool =E2=80=98storage=E2=80=99,= it does not >>> affect the boot. Why the pool storage is unreadable - it likely has >>> to do about the errors above, but can not tell for sure based on the >>> data presented here=E2=80=A6. >=20 > Thing is, the data pool works fine when boot completes. i.e it loads = read/write and behaves normally. >=20 > thanks, > --=20 > J.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C3AC526B-56DD-4273-A3FB-7BBB472563E5>