Date: Sat, 21 Nov 2020 00:06:53 +0100 From: Mateusz Guzik <mjguzik@gmail.com> To: mike tancsa <mike@sentex.net> Cc: Philip Paeps <philip@freebsd.org>, "Bjoern A. Zeeb" <bz@freebsd.org>, netperf-admin@freebsd.org, netperf-users@freebsd.org, Allan Jude <allanjude@freebsd.org> Subject: Re: zoo reboot Friday Nov 20 14:00 UTC Message-ID: <CAGudoHE8-Z1WErjdQJ7ZWVO9h5O-5ys45OwTTPrSAeSCBHHUGw@mail.gmail.com> In-Reply-To: <f9a074b9-17d3-dcfd-5559-a00e1ac75c07@sentex.net> References: <1f8e49ff-e3da-8d24-57f1-11f17389aa84@sentex.net> <d2ffd0f1-1dd8-dc6b-9975-93f20d7974a4@sentex.net> <dc8fed75-0262-c614-3292-6b8ce5addcfc@sentex.net> <0ddec867-32b5-f667-d617-0ddc71726d09@sentex.net> <CAGudoHHNN8ZcgdkRSy0cSaPA6J9ZHVf%2BBQFiBcThrtQ0AMP%2BOw@mail.gmail.com> <5549CA9F-BCF4-4043-BA2F-A2C41D13D955@freebsd.org> <ad81b5f3-f6de-b908-c00f-fb8d6ac2a0b8@sentex.net> <CAGudoHETJZ0f_YjmCcUjb-Wcf1tKhSF719kXxXUB3p4RB0uuRQ@mail.gmail.com> <CAGudoHH=H4Xok5HG3Hbw7S=6ggdsi%2BN4zHirW50cmLGsLnhd4g@mail.gmail.com> <270b65c0-8085-fe2f-cf4f-7a2e4c17a2e8@sentex.net> <CAGudoHFLy2dxBMGd2AJZ6q6zBsU%2Bn8uLXLSiFZ1QGi_qibySVg@mail.gmail.com> <a716e874-d736-d8d5-9c45-c481f6b3dee7@sentex.net> <CAGudoHELFz7KyzQmRN8pCbgLQXPgCdHyDAQ4pzFLF%2BYswcP87A@mail.gmail.com> <163d1815-fc4a-7987-30c5-0a21e8383c93@sentex.net> <CAGudoHF3c1e2DFSAtyjMpcrbfzmMV5x6kOA_5BT5jyoDyKEHsA@mail.gmail.com> <a1ef98c6-e734-1760-f0cb-a8d31c6acc18@sentex.net> <CAGudoHE%2BxjHdBQAD3cAL84=k-kHDsZNECBGNNOn2LsStL5A7Dg@mail.gmail.com> <f9a074b9-17d3-dcfd-5559-a00e1ac75c07@sentex.net>
next in thread | previous in thread | raw e-mail | index | archive | help
good grief if complete restore will be needed, perhaps there is a better layout which can be used? On 11/21/20, mike tancsa <mike@sentex.net> wrote: > OK. Although looks like I will have to pull it in from backups now :( > > > root@zoo2:/home/mdtancsa # zpool import -f -R /mnt zroot > cannot import 'zroot': I/O error > Destroy and re-create the pool from > a backup source. > root@zoo2:/home/mdtancsa # > > all the disks are there :( Not sure why its not importing ? > > On 11/20/2020 6:02 PM, Mateusz Guzik wrote: >> should you go trhough with it please make sure zsh is installed. then >> i can get in and install some other stuff as needed. >> >> On 11/20/20, mike tancsa <mike@sentex.net> wrote: >>> That looks good to my newbie eyes. I think tomorrow I will just bite >>> the bullet and put in a pair of 240G SSDs to boot from. I will install >>> HEAD onto them on another machine, and then put them in zoo, boot from >>> there and adjust the home directory mounts accordingly if thats OK with >>> everyone ? >>> >>> ---Mike >>> >>> On 11/20/2020 5:53 PM, Mateusz Guzik wrote: >>>> I ran this one-liner: >>>> >>>> gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada0 >>>> >>>> which according to https://wiki.freebsd.org/RootOnZFS/GPTZFSBoot >>>> should be fine. Hopefully Allan will know better. >>>> >>>> On 11/20/20, mike tancsa <mike@sentex.net> wrote: >>>>> Unfortunately no luck :( >>>>> >>>>> >>>>> >>>>> ZFS: i/o error - all block copies >>>>> unavailable >>>>> ZFS: can't read MOS of pool >>>>> zroot >>>>> gptzfsboot: failed to mount default pool >>>>> zroot >>>>> >>>>> >>>>> FreeBSD/x86 >>>>> boot >>>>> >>>>> >>>>> Whats odd is that it doesnt post all the drives.... >>>>> >>>>> Zoo predated EFI, so it was booting legacy BIOS. Are the boot blocks >>>>> that you installed assuming that ? >>>>> >>>>> On 11/20/2020 1:27 PM, Mateusz Guzik wrote: >>>>>> swap and boot partitions resized, the ada0p3 partition got removed >>>>>> from the pool and inserted back, it is rebuilding now: >>>>>> >>>>>> root@zoo2:~ # zpool status >>>>>> pool: zroot >>>>>> state: DEGRADED >>>>>> status: One or more devices is currently being resilvered. The pool >>>>>> will >>>>>> continue to function, possibly in a degraded state. >>>>>> action: Wait for the resilver to complete. >>>>>> scan: resilver in progress since Fri Nov 20 23:13:28 2020 >>>>>> 459G scanned at 1.00G/s, 291G issued at 650M/s, 3.47T total >>>>>> 0B resilvered, 8.17% done, 01:25:48 to go >>>>>> config: >>>>>> >>>>>> NAME STATE READ WRITE >>>>>> CKSUM >>>>>> zroot DEGRADED 0 0 >>>>>> 0 >>>>>> mirror-0 DEGRADED 0 0 >>>>>> 0 >>>>>> replacing-0 DEGRADED 0 0 >>>>>> 0 >>>>>> 1517819109053923011 OFFLINE 0 0 >>>>>> 0 was /dev/ada0p3/old >>>>>> ada0p3 ONLINE 0 0 >>>>>> 0 >>>>>> ada1 ONLINE 0 0 >>>>>> 0 >>>>>> mirror-1 ONLINE 0 0 >>>>>> 0 >>>>>> ada3p3 ONLINE 0 0 >>>>>> 0 >>>>>> ada4p3 ONLINE 0 0 >>>>>> 0 >>>>>> mirror-2 ONLINE 0 0 >>>>>> 0 >>>>>> ada5p3 ONLINE 0 0 >>>>>> 0 >>>>>> ada6p3 ONLINE 0 0 >>>>>> 0 >>>>>> special >>>>>> mirror-3 ONLINE 0 0 >>>>>> 0 >>>>>> gptid/db15e826-1a9c-11eb-8d25-0cc47a1f2fa0 ONLINE 0 0 >>>>>> 0 >>>>>> mfid1p2 ONLINE 0 0 >>>>>> 0 >>>>>> >>>>>> errors: No known data errors >>>>>> >>>>>> One pickle: i did 'zpool export zroot' to replace the drive, >>>>>> otherwise >>>>>> zfs protested. subsequent zpool import was done slightly carelessly >>>>>> and it mounted over /, meaning i lost access to original ufs. Should >>>>>> there be a need to boot from it again someone will have to boot >>>>>> single >>>>>> user and make sure to comment out swap in /etc/fstab or we will have >>>>>> to replace the drive again. >>>>>> >>>>>> That said, as I understand we are in position to take out the ufs >>>>>> drive and reboot to be back in business. >>>>>> >>>>>> The ufs drive will have to be mounted somewhere to sort out that >>>>>> swap. >>>>>> >>>>>> On 11/20/20, mike tancsa <mike@sentex.net> wrote: >>>>>>> On 11/20/2020 1:00 PM, Mateusz Guzik wrote: >>>>>>>> So this happened after boot: >>>>>>>> >>>>>>>> root@zoo2:/home/mjg # swapinfo >>>>>>>> Device 1K-blocks Used Avail Capacity >>>>>>>> /dev/ada0p3 2928730500 0 2928730500 0% >>>>>>>> >>>>>>>> which i presume might have corrupted some of it. >>>>>>> Oh, that makes sense now. When it was installed in the back, the >>>>>>> drive >>>>>>> posted as ada0. When we put it in zoo, it was on a farther down >>>>>>> port, >>>>>>> hence it came up as ada7. I had to manually mount / off ada7p2. I >>>>>>> updated fstab so as not to do that again already. That mystery >>>>>>> solved. >>>>>>> >>>>>>> ---Mike >>>>>>> >>>>>>> >>>>>>>> Allan pasted some one-liners to resize the boot and swap partition. >>>>>>>> >>>>>>>> With your permission I would like to run them and then >>>>>>>> offline/online >>>>>>>> the disk to have it rebuild. >>>>>>>> >>>>>>>> As for longer plans what to do with it i think that's a different >>>>>>>> subject, whatever new drives end up being used I'm sure the FreeBSD >>>>>>>> Foundation can reimburse you with no difficulty. >>>>>>>> >>>>>>>> >>>>>>>> On 11/20/20, mike tancsa <mike@sentex.net> wrote: >>>>>>>>> Its a bit of an evolutionary mess the current state of zoo. I >>>>>>>>> wonder >>>>>>>>> if >>>>>>>>> we are better off re-installing the base OS fresh on a pair of SSD >>>>>>>>> drives and have the base OS on it and leave all the user data on >>>>>>>>> the >>>>>>>>> current "zroot"... Considering 240G SSDs are $35 CDN it might be >>>>>>>>> easier >>>>>>>>> to just install fresh on it and not have to worry about resizing >>>>>>>>> etc. >>>>>>>>> >>>>>>>>> ---Mike >>>>>>>>> >>>>>>>>> On 11/20/2020 12:49 PM, Mateusz Guzik wrote: >>>>>>>>>> On 11/20/20, Mateusz Guzik <mjguzik@gmail.com> wrote: >>>>>>>>>>> CC'ing Allan Jude >>>>>>>>>>> >>>>>>>>>>> So: >>>>>>>>>>> >>>>>>>>>>> pool: zroot >>>>>>>>>>> state: DEGRADED >>>>>>>>>>> status: One or more devices could not be opened. Sufficient >>>>>>>>>>> replicas >>>>>>>>>>> exist >>>>>>>>>>> for >>>>>>>>>>> the pool to continue functioning in a degraded state. >>>>>>>>>>> action: Attach the missing device and online it using 'zpool >>>>>>>>>>> online'. >>>>>>>>>>> see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-2Q >>>>>>>>>>> scan: scrub repaired 0B in 05:17:02 with 0 errors on Tue Aug >>>>>>>>>>> 18 >>>>>>>>>>> 15:19:00 >>>>>>>>>>> 2020 >>>>>>>>>>> config: >>>>>>>>>>> >>>>>>>>>>> NAME STATE READ >>>>>>>>>>> WRITE >>>>>>>>>>> CKSUM >>>>>>>>>>> zroot DEGRADED 0 >>>>>>>>>>> 0 >>>>>>>>>>> 0 >>>>>>>>>>> mirror-0 DEGRADED 0 >>>>>>>>>>> 0 >>>>>>>>>>> 0 >>>>>>>>>>> 1517819109053923011 UNAVAIL 0 >>>>>>>>>>> 0 >>>>>>>>>>> 0 was /dev/ada0p3 >>>>>>>>>>> ada1 ONLINE 0 >>>>>>>>>>> 0 >>>>>>>>>>> 0 >>>>>>>>>>> mirror-1 ONLINE 0 >>>>>>>>>>> 0 >>>>>>>>>>> 0 >>>>>>>>>>> ada3p3 ONLINE 0 >>>>>>>>>>> 0 >>>>>>>>>>> 0 >>>>>>>>>>> ada4p3 ONLINE 0 >>>>>>>>>>> 0 >>>>>>>>>>> 0 >>>>>>>>>>> mirror-2 ONLINE 0 >>>>>>>>>>> 0 >>>>>>>>>>> 0 >>>>>>>>>>> ada5p3 ONLINE 0 >>>>>>>>>>> 0 >>>>>>>>>>> 0 >>>>>>>>>>> ada6p3 ONLINE 0 >>>>>>>>>>> 0 >>>>>>>>>>> 0 >>>>>>>>>>> special >>>>>>>>>>> mirror-3 ONLINE 0 >>>>>>>>>>> 0 >>>>>>>>>>> 0 >>>>>>>>>>> gptid/db15e826-1a9c-11eb-8d25-0cc47a1f2fa0 ONLINE 0 >>>>>>>>>>> 0 >>>>>>>>>>> 0 >>>>>>>>>>> mfid1p2 ONLINE 0 >>>>>>>>>>> 0 >>>>>>>>>>> 0 >>>>>>>>>>> >>>>>>>>>>> errors: No known data errors >>>>>>>>>>> >>>>>>>>>>> # dmesg | grep ada0 >>>>>>>>>>> Trying to mount root from ufs:/dev/ada0p2 [rw]... >>>>>>>>>>> ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 >>>>>>>>>>> ada0: <WDC WD3003FZEX-00Z4SA0 01.01A01> ACS-2 ATA SATA 3.x >>>>>>>>>>> device >>>>>>>>>>> ada0: Serial Number WD-WCC137TALF5K >>>>>>>>>>> ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) >>>>>>>>>>> ada0: Command Queueing enabled >>>>>>>>>>> ada0: 2861588MB (5860533168 512 byte sectors) >>>>>>>>>>> ada0: quirks=0x1<4K> >>>>>>>>>>> Mounting from ufs:/dev/ada0p2 failed with error 2; retrying for >>>>>>>>>>> 3 >>>>>>>>>>> more >>>>>>>>>>> seconds >>>>>>>>>>> Mounting from ufs:/dev/ada0p2 failed with error 2. >>>>>>>>>>> vfs.root.mountfrom=ufs:/dev/ada0p2 >>>>>>>>>>> GEOM_PART: Partition 'ada0p3' not suitable for kernel dumps >>>>>>>>>>> (wrong >>>>>>>>>>> type?) >>>>>>>>>>> ZFS WARNING: Unable to attach to ada0p3. >>>>>>>>>>> ZFS WARNING: Unable to attach to ada0p3. >>>>>>>>>>> ZFS WARNING: Unable to attach to ada0p3. >>>>>>>>>>> ZFS WARNING: Unable to attach to ada0p3. >>>>>>>>>>> ZFS WARNING: Unable to attach to ada0p3. >>>>>>>>>>> ZFS WARNING: Unable to attach to ada0p3. >>>>>>>>>>> >>>>>>>>>>> # gpart show ada0 >>>>>>>>>>> => 34 5860533101 ada0 GPT (2.7T) >>>>>>>>>>> 34 6 - free - (3.0K) >>>>>>>>>>> 40 88 1 freebsd-boot (44K) >>>>>>>>>>> 128 3072000 2 freebsd-swap (1.5G) >>>>>>>>>>> 3072128 5857461000 3 freebsd-zfs (2.7T) >>>>>>>>>>> 5860533128 7 - free - (3.5K) >>>>>>>>>>> >>>>>>>>>>> Running naive dd if=/dev/ada0p3 works, so I don't know what zfs >>>>>>>>>>> complains >>>>>>>>>>> about. >>>>>>>>>>> >>>>>>>>>> Also note Philip's point boot partition of 44k. Is that too small >>>>>>>>>> now? >>>>>>>>>> >>>>>>>>>>> On 11/20/20, mike tancsa <mike@sentex.net> wrote: >>>>>>>>>>>> On 11/20/2020 11:40 AM, Philip Paeps wrote: >>>>>>>>>>>>> On 2020-11-21 00:04:19 (+0800), Mateusz Guzik wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Oh, that's a bummer. I wonder if there is a regression in the >>>>>>>>>>>>>> boot >>>>>>>>>>>>>> loader though. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Does the pool mount if you boot the system from a cd/over the >>>>>>>>>>>>>> network/whatever? >>>>>>>>>>>>> It's worth checking if the freebsd-boot partition is large >>>>>>>>>>>>> enough. >>>>>>>>>>>>> I >>>>>>>>>>>>> noticed during the cluster refresh that we often use 108k for >>>>>>>>>>>>> freebsd-boot but recent head wants 117k. I've been bumping >>>>>>>>>>>>> the >>>>>>>>>>>>> bootblocks to 236k. >>>>>>>>>>>>> >>>>>>>>>>>>> So far, all the cluster machines I've upgraded booted though >>>>>>>>>>>>> .. >>>>>>>>>>>>> so >>>>>>>>>>>>> ... >>>>>>>>>>>>> I might be talking ex recto. :) >>>>>>>>>>>>> >>>>>>>>>>>> I put in an ssd drive and booted from it. One of the drives >>>>>>>>>>>> might >>>>>>>>>>>> have >>>>>>>>>>>> gotten loose or died in the power cycles, but there is still >>>>>>>>>>>> redundancy >>>>>>>>>>>> and I was able to mount the pool. Not sure why it cant find the >>>>>>>>>>>> file >>>>>>>>>>>> ? >>>>>>>>>>>> >>>>>>>>>>>> root@zoo2:~ # diff /boot/lua/loader.lua >>>>>>>>>>>> /mnt/boot/lua/loader.lua >>>>>>>>>>>> 29c29 >>>>>>>>>>>> < -- $FreeBSD$ >>>>>>>>>>>> --- >>>>>>>>>>>>> -- $FreeBSD: head/stand/lua/loader.lua 359371 2020-03-27 >>>>>>>>>>>>> 17:37:31Z >>>>>>>>>>>> freqlabs $ >>>>>>>>>>>> root@zoo2:~ # >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> % ls -l /mnt/boot/lua/ >>>>>>>>>>>> total 110 >>>>>>>>>>>> -r--r--r-- 1 root wheel 4300 Nov 20 08:41 cli.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 3288 Nov 20 08:41 color.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 18538 Nov 20 08:41 config.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 12610 Nov 20 08:41 core.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 11707 Nov 20 08:41 drawer.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 2456 Nov 20 08:41 gfx-beastie.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 2235 Nov 20 08:41 gfx-beastiebw.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 1958 Nov 20 08:41 gfx-fbsdbw.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 2413 Nov 20 08:41 gfx-orb.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 2140 Nov 20 08:41 gfx-orbbw.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 3324 Nov 20 08:41 hook.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 2395 Nov 20 08:41 loader.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 2429 Sep 24 09:09 logo-beastie.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 2203 Sep 24 09:09 >>>>>>>>>>>> logo-beastiebw.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 1958 Sep 24 09:09 logo-fbsdbw.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 2397 Sep 24 09:09 logo-orb.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 2119 Sep 24 09:09 logo-orbbw.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 14201 Nov 20 08:41 menu.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 4299 Nov 20 08:41 password.lua >>>>>>>>>>>> -r--r--r-- 1 root wheel 2227 Nov 20 08:41 screen.lua >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Mateusz Guzik <mjguzik gmail.com> >>>>>>>>>>> >> > -- Mateusz Guzik <mjguzik gmail.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGudoHE8-Z1WErjdQJ7ZWVO9h5O-5ys45OwTTPrSAeSCBHHUGw>