Date: Fri, 20 Nov 2020 19:27:44 +0100 From: Mateusz Guzik <mjguzik@gmail.com> To: mike tancsa <mike@sentex.net> Cc: Philip Paeps <philip@freebsd.org>, "Bjoern A. Zeeb" <bz@freebsd.org>, netperf-admin@freebsd.org, netperf-users@freebsd.org, Allan Jude <allanjude@freebsd.org> Subject: Re: zoo reboot Friday Nov 20 14:00 UTC Message-ID: <CAGudoHELFz7KyzQmRN8pCbgLQXPgCdHyDAQ4pzFLF%2BYswcP87A@mail.gmail.com> In-Reply-To: <a716e874-d736-d8d5-9c45-c481f6b3dee7@sentex.net> References: <1f8e49ff-e3da-8d24-57f1-11f17389aa84@sentex.net> <2691e1fd-5a27-4dd0-2ef7-b1c06fd4e751@sentex.net> <A3934CD4-57C1-4215-99F2-9500CB9EDC7C@neville-neil.com> <5A5094BC-D417-4BA6-97E2-7CB522B51368@FreeBSD.org> <4ec6ed6f-b3b4-22ae-e1ec-93a46f3d88ea@sentex.net> <d2ffd0f1-1dd8-dc6b-9975-93f20d7974a4@sentex.net> <dc8fed75-0262-c614-3292-6b8ce5addcfc@sentex.net> <0ddec867-32b5-f667-d617-0ddc71726d09@sentex.net> <CAGudoHHNN8ZcgdkRSy0cSaPA6J9ZHVf%2BBQFiBcThrtQ0AMP%2BOw@mail.gmail.com> <5549CA9F-BCF4-4043-BA2F-A2C41D13D955@freebsd.org> <ad81b5f3-f6de-b908-c00f-fb8d6ac2a0b8@sentex.net> <CAGudoHETJZ0f_YjmCcUjb-Wcf1tKhSF719kXxXUB3p4RB0uuRQ@mail.gmail.com> <CAGudoHH=H4Xok5HG3Hbw7S=6ggdsi%2BN4zHirW50cmLGsLnhd4g@mail.gmail.com> <270b65c0-8085-fe2f-cf4f-7a2e4c17a2e8@sentex.net> <CAGudoHFLy2dxBMGd2AJZ6q6zBsU%2Bn8uLXLSiFZ1QGi_qibySVg@mail.gmail.com> <a716e874-d736-d8d5-9c45-c481f6b3dee7@sentex.net>
next in thread | previous in thread | raw e-mail | index | archive | help
swap and boot partitions resized, the ada0p3 partition got removed from the pool and inserted back, it is rebuilding now: root@zoo2:~ # zpool status pool: zroot state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Fri Nov 20 23:13:28 2020 459G scanned at 1.00G/s, 291G issued at 650M/s, 3.47T total 0B resilvered, 8.17% done, 01:25:48 to go config: NAME STATE READ WRITE CKSUM zroot DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 replacing-0 DEGRADED 0 0 0 1517819109053923011 OFFLINE 0 0 0 was /dev/ada0p3/old ada0p3 ONLINE 0 0 0 ada1 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 ada3p3 ONLINE 0 0 0 ada4p3 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 ada5p3 ONLINE 0 0 0 ada6p3 ONLINE 0 0 0 special mirror-3 ONLINE 0 0 0 gptid/db15e826-1a9c-11eb-8d25-0cc47a1f2fa0 ONLINE 0 0 0 mfid1p2 ONLINE 0 0 0 errors: No known data errors One pickle: i did 'zpool export zroot' to replace the drive, otherwise zfs protested. subsequent zpool import was done slightly carelessly and it mounted over /, meaning i lost access to original ufs. Should there be a need to boot from it again someone will have to boot single user and make sure to comment out swap in /etc/fstab or we will have to replace the drive again. That said, as I understand we are in position to take out the ufs drive and reboot to be back in business. The ufs drive will have to be mounted somewhere to sort out that swap. On 11/20/20, mike tancsa <mike@sentex.net> wrote: > On 11/20/2020 1:00 PM, Mateusz Guzik wrote: >> So this happened after boot: >> >> root@zoo2:/home/mjg # swapinfo >> Device 1K-blocks Used Avail Capacity >> /dev/ada0p3 2928730500 0 2928730500 0% >> >> which i presume might have corrupted some of it. > > Oh, that makes sense now. When it was installed in the back, the drive > posted as ada0. When we put it in zoo, it was on a farther down port, > hence it came up as ada7. I had to manually mount / off ada7p2. I > updated fstab so as not to do that again already. That mystery solved. > > ---Mike > > >> Allan pasted some one-liners to resize the boot and swap partition. >> >> With your permission I would like to run them and then offline/online >> the disk to have it rebuild. >> >> As for longer plans what to do with it i think that's a different >> subject, whatever new drives end up being used I'm sure the FreeBSD >> Foundation can reimburse you with no difficulty. >> >> >> On 11/20/20, mike tancsa <mike@sentex.net> wrote: >>> Its a bit of an evolutionary mess the current state of zoo. I wonder if >>> we are better off re-installing the base OS fresh on a pair of SSD >>> drives and have the base OS on it and leave all the user data on the >>> current "zroot"... Considering 240G SSDs are $35 CDN it might be easier >>> to just install fresh on it and not have to worry about resizing etc. >>> >>> ---Mike >>> >>> On 11/20/2020 12:49 PM, Mateusz Guzik wrote: >>>> On 11/20/20, Mateusz Guzik <mjguzik@gmail.com> wrote: >>>>> CC'ing Allan Jude >>>>> >>>>> So: >>>>> >>>>> pool: zroot >>>>> state: DEGRADED >>>>> status: One or more devices could not be opened. Sufficient replicas >>>>> exist >>>>> for >>>>> the pool to continue functioning in a degraded state. >>>>> action: Attach the missing device and online it using 'zpool online'. >>>>> see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-2Q >>>>> scan: scrub repaired 0B in 05:17:02 with 0 errors on Tue Aug 18 >>>>> 15:19:00 >>>>> 2020 >>>>> config: >>>>> >>>>> NAME STATE READ WRITE >>>>> CKSUM >>>>> zroot DEGRADED 0 0 >>>>> 0 >>>>> mirror-0 DEGRADED 0 0 >>>>> 0 >>>>> 1517819109053923011 UNAVAIL 0 0 >>>>> 0 was /dev/ada0p3 >>>>> ada1 ONLINE 0 0 >>>>> 0 >>>>> mirror-1 ONLINE 0 0 >>>>> 0 >>>>> ada3p3 ONLINE 0 0 >>>>> 0 >>>>> ada4p3 ONLINE 0 0 >>>>> 0 >>>>> mirror-2 ONLINE 0 0 >>>>> 0 >>>>> ada5p3 ONLINE 0 0 >>>>> 0 >>>>> ada6p3 ONLINE 0 0 >>>>> 0 >>>>> special >>>>> mirror-3 ONLINE 0 0 >>>>> 0 >>>>> gptid/db15e826-1a9c-11eb-8d25-0cc47a1f2fa0 ONLINE 0 0 >>>>> 0 >>>>> mfid1p2 ONLINE 0 0 >>>>> 0 >>>>> >>>>> errors: No known data errors >>>>> >>>>> # dmesg | grep ada0 >>>>> Trying to mount root from ufs:/dev/ada0p2 [rw]... >>>>> ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 >>>>> ada0: <WDC WD3003FZEX-00Z4SA0 01.01A01> ACS-2 ATA SATA 3.x device >>>>> ada0: Serial Number WD-WCC137TALF5K >>>>> ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) >>>>> ada0: Command Queueing enabled >>>>> ada0: 2861588MB (5860533168 512 byte sectors) >>>>> ada0: quirks=0x1<4K> >>>>> Mounting from ufs:/dev/ada0p2 failed with error 2; retrying for 3 more >>>>> seconds >>>>> Mounting from ufs:/dev/ada0p2 failed with error 2. >>>>> vfs.root.mountfrom=ufs:/dev/ada0p2 >>>>> GEOM_PART: Partition 'ada0p3' not suitable for kernel dumps (wrong >>>>> type?) >>>>> ZFS WARNING: Unable to attach to ada0p3. >>>>> ZFS WARNING: Unable to attach to ada0p3. >>>>> ZFS WARNING: Unable to attach to ada0p3. >>>>> ZFS WARNING: Unable to attach to ada0p3. >>>>> ZFS WARNING: Unable to attach to ada0p3. >>>>> ZFS WARNING: Unable to attach to ada0p3. >>>>> >>>>> # gpart show ada0 >>>>> => 34 5860533101 ada0 GPT (2.7T) >>>>> 34 6 - free - (3.0K) >>>>> 40 88 1 freebsd-boot (44K) >>>>> 128 3072000 2 freebsd-swap (1.5G) >>>>> 3072128 5857461000 3 freebsd-zfs (2.7T) >>>>> 5860533128 7 - free - (3.5K) >>>>> >>>>> Running naive dd if=/dev/ada0p3 works, so I don't know what zfs >>>>> complains >>>>> about. >>>>> >>>> Also note Philip's point boot partition of 44k. Is that too small now? >>>> >>>>> On 11/20/20, mike tancsa <mike@sentex.net> wrote: >>>>>> On 11/20/2020 11:40 AM, Philip Paeps wrote: >>>>>>> On 2020-11-21 00:04:19 (+0800), Mateusz Guzik wrote: >>>>>>> >>>>>>>> Oh, that's a bummer. I wonder if there is a regression in the boot >>>>>>>> loader though. >>>>>>>> >>>>>>>> Does the pool mount if you boot the system from a cd/over the >>>>>>>> network/whatever? >>>>>>> It's worth checking if the freebsd-boot partition is large enough. >>>>>>> I >>>>>>> noticed during the cluster refresh that we often use 108k for >>>>>>> freebsd-boot but recent head wants 117k. I've been bumping the >>>>>>> bootblocks to 236k. >>>>>>> >>>>>>> So far, all the cluster machines I've upgraded booted though .. so >>>>>>> ... >>>>>>> I might be talking ex recto. :) >>>>>>> >>>>>> I put in an ssd drive and booted from it. One of the drives might >>>>>> have >>>>>> gotten loose or died in the power cycles, but there is still >>>>>> redundancy >>>>>> and I was able to mount the pool. Not sure why it cant find the file >>>>>> ? >>>>>> >>>>>> root@zoo2:~ # diff /boot/lua/loader.lua /mnt/boot/lua/loader.lua >>>>>> 29c29 >>>>>> < -- $FreeBSD$ >>>>>> --- >>>>>>> -- $FreeBSD: head/stand/lua/loader.lua 359371 2020-03-27 17:37:31Z >>>>>> freqlabs $ >>>>>> root@zoo2:~ # >>>>>> >>>>>> >>>>>> % ls -l /mnt/boot/lua/ >>>>>> total 110 >>>>>> -r--r--r-- 1 root wheel 4300 Nov 20 08:41 cli.lua >>>>>> -r--r--r-- 1 root wheel 3288 Nov 20 08:41 color.lua >>>>>> -r--r--r-- 1 root wheel 18538 Nov 20 08:41 config.lua >>>>>> -r--r--r-- 1 root wheel 12610 Nov 20 08:41 core.lua >>>>>> -r--r--r-- 1 root wheel 11707 Nov 20 08:41 drawer.lua >>>>>> -r--r--r-- 1 root wheel 2456 Nov 20 08:41 gfx-beastie.lua >>>>>> -r--r--r-- 1 root wheel 2235 Nov 20 08:41 gfx-beastiebw.lua >>>>>> -r--r--r-- 1 root wheel 1958 Nov 20 08:41 gfx-fbsdbw.lua >>>>>> -r--r--r-- 1 root wheel 2413 Nov 20 08:41 gfx-orb.lua >>>>>> -r--r--r-- 1 root wheel 2140 Nov 20 08:41 gfx-orbbw.lua >>>>>> -r--r--r-- 1 root wheel 3324 Nov 20 08:41 hook.lua >>>>>> -r--r--r-- 1 root wheel 2395 Nov 20 08:41 loader.lua >>>>>> -r--r--r-- 1 root wheel 2429 Sep 24 09:09 logo-beastie.lua >>>>>> -r--r--r-- 1 root wheel 2203 Sep 24 09:09 logo-beastiebw.lua >>>>>> -r--r--r-- 1 root wheel 1958 Sep 24 09:09 logo-fbsdbw.lua >>>>>> -r--r--r-- 1 root wheel 2397 Sep 24 09:09 logo-orb.lua >>>>>> -r--r--r-- 1 root wheel 2119 Sep 24 09:09 logo-orbbw.lua >>>>>> -r--r--r-- 1 root wheel 14201 Nov 20 08:41 menu.lua >>>>>> -r--r--r-- 1 root wheel 4299 Nov 20 08:41 password.lua >>>>>> -r--r--r-- 1 root wheel 2227 Nov 20 08:41 screen.lua >>>>>> >>>>>> >>>>>> >>>>> -- >>>>> Mateusz Guzik <mjguzik gmail.com> >>>>> >> > -- Mateusz Guzik <mjguzik gmail.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGudoHELFz7KyzQmRN8pCbgLQXPgCdHyDAQ4pzFLF%2BYswcP87A>