From owner-netperf-users@freebsd.org Sat Dec 19 20:16:11 2020 Return-Path: Delivered-To: netperf-users@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 2CC094A8709 for ; Sat, 19 Dec 2020 20:16:11 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Cyxnj37Djz4k4W; Sat, 19 Dec 2020 20:16:07 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wr1-x42b.google.com with SMTP id r7so6686139wrc.5; Sat, 19 Dec 2020 12:16:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=JbbYvlnz6PpWc/Za2iwgsAFSuNyd9XThBEq2qtz7JA0=; b=HQhQ9kF9vir13rT8xYCgjI+p9RIaSW+AxE9XLsuPE0WxOrP6RncFxxAc1450lEKG5o IhMW5LrTKj/OO/HdBrTyMON2EXSevO5rXRLZySCiDtzT+Gwow52GxNFy7fmz+Nx+bygq T4yVrgoaa1TAmpoTInVPkDpNeoXvtr4TwFhe7C0C1r/R8ZjR0J43VGCEjyMcF6qU9kUj ukDfXr01LntRRHUdiMSqJUhvC2abF+u7O60Wao6kwcbNe7GW/XVVgbiE0zrteI9idJdv bJ2WZgJ1w847r0zr/8Mxc7m1nNQSGMb9BJvf9TRXmVvODocQccMUS9+PJi8jaEC7pbqE vTig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=JbbYvlnz6PpWc/Za2iwgsAFSuNyd9XThBEq2qtz7JA0=; b=daTLsGneUwd/yXiIwyriZhzCwk3T/3geX602D0q3VmEhKK+zgtIm8qzWaYXNW/n2jE TNp1XDH8omVRY2qo7DtojW3QHghr2YVshcemSWqFMaqlwT54sqH2adJNZCR+p3L45er8 2edA1zuklMk/RZzAkV2Q4Jy9LEeqTh3mlaa3tAFfZSf/2sziW8yNUjm1Erpo9Ml4GGQS CdOze1wgLKr1Sikxtr69NfAgRdsmJQRh96mkhwfXacSu+xU7WGKXAFQLiwOuwViIaAzH AM4hp2GgzkFM36eFbJeXEQTT6+PxrH68jYOhHy55F9IYcka0/GuqwTCfnLY08rc5x4rb 8TTg== X-Gm-Message-State: AOAM531vGNr8VXT/B/uCensyYZSHO+MBYz+jCAg69yOaow3YtLxMR25B 9+Wf9Kq60N29XRmN/rWgSXWre6BoeKR4O+vkXfg= X-Google-Smtp-Source: ABdhPJzSLwRcnNc1Nff3EySE6kbP9Axyc53tRET2SFVxXkWqfQz1gaMTJy4FxtOxcgG0IKwg9EN26TZ5q3ziVfVUxGw= X-Received: by 2002:adf:ec86:: with SMTP id z6mr10666231wrn.17.1608408962288; Sat, 19 Dec 2020 12:16:02 -0800 (PST) MIME-Version: 1.0 Received: by 2002:adf:f811:0:0:0:0:0 with HTTP; Sat, 19 Dec 2020 12:16:01 -0800 (PST) In-Reply-To: <837ce2bc-9731-85b0-c6a5-1b3c7bcadb72@sentex.net> References: <5483e76e-4a2f-3153-c10b-7902839c1b68@sentex.net> <8c26a0d3-3bd0-7535-0abc-3d1e9e5ac7c4@sentex.net> <64923d33-4bf2-0fd5-1b17-d6bd73e9fd32@sentex.net> <13a9ab42-1df8-c054-0c83-5708ab9d9e2b@sentex.net> <6cef40cd-de57-aa84-bc70-ceea71add397@sentex.net> <837ce2bc-9731-85b0-c6a5-1b3c7bcadb72@sentex.net> From: Mateusz Guzik Date: Sat, 19 Dec 2020 21:16:01 +0100 Message-ID: Subject: Re: zoo back online (was Re: zoo hang) To: mike tancsa Cc: George Neville-Neil , "netperf-admin@FreeBSD.org" , netperf-users@freebsd.org, Paul Holes , Hans Petter Selasky Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4Cyxnj37Djz4k4W X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-BeenThere: netperf-users@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: "Announcements and discussions related to the netperf cluster. " List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Dec 2020 20:16:11 -0000 I'm adding hps for USB stack comments. On 12/19/20, mike tancsa wrote: > Hmm, This has happened again. Not sure if its a bug with the driver, the > firmware or both, but after a period of time the usb drive starts to > throw errors. This unit was working fine on RELENG12 and we swapped it > with another drive too, but same results. The drive is clean > > smartctl -a /dev/da2 -T permissive > > > > da2 at umass-sim0 bus 0 scbus14 target 0 lun 0 > da2: Fixed Direct Access SPC-4 SCSI device > da2: Serial Number 00000000000000000000 > da2: 400.000MB/s transfers > da2: 3815447MB (7814037168 512 byte sectors) > da2: quirks=0xa > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 0 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 f6 a5 a8 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Error 5, Retries exhausted > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 5c 65 f0 00 00 40 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 5c 65 f0 00 00 40 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 5c 65 f0 00 00 40 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 04 5c 65 f0 00 00 40 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 0 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 0 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 40 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Error 5, Retries exhausted > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 0 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 ba 00 20 48 00 00 08 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Error 5, Retries exhausted > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 28 00 00 18 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 3 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 28 00 00 18 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 2 more tries remain > (da2:umass-sim0:0:0:0): WRITE(10). CDB: 2a 00 b4 00 20 28 00 00 18 00 > (da2:umass-sim0:0:0:0): CAM status: CCB request completed with an error > (da2:umass-sim0:0:0:0): Retrying command, 1 more tries remain > Solaris: WARNING: Pool 'zoobackup' has encountered an uncorrectable I/O > failure and has been suspended. > > > On 12/18/2020 10:08 AM, George Neville-Neil wrote: >> OK, once we get the backup complete we should probably work on the >> rest of the cleanup. Let me know if and how I can help. >> >> Best, >> George >> >> >> On 18 Dec 2020, at 9:14, mike tancsa wrote: >> >>> Hi George, >>> >>> I think the boot loader is now fixed as those features are white >>> listed. Will start backups once again via zrepl. >>> >>> ---Mike >>> >>> On 12/17/2020 1:58 PM, George Neville-Neil wrote: >>>> Howdy, >>>> >>>> How do we want to handle the old tank stuff? >>>> >>>> Best, >>>> George >>>> >>>> >>>> On 15 Dec 2020, at 16:24, mike tancsa wrote: >>>> >>>>> OK, thanks to Josh P's suggestion, deleting the v2 bookmarks from the >>>>> pool allowed us to boot. >>>>> >>>>> Booted from a temp drive, imported the pool, >>>>> >>>>> root@zoo-temp:~ # zpool import -R /mnt -f zooroot >>>>> root@zoo-temp:~ # zfs list -t bookmark | grep ^z | awk '{print "zfs >>>>> destroy "$1}' >>>>> zfs destroy zooroot#zrepl_CURSOR_G_77296a02a81c78cc_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/ROOT#zrepl_CURSOR_G_e27691751ed1660b_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/ROOT/default#zrepl_CURSOR_G_607fa8e4c7df13b5_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/tmp#zrepl_CURSOR_G_25ae8e2b8723a008_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/usr#zrepl_CURSOR_G_344a884262b3e387_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/usr/home#zrepl_CURSOR_G_2e4087f8f219bd83_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/usr/ports#zrepl_CURSOR_G_fb8384d458dd82b3_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/usr/src#zrepl_CURSOR_G_b867573acd8a57f8_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/var#zrepl_CURSOR_G_ea9efdf01fdf65b5_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/var/audit#zrepl_CURSOR_G_e71132efb0fee45a_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/var/crash#zrepl_CURSOR_G_191c17e9538113f4_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/var/log#zrepl_CURSOR_G_f30668295109ad60_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/var/mail#zrepl_CURSOR_G_7d1eac92237e2603_J_push_to_drive >>>>> zfs destroy >>>>> zooroot/var/tmp#zrepl_CURSOR_G_d593288357e0a319_J_push_to_drive >>>>> root@zoo-temp:~ # zfs list -t bookmark | grep ^z | awk '{print "zfs >>>>> destroy "$1}' | sh >>>>> root@zoo-temp:~ # >>>>> root@zoo-temp:~ # zpool export zooroot >>>>> root@zoo-temp:~ # >>>>> >>>>> and rebooted and its up. Sadly, will need to come up with another >>>>> backup >>>>> system as sysutils/zrepl uses bookmarks :( >>>>> >>>>> ---Mike >>>>> >>>>> On 12/15/2020 1:46 PM, mike tancsa wrote: >>>>>> Looks like the loader does not support v2 bookmarks. I am going to >>>>>> get >>>>>> Paul to put in another disk to boot from, mjg will login, either >>>>>> destroy >>>>>> the bookmarks or hack a loader fix that will allow the box to boot >>>>>> with >>>>>> this feature. Will be an hour or so as we have a office meeting >>>>>> at 2pm >>>>>> we both have to attend. >>>>>> >>>>>> ---Mike >>>>>> >>>>>> On 12/15/2020 1:28 PM, mike tancsa wrote: >>>>>>> I am guessing because I was using zrepl from the ports to do >>>>>>> replication >>>>>>> / backup to a secondary disk, the use of the bookmark_v2 feature is >>>>>>> not >>>>>>> supported on ZoL ? Any way to recover from this ? >>>>>>> >>>>>>> >>>>>>> On 12/15/2020 1:10 PM, mike tancsa wrote: >>>>>>>> OK, but the first problem to deal with :( >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> BIOS drive C: is >>>>>>>> disk0 >>>>>>>> BIOS drive D: is >>>>>>>> disk1 >>>>>>>> ZFS: unsupported feature: >>>>>>>> com.datto:bookmark_v2 >>>>>>>> ZFS: pool zooroot is not >>>>>>>> supported >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Can't find >>>>>>>> /boot/zfsloader >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Can't find >>>>>>>> /boot/loader >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Can't find >>>>>>>> /boot/kernel/kernel >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> FreeBSD/x86 >>>>>>>> boot >>>>>>>> Default: >>>>>>>> /boot/kernel/kernel >>>>>>>> >>>>>>>> >>>>>>>> boot: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Can't find >>>>>>>> /boot/kernel/kernel >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> FreeBSD/x86 >>>>>>>> boot >>>>>>>> Default: >>>>>>>> /boot/kernel/kernel >>>>>>>> >>>>>>>> >>>>>>>> boot: >>>>>>>> >>>>>>>> On 12/15/2020 1:02 PM, Mateusz Guzik wrote: >>>>>>>>> We need to update to r368649 for a pmap fix regardless of the >>>>>>>>> above. I >>>>>>>>> can do the work and make the box ready for the next reboot. >>>>>>>>> >>>>>>>>> On 12/15/20, mike tancsa wrote: >>>>>>>>>> The USB backup disk was throwing errors and I was trying to >>>>>>>>>> export the >>>>>>>>>> backup pool and it looks like the box is hung now. I am going to >>>>>>>>>> power >>>>>>>>>> cycle it >>>>>>>>>> >>>>>>>>>> ---Mike >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>> >> > -- Mateusz Guzik