Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 13 Jun 2022 20:20:06 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 253954] kernel: g_access(958): provider da8 has error 6 set
Message-ID:  <bug-253954-227-J5vQkNzJzc@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-253954-227@https.bugs.freebsd.org/bugzilla/>
References:  <bug-253954-227@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D253954

jnaughto@ee.ryerson.ca changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jnaughto@ee.ryerson.ca

--- Comment #4 from jnaughto@ee.ryerson.ca ---
Any update on this bug.  I just experienced the exact same issue.  I have 8
disks (all SATA) connected to a Freebsd 12.3 system.  The ZFS pool is setup=
 as
a raidz3.  Got in today found one drive was "REMOVED"

# zpool status pool
  pool: pool
 state: DEGRADED
status: One or more devices has been removed by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: scrub repaired 0 in 0 days 02:32:26 with 0 errors on Sat Jun 11
05:32:26 2022
config:

        NAME                     STATE     READ WRITE CKSUM
        pool                     DEGRADED     0     0     0
          raidz3-0               DEGRADED     0     0     0
            ada0                 ONLINE       0     0     0
            ada1                 ONLINE       0     0     0
            ada2                 ONLINE       0     0     0
            ada3                 ONLINE       0     0     0
            ada4                 ONLINE       0     0     0
            8936423309855741075  REMOVED      0     0     0  was /dev/ada5
            ada6                 ONLINE       0     0     0
            ada7                 ONLINE       0     0     0

I assumed that the drive had died and pulled it.  I put a new drive in place
and attempted to replace it:

# zpool replace pool 8936423309855741075 ada5
cannot replace 8936423309855741075 with ada5: no such pool or dataset


It seems that the old drive somehow is still remembered by the system.  I d=
ug
through the logs to find the following occurring when the new drive is inse=
rted
into the system:

Jun 13 13:03:15 server kernel: cam_periph_alloc: attempt to re-allocate val=
id
device ada5 rejected flags 0x118 refcount 1
Jun 13 13:03:15 server kernel: adaasync: Unable to attach to new device due=
 to
status 0x6
Jun 13 13:04:23 server kernel: g_access(961): provider ada5 has error 6 set

Did a reboot without the new drive in place.  On reboot the output of the p=
ool
did look somewhat different:

# zpool status pool
  pool: pool
 state: DEGRADED
status: One or more devices could not be used because the label is missing =
or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-4J
  scan: scrub repaired 0 in 0 days 02:32:26 with 0 errors on Sat Jun 11
05:32:26 2022
config:

        NAME                      STATE     READ WRITE CKSUM
        pool                      DEGRADED     0     0     0
          raidz3-0                DEGRADED     0     0     0
            ada0                  ONLINE       0     0     0
            ada1                  ONLINE       0     0     0
            ada2                  ONLINE       0     0     0
            ada3                  ONLINE       0     0     0
            ada4                  ONLINE       0     0     0
            8936423309855741075   FAULTED      0     0     0  was /dev/ada5
            ada5                  ONLINE       0     0     0
            diskid/DISK-Z1W4HPXX  ONLINE       0     0     0

errors: No known data errors

I assumed this was due to the fact that there was one less drive attached a=
nd
the system assigned new adaX values to each drive.   At this point when I
inserted the new drive the new drive appeared as an ada9.  So I re-issued t=
he
zpool replace command but now with ada9.  Though it did take about 3mins be=
fore
the zpool replace command responded back (which really concerned me).  Yet =
the
server has quite a few users accessing the filesystem so I thought as long =
as
the new drive was re-silvering I would be fine....

I do a weekly scrub of the pool and I believe the error crept up after the
scub.  at 11am today the logs showed the following response:


Jun 13 11:29:15 172.16.20.66 kernel: (ada5:ahcich5:0:0:0): FLUSHCACHE48. AC=
B:
ea 00 00 00 00 40 00 00 00 00 00 00
Jun 13 11:29:15 172.16.20.66 kernel: (ada5:ahcich5:0:0:0): CAM status: Comm=
and
timeout
Jun 13 11:29:15 172.16.20.66 kernel: (ada5:ahcich5:0:0:0): Retrying command=
, 0
more tries remain
Jun 13 11:30:35 172.16.20.66 kernel: ahcich5: Timeout on slot 5 port 0
Jun 13 11:30:35 172.16.20.66 kernel: ahcich5: is 00000000 cs 00000060 ss
00000000 rs 00000060 tfd c0 serr 00000000 cmd 0004c517
Jun 13 11:30:35 172.16.20.66 kernel: (ada5:ahcich5:0:0:0): FLUSHCACHE48. AC=
B:
ea 00 00 00 00 40 00 00 00 00 00 00
Jun 13 11:30:35 172.16.20.66 kernel: (ada5:ahcich5:0:0:0): CAM status: Comm=
and
timeout
Jun 13 11:30:35 172.16.20.66 kernel: (ada5:ahcich5:0:0:0): Retrying command=
, 0
more tries remain
Jun 13 11:31:08 172.16.20.66 kernel: ahcich5: AHCI reset: device not ready
after 31000ms (tfd =3D 00000080)

At 11:39 I believe the following log entries are of note:

Jun 13 11:39:45 172.16.20.66 kernel: (ada5:ahcich5:0:0:0): CAM status:
Unconditionally Re-queue Request
Jun 13 11:39:45 172.16.20.66 kernel: (ada5:ahcich5:0:0:0): Error 5, Periph =
was
invalidated
Jun 13 11:39:45 172.16.20.66 ZFS[92964]: vdev state changed,
pool_guid=3D$5100646062824685774 vdev_guid=3D$8936423309855741075
Jun 13 11:39:45 172.16.20.66 ZFS[92966]: vdev is removed,
pool_guid=3D$5100646062824685774 vdev_guid=3D$8936423309855741075
Jun 13 11:39:46 172.16.20.66 kernel: g_access(961): provider ada5 has error=
 6
set
Jun 13 11:39:47 reactor syslogd: last message repeated 1 times
Jun 13 11:39:47 172.16.20.66 syslogd: last message repeated 1 times
Jun 13 11:39:47 172.16.20.66 kernel: ZFS WARNING: Unable to attach to ada5.

Any idea on what was the issue?

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-253954-227-J5vQkNzJzc>