FreeBSD Mail Archives

Date:      Fri, 28 Nov 2025 08:01:22 -0800
From:      Rick Macklem <rick.macklem@gmail.com>
To:        Anthony Pankov <anthony.pankov@yahoo.com>
Cc:        Warner Losh <imp@bsdimp.com>, freebsd-hackers@freebsd.org
Subject:   Re: any way to recover from I/O hang?
Message-ID:  <CAM5tNy62h7ONqLEB3EnZ%2BRxZgOekDJZ4cBQsjq6R9kkq1Qvhow@mail.gmail.com>
In-Reply-To: <101576524.20251128180929@yahoo.com>
References:  <458400081.20251126171621.ref@yahoo.com> <458400081.20251126171621@yahoo.com> <CANCZdfqsE5aA%2BATrqXnO6MqTWYV34i8DdSCDtJ7yc-Hp6gn-tA@mail.gmail.com> <908785764.20251127131317@yahoo.com> <CANCZdfrM7mFZa1-Dh=26MH2fLgcdP4vH6iBAz2HwvaMBFYB-Hw@mail.gmail.com> <1777092313.20251128124844@yahoo.com> <CAM5tNy5NBaV1avBX-dLHgJLB7w_cJ-oEV=yzS63TEoBy4wXHMA@mail.gmail.com> <101576524.20251128180929@yahoo.com>

index | next in thread | previous in thread | raw e-mail


On Fri, Nov 28, 2025 at 7:09 AM Anthony Pankov <anthony.pankov@yahoo.com> wrote:
>
>
> Friday, November 28, 2025, 4:22:16 PM, you wrote:
>
> >>
> >> I wander do kernel care about infinite I/O waiting? Neither the problem nor previous situation with ggate doesn't seem like a deadlock for me.
> >> If I remember correctly with ggate reviving ggate server do revive 'hanged' client which also was unresponsible to -KILL and 'umount -f'.
> >>
> >> Anyway I think it's very desirable that  'umount -f'  will apply more force and not give up with "device busy" message.
> >>
> >> I partially understand that "more force" is a complex problem. May be of something like replacing mountpoint vnode structures by deadfs structures.
> > This is not easy to do (as in unlikely that anyone will ever
> > implement it). The big issue is "if in progress file systems
> > are aborted, data is lost and the file system structure gets
> > messed up".  Even if people are willing to live with data
> > loss, there might be new cases that fsck(8) must handle.
>
> I don't think this will introduce new cases for fsck because those situations ended in cold reboot. System shutdown hangs before any 'syncing' phase. Aborting in progress filesystem programmatically or doing power off seems for me as the same cases for fsck. May be I'm wrong.
You might be correct. My understanding (I'm not a UFS guy) is that
updates to on-disk storage are done in specific orderings which
fsck(8) depends on to fix things. (I think the assumption is that
things stop all at once when a crash occurs, but as I said, I'm
not a UFS guy).

Doing what you want *might* cause partial updates that do not
retain that ordering, or maybe the orderings will be retained?

rick

>
> > The exercise of implementing this could be a major project,
> > since all sorts of msleep()s must be changed to return and
> > then all the code that calls them must handle that special case
> > without problems.  There is also be lots of waiters on locks
> > and buffer cache blocks that would need to be dealt with.
>
> > I did "umount -N" for the NFS client, which does what you
> > are asking for and it took quite a while to implement (and
> > I wouldn't be surprised if there is still bugs that make it
> > fail in rare cases).  NFS has the advantage that it does
> > not deal with file system structures like inodes and
> > indirect blocks, but it still wasn't easy to get all the
> > "waiters on buffer cache buffers" to error out, etc.
>
> From my point of view some vnode operation on a mountpont (UFS partition for samba share) didn't return and stuck somewhere in UFS module. Which make user process (smbd) hang. Then more users process reach something where UFS stuck. Meanwhile other UFS partitions worked with no problem. This give me an idea that there is a possibility to release all structures and drop all buffers related to problematic mountpoint keeping server alive.
>
> As for server operator "deadlock" in this case mean that there are processes which can't be killed because of stuck in failed mount and 'umount -f' that can't unmount because of processes used the mount.
>
> So I'm here to find a way to give a schedule to a stuck processes just for killing or to force 'umount -f' release stuck processes and destroy the mount.
>
> > As Warner said "If you collect info during the hang and
> > post that, someone might be able to diagnose the
> > problem".
> > As a starting point, the output of "ps axHl" when the
> > hang occurs should show what the various processes
> > are waiting on. Once you have the "wchan" for the
> > processes/threads that are stuck, you can grep around
> > in the sources to find out what it is sleeping on.
>
> > It might also be a resource exhaustion situation, such
> > as "out of vnodes". I know that, when kern.maxvnodes is
> > exceeded, the system doesn't hang, but it can get so slow
> > that it appears hung. (That one is recognized by threads
> > with a wchan that starts with "v", but I cannot remember
> > the exact wchan.)
>
> No, other services on a server was working well.
>
> Sorry for disturbing, this just make some grievance for me because a lot of work was made to do thing 'right' and to separate services on a server.
>
>
> > procstat -kk can also be useful, to figure out what locks
> > various processes are waiting for.
>
> > rick
>
> >>
> >>
> >> > On Thu, Nov 27, 2025 at 3:13 AM Anthony Pankov <anthony.pankov@yahoo.com>
> >> > wrote:
> >>
> >> >>
> >> >> Thursday, November 27, 2025, 7:58:21 AM, you wrote:
> >> >>
> >> >> > On Wed, Nov 26, 2025 at 7:16 AM Anthony Pankov <anthony.pankov@yahoo.com
> >> >> >
> >> >> > wrote:
> >> >>
> >> >> >>
> >> >> >> Recently I'm again facing a situation where some UFS (?) error on one
> >> >> >> partition forced me to reboot whole server and interrupt people's work.
> >> >> >> As I understand the brief is:
> >> >> >>
> >> >> >> Reason: A process in state 'D' (uninterruptible wait/sleep) is blocked
> >> >> on
> >> >> >> an I/O operation (e.g., waiting for a network file share, a disk
> >> >> operation,
> >> >> >> or a device that is no longer responding) at the kernel level. The
> >> >> process
> >> >> >> is not awaken to handle signals until the condition it is waiting for is
> >> >> >> resolved.
> >> >> >> Solution: You cannot kill a 'D' state process with any signal, including
> >> >> >> SIGKILL (kill -9). The condition must be resolved (e.g., fixing the
> >> >> network
> >> >> >> connection, waiting for the I/O to time out, or potentially unmounting
> >> >> the
> >> >> >> filesystem with umount -f if it's a mounted network share). In some rare
> >> >> >> cases involving buggy kernel code or hardware failure, a reboot may be
> >> >> the
> >> >> >> only option.
> >> >> >>
> >> >> >> In my case there was a jailed samba which share one ufs-formatted
> >> >> >> partition. The samba processes hangs one by one and sharing has stopped
> >> >> it
> >> >> >> work.
> >> >> >> There was no message in log. Geom reported no errors. Disk's has no
> >> >> error.
> >> >> >> It seems that there was some introduced inconsistency in UFS.
> >> >> >> My thoughts was to stop/kill samba, remove samba's jail, unmount
> >> >> partition
> >> >> >> and do fsck.
> >> >> >>
> >> >> >> But 'jail -r' exit with something 'rc.shutdown exited ... 9' and left
> >> >> jail
> >> >> >> running. pskill -KILL for samba processes say nothing and do nothing.
> >> >> >> 'umount -f' say 'Device is busy'. Various utilities such as 'df' hangs.
> >> >> So
> >> >> >> I forced to shutdown the server which has other important and workable
> >> >> >> service to resolve the situation.
> >> >> >>
> >> >> >> I wander is there any way to treat such a cases? May be 'umount -f' can
> >> >> >> have more power...
> >> >> >>
> >> >>
> >> >> > So no errors from geom? Or from CAM? what's the underlying storage
> >> >> > hardware? And what's the wchan / straceback for the processes in 'D'
> >> >> state?
> >> >> > And do you know the location of the file that's waiting for I/O?
> >> >>
> >> >> No errors. UFS was on gmirror and 'gmirror status' said 'OK'.
> >> >> Unfortunately I was unable to do meaningful investigation under stress.
> >> >> There was a number of smbd process in a jail which ignore killing and
> >> >> anything else.
> >> >> The first inconsistency found by fsck later was  a file with mtime
> >> >> corresponding to a first claim of a problem from one user. Hour later there
> >> >> was more claims and finally samba share stop working.  As I understand UFS
> >> >> inconsistency grew over that hour. Because there was multiple smbd
> >> >> processes each corresponding to different user there it is no one file
> >> >> maked problem.
> >> >>
> >> >>
> >> >> > There's a number of things that might cause this, but they typically are
> >> >> > noisy about it. There's a couple of deadlock issues that might cause
> >> >> this,
> >> >> > but getting some more information is needed to understand what might be
> >> >> > done to mitigate or prevent the deadlocks.
> >> >>
> >> >> > P.S. Previous situation was when I do simple (as I think) experiments
> >> >> with
> >> >> >> ggated and have forced to reboot server when 'mount' hung.
> >> >> >>
> >> >>
> >> >> > I assume ggate isn't involved when this happens now, right?
> >> >>
> >> >> That's right.  I note this because it seems that stoping ggate server
> >> >> while running ggate client is the simplest way to reproduce the problem.
> >> >>
> >> >> P.S. I was really hope that "umount -f" awake waiting processes for
> >> >> killing.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> >> --
> >> >> >> Best regards,
> >> >> >>  Anthony Pankov                         mailto:anthony.pankov@yahoo.com
> >> >> >>
> >> >> >>
> >> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Best regards,
> >> >> Anthony
> >> >>
> >> >>
> >>
> >>
> >> --
> >> Best regards,
> >> Anthony
> >>
> >>
>
>
> --
> Best regards,
> Anthony
>

help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM5tNy62h7ONqLEB3EnZ%2BRxZgOekDJZ4cBQsjq6R9kkq1Qvhow>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation