Date: Fri, 28 Nov 2025 13:36:39 +0000 From: Bob Bishop <rb@gid.co.uk> To: Anthony Pankov <anthony.pankov@yahoo.com> Cc: "freebsd-hackers@freebsd.org" <freebsd-hackers@FreeBSD.org> Subject: Re: any way to recover from I/O hang? Message-ID: <23A5EAA0-F4E3-4195-92B1-89A090775282@gid.co.uk> In-Reply-To: <458400081.20251126171621@yahoo.com> References: <458400081.20251126171621.ref@yahoo.com> <458400081.20251126171621@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi, > On 26 Nov 2025, at 14:16, Anthony Pankov <anthony.pankov@yahoo.com> wrote: > > > Recently I'm again facing a situation where some UFS (?) error on one partition forced me to reboot whole server and interrupt people's work. > As I understand the brief is: > > Reason: A process in state 'D' (uninterruptible wait/sleep) is blocked on an I/O operation (e.g., waiting for a network file share, a disk operation, or a device that is no longer responding) at the kernel level. The process is not awaken to handle signals until the condition it is waiting for is resolved. > Solution: You cannot kill a 'D' state process with any signal, including SIGKILL (kill -9). The condition must be resolved (e.g., fixing the network connection, waiting for the I/O to time out, or potentially unmounting the filesystem with umount -f if it's a mounted network share). In some rare cases involving buggy kernel code or hardware failure, a reboot may be the only option. > > In my case there was a jailed samba which share one ufs-formatted partition. The samba processes hangs one by one and sharing has stopped it work. > There was no message in log. Geom reported no errors. Disk's has no error. It seems that there was some introduced inconsistency in UFS. > My thoughts was to stop/kill samba, remove samba's jail, unmount partition and do fsck. > > But 'jail -r' exit with something 'rc.shutdown exited ... 9' and left jail running. pskill -KILL for samba processes say nothing and do nothing. > 'umount -f' say 'Device is busy'. Various utilities such as 'df' hangs. So I forced to shutdown the server which has other important and workable service to resolve the situation. > > I wander is there any way to treat such a cases? May be 'umount -f' can have more power... FWIW I have found in the past that SIGBUS will sometimes unstick a process that SIGKILL won’t shift. > P.S. Previous situation was when I do simple (as I think) experiments with ggated and have forced to reboot server when 'mount' hung. > > -- > Best regards, > Anthony Pankov mailto:anthony.pankov@yahoo.com > > -- Bob Bishop rb@gid.co.uk
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?23A5EAA0-F4E3-4195-92B1-89A090775282>
