Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 27 Nov 2025 13:13:17 +0300
From:      Anthony Pankov <anthony.pankov@yahoo.com>
To:        Warner Losh <imp@bsdimp.com>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: any way to recover from I/O hang?
Message-ID:  <908785764.20251127131317@yahoo.com>
In-Reply-To: <CANCZdfqsE5aA%2BATrqXnO6MqTWYV34i8DdSCDtJ7yc-Hp6gn-tA@mail.gmail.com>
References:  <458400081.20251126171621.ref@yahoo.com> <458400081.20251126171621@yahoo.com>  <CANCZdfqsE5aA%2BATrqXnO6MqTWYV34i8DdSCDtJ7yc-Hp6gn-tA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help


Thursday, November 27, 2025, 7:58:21 AM, you wrote:

> On Wed, Nov 26, 2025 at 7:16 AM Anthony Pankov <anthony.pankov@yahoo.com>
> wrote:

>>
>> Recently I'm again facing a situation where some UFS (?) error on one
>> partition forced me to reboot whole server and interrupt people's work.
>> As I understand the brief is:
>>
>> Reason: A process in state 'D' (uninterruptible wait/sleep) is blocked on
>> an I/O operation (e.g., waiting for a network file share, a disk operation,
>> or a device that is no longer responding) at the kernel level. The process
>> is not awaken to handle signals until the condition it is waiting for is
>> resolved.
>> Solution: You cannot kill a 'D' state process with any signal, including
>> SIGKILL (kill -9). The condition must be resolved (e.g., fixing the network
>> connection, waiting for the I/O to time out, or potentially unmounting the
>> filesystem with umount -f if it's a mounted network share). In some rare
>> cases involving buggy kernel code or hardware failure, a reboot may be the
>> only option.
>>
>> In my case there was a jailed samba which share one ufs-formatted
>> partition. The samba processes hangs one by one and sharing has stopped it
>> work.
>> There was no message in log. Geom reported no errors. Disk's has no error.
>> It seems that there was some introduced inconsistency in UFS.
>> My thoughts was to stop/kill samba, remove samba's jail, unmount partition
>> and do fsck.
>>
>> But 'jail -r' exit with something 'rc.shutdown exited ... 9' and left jail
>> running. pskill -KILL for samba processes say nothing and do nothing.
>> 'umount -f' say 'Device is busy'. Various utilities such as 'df' hangs. So
>> I forced to shutdown the server which has other important and workable
>> service to resolve the situation.
>>
>> I wander is there any way to treat such a cases? May be 'umount -f' can
>> have more power...
>>

> So no errors from geom? Or from CAM? what's the underlying storage
> hardware? And what's the wchan / straceback for the processes in 'D' state?
> And do you know the location of the file that's waiting for I/O?

No errors. UFS was on gmirror and 'gmirror status' said 'OK'. Unfortunately I was unable to do meaningful investigation under stress.  There was a number of smbd process in a jail which ignore killing and anything else.
The first inconsistency found by fsck later was  a file with mtime corresponding to a first claim of a problem from one user. Hour later there was more claims and finally samba share stop working.  As I understand UFS inconsistency grew over that hour. Because there was multiple smbd processes each corresponding to different user there it is no one file maked problem.


> There's a number of things that might cause this, but they typically are
> noisy about it. There's a couple of deadlock issues that might cause this,
> but getting some more information is needed to understand what might be
> done to mitigate or prevent the deadlocks.

> P.S. Previous situation was when I do simple (as I think) experiments with
>> ggated and have forced to reboot server when 'mount' hung.
>>

> I assume ggate isn't involved when this happens now, right?

That's right.  I note this because it seems that stoping ggate server while running ggate client is the simplest way to reproduce the problem.

P.S. I was really hope that "umount -f" awake waiting processes for killing.




>> --
>> Best regards,
>>  Anthony Pankov                         mailto:anthony.pankov@yahoo.com
>>
>>
>>


-- 
Best regards,
Anthony




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?908785764.20251127131317>