From owner-freebsd-questions@FreeBSD.ORG  Mon Mar 12 10:09:17 2007
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
X-Original-To: freebsd-questions@freebsd.org
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 90CBA16A405
	for <freebsd-questions@freebsd.org>;
	Mon, 12 Mar 2007 10:09:17 +0000 (UTC)
	(envelope-from nvass@teledomenet.gr)
Received: from wmail.teledomenet.gr (wmail.teledomenet.gr [213.142.128.16])
	by mx1.freebsd.org (Postfix) with ESMTP id 1DF7413C455
	for <freebsd-questions@freebsd.org>;
	Mon, 12 Mar 2007 10:09:17 +0000 (UTC)
	(envelope-from nvass@teledomenet.gr)
Received: from iris (unknown [192.168.1.71])
	by wmail.teledomenet.gr (Postfix) with ESMTP id 230241C8834;
	Mon, 12 Mar 2007 11:45:17 +0200 (EET)
From: Nikos Vassiliadis <nvass@teledomenet.gr>
To: Modulok <modulok@gmail.com>
Date: Mon, 12 Mar 2007 12:10:18 +0200
User-Agent: KMail/1.9.1
References: <64c038660703080349t3311fa22lf8e6ba736db330ed@mail.gmail.com>
	<200703091608.47529.nvass@teledomenet.gr>
	<64c038660703091352u7d8e498eq8049b78a34933555@mail.gmail.com>
In-Reply-To: <64c038660703091352u7d8e498eq8049b78a34933555@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200703121210.19301.nvass@teledomenet.gr>
Cc: freebsd-questions@freebsd.org
Subject: Re: Kill a hanged disk i/o process...
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 12 Mar 2007 10:09:17 -0000

On Friday 09 March 2007 23:52, Modulok wrote:
> "How do I work-around a situation where cp, hangs forever?"

You can try several things:
1) mount read-only and try to copy the data.
   dd(1) might be a better choice than cp(1),
   read more bellow.

2) unmount and dump(8) the filesystem.

3) use dd(1) to copy the filesystem to a file(use
    conv=noerror,sync to ignore I/O errors)

I assume that:
1) your operating system lives in a healthy disk and
2) you have the needed space in this healthy disk
    to copy the data.

Keep in mind the warning from the BUGS section in
mount(8) manual page: 
 It is possible for a corrupted file system to cause a crash.

I would use method 3 first, which would give me the
opportunity to try different things, without having
to do them on the actual filesystem, but on the file-
backed filesystem, for example fsck(8).

HTH, Nikos

> 
> -Modulok-
> 
> On 3/9/07, Nikos Vassiliadis <nvass@teledomenet.gr> wrote:
> > On Friday 09 March 2007 15:28, Modulok wrote:
> > > Thank you for your reply, it was quite informative and very much
> > > appreciated, but the underlying question remains un-answered:
> > >
> > > How do you kill a hanged process that (seemingly) cannot be killed because
> > > of the two conditions below?
> > >
> > > -It's hanged, so it's not ever going to self terminate.
> > > -It's a disk i/o process so not even root can kill it.
> > >
> >
> > As I said before disk I/O is irrelevant.
> >
> > > The gentle shutdown solution doesn't work: Even during shutdown the
> > process
> > > cannot be killed: it's hanged, it's disk i/o.
> > >
> > > How do you kill an un-killable process?
> >
> > What makes you believe there is another official way
> > to kill a process?
> >
> > Perhaps you should ask "How do I work-around a situation
> > where my rm, cp, whatever hang forever?", if that's what
> > you are looking for.
> >
> > > -Modulok-
> > >
> > >
> > > On 3/9/07, Nikos Vassiliadis <nvass@teledomenet.gr> wrote:
> > > >
> > > > On Thursday 08 March 2007 13:49, Modulok wrote:
> > > > > To the best of my knowledge, most processes can be killed explicitly
> > > > > by "kill -s KILL;" There are a few which cannot, such as disk i/o
> > > > > processes. The idea here is data integrity.
> > > >
> > > > A process might be in cannot-be-killed condition while
> > > > in kernel e.g. during a system call. That has to do with
> > > > the completion of the system call, not with data integrity.
> > > > The kernel tries to complete what was asked for.
> > > >
> > > > Also, Killing a process with SIGKILL is far from safe. To put
> > > > it in another way "data integrity" can be guaranteed only
> > > > by the program itself. For example it could have a defined
> > > > behavior when it is signaled by e.g. SIGTERM, for example
> > > > clean up data and exit. Or not. It's up to the programmer.
> > > > Sending a SIGKILL will not give that chance. SIGKILL can
> > > > not be handled. It will be terminated as soon as possible.
> > > >
> > > > Also, separate the meanings "data integrity" and "filesystem
> > > > data integrity". The filesystem will be in fine condition when
> > > > a process gets killed by SIGKILL during file I/O, the data in
> > > > the file most probably not.
> > > >
> > > > >
> > > > > On the rare occasion however, (when attempting to recover data from
> > > > > corrupt disks for example), I've had a process invoked by the "cp"
> > > > > command, hang. This poses a significant problem as these processes are
> > > > > disk i/o processes, and as such cannot be terminated (even by root).
> > > > > So, other than physically hitting the reset button on the case, is
> > > > > there a more eloquent method of forcefully halting a hanged disk i/o
> > > > > process? The idea of "you don't want to terminate a disk i/o process,
> > > > > it could corrupt the data" isn't really a good argument, because if
> > > > > the process hangs and I have to punch the reset button anyway what's
> > > > > the difference?
> > > >
> > > > "Pressing the button" will leave your filesystem in a undefined state,
> > > > you are risking filesystem integrity. Keep in mind that while in use
> > > > (open files etc) a filesystem cannot be unmounted. Anyway, try to shut
> > > > the computer down, it's far more gentle than pressing the button. At
> > > > least the rest of the filesystems will be cleanly unmounted.
> > > >
> > > > Is there something in particular you want to achieve?
> > > >
> > > > Nikos
> > > >
> > >
> >
>