From owner-freebsd-fs@FreeBSD.ORG  Fri May  6 03:12:09 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 81E21106566B
	for <freebsd-fs@freebsd.org>; Fri,  6 May 2011 03:12:09 +0000 (UTC)
	(envelope-from marcel@xcllnt.net)
Received: from mail.xcllnt.net (mail.xcllnt.net [70.36.220.4])
	by mx1.freebsd.org (Postfix) with ESMTP id 559F28FC12
	for <freebsd-fs@freebsd.org>; Fri,  6 May 2011 03:12:09 +0000 (UTC)
Received: from dhcp-192-168-2-13.wifi.xcllnt.net (atm.xcllnt.net [70.36.220.6])
	(authenticated bits=0)
	by mail.xcllnt.net (8.14.4/8.14.4) with ESMTP id p463BxrZ001939
	(version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO);
	Thu, 5 May 2011 20:12:05 -0700 (PDT)
	(envelope-from marcel@xcllnt.net)
Mime-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset=us-ascii
From: Marcel Moolenaar <marcel@xcllnt.net>
In-Reply-To: <FCA2B7F6-F9D1-4DA2-B22C-DEFBC1B55E1B@xcllnt.net>
Date: Thu, 5 May 2011 20:11:59 -0700
Content-Transfer-Encoding: quoted-printable
Message-Id: <6B7B3E48-08D5-47D1-85B4-FAA1EEE6764C@xcllnt.net>
References: <C2BDEBA9-DD29-4B0B-B125-89B93F5997BA@dragondata.com>
	<FCA2B7F6-F9D1-4DA2-B22C-DEFBC1B55E1B@xcllnt.net>
To: Marcel Moolenaar <marcel@xcllnt.net>
X-Mailer: Apple Mail (2.1084)
Cc: freebsd-fs@freebsd.org
Subject: Re: "gpart show" stuck in loop
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 May 2011 03:12:09 -0000


On May 5, 2011, at 2:25 PM, Marcel Moolenaar wrote:

>=20
> On May 5, 2011, at 10:25 AM, Kevin Day wrote:
>=20
>>=20
>> We've had one of our boxes getting stuck with "gpart show" (called =
from rc startup scripts) consuming 100% cpu after each reboot. Manually =
running "gpart show" gives me:
>=20
> Can you send me a binary image of the first sector of da0 privately
> and also tell me what FreeBSD version you're using.

(after receiving the dump)

Hi Kevin,

I reproduced the problem:

ns1% sudo mdconfig -a -t malloc -s 5860573173
md0
ns1% sudo gpart create -s mbr md0
md0 created
ns1% gpart show md0
=3D>        63  4294967229  md0  MBR  (2.7T)
          63  4294967229       - free -  (2.0T)

ns1% sudo dd if=3Dkevin-day.mbr of=3D/dev/md0
8+0 records in
8+0 records out
4096 bytes transferred in 0.006988 secs (586144 bytes/sec)
ns1% gpart show md0
=3D>        63  5860573110  md0  MBR  (2.7T)
          63  2147472747    1  freebsd  [active]  (1.0T)
  2147472810  2147472810    2  freebsd  [active]  (1.0T)
  4294945620  -2729352721    3  freebsd  [active]  ()
  1565592899   581879911       - free -  (277G)
  2147472810  2147472810    2  freebsd  [active]  (1.0T)
  4294945620  -2729352721    3  freebsd  [active]  ()
  1565592899   581879911       - free -  (277G)
  2147472810  2147472810    2  freebsd  [active]  (1.0T)
  4294945620  -2729352721    3  freebsd  [active]  ()
  1565592899   581879911       - free -  (277G)
	^C


The first problem you have is that the MBR has overflows.
As you can see from my initial MBR, only 2.0TB out of the
2.7T can be addressed, whereas yours addresses the whole
2.7T. There must be an overflow condition.

The second problem is that more than 1 slice is marked
active.

Now, on to the infinite recursion in gpart. The XML has
the following pertaining the slices:

        <provider id=3D"0xffffff0029ff9900">
          <geom ref=3D"0xffffff002e742d00"/>
          <mode>r0w0e0</mode>
          <name>md0s3</name>
          <mediasize>-1397428593152</mediasize>
          <sectorsize>512</sectorsize>
          <config>
            <start>4294945620</start>
            <end>1565592898</end>
            <index>3</index>
            <type>freebsd</type>
            <offset>2199012157440</offset>
            <length>18446742676280958464</length>
            <rawtype>165</rawtype>
            <attrib>active</attrib>
          </config>
        </provider>

Notice how mediasize is negative. This is a bug in the
kernel. This is also what leads to the recursion in gpart,
because gpart looks up the next partition on the disk,
given the LBA of the next sector following the partition
just processed. This allows gpart to detect free space
(the next partition found doesn't start at the given LBA)
and it allows gpart to print the partitions in order on
the disk. In any case: since the end of slice 3 is before
the start of slice 3 and even before the start of slice 2,
due to its negative size, gpart will continuously find the
same partitions:
1.  After partition 3 the "cursor" is at 1565592899,
2.  The next partition found is partition 2, at 2147472810
3.  Therefore, 1565592899-2147472810 is free space
4.  Partition 2 is printed, and partition 3 is found next
5.  Partition 3 is printed and due to the negative size:
    goto 1

I think we should do things:
1.  Protect the gpart tool against this,
2.  Fix the kernel to simply reject partitions that
    fall outside of the addressable space (as determined
    by the limitations of the scheme).

In your case it would mean that slice 3 would result
in slice 3 being inaccessable.

Given that you've been hit by this: do you feel that such
a change would be a better failure mode?

--=20
Marcel Moolenaar
marcel@xcllnt.net