Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 Jul 2013 15:50:59 -0700
From:      Sean Bruno <sean_bruno@yahoo.com>
To:        John Baldwin <jhb@freebsd.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r253708 - head/sys/dev/ipmi
Message-ID:  <1375138259.1479.69.camel@localhost>
In-Reply-To: <201307291617.39898.jhb@freebsd.org>
References:  <201307271632.r6RGWYF8046749@svn.freebsd.org> <201307291054.55820.jhb@freebsd.org> <1375127952.1479.32.camel@localhost> <201307291617.39898.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--=-VuE9jZzijLk+qQydoXyk
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

[sbruno_comment_blocks =3D=3D 4]

>=20
> The identify function in 7.x has no such check:
>=20
> static void
> ipmi_isa_identify(driver_t *driver, device_t parent)
> {
> 	struct ipmi_get_info info;
> 	uint32_t devid;
>=20
> 	if (ipmi_smbios_identify(&info) && info.iface_type !=3D SSIF_MODE &&
> 	    device_find_child(parent, "ipmi", -1) =3D=3D NULL) {

Ok then what is this ^^^^^^^^^ ?  Doesn't this mean that if
device_find_child() returns a child node that we should abort?  Is that
not the same as what I'm going on about?

>=20
> The only check in 7 was the one you just moved in ipmi_isa_attach():
>=20
> static int
> ipmi_isa_attach(device_t dev)
> {
> 	struct ipmi_softc *sc =3D device_get_softc(dev);
> 	struct ipmi_get_info info;
> 	const char *mode;
> 	int count, error, i, type;
>=20
> 	/*
> 	 * Pull info out of the SMBIOS table.  If that doesn't work, use
> 	 * hints to enumerate a device.
> 	 */
> 	if (!ipmi_smbios_identify(&info) &&
> 	    !ipmi_hint_identify(dev, &info))
> 		return (ENXIO);
>=20
> 	/*
> 	 * Give other drivers precedence.  Unfortunately, this doesn't
> 	 * work if we have an SMBIOS table that duplicates a PCI device
> 	 * that's later on the bus than the PCI-ISA bridge.
> 	 */
> 	if (ipmi_attached)
> 		return (EBUSY);
> 	...
> }
>=20
> > Am I just confused on the bus relationship here?
> >=20
> > We've gone over this a couple of times in different emails on different
> > lists.  I've just never sat down and walked through the code.  If you
> > see a better way to keep ipmi(4) from erroneously attaching to the ISA
> > interface, let me know.
>=20
> I haven't seen any mention of this problem before.  I've seen threads abo=
ut
> the watchdog issue (trying to set watchdog on shutdown which wants to use
> threads, etc.), but not this.
>=20

http://lists.freebsd.org/pipermail/freebsd-stable/2012-March/067023.html

The thread gets broken apart by the mail list software though.
Somewhere in there I point at ipmi1 things.  But hell if I can find it
anymore.

> Also, can you provide the console messages without this patch?  The previ=
ous
> check in ipmi_isa_attach() is long before we touch the BMC or ever get
> around to creating /dev/ipmi1.  (Just because you see ipmi1: in dmesg doe=
sn't
> mean it's created /dev/ipmi1.)
>=20
Definitely does *not* create /dev/ipmi1.

> > > >   Move the check for ipmi_attached out of the ipmi_isa_attach funct=
ion and into
> > > >   the ipmi_isa_identify function.  Remove the check of the device t=
ree for
> > > >   ipmi devices attached.
> > > >  =20
> > > >   This probing appears to make Broadcom management firmware on Dell=
 machines
> > > >   crash and emit NMI EISA warnings at various times requiring power=
 cycles
> > > >   of the machines to restore.
> > >=20
> > > This makes no sense.  All you are doing is skipping ipmi_smbios_ident=
ify()
> > > which just looks at the SMBIOS table in RAM.  It doesn't try to probe=
 the
> > > BMC at all (no register accesses, etc.).  If just reading a table in =
memory
> > > causes side effects, then running dmidecode in userland should be hos=
ing your
> > > machines as well.
> > >=20
> >=20
> > Probably right.  I'm not exactly sure what is making the Broadcom
> > firmware fall over and die.  But when I see the console emitting "NMI
> > EISA" error messages, this ususally requires a full reboot as the
> > network interface has crashed.  Hopefully, I can dig up more "fact" soo=
n
> > as testing continues.
>=20
> I'd rather be sure this is the right fix, and if it isn't I'd prefer to
> revert this as I don't think it is actually fixing anything.
>=20
It definitely does *not* have the effect that I advertised in my commit
message.

the commit DOES:
-- remove any attempt to do anything in ipmi_isa_* functions.
-- does not emit any errors on attach failure (which are noisy and
   distracting).
-- make attaching to ipmi0 more "reliable" by blindly raising the
   timeout value to 6 seconds.  (6 seconds is the totally empirical
   value I came up with in testing that never failed to attach across
   100+ reboots).

I disagree that it should be reverted.  We can argue about it if you
wish and I'm open to modifying back to the original code.  I don't think
I'd agree with removing the error messages on attachment failure though.
I view the attachment failures as "sysadmin noise" but they should be
there *if* there is a real attach failure.

sean

--=-VuE9jZzijLk+qQydoXyk
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (FreeBSD)

iQEcBAABAgAGBQJR9vHTAAoJEBkJRdwI6BaHE/gH/3teJsMzMspVr1kW/WZmVK79
xRWARA5nUZjhyKAKWqSw5zQmZM8JmnW/0EVndHnS9idLkTGshAPWR02+EoUfXO8V
wG7l54zLcu/KNM9GFssGgJONsAJciIf2g1Q2k7bLAn2wLH6ucIBMl73ZnAMdymcg
U3dGCl/8M3D3TVZUVqzqUSoYhIG5rMu2pLZPqtGz84oqF4/+ln/VHzDG+4vjJm/t
hd8yUge88pNyw/2ne61vZtYoyiElF34UNWkMy0Blu7UlleTXwAOibiGj/hqTqFy5
VDFR/IEpI8r/5sPbCYYLFZVAWkLMvc+dS4S15UNt8NP5FdNfadM631WuW8jdO9s=
=NcDm
-----END PGP SIGNATURE-----

--=-VuE9jZzijLk+qQydoXyk--




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1375138259.1479.69.camel>