Date: Sat, 15 Sep 2012 15:50:05 +0300 From: Mikolaj Golub <trociny@FreeBSD.org> To: Miroslav Lachman <000.fbsd@quip.cz> Cc: Hartmut Brandt <harti@FreeBSD.org>, freebsd-stable@freebsd.org Subject: Re: bsnmpd always died on HDD detach Message-ID: <20120915125003.GA91163@gmail.com> In-Reply-To: <504D10A7.1070701@quip.cz> References: <504D10A7.1070701@quip.cz>
next in thread | previous in thread | raw e-mail | index | archive | help
--ZGiS0Q5IWpPtfppv Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Sun, Sep 09, 2012 at 11:56:55PM +0200, Miroslav Lachman wrote: > I am running bsnmpd with basic snmpd.config (only community and location > changed). > > When there is a problem with HDD and disk disapeared from ATA channel > (eg.: disc physically removed) the bsnmpd always dumps core: > > kernel: pid 1188 (bsnmpd), uid 0: exited on signal 11 (core dumped) > > I see this for a long rime on all releases of 7.x and 8.x branches (i386 > and amd64). I did not tested 9.x. Ok, I was able to to reproduce this under qemu doing atacontrol detach ata1 It crashes in snmp_hostres module, in refresh_device_tbl->refresh_disk_storage_tbl->disk_OS_get_ATA_disks when traversing device_map list and dereferencing map->entry_p, which is NULL here. device_map table is used for consistent device table indexing. refresh_device_tbl(), refresh routine for hrDeviceTable, checks the list of available devices and calls device_entry_delete() for devices that have gone. It does not remove the entry from device_map table, but just sets entry_p to NULL for it (to preserve index reuse by another device). Then refresh_disk_storage_tbl() is called, which in turn calls disk_OS_get_ATA_disks(); disk_OS_get_MD_disks(); disk_OS_get_disks(); and it crashes in disk_OS_get_ATA_disks() when the removed map entry is dereferenced. I am attaching the patch that fixes the issue for me. I was wandering why the issue was not observed after md device removal, as disk_OS_get_MD_disks() did the same things. It has turned out that hostres just does not see md devices, so this function is currently useless. hostres gets devices from devinfo(3), which does not return md devices. disk_OS_get_disks() calls kern.disks sysctl to get the list of disks, and uses device_map differently, so it is not affected. -- Mikolaj Golub --ZGiS0Q5IWpPtfppv Content-Type: text/x-diff; charset=us-ascii Content-Disposition: inline; filename="hostres_diskstorage_tbl.c.skip.patch" Index: usr.sbin/bsnmpd/modules/snmp_hostres/hostres_diskstorage_tbl.c =================================================================== --- usr.sbin/bsnmpd/modules/snmp_hostres/hostres_diskstorage_tbl.c (revision 240529) +++ usr.sbin/bsnmpd/modules/snmp_hostres/hostres_diskstorage_tbl.c (working copy) @@ -287,6 +287,9 @@ disk_OS_get_ATA_disks(void) /* Walk over the device table looking for ata disks */ STAILQ_FOREACH(map, &device_map, link) { + /* Skip deleted entries. */ + if (map->entry_p == NULL) + continue; for (found = lookup; found->media != DSM_UNKNOWN; found++) { if (strncmp(map->name_key, found->dev_name, strlen(found->dev_name)) != 0) @@ -345,6 +348,9 @@ disk_OS_get_MD_disks(void) /* Look for md devices */ STAILQ_FOREACH(map, &device_map, link) { + /* Skip deleted entries. */ + if (map->entry_p == NULL) + continue; if (sscanf(map->name_key, "md%d", &unit) != 1) continue; --ZGiS0Q5IWpPtfppv--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120915125003.GA91163>