From owner-freebsd-stable@FreeBSD.ORG Sat Sep 15 12:50:10 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CAECA1065677; Sat, 15 Sep 2012 12:50:10 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-we0-f182.google.com (mail-we0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 2365D8FC08; Sat, 15 Sep 2012 12:50:09 +0000 (UTC) Received: by weyx56 with SMTP id x56so3468363wey.13 for ; Sat, 15 Sep 2012 05:50:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=SsfJ72yD4jD8VnVpNXEdGeexSz9vNa4rC3fBkbd3iXc=; b=WoO/AsWH+CF88KafhbLCcYeNPGP7zkkVcB/m6WzXlirCbeNQrrShvi4LHsl54jvjMc 5ocdICuOoiH3vkDcygYCAYq34MusXFgoSklnqQP6C1BbePMg6DOKrDys4/VtyUZbzbBR V9By3/YLLwWM0Pr+NOK5wferajjLYDH5UPxueWaSCpkrWWR2U3IIc29Uemj2x0B8Rj9i zykAEEng0wvCLsmOZwAD5ImvMFjpAoiDGMtLAiNgfLYYZRAGThhVPX3/647yW9HeYDud l4/4uBNQmBwbIH4RW7vs0VbDgW35k1khDb3tc2t+22xKHY0gT0gqechX5pJaCXXfpbxq hLQQ== Received: by 10.217.1.206 with SMTP id n56mr3054949wes.151.1347713408854; Sat, 15 Sep 2012 05:50:08 -0700 (PDT) Received: from localhost ([95.69.174.83]) by mx.google.com with ESMTPS id hv8sm9172002wib.0.2012.09.15.05.50.06 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 15 Sep 2012 05:50:07 -0700 (PDT) Sender: Mikolaj Golub Date: Sat, 15 Sep 2012 15:50:05 +0300 From: Mikolaj Golub To: Miroslav Lachman <000.fbsd@quip.cz> Message-ID: <20120915125003.GA91163@gmail.com> References: <504D10A7.1070701@quip.cz> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="ZGiS0Q5IWpPtfppv" Content-Disposition: inline In-Reply-To: <504D10A7.1070701@quip.cz> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Hartmut Brandt , freebsd-stable@freebsd.org Subject: Re: bsnmpd always died on HDD detach X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Sep 2012 12:50:11 -0000 --ZGiS0Q5IWpPtfppv Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Sun, Sep 09, 2012 at 11:56:55PM +0200, Miroslav Lachman wrote: > I am running bsnmpd with basic snmpd.config (only community and location > changed). > > When there is a problem with HDD and disk disapeared from ATA channel > (eg.: disc physically removed) the bsnmpd always dumps core: > > kernel: pid 1188 (bsnmpd), uid 0: exited on signal 11 (core dumped) > > I see this for a long rime on all releases of 7.x and 8.x branches (i386 > and amd64). I did not tested 9.x. Ok, I was able to to reproduce this under qemu doing atacontrol detach ata1 It crashes in snmp_hostres module, in refresh_device_tbl->refresh_disk_storage_tbl->disk_OS_get_ATA_disks when traversing device_map list and dereferencing map->entry_p, which is NULL here. device_map table is used for consistent device table indexing. refresh_device_tbl(), refresh routine for hrDeviceTable, checks the list of available devices and calls device_entry_delete() for devices that have gone. It does not remove the entry from device_map table, but just sets entry_p to NULL for it (to preserve index reuse by another device). Then refresh_disk_storage_tbl() is called, which in turn calls disk_OS_get_ATA_disks(); disk_OS_get_MD_disks(); disk_OS_get_disks(); and it crashes in disk_OS_get_ATA_disks() when the removed map entry is dereferenced. I am attaching the patch that fixes the issue for me. I was wandering why the issue was not observed after md device removal, as disk_OS_get_MD_disks() did the same things. It has turned out that hostres just does not see md devices, so this function is currently useless. hostres gets devices from devinfo(3), which does not return md devices. disk_OS_get_disks() calls kern.disks sysctl to get the list of disks, and uses device_map differently, so it is not affected. -- Mikolaj Golub --ZGiS0Q5IWpPtfppv Content-Type: text/x-diff; charset=us-ascii Content-Disposition: inline; filename="hostres_diskstorage_tbl.c.skip.patch" Index: usr.sbin/bsnmpd/modules/snmp_hostres/hostres_diskstorage_tbl.c =================================================================== --- usr.sbin/bsnmpd/modules/snmp_hostres/hostres_diskstorage_tbl.c (revision 240529) +++ usr.sbin/bsnmpd/modules/snmp_hostres/hostres_diskstorage_tbl.c (working copy) @@ -287,6 +287,9 @@ disk_OS_get_ATA_disks(void) /* Walk over the device table looking for ata disks */ STAILQ_FOREACH(map, &device_map, link) { + /* Skip deleted entries. */ + if (map->entry_p == NULL) + continue; for (found = lookup; found->media != DSM_UNKNOWN; found++) { if (strncmp(map->name_key, found->dev_name, strlen(found->dev_name)) != 0) @@ -345,6 +348,9 @@ disk_OS_get_MD_disks(void) /* Look for md devices */ STAILQ_FOREACH(map, &device_map, link) { + /* Skip deleted entries. */ + if (map->entry_p == NULL) + continue; if (sscanf(map->name_key, "md%d", &unit) != 1) continue; --ZGiS0Q5IWpPtfppv--