Date: Mon, 24 Mar 2008 17:40:03 GMT From: Patrick Donnelly <phd@oceancomputer.com> To: freebsd-sparc64@FreeBSD.org Subject: Re: sparc64/114349: When executing snmpd it immediately stops with a segmentation fault in disman/event/mteObjects.c Message-ID: <200803241740.m2OHe3bF030009@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
The following reply was made to PR sparc64/114349; it has been noted by GNATS. From: Patrick Donnelly <phd@oceancomputer.com> To: bug-followup@FreeBSD.org, freebsd-lists-1@thismonkey.com Cc: Subject: Re: sparc64/114349: When executing snmpd it immediately stops with a segmentation fault in disman/event/mteObjects.c Date: Mon, 24 Mar 2008 13:08:17 -0400 This issue is still present on FreeBSD 7-RELEASE; I just ran into myself converting a Netra box from OpenSolaris to FreeBSD. I did some debugging and discovered a workaround, the faulting component is in the disman/events MIB, so simply adding "-I -mteObjects" to snmpd_flags in rc.conf allows snmpd to start up and function, although presumably without mteObjects functionality, whatever that is. The following patch also *appears* to rectifiy the problem without disabling mteObjects: --- table_tdata.c.orig 2008-03-24 12:28:45.062698182 -0400 +++ table_tdata.c 2008-03-24 12:21:04.111822058 -0400 @@ -464,6 +464,9 @@ if (!table) return NULL; + if (!searchfor) + return NULL; + index.oids = searchfor; index.len = searchfor_len; return CONTAINER_FIND( table->container, &index ); The actual error occurs in mteObjects.c when the code tries to dereference a null pointer. I'm somewhat perplexed as to why this bug appears to only manifest on FreeBSD (Same hardware running Solaris 10/11 with net-snmp compiled from source has no such issue) and only on sparc64. The problem arises because the netsnmp_tdata_row_get_byoid function is matching newly created netsnmpt_tdata_rows which have null oid_index.oids pointers to objects in the objects_table_data global, which in turn is causing a branch in the code to be taken, which attempts to increment oid_index.oids[row->oid_index.len] on the row that has just been matched; since this is NULL, the problem segfaults. I'm unsure if the problem is because rows are being matched when they shouldn't be, or if oid_index.oids is just not being initialized properly somewhere. I can't see any initialization code being run in the codepath leading up to here (the new row is malloc'd and zeroed out just 30 lines before netsnmp_tdata_row_get_byoid is called) so my naive solution was to change the netsnmp_tdata_row_get_byoid function to immediately exit if given a null OID to search for, which appears to work for me, but I'm not familiar enough with the codebase to tell if that's the right thing to do in the long run. This appears to be an upstream issue that just got shaken out by the combination of architecture and compiler on sparc64, so I'm submitting it as a bug there as well, as it doesn't seem to exist in their bug tracker yet. Patrick Donnelly Enterprise Network Engineer Ocean Computer Group 90 Matawan Rd Suite 105 Matawan New Jersey, 07747 Office 732-493-1900 x245 phd@oceancomputer.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200803241740.m2OHe3bF030009>