Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 24 Mar 2008 17:40:03 GMT
From:      Patrick Donnelly <phd@oceancomputer.com>
To:        freebsd-sparc64@FreeBSD.org
Subject:   Re: sparc64/114349: When executing snmpd it immediately stops with a segmentation fault in disman/event/mteObjects.c
Message-ID:  <200803241740.m2OHe3bF030009@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR sparc64/114349; it has been noted by GNATS.

From: Patrick Donnelly <phd@oceancomputer.com>
To: bug-followup@FreeBSD.org,
 freebsd-lists-1@thismonkey.com
Cc:  
Subject: Re: sparc64/114349: When executing snmpd it immediately stops with a segmentation fault in disman/event/mteObjects.c
Date: Mon, 24 Mar 2008 13:08:17 -0400

 This issue is still present on FreeBSD 7-RELEASE; I just ran into  
 myself converting a Netra box from OpenSolaris to FreeBSD. I did some  
 debugging and discovered a workaround, the faulting component is in  
 the disman/events MIB, so simply adding "-I -mteObjects" to  
 snmpd_flags in rc.conf allows snmpd to start up and function, although  
 presumably without mteObjects functionality, whatever that is.
 
 The following patch also *appears* to rectifiy the problem without  
 disabling mteObjects:
 
 --- table_tdata.c.orig  2008-03-24 12:28:45.062698182 -0400
 +++ table_tdata.c       2008-03-24 12:21:04.111822058 -0400
 @@ -464,6 +464,9 @@
       if (!table)
           return NULL;
 
 +    if (!searchfor)
 +       return  NULL;
 +
       index.oids = searchfor;
       index.len  = searchfor_len;
       return CONTAINER_FIND( table->container, &index );
 
 
 The actual error occurs in mteObjects.c when the code tries to  
 dereference a null pointer. I'm somewhat perplexed as to why this bug  
 appears to only manifest on FreeBSD (Same hardware running Solaris  
 10/11 with net-snmp compiled from source has no such issue) and only  
 on sparc64. The problem arises because the netsnmp_tdata_row_get_byoid  
 function is matching newly created netsnmpt_tdata_rows which have null  
 oid_index.oids pointers to objects in the objects_table_data global,  
 which in turn is causing a branch in the code to be taken, which  
 attempts to increment oid_index.oids[row->oid_index.len] on the row  
 that has just been matched; since this is NULL, the problem segfaults.
 
 I'm unsure if the problem is because rows are being matched when they  
 shouldn't be, or if oid_index.oids is just not being initialized  
 properly somewhere. I can't see any initialization code being run in  
 the codepath leading up to here (the new row is malloc'd and zeroed  
 out just 30 lines before netsnmp_tdata_row_get_byoid is called) so my  
 naive solution was to change the netsnmp_tdata_row_get_byoid function  
 to immediately exit if given a null OID to search for, which appears  
 to work for me, but I'm not familiar enough with the codebase to tell  
 if that's the right thing to do in the long run.
 
 This appears to be an upstream issue that just got shaken out by the  
 combination of architecture and compiler on sparc64, so I'm submitting  
 it as a bug there as well, as it doesn't seem to exist in their bug  
 tracker yet.
 
 Patrick Donnelly
 Enterprise Network Engineer
 Ocean Computer Group
 90 Matawan Rd
 Suite 105
 Matawan New Jersey, 07747
 Office 732-493-1900 x245
 phd@oceancomputer.com
 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200803241740.m2OHe3bF030009>