From owner-freebsd-hackers Mon Apr 9 10:59:40 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from xena.gsicomp.on.ca (cr677933-a.ktchnr1.on.wave.home.com [24.43.230.149]) by hub.freebsd.org (Postfix) with ESMTP id 46F9237B424 for ; Mon, 9 Apr 2001 10:59:32 -0700 (PDT) (envelope-from matt@gsicomp.on.ca) Received: from hermes (hermes.gsicomp.on.ca [192.168.0.18]) by xena.gsicomp.on.ca (8.11.1/8.11.3) with SMTP id f39HvdR23338; Mon, 9 Apr 2001 13:57:39 -0400 (EDT) (envelope-from matt@gsicomp.on.ca) Message-ID: <001a01c0c11e$28300830$1200a8c0@gsicomp.on.ca> From: "Matthew Emmerton" To: , "David E. Cross" References: <200104091540.LAA52639@cs.rpi.edu> Subject: Re: sigh... ypserv bug still very much alive Date: Mon, 9 Apr 2001 13:54:39 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4133.2400 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400 Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > The ypserv bug (the one where ypserv randomly stops responding or > just seg-faults) is still very much alive. I had to restart it > about 11 times in the course of 20 minutes this morning. That's > the bad news, the good news is that I started it each time with > 'ktrace -i'. > > Also, in the last 200 lines of kdump output for each and every crash there > is the sequence of calls "select(); gettimeofday();"... that sequence of > calls never appears in the ypserv source code, but does appear in svc_tcp.c > in librpc... my question is: "ypserv defines its own svc_run, and for > TCP connections specifically handles things itself very carefully, how is > the svc_tcp.c code getting called at all?" I think the answer to that is > the source of the problem (it should also be noted that in the case where > ypserv hasn't died and I have collected ktrace information -- up to 8 gig > of it -- the "select(); gettimeofday();" sequence is _never_ called.) I have virtually no experience with RPC or YP/NIS, but I can trace code. Here's what I found: Case #1 usr.sbin/ypserv/ypserv.c : main() calls svctcp_create() lib/libc/rpc/svc_tcp.c : svctcp_create() returns an SVCXPRT with a reference to an initialized rendezvous handler That in itself seems fine, but a rendezvous_request() op on the rendezvous handler can trigger the problem: lib/libc/rpc/svc_tcp.c : rendezvous_request() calls makefd_xprt() lib/libc/rpc/svc_tcp.c : makefd_xprt() calls xdrrec_create() with a pointer to readtcp() lib/libc/rpc/svc_tcp.c : readtcp() calls select(), gettimeofday() Case #2 usr.sbin/ypserv/ypserv.c : main() calls svc_register() lib/libc/rpc/svc_tcp.c : svc_register() calls pmap_set() lib/libc/rpc/pmap_clnt.c: pmap_set() *may* call clnt_create() lib/libc/rpc/clnt_generic.c : clnt_create() calls clnttcp_create() lib/libc/rpc/svc_tcp.c : clnttcp_create() calls readtcp() lib/libc/rpc/svc_tcp.c : readtcp() calls select(), gettimeofday() In answer to your question about "how is the svc_tcp.c code getting called at all?": In case #1, it's getting called when main() starts up and creates the initial TCP listener. In case #2, it's getting called when main() registers the services. Hopefully this will aid you (and others) in tracking down this problem. -- Matt Emmerton To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message