From owner-freebsd-hackers Mon Apr 3 15:47:17 1995 Return-Path: hackers-owner Received: (from majordom@localhost) by freefall.cdrom.com (8.6.10/8.6.6) id PAA21774 for hackers-outgoing; Mon, 3 Apr 1995 15:47:17 -0700 Received: from skynet.ctr.columbia.edu (skynet.ctr.columbia.edu [128.59.64.70]) by freefall.cdrom.com (8.6.10/8.6.6) with ESMTP id PAA21764; Mon, 3 Apr 1995 15:47:12 -0700 Received: (from wpaul@localhost) by skynet.ctr.columbia.edu (8.6.8/8.6.6) id RAA00659; Mon, 3 Apr 1995 17:44:20 -0400 From: "House of Debuggin'" Message-Id: <199504032144.RAA00659@skynet.ctr.columbia.edu> Subject: emacs + NIS + free() == ???? To: freebsd-bugs@FreeBSD.org Date: Mon, 3 Apr 1995 17:44:16 -0400 (EDT) Cc: freebsd-hackers@FreeBSD.org X-Mailer: ELM [version 2.4 PL23] Content-Type: text Content-Length: 4826 Sender: hackers-owner@FreeBSD.org Precedence: bulk Now I've gone and done it. I've just received a bug report (and confirmed it myself) that emacs-19.28 SEGVs when NIS is enabled in FreeBSD-current. The crash happens at startup when endnetgrent() tries to free the list of netgrp structures created by a previous call to getnetgrent(): (gdb) bt #0 0x72198 in _free_internal (ptr=0x10eb08) at gmalloc.c:862 #1 0x72252 in free (ptr=0x10eb08) at gmalloc.c:906 #2 0x4dea7 in emacs_blocked_free (ptr=0x10eb08) at alloc.c:230 #3 0x7223c in free (ptr=0x10eb08) at gmalloc.c:904 #4 0x815cf70 in endnetgrent () #5 0x815cdd6 in setnetgrent () #6 0x815de88 in _createcaches () #7 0x815dc34 in __initdb () #8 0x815da12 in getpwuid () #9 0x5467a in init_editfns () at editfns.c:66 #10 0x23917 in main (argc=3, argv=0xefbfd908, envp=0xefbfd918) at emacs.c:638 (gdb) What seems to be happening is that endnetgrent() is ending up inside emacs's own internal version of free(). getnetgrent() and endnetgrent() are invoked by the code which I added to getpwent.c to do +@netgroup/-@netgroup overrides. Since getnetgrent() was never called as part of getpwuid() before, I'm tempted to think that this was a lurking problem that I foolishly prodded into the open. That or I screwed something up myself, which is equally likely. At the moment, this has me totally stumped. Suggestions would be greatly appreciated. On a different note, this incident has helped me uncover another bug related to RPC. Here's a simple test program: #include #include main() { struct passwd *pw; setpwent(); while ((pw = getpwent()) != NULL) printf ("NAME: [%s] PASS: [%s] UID: [%d] GID: [%d] SHELL: [%s]\n", pw->pw_name, pw->pw_passwd, pw->pw_uid, pw->pw_gid, pw->pw_shell); } This just prints out the contents of your password file. With NIS enabled, it also prints out the contents of the NIS passwd map. Now watch: [/tmp]:marple.ctr.columbia.edu{263}# limit cputime unlimited filesize unlimited datasize 131072 kbytes stacksize 65536 kbytes coredumpsize unlimited memoryuse unlimited descriptors 256 <- max file descriptors == FD_SETSIZE memorylocked 6668 kbytes maxproc 179 [/tmp]:marple.ctr.columbia.edu{264}# a.out NAME: [root] PASS: [YEAH_RIGHT] UID: [0] GID: [0] SHELL: [/bin/csh] NAME: [toor] PASS: [*] UID: [0] GID: [0] SHELL: [] NAME: [daemon] PASS: [*] UID: [1] GID: [31] SHELL: [] NAME: [operator] PASS: [*] UID: [2] GID: [20] SHELL: [/bin/csh] NAME: [bin] PASS: [*] UID: [3] GID: [7] SHELL: [/nonexistent] NAME: [games] PASS: [*] UID: [7] GID: [13] SHELL: [] NAME: [news] PASS: [*] UID: [8] GID: [8] SHELL: [/nonexistent] NAME: [man] PASS: [*] UID: [9] GID: [9] SHELL: [] NAME: [uucp] PASS: [*] UID: [66] GID: [66] SHELL: [/usr/libexec/uucp/uucico] NAME: [ingres] PASS: [*] UID: [267] GID: [74] SHELL: [/bin/csh] NAME: [falcon] PASS: [*] UID: [32766] GID: [31] SHELL: [/usr/games/wargames] NAME: [nobody] PASS: [*] UID: [32767] GID: [9999] SHELL: [/nonexistent] [nis passwd map data follows] Now watch again: [/tmp]:marple.ctr.columbia.edu{266}# unlimit [/tmp]:marple.ctr.columbia.edu{267}# limit descriptors descriptors 360 <- max file descriptors > FD_SETSIZE [/tmp]:marple.ctr.columbia.edu{268}# a.out yp_match: clnt_call: RPC: Unable to receive; errno = Invalid argument clnttcp_create: RPC: Port mapper failure - RPC: Unable to receive clnttcp_create: RPC: Port mapper failure - RPC: Unable to receive clnttcp_create: RPC: Port mapper failure - RPC: Unable to receive clnttcp_create: RPC: Port mapper failure - RPC: Unable to receive clnttcp_create: RPC: Port mapper failure - RPC: Unable to receive [...] What's happening is that clntudp_call() is doing a select(), and it's using the value returned by _rpc_dtablesize() instead of FD_SETSIZE as select()'s first argument. When _rpc_dtablesize() returns a value greater than 256 (and it does when 'unlimit' raises the maximum number of file descriptors to 360), select() returns -1 with errno set to EINVAL. The select() man page seems to suggest that values larger than 256 are valid. The RPC problem could be fixed by clamping the value returned by _rpc_dtablesize() at 256, but that only gets around what could be buggy behavior in select(). Again, suggestions would be appreciated. -Bill -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~T~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Bill Paul (212) 854-6020 | System Manager Work: wpaul@ctr.columbia.edu | Center for Telecommunications Research Home: wpaul@skynet.ctr.columbia.edu | Columbia University, New York City ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The Møøse Illuminati: ignore it and be confused, or join it and be confusing! ~~~~~~~~ FreeBSD 2.1.0-Development #0: Tue Mar 14 11:11:25 EST 1995 ~~~~~~~~~