From owner-freebsd-current Tue Sep 24 10:16:24 2002 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 182D137B404 for ; Tue, 24 Sep 2002 10:16:22 -0700 (PDT) Received: from rootlabs.com (root.org [67.118.192.226]) by mx1.FreeBSD.org (Postfix) with SMTP id 2A6EF43E6A for ; Tue, 24 Sep 2002 10:16:20 -0700 (PDT) (envelope-from nate@rootlabs.com) Received: (qmail 65284 invoked by uid 1000); 24 Sep 2002 17:16:14 -0000 Date: Tue, 24 Sep 2002 10:16:14 -0700 (PDT) From: Nate Lawson To: current@freebsd.org Cc: des@freebsd.org Subject: Coredump from pkg_add + analysis Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG pkg_add coredumps when installing dependencies from remote. This is really annoying because you have to manually track down dependencies and install them (say, for a critical package like cvsup). This has been reported multiple times and is 100% repeatable. It's been broken for a few months (ever since *conn was added to libfetch). Anyway, here's the trace and analysis. (gdb) bt #0 0x2809d6fe in SSL_write () from /usr/lib/libssl.so.2 #1 0x080564f0 in _fetch_write (conn=0x80940c0, buf=0x8098200 "NOOP", len=4) at /usr/src/lib/libfetch/common.c:485 #2 0x080565a7 in _fetch_putln (conn=0x80940c0, str=0x8098200 "NOOP", len=134840832) at /usr/src/lib/libfetch/common.c:513 #3 0x08054023 in _ftp_cmd (conn=0x80940c0, fmt=0x8098200 "NOOP") at /usr/src/lib/libfetch/ftp.c:187 #4 0x08055524 in _ftp_cached_connect (url=0x80940c0, purl=0x8098200, flags=0x8098200 "NOOP") at /usr/src/lib/libfetch/ftp.c:867 #5 0x0805577c in _ftp_request (url=0x8093000, op=0x8059f29 "RETR", us=0x0, purl=0x0, flags=0x0) at /usr/src/lib/libfetch/ftp.c:933 #6 0x080558e0 in fetchXGetFTP (url=0x2f363833, us=0x2f363833, flags=0x8098200 "NOOP") at /usr/src/lib/libfetch/ftp.c:967 #7 0x08050e76 in fetchXGet (URL=0x8093000, us=0x0, flags=0x0) at /usr/src/lib/libfetch/fetch.c:83 #8 0x080511c8 in fetchXGetURL (URL=0x8098200 "NOOP", us=0x8098200, flags=0x8098200 "NOOP") at /usr/src/lib/libfetch/fetch.c:180 #9 0x08051200 in fetchGetURL (URL=0x8098200 "NOOP", flags=0x8098200 "NOOP") at /usr/src/lib/libfetch/fetch.c:192 #10 0x080500c1 in fileGetURL () #11 0x0804b53d in pkg_do ( pkg=0x805e880 "ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/packages-5-current/All/xpdf-1.01.tbz") at /usr/src/usr.sbin/pkg_install/add/perform.c:297 #12 0x0804aceb in pkg_perform (pkgs=0x8090c80) at /usr/src/usr.sbin/pkg_install/add/perform.c:50 #13 0x0804aa4f in real_main (argc=-1077937142, argv=0xbfbffb0c) at /usr/src/usr.sbin/pkg_install/add/main.c:215 #14 0x0804d41e in main (argc=134840832, argv=0xbfbffb00) at /usr/src/usr.sbin/pkg_install/lib/pkgwrap.c:88 #15 0x0804a5c9 in _start () (gdb) fr 1 #1 0x080564f0 in _fetch_write (conn=0x80940c0, buf=0x8098200 "NOOP", len=4) at /usr/src/lib/libfetch/common.c:485 485 wlen = SSL_write(conn->ssl, buf, len); (gdb) list 480 } 481 } 482 errno = 0; 483 #ifdef WITH_SSL 484 if (conn->ssl != NULL) 485 wlen = SSL_write(conn->ssl, buf, len); 486 else 487 #endif 488 wlen = write(conn->sd, buf, len); 489 if (wlen == 0) (gdb) print conn $1 = (struct fetchconn *) 0x80940c0 (gdb) print *conn $2 = {sd = 1651863599, buf = 0x6572462f
, bufsize = 1146307173, buflen = 1919905839, err = 1764717428, ssl = 0x2f363833, ssl_ctx = 0x6b636170, ssl_cert = 0x73656761, ssl_meth = 0x632d352d, ref = 1701999221} (gdb) print conn->ssl $3 = (struct ssl_st *) 0x2f363833 Obviously, the problem is that we're bogusly sending an FTP command to SSL because conn (and &conn->ssl) was overwritten with garbage. Looking into it further, the problem appears to be with cached_connection in ftp.c. Often, the connection is closed (when the reference count goes to 0) but cached_connection is never set to NULL thus causing it to be reused even though *conn is invalid. There's a bit of a layering problem with the ftp/fetch semantics. _fetch_close() is used to shutdown the connection (and handles reference counting but the connection caching is done at the ftp layer. Either the connection cache should be moved to the fetch layer so open/close can deal with it properly (better) or the ftp layer needs to check for a ref count of 1 and invalidate the cache before closing it (worse). A lot of people would really really appreciate it if someone would choose an approach and fix this. -Nate To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message