From owner-freebsd-bugs Thu Jun 8 21:55:19 1995 Return-Path: bugs-owner Received: (from majordom@localhost) by freefall.cdrom.com (8.6.10/8.6.6) id VAA12454 for bugs-outgoing; Thu, 8 Jun 1995 21:55:19 -0700 Received: from blob.best.net (blob.best.net [204.156.128.88]) by freefall.cdrom.com (8.6.10/8.6.6) with ESMTP id VAA12448 for ; Thu, 8 Jun 1995 21:55:18 -0700 Received: from shell1.best.com (shell1.best.com [204.156.128.10]) by blob.best.net (8.6.12/8.6.5) with ESMTP id VAA08804 for ; Thu, 8 Jun 1995 21:54:20 -0700 Received: (dillon@localhost) by shell1.best.com (8.6.12/8.6.5) id VAA14568; Thu, 8 Jun 1995 21:54:49 -0701 Date: Thu, 8 Jun 1995 21:54:49 -0701 From: Matt Dillon Message-Id: <199506090455.VAA14568@shell1.best.com> To: bugs@FreeBSD.org Subject: Re: connect() bug found and fixed (uninitialized pointer) Sender: bugs-owner@FreeBSD.org Precedence: bulk Wow! A wash of responses! Thanks for the info re: large partitions. Yes, the bug is pretty obvious, which is why I didn't bother with a diff. I am not sure what causes the manifestations myself but considering that ftp negotiates a connect() to a specific port, I am not surprised. Yes, ftp.best.com (our FTP/WWW server) is now running FreeBSD, news.best.com is also running FreeBSD. We are going to switch our shell machine over soon too. Here's a quick list of funky problems we've found so far, just FYI in case it rings a bell anywhere: * have noticed that sprintf() seems to need its arguments cast to the exact type spected by the '%' control elements, as reported in warnings if you compile something -Wall (you have no warnings if you do not use -Wall). Integer-type to integer-type casts are required... for example, char to int for %d, int to char for %c, time_t, gid_t, etc... not sure why. This broke wu-ftpd and a couple of other programs. As best as I can tell, something gets confused and the parameter offsets get skewed, causing corruption when a bad pointer (due to the skew) is dereferenced. * we still see random lockups occur about once a day. total lockup, so no way to get a crash dump, at least not so far. We're trying to narrow down the possibilities but no luck so far. * have had some problems with our HP DAT drive when DAT errors occur (NCR PCI SCSI).. sometimes breaks the whole scsi system. Haven't been able to reliably reproduce it yet. * our BSDI shell machine has NFS mounts to the FreeBSD FTP/WWW machine so users can access their FTP/WWW partition from their shell account. If the FTP/WWW machine is rebooted, the shell will 'loose' the mounts... get 'nfs server not responding' errors until the shell is rebooted. however, new mounting new partitions during this condition still works. Weird... * it's pretty much impossible to setup a totally new scsi disk from scratch without using sysinstall to do it, and it won't do it unless you actually have it install some files on the new disk. * would be nice if sysinstall disallowed root partitions greater then whatever the BIOS limit is for bootloading a /kernel (30MB or so?), or at least gave a warning. -Matt Matthew Dillon VP Engineering, BEST Internet Communications, Inc. , [always include a portion of the original email in any response!] From: gpalmer@westhill.cdrom.com :In message <199506090217.TAA13416@shell1.best.com>, Matt Dillon writes: :> Gotta quick question for you guys too: Is it safe to create :> UFS partitions greater then 2GB ? : :If you are running a fairly recent version of the kernel (last 2 or 3 :months - I can't remember exactly when the fixes ), and tools, then :yes - we have our news server with a 8.6Gb partition! : :Gary From: David Greenman :> :> I stuck a printf() in there to catch the condition as well just to :> see how often it occured... got about a hit every 10 minutes :> on our (very busy) FTP/WWW server from ftpd. : : Interesting...I'll have a look. Want to send me a diff for how you think it :should be fixed? : :> Gotta quick question for you guys too: Is it safe to create :> UFS partitions greater then 2GB ? : : Yes. Walnut Creek's news server has a 9GB drive (w/single partition) for :it's spool/news, and has been working fine for a month or more. : :-DG From: David Greenman :>> In case Dima didn't get this off to you, there's a bug in :>> netinet/tcp_usrreq.c: tcp_connect()... the ifaddr is left :>> uninitialized in the case where in_pcbladdr() fails. The fix :>> is to check the error code from in_pcbladdr() and to return :>> it rather then fall through to the remaining code if it comes :>> back non-zero. :>> :>> I stuck a printf() in there to catch the condition as well just to :>> see how often it occured... got about a hit every 10 minutes :>> on our (very busy) FTP/WWW server from ftpd. :> :> Interesting...I'll have a look. Want to send me a diff for how you think it :>should be fixed? : : Nevermind; it took all of about 2 seconds to see the obvious problem. It :then took me a few minutes to determine if the problem would manifest itself. :I think it would under some circumstances (resulting in a panic or weird :behavior). The fix has been committed to CVS; thanks. : :-DG From: Amancio Hasty : :>>> Matt Dillon said: : : > I stuck a printf() in there to catch the condition as well just to : > see how often it occured... got about a hit every 10 minutes : > on our (very busy) FTP/WWW server from ftpd. : > : :Does that mean that ftp.best.com is running FreeBSD ? : :Amancio From: David Greenman :>> In case Dima didn't get this off to you, there's a bug in :>> netinet/tcp_usrreq.c: tcp_connect()... the ifaddr is left :>> uninitialized in the case where in_pcbladdr() fails. The fix :>> is to check the error code from in_pcbladdr() and to return :>> it rather then fall through to the remaining code if it comes :>> back non-zero. :>> :>> I stuck a printf() in there to catch the condition as well just to :>> see how often it occured... got about a hit every 10 minutes :>> on our (very busy) FTP/WWW server from ftpd. :> :> Interesting...I'll have a look. Want to send me a diff for how you think it :>should be fixed? : : Nevermind; it took all of about 2 seconds to see the obvious problem. It :then took me a few minutes to determine if the problem would manifest itself. :I think it would under some circumstances (resulting in a panic or weird :behavior). The fix has been committed to CVS; thanks. : :-DG