From owner-freebsd-hackers Fri Nov 15 14:26:12 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id OAA21875 for hackers-outgoing; Fri, 15 Nov 1996 14:26:12 -0800 (PST) Received: from Kitten.mcs.com (Kitten.mcs.com [192.160.127.90]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id OAA21866 for ; Fri, 15 Nov 1996 14:26:07 -0800 (PST) Received: from Mailbox.mcs.com (Mailbox.mcs.com [192.160.127.87]) by Kitten.mcs.com (8.8.2/8.8.2) with ESMTP id QAA19565; Fri, 15 Nov 1996 16:24:35 -0600 (CST) Received: from Jupiter.Mcs.Net (karl@Jupiter.mcs.net [192.160.127.88]) by Mailbox.mcs.com (8.8.2/8.8.2) with ESMTP id QAA12532; Fri, 15 Nov 1996 16:24:30 -0600 (CST) Received: (from karl@localhost) by Jupiter.Mcs.Net (8.8.2/8.8.2) id QAA11694; Fri, 15 Nov 1996 16:24:15 -0600 (CST) From: Karl Denninger Message-Id: <199611152224.QAA11694@Jupiter.Mcs.Net> Subject: Re: Sockets question... To: terry@lambert.org (Terry Lambert) Date: Fri, 15 Nov 1996 16:24:14 -0600 (CST) Cc: karl@Mcs.Net, terry@lambert.org, fenner@parc.xerox.com, scrappy@ki.net, jdp@polstra.com, hackers@freebsd.org In-Reply-To: <199611152209.PAA27148@phaeton.artisoft.com> from "Terry Lambert" at Nov 15, 96 03:09:53 pm X-Mailer: ELM [version 2.4 PL24] Content-Type: text Sender: owner-hackers@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > No, Karl is doing this: > > > > 1) The *writer* is writing records of variable size with a prefix to > > indicate how many byte(s) follow. > > > > 2) The writer does this ASSUMING that all of the records will get > > delivered to the reader. > > > > 3) When the writer is done, he writes a "no more records follow" > > flag record. > > > > 4) All of those writes return with no errors. > > > > 5) The READER gets about 2700 of the records (out of 8500!) and NEVER > > SEES ANY MORE DATA. It hangs in read()! > > > > This does NOT happen with the 2.6.3 development kit and libraries. It > > RELIABLY happens with -current. > > Is the data in #1 getting to the wire? It happens with the local host on both sides (ie: connect back to the local hostname, in which case the wire isn't involved). > Who is losing the data, the writer or the reader? The writer; the reader never gets the data. > If the reader, is it because of a buffer overflow? The reader never sees it, and its NOT in the mbuf clusters (netstat -an shows nothing outstanding and the socket in a connected state for both sides). > If so, is the reader acking for packets it does not agregate into the > processes read buffer, or is the writer pretending he got ack's? See above; the writer never gets ACKs back (he only expects one at the end of the stream, and since the reader never sees the end record he never sends the ACK). > What if the reader is 2.6.3 and the writer is -current? You're dead. The writer is the one which is important; the reader is not. > What if the situation is reversed? See above. > We need to localize the problem to the client or the server (if possible), > and then localize the problem further to the kernel interface at which > it is occurring. > > Terry Lambert > terry@lambert.org Its on the writing end. Leaving all else alone and recompiling the writer with 2.7.x breaks, 2.6.3 works. -- -- Karl Denninger (karl@MCS.Net)| MCSNet - The Finest Internet Connectivity http://www.mcs.net/~karl | T1's from $600 monthly to FULL DS-3 Service | 33 Analog Prefixes, 13 ISDN, Web servers $75/mo Voice: [+1 312 803-MCS1 x219]| Email to "info@mcs.net" WWW: http://www.mcs.net/ Fax: [+1 312 248-9865] | 2 FULL DS-3 Internet links; 400Mbps B/W Internal