From owner-freebsd-bugs Fri Jun 13 15:30:31 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id PAA24399 for bugs-outgoing; Fri, 13 Jun 1997 15:30:31 -0700 (PDT) Received: from agora.rdrop.com (root@agora.rdrop.com [199.2.210.241]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id PAA24312; Fri, 13 Jun 1997 15:29:55 -0700 (PDT) Received: from george.lbl.gov (george-2.lbl.gov [131.243.2.12]) by agora.rdrop.com (8.8.5/8.8.5) with SMTP id PAA01177; Fri, 13 Jun 1997 15:29:51 -0700 (PDT) Received: (jin@localhost) by george.lbl.gov (8.6.10/8.6.5) id PAA24417; Fri, 13 Jun 1997 15:26:20 -0700 Date: Fri, 13 Jun 1997 15:26:20 -0700 From: "Jin Guojun[ITG]" Message-Id: <199706132226.PAA24417@george.lbl.gov> To: lambert.org!terry@agora.rdrop.com Subject: Re: (kern/3827) : was ASUS P/I-P65UP5 with C-P55T2D Cc: bugs@FreeBSD.ORG, smp@FreeBSD.ORG, smp@csn.net Sender: owner-bugs@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk }> The problem has been narrowed to a very small program at below. }> It appears to happen at subroutine level. That is, the open/freopn works }> in main() program, but fails in subroutine. To replace the subroutine call }> at line 7 in main(), marked /* replace HERE */, with the body of openread() , }> then, program works. So, what can cause this problem? } }[ ... ] } }It's looks like you are hanging in the threads scheduler. } }When you butil your libraries, did you rebuild everything from }scratch? If not, that may be your problem. I did not do anything with the libraries. They come with the 3.0-SNAP distribution. }I know that main is "different" in threads, but I don't understand }why a subroutine call would cause it to fail when it doesn't fail }in main. That is the strange part. Our most programs are not failure because they start fopen/freopen at main(), and the complicated programs are invoking the fopen/freopen in subroutines, which typically in threaded subroutines. So, at the beginning, I thought the hanging was related when the program growing in size. So, I wrote a small program -- openread -- in a main() style that works. Then, I tried to make it called from the program failed at fopen/freopen. This needs make openread() to be a subroutine. At this time, the simple program fails. So, it is not the program size issue. Since the openread.c is so small, the only thing I saw is fopen/freopen fails in subroutine. }What happens if you link it without the threaded libc? Does it }run OK? (Do you have one? Or are you running the very recently }changed stuff?). That is not the problem. The problem happens only linked to libc_r, which is defaulted to distribution of 3.0-xxx. Without -lc_r, everything is OK. }If it does, there is apparently a hidden stack dependency bug }which is being triggered by your code. Check the wrappers for }the functions called by the functions you call. It may be }that freopen() is not a happy camper in general, on the basis }of descriptor locks, and you are only lucking out in main() }when it works. } }Another thig to try is to take stdio out of the equation. I }suspect that it is an implementation assumption in stdio that }is biting you; it may in fact be a scheduler race which gets }drawn out when you go down. } }If it's a race, make the variables global; if it still fails, }you should then be able to migrate the code in and out of the }subroutine a tine at a time to localise the error (part of this }will be to compile the libc(3) pieces yourself, and migrate }their function boundries as well). } }Sorry I can't give you an "oh, #define FROBOZZ" type soloution.