Date: Mon, 26 Feb 1996 06:11:11 +1100 From: Bruce Evans <bde@zeta.org.au> To: bde@zeta.org.au, pst@shockwave.com Cc: freebsd-current@freebsd.org, jhay@mikom.csir.co.za Subject: Re: Bug in libc/db/hash/hash.c??? Message-ID: <199602251911.GAA31346@godzilla.zeta.org.au>
next in thread | raw e-mail | index | archive | help
> I'm not sure how postponing the stat helps. The problem seems to be > with concurrent accesses to the database. First cap_mkdb opens it and > it gets initialized. This hasn't changed. Then the getcap library > opens it and it gets initialized again because the file length is 0. > Oops. >I'm not sure what code you're looking at, but that doesn't match my version >of cap_mkdb. There is no getcap library code linked with this file, it's >merely opened once, with flags O_CREAT | O_TRUNC, no more. I'm looking at the standard version of cap_mkdb.c, which hasn't changed since 4.4lite. It calls cgetnext(). > I noticed a(nother) Heisenbug in the old code. statbuf.st_size isn't > initialized if stat() fails. This only matters if stat() fails with > an error other than ENOENT. (There is a similar bug involving errno.) >Yes, which is why I changed it to a fstat and I only check statbuf.st_size >if the fstat succeeded. Again, I did not save/restore errno because a >perusal of the surrounding code shows that it's in an indeterminate state >at that point (that is, there are calls immediately following it that would >change the state). I think the fix works because the O_CREAT flag is now honoured (perhaps it should check O_TRUNC too?). I think the database was messed up when cgetnext() opened it without (O_CREAT | O_TRUNC). >Now, the big question: "Is there still a bug with this?" Even if cap_mkdb >doesn't do what you suggest, what happens if someone /does/ do concurrent >opens of a file? You're correct, there -is- a race condition for the window >between the open and the first hash_sync. We could either reduce that window >by doing an initial hash_sync immediately after the table is initialized >(yuck for two reasons), or toss this entire idea as being bad and revert >back to pre-pst code. I doubt that the old way survived concurrent opens. The second opener got an empty database if the first opener hasn't synced anything. How could that work? I think it usually gets read error early, so it usually fails safely. Worse can probably happen if the first opener the database is half written. You've certainly introduced a new race, but I wouldn't worry about it especially. There must be more opportunities to read inconsistent data while the database is being updated. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199602251911.GAA31346>