FreeBSD Mail Archives

Date:      Sun, 25 Feb 1996 11:28:15 -0800
From:      Paul Traina <pst@shockwave.com>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        freebsd-current@freebsd.org, jhay@mikom.csir.co.za
Subject:   Re: Bug in libc/db/hash/hash.c??? 
Message-ID:  <199602251928.LAA00548@precipice.shockwave.com>
In-Reply-To: Your message of "Mon, 26 Feb 1996 06:11:11 %2B1100." <199602251911.GAA31346@godzilla.zeta.org.au>

  From: Bruce Evans <bde@zeta.org.au>
  Subject: Re: Bug in libc/db/hash/hash.c???
  >  I'm not sure how postponing the stat helps.  The problem seems to be
  >  with concurrent accesses to the database.  First cap_mkdb opens it and
  >  it gets initialized.  This hasn't changed.  Then the getcap library
  >  opens it and it gets initialized again because the file length is 0.
  >  Oops.

  >I'm not sure what code you're looking at, but that doesn't match my version
  >of cap_mkdb.  There is no getcap library code linked with this file, it's
  >merely opened once, with flags O_CREAT | O_TRUNC, no more.

  I'm looking at the standard version of cap_mkdb.c, which hasn't changed
  since 4.4lite.  It calls cgetnext().

Yep, sorry, missed the implicit open there.

  >  I noticed a(nother) Heisenbug in the old code.  statbuf.st_size isn't
  >  initialized if stat() fails.  This only matters if stat() fails with
  >  an error other than ENOENT.  (There is a similar bug involving errno.)

  >Yes, which is why I changed it to a fstat and I only check statbuf.st_size
  >if the fstat succeeded.  Again, I did not save/restore errno because a
  >perusal of the surrounding code shows that it's in an indeterminate state
  >at that point (that is, there are calls immediately following it that would
  >change the state).

  I think the fix works because the O_CREAT flag is now honoured (perhaps
  it should check O_TRUNC too?).  I think the database was messed up when
  cgetnext() opened it without (O_CREAT | O_TRUNC).

Do you mean it should require both O_CREAT and O_TRUNC or either?  If either,
that case is covered earlier.  I'm not certain both should be required if
the file is already 0 sized.  However, I'm easy going, whichever you want
is fine by me, just as long as you clarify what you want. :-)

  >Now, the big question: "Is there still a bug with this?"  Even if cap_mkdb
  >doesn't do what you suggest,  what happens if someone /does/ do concurrent
  >opens of a file?  You're correct, there -is- a race condition for the window
  >between the open and the first hash_sync.  We could either reduce that windo
>>w
  >by doing an initial hash_sync immediately after the table is initialized
  >(yuck for two reasons), or toss this entire idea as being bad and revert
  >back to pre-pst code.

  I doubt that the old way survived concurrent opens.  The second opener
  got an empty database if the first opener hasn't synced anything.  How
  could that work?  I think it usually gets read error early, so it usually
  fails safely.  Worse can probably happen if the first opener the database
  is half written.

  You've certainly introduced a new race, but I wouldn't worry about it
  especially.  There must be more opportunities to read inconsistent
  data while the database is being updated.

Yeah, it looks like paging to disk is rather asynchronous in nature.  In my
opinion, the whole damn thing should internally be protected with advisory
locks,  but that completely breaks ndbm semantics where the callers expect
to have to lock the database themselves.  Sigh.

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199602251928.LAA00548>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation