Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 13 May 2015 19:58:25 -0700
From:      "Chris H" <bsd-lists@bsdforge.com>
To:        <freebsd-ports@freebsd.org>
Subject:   Re: damage to pkg's sqlite data base
Message-ID:  <b48c34d8d273750f78af21ce077935cc@ultimatedns.net>
In-Reply-To: <201505132152.t4DLqbLK000601@sdf.org>
References:  <201505120617.t4C6HkA3019096@sdf.org> <0dcbdcb9fdfdebc53a2563b11494f61b@ultimatedns.net> <201505130512.t4D5Cpqa011080@sdf.org> <5f1749d3e664fbb487aa98f160e6cf92@ultimatedns.net> <201505130720.t4D7KtMd016553@sdf.org> <a65a967965bd406b674ca618265d8b22@ultimatedns.net> <20150513145004.GA90457@ivaldir.etoilebsd.net>, <201505132152.t4DLqbLK000601@sdf.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 13 May 2015 16:52:37 -0500 Scott Bennett <bennett@sdf.org> wrote

> I'm short on time right now, so I'll defer my replies to Andrew Clark
> and Chris H until late tonight after I (hope to) have had a chance to try
> out Andrew's suggested procedure.  I do have sqlite installed, though (see
> far below).
> 
> Baptiste Daroussin <bapt@freebsd.org> wrote:
> > On Wed, May 13, 2015 at 06:48:12AM -0700, Chris H wrote:
> > > On Wed, 13 May 2015 02:20:55 -0500 Scott Bennett <bennett@sdf.org> wrote
> > > 
> > > > "Chris H" <bsd-lists@bsdforge.com> wrote:
> > > > > On Wed, 13 May 2015 00:12:51 -0500 Scott Bennett <bennett@sdf.org>
> > > wrote > > > "Chris H" <bsd-lists@bsdforge.com> wrote:
> > > > > > > On Tue, 12 May 2015 01:17:46 -0500 Scott Bennett
> > > <bennett@sdf.org> > > > wrote >
> > > > > > > > For nearly two weeks I've been stymied by an apparently damaged
> > > > > > record > > in the sqlite data base used by pkg(8) and
> > > pkg-static(8).  > > > Unfortunately, > > it is a record for a port that
> > > is depended upon rather > > > heavily, lang/gcc. > > lang/gcc compiled
> > > and linked just fine, but any > > > attempt to install the > > result
> > > ends up like this. > > > > > 
> > > > > > > > ===>  Checking if gcc already installed
> > > > > > > > ===>   Registering installation for gcc-4.8.4_3
> > > > > > > > Installing gcc-4.8.4_3...
> > > > > > > > pkg-static: sqlite error while executing iterator in file
> > > > > > > > pkgdb_iterator.c:931: database disk image is malformed
> > > pkg-static: > > > sqlite > > error while executing INSERT OR REPLACE INTO
> > > files (path, > > > sha256, > > package_id) VALUES (?1, ?2, ?3) in file
> > > pkgdb.c:1722: > > > database disk image > > is malformed *** Error code
> > > 70 > > > > > 
> > > > > > > > Stop.
> > > > > > > > make: stopped in /usr/ports/lang/gcc
> > > > > > > > 
> > > > > > > >      I'm not familiar with sqlite commands, nor with the
> > > structure of > > > the > > pkg data base, nor any of several other
> > > relevant topics.  But I > > > do have 
> > > >
> > > >      If someone knows sqlite well enough to look at the error messages
> > > above > and then give me directions to delete the offending records, then
> > > perhaps the > next attempt to install lang/gcc might actually work. 
> > > Anyone? > 
> > > > > > > > questions:
> > > > > > > > 
> > > > > > > >     1) how might the damage have happened?
> > > > > > > > 
> > > > > > > >     2) how might such an event be avoided/prevented in the
> > > future? > > > > > 
> > > > > > > >     3) most urgently, how can I *fix* it?
> > > > > > > You should get a backup made on the nightly cron(8)
> > > (periodic(8)). > > > > It'll be located in /var/backups
> > > > > > > Off the top of my head I can't remember which of the 2, is which.
> > > > > > > But a simple size comparison should tell the tale. :)
> > > > > > 
> > > > > >      The problem began much farther back than two days ago, but I
> > > should > > > be able to restore it from two weeks or more in the past and
> > > then find > > > the most recent version that is not corrupted.
> > > > > > 
> > > > > > > Simply rename your (now) corrupt db, and copy the backup over.
> > > > > > 
> > > > > >      However, if I do that, then what happens to all the ports that
> > > have > > > been updated or added since that version of the data base was
> > > backed up? > > > I have run "portmaster -a" (with some additional
> > > options) quite a few > > > times since the lang/gcc problem first
> > > appeared, so an old local.sqlite > > > will no longer accurately reflect
> > > what is currently installed. > > > >
> > > > > > > I ran into this problem as well, and this was how I was finally
> > > > > > > able to overcome it.
> > > > > > >
> > > > > > > Hope this helps!
> > > > > > >
> > > > > > > --Chris
> > > > > > >
> > > > > > > > 
> > > > > > > >     4) I was unable to find any instructions for recreating a
> > > pkg > > > data > >     base if the data base gets damaged/destroyed.  Is
> > > there a > > > way to > >     do that that I missed?
> > > > > > 
> > > > > >      There must be a way to do this, right?  I mean, really, it's
> > > pretty > > > fundamental that no new data base be put into production
> > > without a way to > > > rebuild it.  The FreeBSD developers haven't really
> > > broken so ancient and > > > basic a principle, have they?  So what's the
> > > trick?  What is the method > > > to rebuild /var/db/pkg/local.sqlite from
> > > scratch based upon the currently > > > installed ports/packages?
> > > > > I whined about it the first time my DB blew up. It's become
> > > > > corrupted several times since on different boxes/versions. *but*
> > > > > after the first time, I made it a habit of making a copy of it
> > > *before* > > embarking on an upgrade, or install of any ports. Seems it's
> > > the only > > way to save yourself. pkg(8), to the best of my knowledge
> > > doesn't > > perform a backup prior to any of this. So I *do*.
> > > > > Annoying, but it beats attempting to recover from a failed port
> > > > > upgrade/install.
> > > > 
> > > >      Chris, thanks for your responses, depressing though they be.
> > > >      I have been using portmaster for ages to update my installed ports
> > > > and usually to install them in the first place.  Does your suggestion
> > > > mean abandoning portmaster?
> > > No. I'm an avid user of portmaster(8). In fact I intend to be it's
> > > maintainer. :)
> > > >  If it is necessary to back local.sqlite up
> > > > before each installation, that would seem to imply that portmaster
> > > would > only be useful in getting a list of ports to be updated, after
> > > which each > port would have to be updated individually, so that
> > > local.sqlite could be > backed up before each one.  I have over 1900
> > > ports installed, so that would > mean just never updating my ports at
> > > all, except by rebuilding all of them > upon upgrading to a new major
> > > release of FreeBSD. I
> > > >      Bapt, oh, Bapt, whatever were you folks thinking?  Did you really
> > > > leave us no practical recovery method?
> > > In Bapt's humble defense. It's difficult, to impossible for him to fix
> > > a problem w/o decent clues. So I wouldn't go pointing any fingers, just
> > > yet. :) In other words; w/o a cause, what should he fix? :)
> > > What recovery would you suggest? Truth is, if you brushed up on sqlite3
> > > you could create a fresh DB, using output from portmaster(8).
> > > Troublesome, perhaps. But at least an alternative.
> 
>      Chris?  You already responded to my question 4) above, which answers
> your question.  Why are you now asking this?  Any data base intended for
> production use MUST be accompanied by a method of recreating it before
> the data base goes into production.  This has been the case as long as
> data bases have been used.  The Wallendas may walk without nets between
> skyscraper roofs, but I assure you that a) not all of them arrive at the
> intended destinations and b) normal mortals have a different perspective
> on that sort of risk.
>      Further, if portmaster produces that output *from the corrupted data
> base*, then it stands to reason that other cases of corruption that differed
> in particulars from this one might prevent portmaster from providing said
> output.
Portmaster(8) has it's own record keeping system (to a degree). The
information it *knows* about regarding *what's* installed, can
allow you to ask it to produce output that will aid in the
*re*creation of a failed DB. Perhaps not perfectly, but often, good
enough. As to my answer above; I "whined" that it seemed that
(at that time) that having a "single point of failure" (sqlite3), was
not good practice for a base/required part of the system. As there
is no (reasonable) recourse in the event of failure. I still stand
by that. I'm not keen on the idea that the average user *must* learn
[yet another] SQL, in order to effectively maintain their system.
If it were up to me, it'd all be simple ASCII based text. Sure, the
DB is text, but not as easily recovered as other text based record
keeping applications provide. In the end, I'm *extremely* grateful
for all the work Bapt, and Co have put into pkg(8). It has become
[seemingly] *quite* stable. But I think it could do with an option
to [at least] create a DB backup *prior* to commencing with any
operation that will affect the DB. See; "safety net". :)
> 
> > > > 
> > Coming very late to the discussion I would like to understand what lead you
> > to a broken database, from what I am aware of the only way to end up with a
> > broken db is using pkg over nfs without rpc lock properly running (note
> > that the default >
> > nfs client service on freebsd does not start rpc lock...)
> >
> > pkg work around this case by detecting the filesystem it is running on and
> > changing the locking system it uses when running on network filesystem so
> > this problem cannot in theory happen anymore even if one forgot to start
> > rpc lockd. >
> > It remains however one case where this can happen: it is one the host is
> > running over nfs and it starts a jail, in that case, the jail does not know
> > it is running over nfs so pkg cannot know it needs to switch the locking
> > system. >
> > Are you in one of those situation, if not can you describe a a bit more
> > your system?
> 
> FreeBSD hellas 10.1-STABLE FreeBSD 10.1-STABLE #54 r282368: Sun May  3
> 15:48:37 CDT 2015     bennett@hellas:/usr/obj/usr/src/sys/hellas  amd64
>      I have only one machine running these days, so there is no NFS.  At
> present, I have no jails set up.
>      Here are data concerning the relevant drive, which is the boot drive.
> There are many other file systems on this machine, but they all involve
> external devices and are unrelated to this issue, except for /buildwork,
> where WRKDIRPREFIX=/buildwork/ports and CCACHE_DIR=/buildwork/ccache.  The
> same sort of information for /buildwork is included below.
> 
> /dev/mirror/fbsds1a on / (ufs, local)
> devfs on /dev (devfs, local, multilabel)
> /dev/mirror/fbsds1d on /var (ufs, local)
> /dev/mirror/fbsds1e on /usr (ufs, local, soft-updates)
> /dev/mirror/fbsds1f on /usr/local (ufs, local)
> /dev/mirror/fbsds3a on /usr/home (ufs, local)
> /dev/mirror/fbsds3d on /usr/ports (ufs, local, soft-updates)
> /dev/ufs/stripework on /buildwork (ufs, local, soft-updates)
> 
>  mirror/fbsds1  COMPLETE  ada0s1 (ACTIVE)
> mirror/fbsds3a  COMPLETE  ada0s3a (ACTIVE)
> mirror/fbsds3d  COMPLETE  ada0s3d (ACTIVE)
>                             ufs/stripework     N/A  stripe/5x15g1
> stripe/5x15g1      UP  da0p5
>                        da2p5
>                        da3p5
>                        da5p5
>                        da6p5
> 
> Note that the mirrors listed above have only one device in each.  At one
> point a few weeks ago, there were two devices in each for a bit under
> eight hours until the brand-new second drive failed.  Unfortunately, it
> turns out that the manufacturer does not provide any support for its
> internal hard drive products, other than replacement under warranty via
> a web page that requires an unsecured web browser to apply for the
> replacement.  Grrr.  Until I get that straightened out and have a second
> drive, all of the mirrors listed above on the boot drive are unreplicated.
> >
> > Now by default pkg creates back up of the db daily to allow recovery, there
> > was an issue before pkg 1.5 which requires a manual intervention and fixed
> > after pkg 1.5 so it is easier to recover.
> 
>      One serious problem for recovery is that I don't know when the
> corruption actually occurred.  All I can tell you is that it must have
> happened no earlier than the previous update to lang/gcc (lang/gcc48).
> The error messages first appeared with the update to gcc-4.8.4_3.
> >
> > sqlite is a pretty solid db I would like to understand how you ended up
> > with a corrupted db if not what I described above.
> 
>      So would I because I would seriously like it not to happen again.
> Long ago, I used to use portupgrade, but I eventually switched to
> portmaster for three reasons:  speed (ruby is awfully slow), dougb@
> massively reworked portmaster and made it work much more correctly
> than portupgrade, and portupgrade's frustrating habit of always deciding
> that the package data base had to be rebuilt from nothing, which always
> took a long time and failed as often as it succeeded.  Because portmaster
> now has to use a data base instead of direct evidence, it seems that that
> fourth complaint has partly returned to haunt us.
>      sqlite3-3.8.9_1 is what is installed on my system.  I have no idea
> which version is incorporated into pkg or pkg-static.
> 
> 
>                                   Scott Bennett, Comm. ASMELG, CFIAG
> **********************************************************************
> * Internet:   bennett at sdf.org   *xor*   bennett at freeshell.org  *
> *--------------------------------------------------------------------*
> * "A well regulated and disciplined militia, is at all times a good  *
> * objection to the introduction of that bane of all free governments *
> * -- a standing army."                                               *
> *    -- Gov. John Hancock, New York Journal, 28 January 1790         *
> **********************************************************************
> _______________________________________________
> freebsd-ports@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-ports
> To unsubscribe, send any mail to "freebsd-ports-unsubscribe@freebsd.org"

--Chris

--





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?b48c34d8d273750f78af21ce077935cc>