From owner-freebsd-hackers@FreeBSD.ORG Tue May 15 06:34:55 2007 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7B86416A400 for ; Tue, 15 May 2007 06:34:55 +0000 (UTC) (envelope-from xistence@0x58.com) Received: from mailexchange.osnn.net (1e.66.5646.static.theplanet.com [70.86.102.30]) by mx1.freebsd.org (Postfix) with SMTP id 3324D13C457 for ; Tue, 15 May 2007 06:34:55 +0000 (UTC) (envelope-from xistence@0x58.com) Received: (qmail 53574 invoked by uid 0); 15 May 2007 06:31:09 -0000 Received: from unknown (HELO ?10.10.10.22?) (xistence@0x58.com@72.208.132.56) by mailexchange.osnn.net with SMTP; 15 May 2007 06:31:09 -0000 In-Reply-To: <46493F0A.9050303@u.washington.edu> References: <200705102105.27271.blackdragon@highveldmail.co.za> <20070512155059.92011d54.stas@FreeBSD.org> <4645AFAF.7010704@free.fr> <8916C4D5-4DB5-49C0-AF8D-07F9FFA0A6E0@0x58.com> <46493F0A.9050303@u.washington.edu> Mime-Version: 1.0 (Apple Message framework v752.3) Content-Type: multipart/signed; micalg=sha1; boundary=Apple-Mail-1--959769673; protocol="application/pkcs7-signature" Message-Id: <96C6AAEA-B70A-400F-8614-8DFDE5930D19@0x58.com> From: Bert JW Regeer Date: Mon, 14 May 2007 23:34:52 -0700 To: Garrett Cooper X-Mailer: Apple Mail (2.752.3) X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-hackers@freebsd.org Subject: Re: New FreeBSD package system (a.k.a. Daemon Package System (dps)) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 May 2007 06:34:55 -0000 --Apple-Mail-1--959769673 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=ISO-8859-1; delsp=yes; format=flowed On May 14, 2007, at 10:03 PM, Garrett Cooper wrote: > Bert JW Regeer wrote: >> On May 12, 2007, at 5:14 AM, Philippe Laquet wrote: >>> Stanislav Sedov a =E9crit : >>>> On Fri, 11 May 2007 02:10:05 +0200 >>>> Ivan Voras mentioned: >>>> >>>> >>>>> - I think it's time to give up on using BDB+directory tree full =20= >>>>> of text >>>>> files for storing the installed packages database, and I =20 >>>>> propose all of >>>>> this be replaced by a single SQLite database. SQLite is public =20 >>>>> domain >>>>> (can be slurped into base system), embeddable, stores all data =20 >>>>> in a >>>>> single file, lightweight, fast, and can be used to do fancy =20 >>>>> things such >>>>> as reporting. >>>>> >>>> >>>> What is the reason to use SQL-based database? You'll perform direct >>>> queries to database? The packaging system is for ordinal users, =20 >>>> not sql >>>> geeks, so they should not have to use sql for managing packages. =20= >>>> So a >>>> simple set of hashes will suffer or needs. I agree with Julian =20 >>>> that we >>>> should have a backup of packaging database in plain text format, =20= >>>> and >>>> utility to rebuild it. This way we can always restore the =20 >>>> database if >>>> something goes wrong. Furhtermore, that should not make a great =20 >>>> impact >>>> on performance, since we don't have to rebuild it every day. >>>> >>> I agree with Stan ;) >>> >>> "fast and improved" package utilities uses mainly some indexed =20 >>> berkeley DB combined with flat files, aren't they? I, and may be =20 >>> many other FreeBSD users use light systems for efficiency and =20 >>> easier management, if we use some database system it will require =20= >>> Disk Space, resources for the DB to run, dependencies and so =20 >>> on... And we also may be exposed to a "that DB is better" war ;) >>> >> SQLite is compiled inside a program, and as such does not require =20 >> any resources other than one file handle and some CPU time when =20 >> querying. The file is stored on disk, and requires no separate =20 >> process to be running to query. Maybe I misunderstood what you =20 >> were trying to say. SQLite will require less resources than flat =20 >> text files, since SQLite is a one time open then process, instead =20 >> of what is currently happening, having to open and close hundreds =20 >> of files depending on how many ports are installed. With this =20 >> regard, SQLite is like BDB. Where SQLite uses standards compliant =20 >> SQL statements to get data. > > Correct. =46rom what I was reading shared memory read access and =20 > locking are two available features of BDB databases. > > The only thing is that I do agree that there should be a dumping =20 > and importing mechanism of some kind for semi-formatted text files, =20= > for backup, debugging, and modification purposes. That's just my =20 > personal idea on the topic though :). > >>>> --=20 >>>> Stanislav Sedov >>>> ST4096-RIPE >>>> >>> >> I am able to understand many of the gripes with using a databases, =20= >> and have to import yet another code base into the FreeBSD base, =20 >> however as one of the young ones, and knowing sed/awk/grep and =20 >> SQL, I prefer SQL over having to process hundreds of text files =20 >> using text processing tools. It saddens me each time I run one of =20 >> the pkg_* tools that needs to parse the flat file structure since =20 >> it takes so long. I have friends running Ubuntu and their apt-get =20 >> returns results much faster. >> In a world where hard drives are becoming more reliable, and are =20 >> automatically relocating sectors that go bad, do we really have to =20= >> worry about database corruption as much? I feel that many of the =20 >> fears that are being put forward will do harm to a text based =20 >> "storage" system as well. If one block drops out, it can cause =20 >> tools to not be able to parse the files. Create a backup copy of =20 >> the database after each successful transaction? There are ways to =20 >> battle data corruption. > > True. I was thinking of backup, and recreation from scratch, =20 > considering that the database wouldn't be more than a few megs. In =20 > place replacement just seems like a hairy situation sometimes.. > >> Using BDB is not an real option either. I can not even count the =20 >> amount of times that the BDB database that portupgrade created has =20= >> become corrupt because I accidently ran two portupgrades at the =20 >> same time, or even remembered that I did not want to upgrade =20 >> something and hit Ctrl+C. > > I'm sorry but nothing's completely solid in that respect, AFAIK. In =20= > terms of the first problem you mentioned, Wade is working on the =20 > locking . > > In terms of transactions, maybe we should take a look at Subversion =20= > for inspiration: archive-2005-03/0301.shtml>. I'm a firm believer that it's easier =20 > to incorporate code than it is to remove it. I am unable to see any references to transaction support for BDB =20 databases, maybe I am missing something. Subversion in that thread is =20= suggesting SQL for a totally different reason. fsfs is what most =20 people are using as a subversion backend to help avoid BDB =20 corruption. =46rom the many people I have talked to that used to use =20 Subversion with BDB have had major issues, whereas fsfs has not had =20 any issues at all. Just what I have experienced myself as a Subversion repository =20 administrator. > >> The experience I got from running SVN with BDB as the back-end =20 >> database to store my data, I say no thanks. In that case I would =20 >> much rather stick with the flat text files than go with a database. > > Well, a few comments: > > -Text files are bloated. Although many people are for XML, it takes =20= > much longer to parse than binary databases. /var/db/pkg/ are all plain flat text files. I am not a supporter of =20 XML at all. > -Custom text files require custom format capable parsers, no matter =20= > what the format, and the less coverage a parser has, the more =20 > probable the likelihood of bugs IMO. We already have these in the pkg_* functions, so i'd hope they are =20 fairly solid! > -In the event that features changed or were added, some required =20 > modifications to the parser could be trivial to major. With =20 > databases you can get away from that mentality to some degree IMHO. Changing an SQL query versus re-writing a parser for text files is a =20 huge difference. > > -Garrett I am not opposed to text files, other than that they can be slow. I =20 am against BDB because over the years, in my experience they have =20 shown to be extremely unreliable and easily corrupted. If we are =20 going to be making changes to the way the ports/packages store the =20 information about what exists, it should be done in such a way that =20 it is scalable and at the same time extensible (is this a word?). Bert JW Regeer --Apple-Mail-1--959769673--