Date: Mon, 14 May 2007 23:34:52 -0700 From: Bert JW Regeer <xistence@0x58.com> To: Garrett Cooper <youshi10@u.washington.edu> Cc: freebsd-hackers@freebsd.org Subject: Re: New FreeBSD package system (a.k.a. Daemon Package System (dps)) Message-ID: <96C6AAEA-B70A-400F-8614-8DFDE5930D19@0x58.com> In-Reply-To: <46493F0A.9050303@u.washington.edu> References: <200705102105.27271.blackdragon@highveldmail.co.za> <f20c8u$htp$1@sea.gmane.org> <20070512155059.92011d54.stas@FreeBSD.org> <4645AFAF.7010704@free.fr> <8916C4D5-4DB5-49C0-AF8D-07F9FFA0A6E0@0x58.com> <46493F0A.9050303@u.washington.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail-1--959769673 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=ISO-8859-1; delsp=yes; format=flowed On May 14, 2007, at 10:03 PM, Garrett Cooper wrote: > Bert JW Regeer wrote: >> On May 12, 2007, at 5:14 AM, Philippe Laquet wrote: >>> Stanislav Sedov a =E9crit : >>>> On Fri, 11 May 2007 02:10:05 +0200 >>>> Ivan Voras <ivoras@fer.hr> mentioned: >>>> >>>> >>>>> - I think it's time to give up on using BDB+directory tree full =20= >>>>> of text >>>>> files for storing the installed packages database, and I =20 >>>>> propose all of >>>>> this be replaced by a single SQLite database. SQLite is public =20 >>>>> domain >>>>> (can be slurped into base system), embeddable, stores all data =20 >>>>> in a >>>>> single file, lightweight, fast, and can be used to do fancy =20 >>>>> things such >>>>> as reporting. >>>>> >>>> >>>> What is the reason to use SQL-based database? You'll perform direct >>>> queries to database? The packaging system is for ordinal users, =20 >>>> not sql >>>> geeks, so they should not have to use sql for managing packages. =20= >>>> So a >>>> simple set of hashes will suffer or needs. I agree with Julian =20 >>>> that we >>>> should have a backup of packaging database in plain text format, =20= >>>> and >>>> utility to rebuild it. This way we can always restore the =20 >>>> database if >>>> something goes wrong. Furhtermore, that should not make a great =20 >>>> impact >>>> on performance, since we don't have to rebuild it every day. >>>> >>> I agree with Stan ;) >>> >>> "fast and improved" package utilities uses mainly some indexed =20 >>> berkeley DB combined with flat files, aren't they? I, and may be =20 >>> many other FreeBSD users use light systems for efficiency and =20 >>> easier management, if we use some database system it will require =20= >>> Disk Space, resources for the DB to run, dependencies and so =20 >>> on... And we also may be exposed to a "that DB is better" war ;) >>> >> SQLite is compiled inside a program, and as such does not require =20 >> any resources other than one file handle and some CPU time when =20 >> querying. The file is stored on disk, and requires no separate =20 >> process to be running to query. Maybe I misunderstood what you =20 >> were trying to say. SQLite will require less resources than flat =20 >> text files, since SQLite is a one time open then process, instead =20 >> of what is currently happening, having to open and close hundreds =20 >> of files depending on how many ports are installed. With this =20 >> regard, SQLite is like BDB. Where SQLite uses standards compliant =20 >> SQL statements to get data. > > Correct. =46rom what I was reading shared memory read access and =20 > locking are two available features of BDB databases. > > The only thing is that I do agree that there should be a dumping =20 > and importing mechanism of some kind for semi-formatted text files, =20= > for backup, debugging, and modification purposes. That's just my =20 > personal idea on the topic though :). > >>>> --=20 >>>> Stanislav Sedov >>>> ST4096-RIPE >>>> >>> >> I am able to understand many of the gripes with using a databases, =20= >> and have to import yet another code base into the FreeBSD base, =20 >> however as one of the young ones, and knowing sed/awk/grep and =20 >> SQL, I prefer SQL over having to process hundreds of text files =20 >> using text processing tools. It saddens me each time I run one of =20 >> the pkg_* tools that needs to parse the flat file structure since =20 >> it takes so long. I have friends running Ubuntu and their apt-get =20 >> returns results much faster. >> In a world where hard drives are becoming more reliable, and are =20 >> automatically relocating sectors that go bad, do we really have to =20= >> worry about database corruption as much? I feel that many of the =20 >> fears that are being put forward will do harm to a text based =20 >> "storage" system as well. If one block drops out, it can cause =20 >> tools to not be able to parse the files. Create a backup copy of =20 >> the database after each successful transaction? There are ways to =20 >> battle data corruption. > > True. I was thinking of backup, and recreation from scratch, =20 > considering that the database wouldn't be more than a few megs. In =20 > place replacement just seems like a hairy situation sometimes.. > >> Using BDB is not an real option either. I can not even count the =20 >> amount of times that the BDB database that portupgrade created has =20= >> become corrupt because I accidently ran two portupgrades at the =20 >> same time, or even remembered that I did not want to upgrade =20 >> something and hit Ctrl+C. > > I'm sorry but nothing's completely solid in that respect, AFAIK. In =20= > terms of the first problem you mentioned, Wade is working on the =20 > locking <http://wiki.freebsd.org/WadeWesolowsky>. > > In terms of transactions, maybe we should take a look at Subversion =20= > for inspiration: <http://svn.haxx.se/dev/=20 > archive-2005-03/0301.shtml>. I'm a firm believer that it's easier =20 > to incorporate code than it is to remove it. I am unable to see any references to transaction support for BDB =20 databases, maybe I am missing something. Subversion in that thread is =20= suggesting SQL for a totally different reason. fsfs is what most =20 people are using as a subversion backend to help avoid BDB =20 corruption. =46rom the many people I have talked to that used to use =20 Subversion with BDB have had major issues, whereas fsfs has not had =20 any issues at all. Just what I have experienced myself as a Subversion repository =20 administrator. > >> The experience I got from running SVN with BDB as the back-end =20 >> database to store my data, I say no thanks. In that case I would =20 >> much rather stick with the flat text files than go with a database. > > Well, a few comments: > > -Text files are bloated. Although many people are for XML, it takes =20= > much longer to parse than binary databases. /var/db/pkg/ are all plain flat text files. I am not a supporter of =20 XML at all. > -Custom text files require custom format capable parsers, no matter =20= > what the format, and the less coverage a parser has, the more =20 > probable the likelihood of bugs IMO. We already have these in the pkg_* functions, so i'd hope they are =20 fairly solid! > -In the event that features changed or were added, some required =20 > modifications to the parser could be trivial to major. With =20 > databases you can get away from that mentality to some degree IMHO. Changing an SQL query versus re-writing a parser for text files is a =20 huge difference. > > -Garrett I am not opposed to text files, other than that they can be slow. I =20 am against BDB because over the years, in my experience they have =20 shown to be extremely unreliable and easily corrupted. If we are =20 going to be making changes to the way the ports/packages store the =20 information about what exists, it should be done in such a way that =20 it is scalable and at the same time extensible (is this a word?). Bert JW Regeer --Apple-Mail-1--959769673--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?96C6AAEA-B70A-400F-8614-8DFDE5930D19>