Date: Fri, 13 Jul 2007 09:02:24 -0700 From: Garrett Cooper <youshi10@u.washington.edu> To: Tim Kientzle <kientzle@freebsd.org> Cc: ports@freebsd.org, hackers@freebsd.org, krion@freebsd.org Subject: Re: Finding slowdowns in pkg_install (continuations of previous threads) Message-ID: <4697A210.2020301@u.washington.edu> In-Reply-To: <4696C0D2.6010809@u.washington.edu> References: <468C96C0.1040603@u.washington.edu> <468C9718.1050108@u.washington.edu> <468E60E9.80507@freebsd.org> <468E6C81.4060908@u.washington.edu> <468E7192.8030105@freebsd.org> <4696C0D2.6010809@u.washington.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
Garrett Cooper wrote: > Tim Kientzle wrote: >>> -I tried ... buffering ... the +CONTENTS file parsing function, >>> and the >>> majority of the time it yielded good results .... >> >> One approach I prototyped sometime back was to use >> libarchive in pkg_add as follows: >> * Open the archive >> * Read +CONTENTS directly into memory (it's >> guaranteed to always be first in the archive) >> * Parse all of +CONTENTS at once >> * Continue scanning the archive, disposing >> of each file as it appears in the archive. >> >> Based on my experience with this, I would >> suggest you just read all of +CONTENTS >> directly into memory at once and parse >> the whole thing in a single shot. >> fopen(), then fstat() to get the size, >> then allocate a buffer and read the whole >> thing, then fclose(). You can then >> parse it all at once. >> >> As a bonus, your parser then becomes a nice >> little bit of reusable code that reads >> a block of memory and returns a structure describing >> the package metadata. >> >> Tim Kientzle > I'm not 100% sure because I'm not comparing apples (virtual disk on > desktop via VMware) to apples (real disk on server), but I'm showing a > 2.5-fold speedup after adding the simple parser: > > Virtual disk: > 4.42 real 1.37 user 1.47 sys > > Real disk: > 10.26 real 5.36 user 0.99 sys > > I'll run a battery of tests just to ensure whether or not that's the > case. > > Be back with results in a few more days. > > -Garrett Hello, As promised, here are some results for my work: By modifying the parser and heuristics in plist_cmd I appear to have decreased all figures (except plist_cmd, which I will note later) from their original values to much lower values. The only drawback is that I appear to have stimulated a bug with either malloc'ing memory, printf/vargs, or transferring large amounts of data via pipes where some of my debug messages are making it into plist_cmd(..) from obtainbymatch(..), which represents the the 3-fold increase in reported plist_cmd(..) iterations. I'm going to try replacing the debug commands with standard print statements wherever possible, then replace all tar commands with libarchive APIs, and see if the problem solves itself. Notes: 1. This sample is based off x11-libs/atk. 2. It isn't the final set of results. 3. Graphs coming soon (need to simulate values in Excel on work machine and convert to screenshots later on when I have a break -- thinking around noon). I'll repost when I have them available. 4. CSV files available at: http://students.washington.edu/youshi10/posted/atk-results.tgz.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4697A210.2020301>