Date: Sat, 14 Jul 2007 04:04:21 -0700 From: Garrett Cooper <youshi10@u.washington.edu> To: Tim Kientzle <kientzle@freebsd.org> Cc: ports@freebsd.org, hackers@freebsd.org, krion@freebsd.org Subject: Re: Finding slowdowns in pkg_install (continuations of previous threads) Message-ID: <4698ADB5.7080600@u.washington.edu> In-Reply-To: <4697A210.2020301@u.washington.edu> References: <468C96C0.1040603@u.washington.edu> <468C9718.1050108@u.washington.edu> <468E60E9.80507@freebsd.org> <468E6C81.4060908@u.washington.edu> <468E7192.8030105@freebsd.org> <4696C0D2.6010809@u.washington.edu> <4697A210.2020301@u.washington.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
Garrett Cooper wrote: > Garrett Cooper wrote: >> Tim Kientzle wrote: >>>> -I tried ... buffering ... the +CONTENTS file parsing function, >>>> and the >>>> majority of the time it yielded good results .... >>> >>> One approach I prototyped sometime back was to use >>> libarchive in pkg_add as follows: >>> * Open the archive >>> * Read +CONTENTS directly into memory (it's >>> guaranteed to always be first in the archive) >>> * Parse all of +CONTENTS at once >>> * Continue scanning the archive, disposing >>> of each file as it appears in the archive. >>> >>> Based on my experience with this, I would >>> suggest you just read all of +CONTENTS >>> directly into memory at once and parse >>> the whole thing in a single shot. >>> fopen(), then fstat() to get the size, >>> then allocate a buffer and read the whole >>> thing, then fclose(). You can then >>> parse it all at once. >>> >>> As a bonus, your parser then becomes a nice >>> little bit of reusable code that reads >>> a block of memory and returns a structure describing >>> the package metadata. >>> >>> Tim Kientzle >> I'm not 100% sure because I'm not comparing apples (virtual disk on >> desktop via VMware) to apples (real disk on server), but I'm showing >> a 2.5-fold speedup after adding the simple parser: >> >> Virtual disk: >> 4.42 real 1.37 user 1.47 sys >> >> Real disk: >> 10.26 real 5.36 user 0.99 sys >> >> I'll run a battery of tests just to ensure whether or not that's the >> case. >> >> Be back with results in a few more days. >> >> -Garrett > Hello, > As promised, here are some results for my work: > > By modifying the parser and heuristics in plist_cmd I appear to > have decreased all figures (except plist_cmd, which I will note later) > from their original values to much lower values. The only drawback is > that I appear to have stimulated a bug with either malloc'ing memory, > printf/vargs, or transferring large amounts of data via pipes where > some of my debug messages are making it into plist_cmd(..) from > obtainbymatch(..), which represents the the 3-fold increase in > reported plist_cmd(..) iterations. > > I'm going to try replacing the debug commands with standard print > statements wherever possible, then replace all tar commands with > libarchive APIs, and see if the problem solves itself. > > Notes: > 1. This sample is based off x11-libs/atk. > 2. It isn't the final set of results. > 3. Graphs coming soon (need to simulate values in Excel on work > machine and convert to screenshots later on when I have a break -- > thinking around noon). I'll repost when I have them available. > 4. CSV files available at: > http://students.washington.edu/youshi10/posted/atk-results.tgz. I've posted HTML results of the interpreted spreadsheet on <http://students.washington.edu/posted/atk.htm>. I'll provide commentary tomorrow after I get some sleep. -Garrett
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4698ADB5.7080600>