Date: Fri, 4 Nov 2005 11:29:12 -0600 From: Kirk Strauser <kirk@strauser.com> To: freebsd-questions@freebsd.org Subject: Re: Fast diff command for large files? Message-ID: <200511041129.17912.kirk@strauser.com> In-Reply-To: <436B8ADF.4000703@mac.com> References: <200511040956.19087.kirk@strauser.com> <436B8ADF.4000703@mac.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--nextPart3264035.pAfWVuXc3O Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Friday 04 November 2005 10:22, Chuck Swiger wrote: > Multigigabyte? Find another approach to solving the problem, a text-base > diff is going to require excessive resources and time. A 64-bit platform > with 2 GB of RAM & 3GB of swap requires ~1000 seconds to diff ~400MB. There really aren't many options. For the patient, here's what's happening: Our legacy application runs on FoxPro. Our web application runs on a=20 PostgreSQL database that's a mirror of the FoxPro tables. We do the mirroring by running a program that dumps the FoxPro tables out a= s=20 tab-delimited files. Thus far, we'd been using PostgreSQL's "copy from"=20 command to read those files into the database. In reality, though, a very,= =20 very small percentage of rows in those tables actually change. So, I wrote= =20 a program that takes the output of diff and converts it into a series of=20 "delete" and "insert" commands; benchmarking shows that this is roughly 300= =20 times faster in our use. And that's why I need a fast diff. Even if it takes as long as the databas= e=20 bulk loads, we can run it on another server and use 20 seconds of CPU for=20 PostgreSQL instead of 45 minutes. The practical upshot is that the=20 database will never get sluggish, even if the other "diff server" is loaded= =20 to the gills. =2D-=20 Kirk Strauser --nextPart3264035.pAfWVuXc3O Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- iD8DBQBDa5pt5sRg+Y0CpvERAlDzAJ4ljAuI//Jf9YABy5bC2+C3g7NAcgCeMt6J 6fvneAVD2YqkCQBaMpVeQXU= =kX3b -----END PGP SIGNATURE----- --nextPart3264035.pAfWVuXc3O--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200511041129.17912.kirk>