Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 25 Mar 2023 22:29:47 +0000
From:      Tom Jones <thj@freebsd.org>
To:        Jamie Landeg-Jones <jamie@catflap.org>
Cc:        freebsd-current@freebsd.org
Subject:   Re: diff(1) goes into cpu-hogging endless loop
Message-ID:  <ZB9127V85OOPDtW9@spacemonster>
In-Reply-To: <202303252155.32PLtEPF072349@donotpassgo.dyslexicfish.net>
References:  <202303252155.32PLtEPF072349@donotpassgo.dyslexicfish.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Mar 25, 2023 at 09:55:14PM +0000, Jamie Landeg-Jones wrote:
> Hi, A "diff" of 2 files:
> 
> 1  77,933,904 bytes
> 2  63,013,818 bytes
> 
> , goes into an endless loop, whilst "gdiff" completes the operation in
> about 5 seconds.
> 
> I tested using the latest "diff" from current, and get the same result.
> 
> Splitting both files into 10Mb chunks, and diffing these was successful.
> 
> A ktrace of the "diff" actually stops producing any output after about
> 5 seconds, whilst the cpu looping continues.
> 
> Any ideas on what to do next? Does anyone else get the same result?
> 
> The files are just utf-8 freebsd git logs, and are available here if
> anyone would like to test:
> 
> http://www.catflap.org/jamie/1.xz (13,282,864 bytes)
> http://www.catflap.org/jamie/2.xz (12,221,164 bytes)
> 
> Cheers, Jamie

My guess is that you are hitting a worst case in the stone algorithm. I
have a WIP review to integrate the Myers algorithm from libdiff here:

https://reviews.freebsd.org/D36860

- Tom



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ZB9127V85OOPDtW9>