Date: Tue, 9 Aug 2022 17:40:00 -0700 From: David Christensen <dpchrist@holgerdanske.com> To: questions@freebsd.org Subject: Re: zfs and git upload-pack Message-ID: <34cea530-841a-527c-b080-ab3e1379d86f@holgerdanske.com> In-Reply-To: <92ca8cb253ec6c84e44c76d82e98c1e1.philipp@bureaucracy.de> References: <20220807102839.7c69f387@bureaucracy.de> <a94ec43d-8ae8-fd20-a5a9-0e1e67750107@holgerdanske.com> <20220807195750.0233e2f3@bureaucracy.de> <348470bf-0f11-f7b3-e782-881c3f864ffb@holgerdanske.com> <20220807211348.401ee1c3@bureaucracy.de> <62ced745-9db4-021f-ae0a-fdb4aba03a13@holgerdanske.com> <92ca8cb253ec6c84e44c76d82e98c1e1.philipp@bureaucracy.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On 8/9/22 06:16, Philipp wrote: > [2022-08-07 20:52] David Christensen >> On 8/7/22 12:13, Philipp Takacs wrote: >>> On Sun, 7 Aug 2022 11:12:20 -0700 David Christensen >>>> On 8/7/22 10:57, Philipp Takacs wrote: >>>>> On Sun, 7 Aug 2022 09:54:41 -0700 David Christensen >>>>>> On 8/7/22 01:28, Philipp wrote: >>>>>>> Hi all >>>>>>> >>>>>>> I host a quite uncommon git repository mostly out of binary >>>>>>> files. I have the problem every time this repo is cloned the host >>>>>>> allocate memory and going to swap. This leads to the host being >>>>>>> unusable and need to force rebooted. >>>>>>> >>>>>>> The repo is stored on a zfs and nullmounted in a jail to run the >>>>>>> git service over ssh. The host is a FreeBSD 13.1 with 4GB RAM and >>>>>>> 4GB swap. >>>>>>> >>>>>>> What I have noticed is that the biggest memory consumtion is from >>>>>>> mmap() a pack file. For the given repo this has the size of 6,7G. >>>>>>> I suspect this file is mapped in memory but not correctly >>>>>>> handled/unmaped (by the kernel) when not enough memory is >>>>>>> available. >>>>>>> >>>>>>> I have tested some options to solve/workaround this issue: >>>>>>> >>>>>>> * limit the zfs ARC size in loader.conf >>>>>>> * zfs set primarycache none for the dataset >>>>>>> * limit datasize, memoryuse and vmemoryuse via login.conf >>>>>>> * limit git packedGitLimit > I have [restored the options to previous values]. Now the behavior has changed. Now one clone was > succsessfull and at later clones stop with an error (Cannot allocate > memory). This is better but still not good. If you reboot the server, does the clone work the first them and then always fail? >> What happens if the clone is attempted by a different user on the same >> workstation? >> >> >> What happens if the clone is attempted from another workstation? > > The same as described. That supports a hypothesis that the problem is the server. >> Please post client console sessions that demonstrate correct operation >> and failed operation. > successfull: > > satanist@hell tmp$ git clone -v ssh://bigrepo@git.bureaucracy.de:2222/bigrepo > Cloning into 'bigrepo'... > remote: Objekte aufzählen: 9661, fertig. > remote: Gesamt 9661 (Delta 0), Wiederverwendet 0 (Delta 0), Pack wiederverwendet 9661 > Receiving objects: 100% (9661/9661), 6.73 GiB | 5.96 MiB/s, done. > Resolving deltas: 100% (3/3), done. > Updating files: 100% (6591/6591), done. Are those status messages in German? I do not know German... > unsuccessfull: > > satanist@hell tmp$ git clone -v ssh://bigrepo@git.bureaucracy.de:2222/bigrepo > Cloning into 'bigrepo'... > remote: Enumerating objects: 9661, done. Rerror: git upload-pack: git-pack-objects died with error.iB/s > fatal: git upload-pack: aborting due to possible repository corruption on the remote side. > remote: fatal: packfile ./objects/pack/pack-6fee671a31a59454b539c88d674373d88ad67780.pack cannot be mapped: Cannot allocate memory > remote: aborting due to possible repository corruption on the remote side. > fatal: early EOF > fatal: index-pack failed > > As mentioned earlier the "Cannot allocate memory" is new. The old > behavior was that the server was unusable till I restarted the server. > I currently don't know how this exactly looks on the client, but there > is not mutch info in the output. Now the status messages are in English. Do you know why the language changed? Have you run 'git fsck'? https://git-scm.com/docs/git-fsck Does the server have ECC memory? >>> This is a server, a client connect with a >>> git client over ssh and use git-upload-pack >> >> >> https://git-scm.com/docs/git-upload-pack > > Yes this programm, but I post hear because I susspect this is an freebsd > issue not an issue with git. This programm basicly mmap() some files, > parse them and write parts (based on stdin) of the content to stdout. I desire stable OS's, so I typically use an older production release. My servers have: 2022-08-09 16:17:40 dpchrist@f3 ~ $ freebsd-version ; uname -a 12.3-RELEASE-p5 FreeBSD f3.tracy.holgerdanske.com 12.3-RELEASE-p5 FreeBSD 12.3-RELEASE-p5 GENERIC amd64 Perhaps you should grab a blank SSD, do an install of FreeBSD-12.3-RELEASE, install and set up the software you need without using jails, restore the Git repository onto it, and test if that works. >>> to receive the content of >>> the repo. The communication of the git client and git-upload-pack works >>> with stdin/stdout. I can give the logs of my git authorization handler >>> (inside jail): >> >> <snip> >> >> What file? > > This file is called fugit.log and is created by a authorization handler > for git over ssh called fugit. I use fugit to manage authorization for > multible git repositories. See https://github.com/cbdevnet/fugit/blob/master/fugit Is fugit is validated and supported on your OS? Is there a support community? Has anyone else seen this problem? Has anyone else seen this problem on Git? > So it looks like the git-upload-pack and the corresponding > IO causes a lot of cache allocation. This leads to no free memory left > for the rest of the operation of the server. As the clone works the first time, and then fails every time thereafter with memory-related errors, perhaps there is a memory leak? > I don't know where to increase the verbosity. ssh just starts a session > and goes on. The logs of fugit are already posted completly. git-upload-pack > does not log. Perhaps we can explore that later. >> # zpool iostat -v 60 > > This looks quite normal here parts of the output: > > Normal operation without a git clone running: > > pool alloc free read write read write > ---------- ----- ----- ----- ----- ----- ----- > zroot 51.8G 408G 3 14 25.4K 150K > mirror-0 51.8G 408G 3 14 25.4K 150K > ada1p3 - - 1 7 12.2K 74.8K > ada0p3 - - 1 7 13.2K 74.8K > ---------- ----- ----- ----- ----- ----- ----- > > During a clone: > > capacity operations bandwidth > pool alloc free read write read write > ---------- ----- ----- ----- ----- ----- ----- > zroot 51.8G 408G 22 24 2.87M 226K > mirror-0 51.8G 408G 22 24 2.87M 226K > ada1p3 - - 11 12 1.47M 113K > ada0p3 - - 11 12 1.40M 113K > ---------- ----- ----- ----- ----- ----- ----- > > As expected the read goes up during the the clone. But not > to a level I have conserne about the load. I assume ada0 and ada1 are HDD's (?). (If so, an SSD cache device really helps; but that should be unrelated to this issue.) Does the "During a clone" output correspond to the first clone after a reboot or to a second or subsequent clone? You want to look at the former; the latter may not show much because of caching. >> Start the following command in another terminal on the server to monitor >> CPU and/or IO activity (press 'm' to switch between the two) (press 'q' >> to exit): >> >> # top -S -s 60 > > This also looks as expected. The git process (chiled of git-upload-pack) > uses cpu and memory also creates IO. I have some output some secounds > befor the git was killed (sorted by RES): > > Mem: 598M Active, 426M Inact, 166M Laundry, 1223M Wired, 1020M Free > ARC: 543M Total, 185M MFU, 82M MRU, 16K Anon, 8618K Header, 267M Other > 98M Compressed, 240M Uncompressed, 2,45:1 Ratio > Swap: 4096M Total, 17M Used, 4079M Free > Displaying CPU statistics. > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 45719 satanist 1 30 0 1289M 986M pipdwt 2 0:26 17,77% git > 53388 10001 44 52 0 2755M 237M uwait 2 2:23 0,15% java > > After the java process there are only processes with less then 100MB > reserved. > > I don't know excactly, but it looks like RES and SIZE adds memory > allocation and memory mapped files. In this case I would argue there is > sufficient memory availible to drop, because it can be read from disk. Check 'man top' for SIZE and RES field definitions just to be sure. My top(1) looks somewhat different. The difference between SIZE and RES is 303M for git and 2518M for java; 2821M total. This total is larger than any Mem, ARC (ZFS), or Swap statistic. My guess is that the overall statistics do not include memory-mapped files (?). See if your top(1) can display the "RSfd" field -- Resident File-Backed Memory Size, which should include file mappings. "1020M Free" is shown. Was the server hung when you ran top(1)? Does the server have a console? Do you have access? If you log in to the console before testing and renice that process to a negative priority, the console might keep working when everything else hangs. If you try the same trick over SSH, it could interact with SSH and Git. David
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?34cea530-841a-527c-b080-ab3e1379d86f>