From nobody Tue Aug 9 13:16:16 2022 X-Original-To: questions@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4M2D9S0Scjz4Z0sy for ; Tue, 9 Aug 2022 13:16:28 +0000 (UTC) (envelope-from philipp@bureaucracy.de) Received: from smtp1.bureaucracy.de (smtp1.bureaucracy.de [80.190.133.201]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp1.bureaucracy.de", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4M2D9Q5Mx2z3sm3 for ; Tue, 9 Aug 2022 13:16:26 +0000 (UTC) (envelope-from philipp@bureaucracy.de) Received: from localhost (p200300e5b71d2600ec17899ce38f07e6.dip0.t-ipconnect.de [2003:e5:b71d:2600:ec17:899c:e38f:7e6]) by smtp (OpenSMTPD) with ESMTPSA id b14cc7b5 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Tue, 9 Aug 2022 13:16:17 +0000 (UTC) Message-ID: <92ca8cb253ec6c84e44c76d82e98c1e1.philipp@bureaucracy.de> From: Philipp To: questions@freebsd.org Subject: Re: zfs and git upload-pack In-reply-to: <62ced745-9db4-021f-ae0a-fdb4aba03a13@holgerdanske.com> References: <20220807102839.7c69f387@bureaucracy.de> <20220807195750.0233e2f3@bureaucracy.de> <348470bf-0f11-f7b3-e782-881c3f864ffb@holgerdanske.com> <20220807211348.401ee1c3@bureaucracy.de> <62ced745-9db4-021f-ae0a-fdb4aba03a13@holgerdanske.com> Comments: In-reply-to David Christensen message dated "Sun, 07 Aug 2022 20:52:55 -0700." List-Id: User questions List-Archive: https://lists.freebsd.org/archives/freebsd-questions List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-questions@freebsd.org X-BeenThere: freebsd-questions@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-ID: <281744.1660050976.1@localhost> Content-Transfer-Encoding: quoted-printable Date: Tue, 09 Aug 2022 15:16:16 +0200 X-Rspamd-Queue-Id: 4M2D9Q5Mx2z3sm3 X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of philipp@bureaucracy.de has no SPF policy when checking 80.190.133.201) smtp.mailfrom=philipp@bureaucracy.de X-Spamd-Result: default: False [-1.80 / 15.00]; AUTH_NA(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FORGED_SENDER(0.30)[satanist@bureaucracy.de,philipp@bureaucracy.de]; MIME_GOOD(-0.10)[text/plain]; MLMMJ_DEST(0.00)[questions@freebsd.org]; R_SPF_NA(0.00)[no SPF record]; RCVD_TLS_ALL(0.00)[]; MIME_TRACE(0.00)[0:+]; R_DKIM_NA(0.00)[]; TAGGED_FROM(0.00)[freebsd]; ASN(0.00)[asn:15598, ipnet:80.190.128.0/19, country:DE]; DMARC_NA(0.00)[bureaucracy.de]; ARC_NA(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_NEQ_ENVFROM(0.00)[satanist@bureaucracy.de,philipp@bureaucracy.de]; FROM_HAS_DN(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[questions@freebsd.org]; TO_MATCH_ENVRCPT_ALL(0.00)[]; TO_DN_NONE(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-ThisMailContainsUnwantedMimeParts: N [2022-08-07 20:52] David Christensen > On 8/7/22 12:13, Philipp Takacs wrote: > > On Sun, 7 Aug 2022 11:12:20 -0700 > > David Christensen wrote: > > = > >> On 8/7/22 10:57, Philipp Takacs wrote: > >>> On Sun, 7 Aug 2022 09:54:41 -0700 > >>> David Christensen wrote: > >>> = > >>>> On 8/7/22 01:28, Philipp wrote: > >>>>> Hi all > >>>>> > >>>>> I host a quite uncommon git repository mostly out of binary > >>>>> files. I have the problem every time this repo is cloned the host > >>>>> allocate memory and going to swap. This leads to the host being > >>>>> unusable and need to force rebooted. > >>>>> > >>>>> The repo is stored on a zfs and nullmounted in a jail to run the > >>>>> git service over ssh. The host is a FreeBSD 13.1 with 4GB RAM and > >>>>> 4GB swap. > >>>>> > >>>>> What I have noticed is that the biggest memory consumtion is from > >>>>> mmap() a pack file. For the given repo this has the size of 6,7G. > >>>>> I suspect this file is mapped in memory but not correctly > >>>>> handled/unmaped (by the kernel) when not enough memory is > >>>>> available. > >>>>> > >>>>> I have tested some options to solve/workaround this issue: > >>>>> > >>>>> * limit the zfs ARC size in loader.conf > >>>>> * zfs set primarycache none for the dataset > >>>>> * limit datasize, memoryuse and vmemoryuse via login.conf > >>>>> * limit git packedGitLimit > >>>>> > >>>>> None of them have solved the issue. > > > I would restore them to previous values. I have done this. Now the behavior has changed. Now one clone was succsessfull and at later clones stop with an error (Cannot allocate memory). This is better but still not good. > >>> this repo gets cloned a few times a month. Currently > >>> the Host dies because one client try to clone this repo. > > > What happens if the clone is attempted by a different user on the same = > workstation? > > > What happens if the clone is attempted from another workstation? The same as described. > >> Please post console sessions that demonstrate cloning without failure > >> and cloning with failure. > > = > > Not sure what you mean. = > > > Please post client console sessions that demonstrate correct operation = > and failed operation. Let me rephrase this: I'm not shure what you expect from this but ok: successfull: satanist@hell tmp$ git clone -v ssh://bigrepo@git.bureaucracy.de:2222/bigre= po Cloning into 'bigrepo'... remote: Objekte aufz=C3=A4hlen: 9661, fertig. remote: Gesamt 9661 (Delta 0), Wiederverwendet 0 (Delta 0), Pack wiederverw= endet 9661 Receiving objects: 100% (9661/9661), 6.73 GiB | 5.96 MiB/s, done. Resolving deltas: 100% (3/3), done. Updating files: 100% (6591/6591), done. unsuccessfull: satanist@hell tmp$ git clone -v ssh://bigrepo@git.bureaucracy.de:2222/bigre= po Cloning into 'bigrepo'... remote: Enumerating objects: 9661, done. = Rerror: git upload-pack: git-pack-objects died with erro= r.iB/s fatal: git upload-pack: aborting due to possible repository corruption on t= he remote side. remote: fatal: packfile ./objects/pack/pack-6fee671a31a59454b539c88d674373d= 88ad67780.pack cannot be mapped: Cannot allocate memory remote: aborting due to possible repository corruption on the remote side. fatal: early EOF fatal: index-pack failed As mentioned earlier the "Cannot allocate memory" is new. The old behavior was that the server was unusable till I restarted the server. I currently don't know how this exactly looks on the client, but there is not mutch info in the output. > > This is a server, a client connect with a > > git client over ssh and use git-upload-pack = > > > https://git-scm.com/docs/git-upload-pack Yes this programm, but I post hear because I susspect this is an freebsd issue not an issue with git. This programm basicly mmap() some files, parse them and write parts (based on stdin) of the content to stdout. > > to receive the content of > > the repo. The communication of the git client and git-upload-pack works > > with stdin/stdout. I can give the logs of my git authorization handler > > (inside jail): > > > > What file? This file is called fugit.log and is created by a authorization handler for git over ssh called fugit. I use fugit to manage authorization for multible git repositories. See https://github.com/cbdevnet/fugit/blob/maste= r/fugit > > The last line mean the clone was finished[0]. But at this time > > everything else on the host was unusable. Here the corresponding conten= t > > of /var/log/messages: > > > > > Between 12:00 and 14:00 the server started to be slow and running > > ssh/mosh session stopped working. Starting new sessions over ssh was > > not possible. The root login at 14:38 was me over ipmi try to somehow > > get the the host working again. But I could login and only run top then > > this session was also unusable. I have then restarted the server over > > ipmi. > > > It looks like the server is getting overloaded with incoming TCP = > packets, the client is closing connections, the client is timing out = > when reconnecting, etc.. Near the end, I see jails being killed. I do = > not see reasons why. Perhaps there are clues in other logs. Perhaps = > you can increase the logging verbosity of the Git and/or SSH services to = > obtain clues. =46rom my perspective this looks a bit diffrent, because I could reproduce the behavior. So it looks like the git-upload-pack and the corresponding IO causes a lot of cache allocation. This leads to no free memory left for the rest of the operation of the server. I don't know where to increase the verbosity. ssh just starts a session and goes on. The logs of fugit are already posted completly. git-upload-pac= k does not log. > Start the following command in a terminal on the server to monitor ZFS = > disk activity (press Ctrl+C to exit): > > # zpool iostat -v 60 This looks quite normal here parts of the output: Normal operation without a git clone running: pool alloc free read write read write ---------- ----- ----- ----- ----- ----- ----- zroot 51.8G 408G 3 14 25.4K 150K mirror-0 51.8G 408G 3 14 25.4K 150K ada1p3 - - 1 7 12.2K 74.8K ada0p3 - - 1 7 13.2K 74.8K ---------- ----- ----- ----- ----- ----- ----- During a clone: capacity operations bandwidth = pool alloc free read write read write ---------- ----- ----- ----- ----- ----- ----- zroot 51.8G 408G 22 24 2.87M 226K mirror-0 51.8G 408G 22 24 2.87M 226K ada1p3 - - 11 12 1.47M 113K ada0p3 - - 11 12 1.40M 113K ---------- ----- ----- ----- ----- ----- ----- As expected the read goes up during the the clone. But not to a level I have conserne about the load. > Start the following command in another terminal on the server to monitor = > CPU and/or IO activity (press 'm' to switch between the two) (press 'q' = > to exit): > > # top -S -s 60 This also looks as expected. The git process (chiled of git-upload-pack) uses cpu and memory also creates IO. I have some output some secounds befor the git was killed (sorted by RES): Mem: 598M Active, 426M Inact, 166M Laundry, 1223M Wired, 1020M Free ARC: 543M Total, 185M MFU, 82M MRU, 16K Anon, 8618K Header, 267M Other 98M Compressed, 240M Uncompressed, 2,45:1 Ratio Swap: 4096M Total, 17M Used, 4079M Free Displaying CPU statistics. PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMM= AND 45719 satanist 1 30 0 1289M 986M pipdwt 2 0:26 17,77% git 53388 10001 44 52 0 2755M 237M uwait 2 2:23 0,15% java After the java process there are only processes with less then 100MB reserved. I don't know excactly, but it looks like RES and SIZE adds memory allocation and memory mapped files. In this case I would argue there is sufficient memory availible to drop, because it can be read from disk. = Philipp