Date: Fri, 9 Jul 2021 12:46:47 -0700 From: Mark Millard via freebsd-ports <freebsd-ports@freebsd.org> To: bob prohaska <fbsd@www.zefox.net> Cc: freebsd-ports@freebsd.org Subject: Re: Too many pythons in poudriere Message-ID: <6D9BAA86-CB85-42E9-B4FC-6272141CFFDB@yahoo.com> In-Reply-To: <24795168-CEBF-4049-A4BD-D529F70E7F8D@yahoo.com> References: <044A7E63-2734-41F4-A1A2-AE5096C6A62C@yahoo.com> <CDDBF956-9647-464C-9114-CB1021C14699@yahoo.com> <EE5052EA-0393-4AEB-A6A7-AE24F63F8E46@yahoo.com> <20210708154558.GB60914@www.zefox.net> <E1984653-A703-4B92-A48D-131514C92E92@yahoo.com> <20210708175436.GA70414@www.zefox.net> <24795168-CEBF-4049-A4BD-D529F70E7F8D@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2021-Jul-8, at 14:11, Mark Millard <marklmi at yahoo.com> wrote: > On 2021-Jul-8, at 10:54, bob prohaska <fbsd at www.zefox.net> wrote: >=20 >> On Thu, Jul 08, 2021 at 09:41:13AM -0700, Mark Millard wrote: >>>=20 >>>=20 >>> On 2021-Jul-8, at 08:45, bob prohaska <fbsd at www.zefox.net> wrote: >>>=20 >>>> Even with -J1 and no ALLOW_MAKE_JOBS I'm still >>>> seeing five pythons occupying at least 3 GB on >>>> the loose. >>>=20 >>> Actually I just looked and saw: >>>=20 >>> Swapinfo 7.36% >>>=20 >>> (Unlike the 83% or so I saw somewhat around 3 hours ago.) >>>=20 >>> Load Averages (220%) 2.20 2.18 1.76 >>>=20 >>> Elapsed 12:54:56 >>>=20 >>> I do not see a swaplog in http://www.zefox.org/~bob/swaplogs/ >>> to look at. So I can not see how much the peak swap space >>> usage was so far (approximately). >>=20 >> Started a new swaplog: >> http://www.zefox.org/~bob/swaplogs/202107182930.log >>=20 >> It came within a whisker of running out of swap, then abruptly >> the python threads vanished and the build seems to be proceeding. >=20 > Did you update any thing, such as /usr/ports/ , between the 50+ hr > run and the new one? The new one spent over 6 hours at: >=20 > [05:50:59] [ 8% 3914/47953] python = ../../third_party/blink/renderer/build/scripts/run_with_pythonpath.py -I = ../../third_party/blink/renderer/build/scripts -I ../../third_party -I = ../../tools ../../third_party/blink/renderer/build/scripts/make_names.py = gen/third_party/blink/renderer/modules/event_interface_modules_names.json5= --output_dir gen/third_party/blink/renderer/modules > [12:22:32] [ 8% 3915/47953] python = ../../third_party/blink/renderer/bindings/scripts/generate_bindings.py = --web_idl_database = gen/third_party/blink/renderer/bindings/web_idl_database.pickle = --root_src_dir ../../ --root_gen_dir gen --output_core_reldir = third_party/blink/renderer/bindings/core/v8/ --output_modules_reldir = third_party/blink/renderer/bindings/modules/v8/ enumeration = callback_function callback_interface interface namespace typedef union > [12:22:36] [ 8% 3916/47953] touch = obj/third_party/blink/renderer/bindings/generate_bindings_all.stamp > [12:22:39] [ 8% 3917/47953] touch = obj/third_party/blink/renderer/bindings/modules/event_modules_names.stamp > [12:22:42] [ 8% 3918/47953] python = ../../third_party/blink/renderer/build/scripts/run_with_pythonpath.py -I = ../../third_party/blink/renderer/build/scripts -I ../../third_party -I = ../../tools ../../third_party/blink/renderer/build/scripts/make_names.py = ../../third_party/blink/renderer/modules/event_target_modules_names.json5 = --output_dir gen/third_party/blink/renderer/modules > [12:22:42] [ 8% 3919/47953] touch = obj/third_party/blink/renderer/bindings/modules/event_target_modules_names= .stamp >=20 >=20 > The 50+ hour one did not: >=20 > [03:56:05] [ 8% 3848/47953] python = ../../third_party/blink/renderer/build/scripts/run_with_pythonpath.py -I = ../../third_party/blink/renderer/build/scripts -I ../../third_party -I = ../../tools ../../third_party/blink/renderer/build/scripts/make_names.py = gen/third_party/blink/renderer/modules/event_interface_modules_names.json5= --output_dir gen/third_party/blink/renderer/modules > [03:56:05] [ 8% 3849/47953] touch = obj/third_party/blink/renderer/bindings/modules/event_modules_names.stamp > [03:56:06] [ 8% 3850/47953] python = ../../third_party/blink/renderer/bindings/scripts/collect_idl_files.py = --idl_list_file = __third_party_blink_renderer_bindings_web_idl_in_core_for_testing___build_= toolchain_linux_clang_arm64__rule.rsp --component core --output = gen/third_party/blink/renderer/bindings/web_idl_in_core_for_testing.pickle= --for_testing > [03:56:06] [ 8% 3851/47953] touch = obj/third_party/blink/renderer/bindings/web_idl_in_core_for_testing.stamp > [03:56:06] [ 8% 3852/47953] touch = obj/third_party/blink/renderer/bindings/scripts/cached_jinja_templates.sta= mp > [03:56:06] [ 8% 3853/47953] python = ../../third_party/blink/renderer/build/scripts/run_with_pythonpath.py -I = ../../third_party/blink/renderer/build/scripts -I ../../third_party -I = ../../tools ../../third_party/blink/renderer/build/scripts/make_names.py = ../../third_party/blink/renderer/modules/event_target_modules_names.json5 = --output_dir gen/third_party/blink/renderer/modules > [03:56:09] [ 8% 3854/47953] python = ../../third_party/blink/renderer/build/scripts/run_with_pythonpath.py -I = ../../third_party/blink/renderer/build/scripts -I ../../third_party -I = ../../tools = ../../third_party/blink/renderer/build/scripts/core/style/make_computed_st= yle_initial_values.py = ../../third_party/blink/renderer/core/css/css_properties.json5 = ../../third_party/blink/renderer/core/css/computed_style_field_aliases.jso= n5 = ../../third_party/blink/renderer/platform/runtime_enabled_features.json5 = ../../third_party/blink/renderer/core/style/computed_style_extra_fields.js= on5 --output_dir gen/third_party/blink/renderer/core/style --gperf gperf > [03:56:09] [ 8% 3855/47953] touch = obj/third_party/blink/renderer/bindings/modules/event_target_modules_names= .stamp >=20 >=20 > The build step numbers are different for the same > command: >=20 > 3914/47953 > vs. > 3848/47953 >=20 > (But I do not know if the build technique tries to > keep the partial ordering for build steps stable > across build attempts from the similar starting > conditions.) >=20 > It almost seems like like a system level check in what > process(s) have large amounts swap space in use would > be appropriate when this sort of thing happens, not > that I've done such before. My understanding is that > top's per-process reporting of swap space usage is > problematical when the display is set to display such > information. Going in different direction, when you updated: /usr/local/poudriere/poudriere-system to have Jail OSVERSION: 1400024 (and probably matching the host OS main-n247590-5dd84e315a9), matching /usr/src/ as well, did you rebuild all your ports under the coherent combination? Or are you still using port builds in poudriere that were built with the jail OSVERSION being older than the /usr/src/ the builds were based on? Just like devel/llvm10 had a command that only-sometimes had problems when built via the incoherent combination, other ports could have odd behavior --but only sometimes. My guess is that you have not done such because it would have taken far more time than I've noticed in your activities. I recommended before doing such a rebuild of the ports and then a reinstall of them. Given the odd behaviors being observed that do not repeat each time tried, I still make the recommendation. >> I'm curious if this was blind luck, or some adaptive behavior >> by poudrirere. >=20 > Poudriere does not control chrome's internel build steps. > The notation like [ 8% 3849/47953] is not from poudriere. >=20 >> One other oddity: occasionally one see in top a >> PID using more than 100% WCPU. Is one thread occupying two cores? >=20 > A process can have multiple threads instead of just one > but each thread runs on at most one cpu (core here) at a > time. >=20 > Top can do either of: >=20 > A) show a count of threads in each process shown > vs. > B) show a line for each thread instead of for each > process >=20 > Top can also display the unique thread number instead > of the process number (shared by all threads in the > process). (But not both at the same time.) Showing > thread numbers is probably only commonly selected when > (B) is in use. >=20 > In (B) mode, if the process number is being shown, > then there will be one example of the number per > thread shown that is from the process. >=20 > But I'll note that, if I remember right, python is > a single threaded language. It would probably take > use of language bindings to other langauges for > multiple threads to be involved (indirectly) in a > python process. >=20 >>>=20 >>>> I'm fairly sure this didn't happen >>>> when using make by itself (IIRC it was -j2). >=20 > It did not happen at that point in the 50+ hr run > either. >=20 >>>> I also got rid of the mistaken directive in >>>> poudriere.d/make.conf. >>>=20 >>> When I look at http://www.zefox.org/~bob/poudriere.d/make.conf >>> now I see: >>>=20 >>> ALLOW_MAKE_JOBS=3Dyes >>> #MAKE_JOBS_NUMBER=3D2 >>> #.if ${.CURDIR:M*www/chromium} >>> #MAKE_JOBS_NUMBER_LIMIT=3D2 >>> #.endif >>> #.if ${.CURDIR:M*databases/sqlite3} >>> #MAKE_JOBS_NUMBER_LIMIT=3D2 >>> #.endif >>> #.if ${.CURDIR:M*www/firefox} >>> #MAKE_JOBS_NUMBER_LIMIT=3D2 >>> #.endif >>>=20 >>> which does not match your wording. >>>=20 >>=20 >> Thank you for catching my error. _now_ it's fixed. >>=20 >> [snip] >>> To see what is getting CPU time that leads to >>> the load averages being around 2 might take >>> using something like top sorted by cpu time >>> and watching for a while. >>>=20 >>>> There is a=20 >>>> #MAX_MEMORY=3D8 >>>> in poudriere.conf, presumably GB. >>>=20 >>> Documented as GiB: >>>=20 >>> # How much memory to limit jail processes to for *each builder* >>> # in GiB (default: none) >>> #MAX_MEMORY=3D8 >>>=20 >>> Per builder, not per-make-process. >>> Within a builder each make-process shares >>> that size space with the others. >>>=20 >>>> That >>>> looks like a good knob to play with. Would >>>> setting it to something like 3 or 4 help? >>>=20 >>> If the memory use exceeds what you set, the builder >>> process is likely killed.=20 >> [snip]=20 >>=20 >> I was hopeful it might inhibit starting new PIDs >> when memory/swap is below some threshold. Guess not.=20 >=20 > poudriere does not control the internals of the > builder's attempted operations beyond what the > port supports. (And there are not port-specific > interfaces for such control. poudriere treats > ports the same way in general.) >=20 =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6D9BAA86-CB85-42E9-B4FC-6272141CFFDB>