Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 09 May 2017 19:23:38 +0200
From:      pmhausen <pmhausen@me.com>
To:        python@freebsd.org
Subject:   Proposal for pkg-message for python 3.x ports
Message-ID:  <36BF7EEF-EE06-4A70-B0F2-69D1454CD1FF@me.com>

next in thread | raw e-mail | index | archive | help
Dear python port maintainers,

I discovered an odd performance problem with python 3.6 on servers with
large amounts of memory.

You can find all the details in a thread of the help@python.org
mailinglist with subject

"Python bug? Closing *all* file descriptors on fork() instead of only =
open ones?"

Unfortunately I cannot get at the list archives. Otherwise I would have =
sent
you a direct link.

Short version: since python 3.3 (if my google-fu worked) python tries to
close all open file descriptors in the child after fork(). Since there =
is
no standard POSIX compliant way to do this for *open* descriptors only,
it tries to use /dev/fd on FreeBSD and if that is not available it =
closes
*all* descriptors (resulting in ERR#9 for the majority). On a 64 bit =
FreeBSD
11 system with 128 GB of RAM the auto-tuned kern.maxfilesperproc
is between 3 and 4 million. That's a heck of a lot of system calls for =
each
fork() ;-)

I suggest adding a hint to mount fdescfs and add it to /etc/fstab in the
pkg-message for the affected python ports.

I'll attach my first message to the list below. The solution is to mount =
/dev/fd.

Kind regards,
Patrick
-----------------------
Hi, all,

last week we encountered an odd behaviour of a particular python =
application
on a freshly deployed server. The application in question is: =
https://github.com/iocage/iocage

On our test system (SMALLBOX) everything seemed to be working fine while =
on
the new designated production system (BIGBOX) every invocation of iocage =
was
incredibly slow =E2=80=A6

This was suprising since BIGBOX features 20 cores, 128 GB of RAM, SSD =
only
while SMALLBOX is a run-off-the-mill server with 4 cores, 16 GB of RAM =
and
spinning drives we used for development.

I finally found that the problem may not lie with iocage. To reproduce =
with just python installed:

SMALLBOX:
$ uname -a
FreeBSD testhost.proserver.punkt.de 11.0-RELEASE-p10 FreeBSD =
11.0-RELEASE-p10 #2 r317823: Fri May  5 11:41:45 CEST 2017     =
root@testhost.proserver.punkt.de:/usr/obj/usr/src/sys/VIMAGE  amd64
$ python3.6 -V
Python 3.6.1
$ time pip3 list
[=E2=80=A6]
real	0m0.982s
user	0m0.614s
sys	0m0.370s

BIGBOX:
$ uname -a
FreeBSD ph002.proserver.punkt.de 11.0-RELEASE-p10 FreeBSD =
11.0-RELEASE-p10 #2 r317823: Fri May  5 10:55:27 CEST 2017     =
root@ph002.proserver.punkt.de:/usr/obj/usr/src/sys/VIMAGE  amd64
$ python3.6 -V
Python 3.6.1
$ time pip3 list
real	0m4.140s
user	0m1.242s
sys	0m2.900s

Well =E2=80=A6 something=E2=80=99s fishy. After double checking OS and =
python versions, SSD configuration, CPU
and I/O load, being a sysadmin and not really a developer I decided to =
use truss to have a look at
what the systems were actually *doing*.

Result: on every fork of a python process we get:
77906: close(4)                                  ERR#9 'Bad file =
descriptor'
77906: close(5)                                  ERR#9 'Bad file =
descriptor'
77906: close(6)                                  ERR#9 'Bad file =
descriptor'
77906: close(7)                                  ERR#9 'Bad file =
descriptor'
77906: close(9)                                  ERR#9 'Bad file =
descriptor'
77906: close(10)                                 ERR#9 'Bad file =
descriptor'
77906: close(11)                                 ERR#9 'Bad file =
descriptor=E2=80=98
[=E2=80=A6 continued for literally millions of calls ...]

I=E2=80=99m perfectly familiar with the concept of closing *open* file =
descriptors on fork/exec but closing
*all* of them regardless of whether they are in use or not means that on =
the two systems we
get this number of system calls:

SMALLBOX:
$ truss -f pip3 list 2>&1 | grep 'ERR#9' | wc -l
1877703

BIGBOX:
$ truss -f pip3 list 2>&1 | grep 'ERR#9' | wc -l
15082323

Which rather nicely corresponds to the number of file descriptors per =
process on each system
multiplied by 4, since the initial python process seems to fork 4 times =
to get the pip3 job done:

SMALLBOX:
$ sysctl kern.maxfilesperproc
kern.maxfilesperproc: 469431
$ dc
469431 4 * p
1877724

BIGBOX:
$ sysctl kern.maxfilesperproc
kern.maxfilesperproc: 3770586
$ dc
3770586 4 * p
15082344

OK - obviously even on a system like BIGBOX 15 million invocations of =
close() take a couple
of seconds. And this will get worse with 256 or 512 GB of RAM since =
kernel tables will
dynamically increase accordingly.

What now? Is this a bug? It should be sufficient to close descriptors 0 =
to 3 after fork. Which
is precisely what the invoking shell does before exec()ing python3.6 =E2=80=
=A6

Thanks in advance for any hints
Patrick=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?36BF7EEF-EE06-4A70-B0F2-69D1454CD1FF>