Date: Fri, 8 Feb 2008 20:43:18 +0100 From: Mel <fbsd.questions@rachie.is-a-geek.net> To: freebsd-questions@freebsd.org Cc: lachlan@lkla.org, Alex Zbyslaw <xfb52@dial.pipex.com>, mark@msapiro.net Subject: Re: Memory Error using Mailman on FreeBSD. How to debug? Message-ID: <200802082043.19282.fbsd.questions@rachie.is-a-geek.net> In-Reply-To: <47AC4E08.1060801@dial.pipex.com> References: <1153.137.153.0.37.1202210274.squirrel@sm.lkla.org> <26921.137.153.0.25.1202463164.squirrel@sm.lkla.org> <47AC4E08.1060801@dial.pipex.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Friday 08 February 2008 13:41:44 Alex Zbyslaw wrote: > Lachlan Michael wrote: > >>Real puzzler. I'm surprised not to have at least one process growing, > >>though. Maybe it's not using much CPU and you're not spotting it. > > > >Following you advice, as far as I can tell, the mailman qrunner process > > > > /usr/local/bin/python2.5 /usr/local/mailman/bin/qrunner > >--runner=IncomingRunner:0:1 -s > > > >is the one that crashes: all other mailman processes are unaffected. I > >couldn't see it increase much in size (maybe it went from 8.5M to 12.5M), > >then it just bombed and a new process was spawned (easy to tell by the > >large increase in PID). > > All I can think us that qrunner asks for such a large amount of memory > in one go, that it bombs out without ever growing. That fits with the > ktrace output as well. Regretably, I don't think you can tell *how* > much memory was asked for. (The normal pattern with out of memory > errors is for the process to grow and grown and grow and die; but it's > not the only one). > > >>Other things to try: Up the stack size > >> ulimit -s 262144 > >> > >>inside the mailman startup. Again, I've had processes in the past which > >>needed this. > > > >Ok, I am going to gradually try different limits. It seems as though > > setting kern.maxssiz="256M" > >and so on in /boot/loader.conf will allow me to increase the limits. > >Having to reboot is a pain, though. How far can I go? 512M? (Physical > >memory is 1GB) > > Certainly not more than physical memory :-) To be honest, if 256M > doesn't do it then this probably isn't the problem. I'm not > particularly hopeful that this will do it, but in your circumstance I > would try it. > > At the same time, you could also increase the data size (maxdsiz?) to > 1Gb (yours looks like 0.5Gb, half your physical memory). > > My limit settings (also 1Gb) look like: > > datasize 1048576 kbytes > stacksize 262144 kbytes > > which come from trying to set 256Mb and 1024Mb in the kernel config (old > FreeBSD - no sysctls). > > Keep the ulimit -a in the mailman startup script so you can confirm that > you really get these numbers. > > >>Can you email a file of the size your are > >>trying not through mailman? Maybe your MTA (sendmail/postfix etc) has a > >>limit that somehow causes mailman to get this error. > > > >This is definitely not the case. Users can receive (and send) similar > >sized large attachments individually, so the MTA (sendmail in this case) > >is not the cause. > > OK - rule that out. The ktrace showing qrunner failing a break pretty > much does that too. > > >>The final suggestion is to try to trace (ktrace, strace from ports) the > >>process that is dying, > > > >I'll admit it is my first time to try a ktrace, but after noting which > >process it was that crashed I could identify the newly spawned PID, and > >obtained a ktrace.out (binary) and a kdump (called > >mailman_process_log.txt) when the problems occurs by sending another large > >mail attachment. I'll leave the files up for a couple of days. (Both > >files are about 2MB in size) > > > >http://lachlan.lkla.org/tmp/mailman_memory_error/ > > > >Not that I can properly interpret the results, but it seems the mail file > >is completely read, but whatever happens next causes the memory error. > > > > 52506 python2.5 RET read 354/0x162 > > 52506 python2.5 CALL break(0x8add000) > > 52506 python2.5 RET break 0 > > 52506 python2.5 CALL break(0x8cc3000) > > 52506 python2.5 RET break -1 errno 12 Cannot allocate memory > > The kdump output is the only useful bit, really. Your analysis seems > correct to me. This looks like classic uninitialized variable to me, as in asking for 5397590320 memory, cause "msgSize" was unset. I'd attach gdb with -p flag and check how much memory it's asking for. If that doesn't work for you, maybe you can find out in the python source where it is asking for this memory and instead of saying "Cannot allocate" make it say "Cannot allocate this many bytes". -- Mel
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200802082043.19282.fbsd.questions>