Date: Sun, 31 May 2009 01:20:45 -0500 From: Richard Todd <rmtodd@ichotolot.servalan.com> To: freebsd-current@freebsd.org Subject: Bug in recent large_alloc changes to the ZFS zio code? Message-ID: <20090531064517.EB9ADCC9@mx1.synetsystems.com>
next in thread | raw e-mail | index | archive | help
Okay, I'm looking at the recent changes in the ZFS zio code to change how data buffers are allocated (svn r192207). The old code for zio_data_buf_alloc just called kmem_alloc (the Solaris compatibility one), which in turn called malloc() with M_WAITOK, so it would always be guaranteed of getting a valid, non-null pointer. Fair enough. The new code has an alternate code path, where in "arc_large_memory_enabled" mode, it calls the new function zio_large_malloc instead. zio_large_malloc in turn tries a few times to allocate the required pages using vm_phys_alloc_contig, but if that fails goes ahead and returns NULL. Here's the problem. As near as I can tell, none of the code that calls zio_data_buf_alloc appears to check for the possibility that the returned pointer could be NULL, which I guess is reasonable as the original code never could return NULL. However, the new large malloc code *can* return NULL, which causes the obvious problem. The other day I mentioned here a panic I saw where under sufficiently heavy load the GEOM code was complaining that it had been given a NULL data pointer. It seems to me that that was likely because zio had tried to allocate a data buffer and gotten a NULL pointer instead.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090531064517.EB9ADCC9>