Date: Wed, 1 Oct 2008 05:52:52 -0600 From: "Cyrus Rahman" <crahman@gmail.com> To: freebsd-geom@freebsd.org Subject: gjournal deadlock Message-ID: <9e77bdb50810010452r3bd4a01bs14facb8fa9a97b4a@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
I continue to experience deadlocks using gjournal with large files. In a previous message I mentioned that they occur frequently with snapshots. Although useful, it is certainly possible to do without snapshots, however, lately I have experienced them in another context, namely, building nanobsd images. The problem occurs when writing out the image file through the md(4) device. Writing 128MB images causes no trouble, but moving to a 2GB image causes the deadlock every time. In fact, I was only able to succeed by building the image on a non-journaled filesystem. The deadlock occurs while sleeping on wdrain - here's the ps(1) output of the processes involved in one such event: 0 51 0 0 -16 0 0 16 wdrain DL ?? 1:24.22 [g_journal switcher] 0 52022 52018 0 -16 0 4640 1152 wdrain D ?? 0:00.02 newsyslog 1001 52069 1725 0 -16 0 2596 636 wdrain D p3 0:00.01 sync 0 51935 51933 0 -16 0 4640 1124 wdrain T p7 0:00.38 cpio -dump /usr/obj/nanobsd.img 0 51924 0 0 75 0 0 16 suspfs DL ?? 0:00.12 [md0] These values are used when deciding to msleep in wdrain: vfs.hirunningspace: 1048576 vfs.lorunningspace: 524288 vfs.runningbufspace: 1956352 They remain static after the deadlock. The really unacceptable aspect of this is that if you don't notice the deadlock has occurred, you can continue to work for many hours on other projects. However, none of the changes made to the filesystem after the deadlock will be committed to the disk. So all your work, including any notes about the deadlock, will vanish when you reboot. It's strange seeing all those deleted files reappear and your code revert back ten hours to the instant the deadlock occurred, and this issue represents a serious danger to anyone using gjournal in a production environment. Furthermore, the problem affects all gjournaled filesystems, not just the one involved in the observed deadlock - so, for example, your successfully received mail and such will also vanish. I expect what happens is that all the changes after the deadlock pile up in the journals, and so remain visible until the inevitable reboot, at which time they are discarded.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9e77bdb50810010452r3bd4a01bs14facb8fa9a97b4a>