Date: Thu, 15 Mar 2007 11:00:31 +0200 From: Kostik Belousov <kostikbel@gmail.com> To: David Cecil <david.cecil@nokia.com> Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek <pjd@freebsd.org> Subject: Re: FFS writes to read-only mount Message-ID: <20070315090031.GB80993@deviant.kiev.zoral.com.ua> In-Reply-To: <45F8EE27.6070208@nokia.com> References: <45F776AE.8090702@nokia.com> <20070314161041.GI7847@garage.freebsd.pl> <45F8EE27.6070208@nokia.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--qcHopEYAB45HaUaB Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Mar 15, 2007 at 04:56:39PM +1000, David Cecil wrote: > Hi Pawel, >=20 > here is what I've found. Note that this problem doesn't occur=20 > frequently, but does happen often enough to be a problem. I am still=20 > debugging the live system. >=20 > I put breakpoints in g_print_bio and g_io_request. I then continued=20 > until the breakpoint at g_print_bio breakpoint was hit, but each time=20 > g_io_request was hit, I printed the backtrace. Then, at g_print_bio I=20 > could identify the struct bio * and look for the corresponding=20 > g_io_request trace with a matching struct bio *. >=20 > That trace looks like this: > db> bt > Tracing pid 47 tid 100050 td 0x860d0640 > g_io_request(864258c4,860ec440,a0020024,98b92188,a5281bf0,...) at=20 > g_io_request+0x107 > g_vfs_strategy(86104ce4,98b92188,a0020024,98b92188,86104c30,...) at=20 > g_vfs_strategy+0x63 > ffs_geom_strategy(86104ce4,98b92188) at ffs_geom_strategy+0x129 > bufwrite(98b92188,4000,98b92188,a5281c54,805b20c1,...) at bufwrite+0x146 > ffs_bufwrite(98b92188) at ffs_bufwrite+0x282 > vfs_bio_awrite(98b92188) at vfs_bio_awrite+0x221 > vop_stdfsync(a5281cbc,a5281cbc,a5281c98,806eff3e,a5281cbc,...) at=20 > vop_stdfsync+0x121 It would be helpful, at this frame, to print the actual vnode, as well as the dirty buffer header together with dump of buffer data. It seems that the easiest way to do that would be to obtain coredump and then use kgdb. > devfs_fsync(a5281cbc) at devfs_fsync+0x23 > VOP_FSYNC_APV(80795080,a5281cbc) at VOP_FSYNC_APV+0x7e > sync_vnode(86104ce4,860d0640) at sync_vnode+0x100 > sched_sync(0,a5281d38,0,805c068c,0,...) at sched_sync+0x1ed > fork_exit(805c068c,0,a5281d38) at fork_exit+0xa0 > fork_trampoline() at fork_trampoline+0x8 >=20 > From ps, it identifies pid 47 as "syncer". >=20 > So it appears that all I've discovered is that the syncer thread is=20 > writing all the dirty buffers for gm0s1a to the disk :-( I guess the=20 > question is, why are there any dirty buffers for the read-only=20 > partition? I think the partition was mounted rw for a short while=20 > during the boot process. Maybe something there that didn't get=20 > flushed? Or is it something internal to FFS? >=20 > The output of ps is listed below. Note that some processes might differ= =20 > from a standard FreeBSD system as we're modifying the code. As this=20 > problem seems to be generic to some FreeBSD systems, I thought it valid= =20 > to pursue the problem here. I hope that's okay, and anticipate the fix= =20 > would apply to FBSD, as I've seen what appears to be the same problem=20 > logged elsewhere. >=20 > db> ps > pid ppid pgrp uid state wmesg wchan cmd > 856 841 856 0 S+ ttyin 0x858bd810 csh > 841 1 841 0 Ss+ wait 0x8612f800 login > 791 551 791 0 Ss select 0x8081231c sshd-x > 790 551 790 0 Ss nanslp 0x807c6400 cron > 789 551 789 0 Ss select 0x8081231c monitord > 788 551 788 0 Ss select 0x8081231c snmpd > 787 551 787 0 Ss select 0x8081231c clishd > 785 551 785 0 Ss select 0x8081231c inetd > 731 727 727 80 S accept 0x861cde7e httpd > 730 727 727 80 S accept 0x861cde7e httpd > 729 551 729 0 Ss select 0x8081231c ifm > 728 551 728 0 Ss select 0x8081231c xpand > 727 551 727 0 Ss select 0x8081231c httpd > 725 551 725 0 Ss select 0x8081231c ipsrd > 559 551 559 0 Ss select 0x8081231c syslogd > 551 1 551 0 Ss select 0x8081231c pm > 49 0 0 0 SL - 0xa5071d04 [schedcpu] > 48 0 0 0 SL sdflush 0x808ca6cc [softdepflush] > 47 0 0 0 SL syncer 0x807c5da4 [syncer] > 46 0 0 0 SL vlruwt 0x860ce600 [vnlru] > 45 0 0 0 SL psleep 0x8081276c [bufdaemon] > 44 0 0 0 SL pgzero 0x808cb41c [pagezero] > 43 0 0 0 SL psleep 0x808caf84 [vmdaemon] > 42 0 0 0 SL psleep 0x808caf40 [pagedaemon] > 41 0 0 0 SL m:w1 0x860a1800 [g_mirror gm0s1] > 40 0 0 0 SL ip_inp 0x80854c00 [ip_in_0] > 39 0 0 0 SL ip6_inpu 0x808c9560 [ip6 input [1]] > 38 0 0 0 SL ip6_inpu 0x808c9560 [ip6 input [0]] > 37 0 0 0 WL [swi0: sio] > 36 0 0 0 WL [irq17: ichsmb0] > 35 0 0 0 WL [irq18: atapci1] > 34 0 0 0 WL [irq15: ata1] > 33 0 0 0 WL [irq14: ata0] > 32 0 0 0 SL usbevt 0x84b7d210 [usb2] > 31 0 0 0 WL [irq23: ehci0] > 30 0 0 0 SL usbevt 0x8588c210 [usb1] > 29 0 0 0 WL [irq19: uhci1] > 28 0 0 0 SL usbtsk 0x807c2c64 [usbtask] > 27 0 0 0 SL usbevt 0x8587a210 [usb0] > 26 0 0 0 WL [irq16: uhci0] > 25 0 0 0 WL [irq144: cavium0] > 24 0 0 0 SL cbb cv 0x84b7dbe4 [cbb1] > 23 0 0 0 SL cbb cv 0x84b7f3e4 [cbb0] > 22 0 0 0 WL [irq96: cbb0 cbb1] > 21 0 0 0 WL [irq9: acpi0] > 20 0 0 0 WL [swi5: +] > 9 0 0 0 SL - 0x84b56bc0 [thread taskq] > 19 0 0 0 WL [swi6: +] > 18 0 0 0 WL [swi6: task queue] > 8 0 0 0 SL - 0x84b56d00 [acpi_task2] > 7 0 0 0 SL - 0x84b56d00 [acpi_task1] > 6 0 0 0 SL - 0x84b56d00 [acpi_task0] > 17 0 0 0 WL [swi2: cambio] > 5 0 0 0 SL - 0x84b56e80 [kqueue taskq] > 16 0 0 0 SL - 0x807c2440 [yarrow] > 4 0 0 0 SL - 0x807c335c [g_down] > 3 0 0 0 RL CPU 0 [g_up] > 2 0 0 0 SL - 0x807c3350 [g_event] > 15 0 0 0 WL [swi1: net] > 14 0 0 0 WL [swi3: vm] > 13 0 0 0 WL [swi4: clock sio] > 12 0 0 0 RL [idle: cpu0] > 11 0 0 0 RL CPU 1 [idle: cpu1] > 1 0 1 0 SLs wait 0x84ac8e00 [init] > 10 0 0 0 SL ktrace 0x807c4610 [ktrace] > 0 0 0 0 WLs [swapper] >=20 > One list member asked if I have mounted the root partition with noatime= =20 > as well as ro. I have not specified noatime. It's worth noting that=20 > files in / are regularly read/accessed, as you'd expect, but this bug=20 > only occurs once a week or so on a number of machines. >=20 > Thanks a lot for your time. >=20 > Regards, > Dave >=20 > ext Pawel Jakub Dawidek wrote: > >On Wed, Mar 14, 2007 at 02:14:38PM +1000, David Cecil wrote: > > =20 > >>Hi, > >> > >>I have seen the following message (or equivalent) occasionally on a=20 > >>FreeBSD 6.1 system: > >>g_vfs_done():mirror/gm0s1a[WRITE(offset=3D1349091328, length=3D16384)]e= rror =3D=20 > >>1 > >> > >>The partition in question is the root partition, and it is mounted=20 > >>read-only. I have verified that the problem occurs due to the write=20 > >>request returning EPERM due to the check in g_io_check: > >>case BIO_WRITE: > >>case BIO_DELETE: > >> if (cp->acw =3D=3D 0) > >> return (EPERM); > >> > >>I have been trying to determine what within FFS would be trying to writ= e=20 > >>to the partition. The "bio_from" in the bio structure indicates (in th= e=20 > >>geom) that it's ffs.mirror/gm0s1a that's trying to write. The contents= =20 > >>of the buffer looks somewhat like a directory (lots of files listed, bu= t=20 > >>comparison to the actual directory that contians these files reveals it= 's=20 > >>somewhat different), followed by a binary (ELF header). However, I'm a= t=20 > >>a loss to understand who's actually doing the writing. Is it coming fr= om=20 > >>within FFS or is there an application that's done the write? (I can't= =20 > >>understand how an application would be permitted to do it though.) > >> > >>I have seen this sort of problem (same error number) reported on the=20 > >>Internet occasionally, but it doesn't seem it's been satisfactorily=20 > >>resolved in all instances. > >> > >>Any help you can provide would be much appreciated. > >> =20 > > > >Will be good if you could place kdb_enter() into g_vfs_done() error path > >and once in DDB try which processes wait for I/O and collect their > >backtraces or just put output of 'alltrace' on some web page. > > > > =20 >=20 >=20 > --=20 > Software Engineer > Secure and Mobile Connectivity > Nokia Enterprise Solutions > +61 7 5553 8307 (office) > +61 412 728 222 (cell) >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" --qcHopEYAB45HaUaB Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) iD8DBQFF+QsuC3+MBN1Mb4gRAoUzAKCtaA3JA8mLU9K2xE5iJnaSojyOVwCdH6uy mmrcLNQXhl9I3ywn4F32qIw= =FJAx -----END PGP SIGNATURE----- --qcHopEYAB45HaUaB--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070315090031.GB80993>