Date: Sun, 11 Dec 2011 17:30:06 GMT From: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= <des@des.no> To: freebsd-bugs@FreeBSD.org Subject: Re: kern/163076: It is not possible to read in chunks from linprocfs and procfs. Message-ID: <201112111730.pBBHU6UU019001@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/163076; it has been noted by GNATS. From: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= <des@des.no> To: "Poul-Henning Kamp" <phk@phk.freebsd.dk> Cc: Jaakko Heinonen <jh@FreeBSD.org>, Petr Salinger <Petr.Salinger@seznam.cz>, bug-followup@FreeBSD.org, mdf@FreeBSD.org Subject: Re: kern/163076: It is not possible to read in chunks from linprocfs and procfs. Date: Sun, 11 Dec 2011 18:28:59 +0100 "Poul-Henning Kamp" <phk@phk.freebsd.dk> writes: > Jaakko Heinonen <jh@FreeBSD.org> writes: > > One problem is the different malloc() semantics. The kernel version uses > > M_WAITOK allocations while user space malloc(3) can fail. > Yes, that's Dag-Erlings and my point: The semantics are too different. There is another, more important issue. Synthetic (pseudofs-based) filesystems use sbufs extensively. If a program seeks to a position far into a file in such a filesystem and then reads some data from it, we have to allocate an sbuf large enough to contain the entire file *up to and including* the data the program actually asked for, unless the underlying filler function can somehow skip the part the program didn't ask for (or a large part of it), which is not always possible. This basically allows the userland to trick pseudofs into allocating and dirtying arbitrary amounts of kernel memory, which is obviously not a good thing. The way sbuf_*printf() was originally implemented, it would have been trivial to extend the API to have it allocate no more memory than required to hold the amount of data the program requested, and discard the first N bytes written to it. This is not possible with the current code: the userland *printf() implementation is too different from the kernel *printf() implementation, so they had to tear sbuf_*printf() apart. So what did we do instead? We added checks in pseudofs to prevent the scenario I described. This effectively sets an arbitrary upper limit on the size of sbuf-backed files in pseudofs-based filesystems. I was very unhappy about it, but the only way around it would have been to hide this functionality behind ifdefs so it was only available in the kernel, and turn the sbuf implementation into an even bigger mess than it already is - a mess which is entirely ascribed to the boneheaded decision to use the same code in userland as in the kernel. Before you point out that this *can* in fact be implemented using sbuf_drain(): sbuf_drain() has no place in the sbuf API, and should be removed. It breaks the sbuf semantics by a) providing access to the contents of an sbuf at a time when the sbuf is, by definition, in an inconsistent state, and b) emptying the sbuf at irregular intervals and when sbuf_finish() is called, preventing reuse of its contents. It's bad enough that I had to extend sbufs to support binary data; let's please not compound that mistake by deviating even further from their original purpose. DES --=20 Dag-Erling Sm=C3=B8rgrav - des@des.no
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201112111730.pBBHU6UU019001>