Date: Mon, 28 Aug 2006 20:53:06 -0500 From: Brooks Davis <brooks@one-eyed-alien.net> To: Antony Mawer <fbsd-arch@mawer.org> Cc: Max Laier <max@love2party.net>, "Marc G. Fournier" <scrappy@freebsd.org>, freebsd-arch@freebsd.org Subject: Re: BSDStats - What is involved ... ? Message-ID: <20060829015306.GA6722@lor.one-eyed-alien.net> In-Reply-To: <44F39ACB.6090703@mawer.org> References: <20060825233420.V82634@hub.org> <20060826112115.GG16768@turion.vk2pj.dyndns.org> <20060826132138.H82634@hub.org> <200608261848.16513.max@love2party.net> <20060826165209.V82634@hub.org> <20060828130247.GA77702@lor.one-eyed-alien.net> <20060828170450.M82634@hub.org> <44F39ACB.6090703@mawer.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--CE+1k2dSO48ffgeK Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Aug 29, 2006 at 11:39:23AM +1000, Antony Mawer wrote: > On 29/08/2006 6:07 AM, Marc G. Fournier wrote: > >On Mon, 28 Aug 2006, Brooks Davis wrote: > > > >>While I understand (or think I understand) the motivations for this=20 > >>design goal, it's contrary to allowing collection of statistics from=20 > >>many people. I'd love to be able to publish data from the FreeBSD=20 > >>systems (300+) at work, but unless I can do it in an anonymized=20 > >>aggregate form it's not going to happen. I just can't justify leaking= =20 > >>that much internal configuration information given a policy of hiding= =20 > >>it (right or wrong and not subject to debate). If I could run my own= =20 > >>stats server and publish from it that might be possible. > > > >Agreggate submissions will never be possible, as it will definitely=20 > >break any attempts at keeping the data 'clean' :( I do understand that= =20 > >we will never be able to get *everyone* reporting, but we will try as=20 > >much as possible to make it easy for as many as possible to report=20 > >*within* limits ... > > > >I'm going to work on an 'email submission' method in September, that=20 > >would allow repoting to go *thru* one mailbox, and will include a=20 > >confirmation/challenge stage *per* server though ... >=20 > Brooks, what sort of information are you looking to "anonymise" before=20 > sending it out? Aggregating to say that I have X of this kind of CPU, Y= =20 > of this IDE chipset, etc, rather than linking it specifically to each=20 > machine? Where would you feel a comfortable balance lay? Obviously some= =20 > effort needs to be made to minimise fraudulent entries >=20 > Perhaps aggregate submissions could be conducted using a registration=20 > mechanism... >=20 > Other thoughts would be having a local stats aggregation server that=20 > pushes summaries up to the master server... the aggregation server keeps= =20 > the individual details, and some sort of challenge mechanism could be=20 > randomly selected by the master server to reduce the ease with which the= =20 > numbers can be 'faked'? >=20 > ... just rambling as I thought of potential ways around this ... I'd prefer not to expose host names or IP addresses, hardware information and OS version aren't really a problem if they can't be traced to a host name. The requirement to register an aggregation server would be fine with me. A challenge mechanism would be tricky because it would have to occur during a push to the central server since connects back are not really possible. -- Brooks --CE+1k2dSO48ffgeK Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFE854CXY6L6fI4GtQRAiNjAJ95+utUg1oue72f4HEnOiPQlIZFsACg10m6 0gnnrstXwO/iHCBcl/zqHKE= =N1rs -----END PGP SIGNATURE----- --CE+1k2dSO48ffgeK--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060829015306.GA6722>