Date: Mon, 07 Apr 2008 20:46:17 -0300 From: =?UTF-8?B?Sm/Do28gQ2FybG9zIE1lbmRlcyBMdcOtcw==?= <jonny@jonny.eng.br> To: pav@FreeBSD.org Cc: hubs@FreeBSD.org Subject: Re: package distribution crisis - CDN needed Message-ID: <47FAB249.2020001@jonny.eng.br> In-Reply-To: <1207605059.1031.38.camel@ikaros.oook.cz> References: <1207605059.1031.38.camel@ikaros.oook.cz>
next in thread | previous in thread | raw e-mail | index | archive | help
Pav Lucistnik wrote: > Okay the situation recently was that the mirrors had no chance keeping > up with all the package sets I've been uploading to ftp-master. > > We clearly need to move beyond rsync/cvsup synced ftp mirrors. This does > not scale. > > I do propose a creation of a CDN (Content Delivery Network), having > these features: > > - no mirroring of a complete package set! (Also no directory listings.) > When client requests the file, and the file is not in the local cache, > the file is downloaded from the upstream server and while it's being > obtained, it's already being sent to the client. This is basically > squid. > > - if the file is present in the local cache, it's returned from local > cache. > > - local cache is invalidated when a new package set is available on > an upstream server. Invalidating mechanism: > option a) cronjob that polls upstream server every 5 minutes for a > file that gives current package set IDs (pull) > option b) master server sends notification to all mirrors to > invalidate a package set (push) > optimization: when package set was invalidated, don't delete old > files, instead on next hit, verify timestamp against upstream server > > - atomic package set uploads to master from pointyhat (probably having > two directories that are switched over on master) > > - everything runs over http > > - default source of files for "pkg_add -r" command > > The goal is to refresh a package set on a daily basis. > > > I don't know if we can use some existing software for this (Squid? > Apache mod_proxy?) or if we will need to put something new together. > Ideas? > I am not sure if this would solve anything, but if we go further in this direction, I'd like to see some architecture with prefetch capability. Note also that a real CDN would hide from the final user the real data location, and this would be selected by some sort of proximity and/or load information. Some CDNs indeed use proxy cache to central server as means of populating its own data, but proxy caching is only a small part of the solution. I did not follow whatever situation happened recently, but I had some trouble in the past with late announcements for mirror administrators. I had sometimes received the announce just like any other FreeBSD user. And even in that cases, packages were distributed much time earlier than final release. Jonny -- João Carlos Mendes Luís - Networking Engineer - jonny@jonny.eng.br
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?47FAB249.2020001>