From owner-freebsd-arch@FreeBSD.ORG Sun May 3 13:04:15 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9411114F for ; Sun, 3 May 2015 13:04:15 +0000 (UTC) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "vps1.elischer.org", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 5D61B1B9C for ; Sun, 3 May 2015 13:04:14 +0000 (UTC) Received: from jre-mbp.elischer.org (ppp121-45-241-118.lns20.per4.internode.on.net [121.45.241.118]) (authenticated bits=0) by vps1.elischer.org (8.14.9/8.14.9) with ESMTP id t43D48jY020780 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Sun, 3 May 2015 06:04:12 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <55461CC3.2020306@freebsd.org> Date: Sun, 03 May 2015 21:04:03 +0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Mateusz Guzik , Bruce Evans CC: freebsd-arch@freebsd.org Subject: Re: [PATCH 1/2] Generalised support for copy-on-write structures shared by threads. References: <1430188443-19413-1-git-send-email-mjguzik@gmail.com> <1430188443-19413-2-git-send-email-mjguzik@gmail.com> <20150428181802.F1119@besplex.bde.org> <20150501165633.GA7112@dft-labs.eu> In-Reply-To: <20150501165633.GA7112@dft-labs.eu> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 May 2015 13:04:15 -0000 On 5/2/15 12:56 AM, Mateusz Guzik wrote: > On Tue, Apr 28, 2015 at 06:45:01PM +1000, Bruce Evans wrote: >> On Tue, 28 Apr 2015, Mateusz Guzik wrote: >>> diff --git a/sys/sys/proc.h b/sys/sys/proc.h >>> index 64b99fc..f29d796 100644 >>> --- a/sys/sys/proc.h >>> +++ b/sys/sys/proc.h >>> @@ -225,6 +225,7 @@ struct thread { >>> /* Cleared during fork1() */ >>> #define td_startzero td_flags >>> int td_flags; /* (t) TDF_* flags. */ >>> + u_int td_cowgeneration;/* (k) Generation of COW pointers. */ >>> int td_inhibitors; /* (t) Why can not run. */ >>> int td_pflags; /* (k) Private thread (TDP_*) flags. */ >>> int td_dupfd; /* (k) Ret value from fdopen. XXX */ >> This name is so verbose that it messes up the comment indentation. >> > Yeah, that's crap, but the naming is already inconsistent and verbose. > For instance there is td_generation alrady. > > Is _cowgen variant ok? td_cowgen is much preferable with comment "(k) generation of c.o.w. pointers." > >>> @@ -830,6 +832,11 @@ extern pid_t pid_max; >>> KASSERT((p)->p_lock == 0, ("process held")); \ >>> } while (0) >>> >>> +#define PROC_UPDATE_COW(p) do { \ >>> + PROC_LOCK_ASSERT((p), MA_OWNED); \ >>> + p->p_cowgeneration++; \ >> Missing parentheses. > Oops, fixed. > >>> +} while (0) >>> + >>> /* Check whether a thread is safe to be swapped out. */ >>> #define thread_safetoswapout(td) ((td)->td_flags & TDF_CANSWAP) >>> >>> @@ -976,6 +983,10 @@ struct thread *thread_alloc(int pages); >>> int thread_alloc_stack(struct thread *, int pages); >>> void thread_exit(void) __dead2; >>> void thread_free(struct thread *td); >>> +void thread_get_cow_proc(struct thread *newtd, struct proc *p); >>> +void thread_get_cow(struct thread *newtd, struct thread *td); >>> +void thread_free_cow(struct thread *td); >>> +void thread_update_cow(struct thread *td); >> Insertion sort errors. >> >> Namespace errors. I don't like the style of naming things with objects >> first and verbs last, but it is good for sorting related objects. Here >> the verbs "get" and "free" are in the middle of the objects >> "thread_cow_proc" and "thread_cow". Also, shouldn't it be "thread_proc_cow" >> (but less verbose, maybe "tpcow"), not "thread_cow_proc", to indicate >> that the cow is hung of the proc? I didn't notice the details, but it >> makes no sense to hang a proc of a cow :-). >> > Well all current funcs are named thread_*, so tpcow and the like would > be inconsistent. > > On another look existence of thread_suspend_* suggests thread_cow_* > naming. > > With this putting _proc variant anywhere but at the end also breaks > consistency. 'thread_cow_from_proc' would increase verbosity. > > That said, I would say the patch below is ok enough. > > diff --git a/sys/amd64/amd64/trap.c b/sys/amd64/amd64/trap.c > index 193d207..cef3221 100644 > --- a/sys/amd64/amd64/trap.c > +++ b/sys/amd64/amd64/trap.c > @@ -257,8 +257,8 @@ trap(struct trapframe *frame) > td->td_pticks = 0; > td->td_frame = frame; > addr = frame->tf_rip; > - if (td->td_ucred != p->p_ucred) > - cred_update_thread(td); > + if (td->td_cowgen != p->p_cowgen) > + thread_cow_update(td); > > switch (type) { > case T_PRIVINFLT: /* privileged instruction fault */ > diff --git a/sys/arm/arm/trap-v6.c b/sys/arm/arm/trap-v6.c > index abafa86..7463d3c 100644 > --- a/sys/arm/arm/trap-v6.c > +++ b/sys/arm/arm/trap-v6.c > @@ -394,8 +394,8 @@ abort_handler(struct trapframe *tf, int prefetch) > p = td->td_proc; > if (usermode) { > td->td_pticks = 0; > - if (td->td_ucred != p->p_ucred) > - cred_update_thread(td); > + if (td->td_cowgen != p->p_cowgen) > + thread_cow_update(td); > } > > /* Invoke the appropriate handler, if necessary. */ > diff --git a/sys/arm/arm/trap.c b/sys/arm/arm/trap.c > index 0f142ce..d7fb73a 100644 > --- a/sys/arm/arm/trap.c > +++ b/sys/arm/arm/trap.c > @@ -214,8 +214,8 @@ abort_handler(struct trapframe *tf, int type) > if (user) { > td->td_pticks = 0; > td->td_frame = tf; > - if (td->td_ucred != td->td_proc->p_ucred) > - cred_update_thread(td); > + if (td->td_cowgen != td->td_proc->p_cowgen) > + thread_cow_update(td); > > } > /* Grab the current pcb */ > @@ -644,8 +644,8 @@ prefetch_abort_handler(struct trapframe *tf) > > if (TRAP_USERMODE(tf)) { > td->td_frame = tf; > - if (td->td_ucred != td->td_proc->p_ucred) > - cred_update_thread(td); > + if (td->td_cowgen != td->td_proc->p_cowgen) > + thread_cow_update(td); > } > fault_pc = tf->tf_pc; > if (td->td_md.md_spinlock_count == 0) { > diff --git a/sys/i386/i386/trap.c b/sys/i386/i386/trap.c > index d783a2b..b118e73 100644 > --- a/sys/i386/i386/trap.c > +++ b/sys/i386/i386/trap.c > @@ -306,8 +306,8 @@ trap(struct trapframe *frame) > td->td_pticks = 0; > td->td_frame = frame; > addr = frame->tf_eip; > - if (td->td_ucred != p->p_ucred) > - cred_update_thread(td); > + if (td->td_cowgen != p->p_cowgen) > + thread_cow_update(td); > > switch (type) { > case T_PRIVINFLT: /* privileged instruction fault */ > diff --git a/sys/kern/init_main.c b/sys/kern/init_main.c > index b77b788..e0042e9 100644 > --- a/sys/kern/init_main.c > +++ b/sys/kern/init_main.c > @@ -522,8 +522,6 @@ proc0_init(void *dummy __unused) > #ifdef MAC > mac_cred_create_swapper(newcred); > #endif > - td->td_ucred = crhold(newcred); > - > /* Create sigacts. */ > p->p_sigacts = sigacts_alloc(); > > @@ -555,6 +553,10 @@ proc0_init(void *dummy __unused) > p->p_limit->pl_rlimit[RLIMIT_MEMLOCK].rlim_max = pageablemem; > p->p_cpulimit = RLIM_INFINITY; > > + PROC_LOCK(p); > + thread_cow_get_proc(td, p); > + PROC_UNLOCK(p); > + > /* Initialize resource accounting structures. */ > racct_create(&p->p_racct); > > @@ -842,10 +844,10 @@ create_init(const void *udata __unused) > audit_cred_proc1(newcred); > #endif > proc_set_cred(initproc, newcred); > + cred_update_thread(FIRST_THREAD_IN_PROC(initproc)); > PROC_UNLOCK(initproc); > sx_xunlock(&proctree_lock); > crfree(oldcred); > - cred_update_thread(FIRST_THREAD_IN_PROC(initproc)); > cpu_set_fork_handler(FIRST_THREAD_IN_PROC(initproc), start_init, NULL); > } > SYSINIT(init, SI_SUB_CREATE_INIT, SI_ORDER_FIRST, create_init, NULL); > diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c > index c3dd792..0dfecff 100644 > --- a/sys/kern/kern_fork.c > +++ b/sys/kern/kern_fork.c > @@ -496,7 +496,6 @@ do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, > p2->p_swtick = ticks; > if (p1->p_flag & P_PROFIL) > startprofclock(p2); > - td2->td_ucred = crhold(p2->p_ucred); > > if (flags & RFSIGSHARE) { > p2->p_sigacts = sigacts_hold(p1->p_sigacts); > @@ -526,6 +525,8 @@ do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, > */ > lim_fork(p1, p2); > > + thread_cow_get_proc(td2, p2); > + > pstats_fork(p1->p_stats, p2->p_stats); > > PROC_UNLOCK(p1); > diff --git a/sys/kern/kern_kthread.c b/sys/kern/kern_kthread.c > index ee94de0..863bbc6 100644 > --- a/sys/kern/kern_kthread.c > +++ b/sys/kern/kern_kthread.c > @@ -289,7 +289,7 @@ kthread_add(void (*func)(void *), void *arg, struct proc *p, > cpu_set_fork_handler(newtd, func, arg); > > newtd->td_pflags |= TDP_KTHREAD; > - newtd->td_ucred = crhold(p->p_ucred); > + thread_cow_get_proc(newtd, p); > > /* this code almost the same as create_thread() in kern_thr.c */ > p->p_flag |= P_HADTHREADS; > diff --git a/sys/kern/kern_prot.c b/sys/kern/kern_prot.c > index 9c49f71..b531763 100644 > --- a/sys/kern/kern_prot.c > +++ b/sys/kern/kern_prot.c > @@ -1946,9 +1946,8 @@ cred_update_thread(struct thread *td) > > p = td->td_proc; > cred = td->td_ucred; > - PROC_LOCK(p); > + PROC_LOCK_ASSERT(p, MA_OWNED); > td->td_ucred = crhold(p->p_ucred); > - PROC_UNLOCK(p); > if (cred != NULL) > crfree(cred); > } > @@ -1987,6 +1986,8 @@ proc_set_cred(struct proc *p, struct ucred *newcred) > > oldcred = p->p_ucred; > p->p_ucred = newcred; > + if (newcred != NULL) > + PROC_UPDATE_COW(p); > return (oldcred); > } > > diff --git a/sys/kern/kern_syscalls.c b/sys/kern/kern_syscalls.c > index dada746..3d3df01 100644 > --- a/sys/kern/kern_syscalls.c > +++ b/sys/kern/kern_syscalls.c > @@ -31,6 +31,8 @@ __FBSDID("$FreeBSD$"); > #include > #include > #include > +#include > +#include > #include > #include > #include > diff --git a/sys/kern/kern_thr.c b/sys/kern/kern_thr.c > index 6911bb97..a53bd25 100644 > --- a/sys/kern/kern_thr.c > +++ b/sys/kern/kern_thr.c > @@ -228,13 +228,13 @@ create_thread(struct thread *td, mcontext_t *ctx, > bcopy(&td->td_startcopy, &newtd->td_startcopy, > __rangeof(struct thread, td_startcopy, td_endcopy)); > newtd->td_proc = td->td_proc; > - newtd->td_ucred = crhold(td->td_ucred); > + thread_cow_get(newtd, td); > > if (ctx != NULL) { /* old way to set user context */ > error = set_mcontext(newtd, ctx); > if (error != 0) { > + thread_cow_free(newtd); > thread_free(newtd); > - crfree(td->td_ucred); > goto fail; > } > } else { > @@ -246,8 +246,8 @@ create_thread(struct thread *td, mcontext_t *ctx, > /* Setup user TLS address and TLS pointer register. */ > error = cpu_set_user_tls(newtd, tls_base); > if (error != 0) { > + thread_cow_free(newtd); > thread_free(newtd); > - crfree(td->td_ucred); > goto fail; > } > } > diff --git a/sys/kern/kern_thread.c b/sys/kern/kern_thread.c > index 0a93dbd..063dfe9 100644 > --- a/sys/kern/kern_thread.c > +++ b/sys/kern/kern_thread.c > @@ -324,8 +324,7 @@ thread_reap(void) > mtx_unlock_spin(&zombie_lock); > while (td_first) { > td_next = TAILQ_NEXT(td_first, td_slpq); > - if (td_first->td_ucred) > - crfree(td_first->td_ucred); > + thread_cow_free(td_first); > thread_free(td_first); > td_first = td_next; > } > @@ -381,6 +380,44 @@ thread_free(struct thread *td) > uma_zfree(thread_zone, td); > } > > +void > +thread_cow_get_proc(struct thread *newtd, struct proc *p) > +{ > + > + PROC_LOCK_ASSERT(p, MA_OWNED); > + newtd->td_ucred = crhold(p->p_ucred); > + newtd->td_cowgen = p->p_cowgen; > +} > + > +void > +thread_cow_get(struct thread *newtd, struct thread *td) > +{ > + > + newtd->td_ucred = crhold(td->td_ucred); > + newtd->td_cowgen = td->td_cowgen; > +} > + > +void > +thread_cow_free(struct thread *td) > +{ > + > + if (td->td_ucred) > + crfree(td->td_ucred); > +} > + > +void > +thread_cow_update(struct thread *td) > +{ > + struct proc *p; > + > + p = td->td_proc; > + PROC_LOCK(p); > + if (td->td_ucred != p->p_ucred) > + cred_update_thread(td); > + td->td_cowgen = p->p_cowgen; > + PROC_UNLOCK(p); > +} > + > /* > * Discard the current thread and exit from its context. > * Always called with scheduler locked. > @@ -518,7 +555,7 @@ thread_wait(struct proc *p) > cpuset_rel(td->td_cpuset); > td->td_cpuset = NULL; > cpu_thread_clean(td); > - crfree(td->td_ucred); > + thread_cow_free(td); > thread_reap(); /* check for zombie threads etc. */ > } > > diff --git a/sys/kern/subr_syscall.c b/sys/kern/subr_syscall.c > index 1bf78b8..070ba28 100644 > --- a/sys/kern/subr_syscall.c > +++ b/sys/kern/subr_syscall.c > @@ -61,8 +61,8 @@ syscallenter(struct thread *td, struct syscall_args *sa) > p = td->td_proc; > > td->td_pticks = 0; > - if (td->td_ucred != p->p_ucred) > - cred_update_thread(td); > + if (td->td_cowgen != p->p_cowgen) > + thread_cow_update(td); > if (p->p_flag & P_TRACED) { > traced = 1; > PROC_LOCK(p); > diff --git a/sys/kern/subr_trap.c b/sys/kern/subr_trap.c > index 93f7557..e5e55dd 100644 > --- a/sys/kern/subr_trap.c > +++ b/sys/kern/subr_trap.c > @@ -213,8 +213,8 @@ ast(struct trapframe *framep) > thread_unlock(td); > PCPU_INC(cnt.v_trap); > > - if (td->td_ucred != p->p_ucred) > - cred_update_thread(td); > + if (td->td_cowgen != p->p_cowgen) > + thread_cow_update(td); > if (td->td_pflags & TDP_OWEUPC && p->p_flag & P_PROFIL) { > addupc_task(td, td->td_profil_addr, td->td_profil_ticks); > td->td_profil_ticks = 0; > diff --git a/sys/powerpc/powerpc/trap.c b/sys/powerpc/powerpc/trap.c > index 0ceb170..bfbd94d 100644 > --- a/sys/powerpc/powerpc/trap.c > +++ b/sys/powerpc/powerpc/trap.c > @@ -196,8 +196,8 @@ trap(struct trapframe *frame) > if (user) { > td->td_pticks = 0; > td->td_frame = frame; > - if (td->td_ucred != p->p_ucred) > - cred_update_thread(td); > + if (td->td_cowgen != p->p_cowgen) > + thread_cow_update(td); > > /* User Mode Traps */ > switch (type) { > diff --git a/sys/sparc64/sparc64/trap.c b/sys/sparc64/sparc64/trap.c > index b4f0e27..e9917e5 100644 > --- a/sys/sparc64/sparc64/trap.c > +++ b/sys/sparc64/sparc64/trap.c > @@ -277,8 +277,8 @@ trap(struct trapframe *tf) > td->td_pticks = 0; > td->td_frame = tf; > addr = tf->tf_tpc; > - if (td->td_ucred != p->p_ucred) > - cred_update_thread(td); > + if (td->td_cowgen != p->p_cowgen) > + thread_cow_update(td); > > switch (tf->tf_type) { > case T_DATA_MISS: > diff --git a/sys/sys/proc.h b/sys/sys/proc.h > index 64b99fc..5033957 100644 > --- a/sys/sys/proc.h > +++ b/sys/sys/proc.h > @@ -225,6 +225,7 @@ struct thread { > /* Cleared during fork1() */ > #define td_startzero td_flags > int td_flags; /* (t) TDF_* flags. */ > + u_int td_cowgen; /* (k) Generation of COW pointers. */ > int td_inhibitors; /* (t) Why can not run. */ > int td_pflags; /* (k) Private thread (TDP_*) flags. */ > int td_dupfd; /* (k) Ret value from fdopen. XXX */ > @@ -531,6 +532,7 @@ struct proc { > pid_t p_oppid; /* (c + e) Save ppid in ptrace. XXX */ > struct vmspace *p_vmspace; /* (b) Address space. */ > u_int p_swtick; /* (c) Tick when swapped in or out. */ > + u_int p_cowgen; /* (c) Generation of COW pointers. */ > struct itimerval p_realtimer; /* (c) Alarm timer. */ > struct rusage p_ru; /* (a) Exit information. */ > struct rusage_ext p_rux; /* (cu) Internal resource usage. */ > @@ -830,6 +832,11 @@ extern pid_t pid_max; > KASSERT((p)->p_lock == 0, ("process held")); \ > } while (0) > > +#define PROC_UPDATE_COW(p) do { \ > + PROC_LOCK_ASSERT((p), MA_OWNED); \ > + (p)->p_cowgen++; \ > +} while (0) > + > /* Check whether a thread is safe to be swapped out. */ > #define thread_safetoswapout(td) ((td)->td_flags & TDF_CANSWAP) > > @@ -974,6 +981,10 @@ void cpu_thread_swapin(struct thread *); > void cpu_thread_swapout(struct thread *); > struct thread *thread_alloc(int pages); > int thread_alloc_stack(struct thread *, int pages); > +void thread_cow_get_proc(struct thread *newtd, struct proc *p); > +void thread_cow_get(struct thread *newtd, struct thread *td); > +void thread_cow_free(struct thread *td); > +void thread_cow_update(struct thread *td); > void thread_exit(void) __dead2; > void thread_free(struct thread *td); > void thread_link(struct thread *td, struct proc *p); From owner-freebsd-arch@FreeBSD.ORG Sun May 3 14:50:04 2015 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6DCDE662 for ; Sun, 3 May 2015 14:50:04 +0000 (UTC) Received: from mail.grem.de (outcast.grem.de [213.239.217.27]) by mx1.freebsd.org (Postfix) with SMTP id C974A149D for ; Sun, 3 May 2015 14:50:03 +0000 (UTC) Received: (qmail 91526 invoked by uid 89); 3 May 2015 14:49:55 -0000 Received: from unknown (HELO bsd64.grem.de) (mg@grem.de@185.17.207.96) by mail.grem.de with ESMTPA; 3 May 2015 14:49:55 -0000 Date: Sun, 3 May 2015 16:49:49 +0200 From: Michael Gmelin To: arch@freebsd.org Cc: Juli Mallett Subject: Further changes to smb(4) API Message-ID: <20150503164949.5095447e@bsd64.grem.de> X-Mailer: Claws Mail 3.11.1 (GTK+ 2.24.25; amd64-portbld-freebsd10.0) X-Face: $wrgCtfdVw_H9WAY?S&9+/F"!41z'L$uo*WzT8miX?kZ~W~Lr5W7v?j0Sde\mwB&/ypo^}> +a'4xMc^^KroE~+v^&^#[B">soBo1y6(TW6#UZiC]o>C6`ej+i Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAAJFBMVEWJBwe5BQDl LASZU0/LTEWEfHbyj0Txi32+sKrp1Mv944X8/fm1rS+cAAAACXBIWXMAAAsTAAAL EwEAmpwYAAAAB3RJTUUH3wESCxwC7OBhbgAAACFpVFh0Q29tbWVudAAAAAAAQ3Jl YXRlZCB3aXRoIFRoZSBHSU1QbbCXAAAAAghJREFUOMu11DFvEzEUAGCfEhBVFzuq AKkLd0O6VrIQsLXVSZXoWE5N1K3DobBBA9fQpRWc8OkWouaIjedWKiyREOKs+3PY fvalCNjgLVHeF7/3bMtBzV8C/VsQ8tecEgCcDgrzjekwKZ7TwsJZd/ywEKwwP+ZM 8P3drTsAwWn2mpWuDDuYiK1bFs6De0KUUFw0tWxm+D4AIhuuvZqtyWYeO7jQ4Aea 7jUqI+ixhQoHex4WshEvSXdood7stlv4oSuFOC4tqGcr0NjEqXgV4mMJO38nld4+ xKNxRDon7khyKVqY7YR4d+Cg0OMrkWXZOM7YDkEfKiilCn1qYv4mighZiynuHHOA Wq9QJq+BIES7lMFUtcikMnkDGHUoncA+uHgrP0ctIEqfwLHzeSo+eUA66AqzwN6n 2ZHJhw6Qh/PoyC/QENyEyC/AyNjq74Bs+3UH0xYwzDUC4B97HgLocg1QLYgDDO1v f3UX9Y307Ew4AHh67YAFFsxEpkXwpXY3eIgMhAAE3R19L919nNnuD2wlPcDE3UeT L2ytEICQib9BXgS2fU8PrD82ToYO1OEmMSnYTjSqSv9wdC0tPYC+rQRQD9ESnldF CyqfmiYW+tlALt8gH2xrMdC/youbjzPXEun+/ReXsMCDyve3dZc09fn2Oas8oXGc Jj6/fOeK5UmSMPmf/jL+GD8BEj0k/Fn6IO4AAAAASUVORK5CYII= MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 May 2015 14:50:04 -0000 Porting some interesting features, r281985 also introduced a breaking ABI/API change to the smbus interface. As Juli pointed out, since this is not backwards compatible anyway, we might as well go the whole nine yards and try to make the interface as good as possible. There's still plenty of time to 11-RELEASE and I'm willing to incorporate further changes, assuming there is enough feedback. As there are only a few ports using this interface and they're all very similar in the way it is used, it would be really good to get some feedback from actual users of this interface in real world projects. Juli suggested as a first step to change rbuf and wbuf in smbcmd from char* to void* to avoid casting in case of SMB_READW and SMB_WRITEW. Based on what I've seen in ports using this API, they all seem to create wrapper functions around these ioctl commands anyway, so I'm curious if it wouldn't make sense to provide functions that wrap the smb interface in a library (either in base or as a port): sysutils/bsdhwmon: uint8_t read_byte(int fd, int slave, const char idxreg); void write_byte(int fd, int slave, const char idxreg, const char value); sysutils/consolehm: int ReadByte(u_char *return_value, int addr); int WriteByte(int addr, int value); sysutils/gkrellm2: static gint get_data(int iodev, u_char command, int interface, u_char *ret); sysutils/healthd: static int WriteByte(int addr,int value); static int ReadByte(int addr); sysutils/xmbmon: int smbioctl_readB(int slave, int addr); void smbioctl_writeB(int slave, int addr, int value); int smbioctl_readW(int smb_slave, int addr); void smbioctl_writeW(int slave, int addr, int value); Cheers, Michael -- Michael Gmelin From owner-freebsd-arch@FreeBSD.ORG Sun May 3 17:26:50 2015 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7B21F93B; Sun, 3 May 2015 17:26:50 +0000 (UTC) Received: from mail-ie0-x230.google.com (mail-ie0-x230.google.com [IPv6:2607:f8b0:4001:c03::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 46AF9134D; Sun, 3 May 2015 17:26:50 +0000 (UTC) Received: by iecrt8 with SMTP id rt8so116726262iec.0; Sun, 03 May 2015 10:26:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=ytRMXFZagxE6Kjvb8Lpl8UCd9pj56vzrZ+LZzJSQvH8=; b=rgifB/WCNoLVgx5BoAqkxcQ04pVe1OzpPJ1TozJPeLRovUzuS8RafWoM4510ia0Miq uwIs2cUrpIIWVHfxw9YUWnp6MvYF5OMQhaRtksFeLOCU4PZ7f2XmItI/CfPGagZUHtL1 e9V8UjLS9YkegsUcV/lAdfu+6e0g4iRjik0qegVTPvftxyHOr4QdCnM1qME//PSuM43J REkMCHx2SmzH6/vXEv2rPeL6dlEq3O9/7leGq+YigG4nGSFGRYJXMmvklCJLK7SQ3mpi hThL3to++FruRAUdI1VIy6NtEIYeHw+tcgI3nesCkWCSJ5tQRL6iEeGGyITxh2zg7tSl HUhA== MIME-Version: 1.0 X-Received: by 10.107.155.81 with SMTP id d78mr23443271ioe.29.1430674009614; Sun, 03 May 2015 10:26:49 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.36.38.133 with HTTP; Sun, 3 May 2015 10:26:49 -0700 (PDT) In-Reply-To: <20150503164949.5095447e@bsd64.grem.de> References: <20150503164949.5095447e@bsd64.grem.de> Date: Sun, 3 May 2015 10:26:49 -0700 X-Google-Sender-Auth: Vx2inGUwj73Z-uFpiwLJeCrlbn4 Message-ID: Subject: Re: Further changes to smb(4) API From: Adrian Chadd To: Michael Gmelin Cc: "freebsd-arch@freebsd.org" , Juli Mallett Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 May 2015 17:26:50 -0000 Yes, please do create a library. -a On 3 May 2015 at 07:49, Michael Gmelin wrote: > Porting some interesting features, r281985 also introduced a breaking > ABI/API change to the smbus interface. As Juli pointed out, since this > is not backwards compatible anyway, we might as well go the whole nine > yards and try to make the interface as good as possible. There's still > plenty of time to 11-RELEASE and I'm willing to incorporate further > changes, assuming there is enough feedback. > > As there are only a few ports using this interface and they're all > very similar in the way it is used, it would be really good to get some > feedback from actual users of this interface in real world projects. > > Juli suggested as a first step to change rbuf and wbuf in smbcmd from > char* to void* to avoid casting in case of SMB_READW and SMB_WRITEW. > > Based on what I've seen in ports using this API, they all seem to > create wrapper functions around these ioctl commands anyway, so I'm > curious if it wouldn't make sense to provide functions that wrap > the smb interface in a library (either in base or as a port): > > sysutils/bsdhwmon: > uint8_t read_byte(int fd, int slave, const char idxreg); > void write_byte(int fd, int slave, const char idxreg, const char value); > > sysutils/consolehm: > int ReadByte(u_char *return_value, int addr); > int WriteByte(int addr, int value); > > sysutils/gkrellm2: > static gint > get_data(int iodev, u_char command, int interface, u_char *ret); > > sysutils/healthd: > static int WriteByte(int addr,int value); > static int ReadByte(int addr); > > sysutils/xmbmon: > int smbioctl_readB(int slave, int addr); > void smbioctl_writeB(int slave, int addr, int value); > int smbioctl_readW(int smb_slave, int addr); > void smbioctl_writeW(int slave, int addr, int value); > > Cheers, > Michael > > -- > Michael Gmelin > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" From owner-freebsd-arch@FreeBSD.ORG Sun May 3 23:47:40 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 95C30A47 for ; Sun, 3 May 2015 23:47:40 +0000 (UTC) Received: from mail109.syd.optusnet.com.au (mail109.syd.optusnet.com.au [211.29.132.80]) by mx1.freebsd.org (Postfix) with ESMTP id 5B4E61B81 for ; Sun, 3 May 2015 23:47:40 +0000 (UTC) Received: from c211-30-166-197.carlnfd1.nsw.optusnet.com.au (c211-30-166-197.carlnfd1.nsw.optusnet.com.au [211.30.166.197]) by mail109.syd.optusnet.com.au (Postfix) with ESMTPS id E798CD64EE7; Mon, 4 May 2015 09:47:36 +1000 (AEST) Date: Mon, 4 May 2015 09:47:36 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Mateusz Guzik cc: freebsd-arch@freebsd.org Subject: Re: [PATCH 1/2] Generalised support for copy-on-write structures shared by threads. In-Reply-To: <20150501165633.GA7112@dft-labs.eu> Message-ID: <20150504092712.X3339@besplex.bde.org> References: <1430188443-19413-1-git-send-email-mjguzik@gmail.com> <1430188443-19413-2-git-send-email-mjguzik@gmail.com> <20150428181802.F1119@besplex.bde.org> <20150501165633.GA7112@dft-labs.eu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=QeFf4Krv c=1 sm=1 tr=0 a=KA6XNC2GZCFrdESI5ZmdjQ==:117 a=PO7r1zJSAAAA:8 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=qFd6ZJhFGKGVsV3VXbMA:9 a=CjuIK1q_8ugA:10 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 May 2015 23:47:40 -0000 On Fri, 1 May 2015, Mateusz Guzik wrote: > On Tue, Apr 28, 2015 at 06:45:01PM +1000, Bruce Evans wrote: >> On Tue, 28 Apr 2015, Mateusz Guzik wrote: >>> diff --git a/sys/sys/proc.h b/sys/sys/proc.h >>> index 64b99fc..f29d796 100644 >>> --- a/sys/sys/proc.h >>> +++ b/sys/sys/proc.h >>> @@ -225,6 +225,7 @@ struct thread { >>> /* Cleared during fork1() */ >>> #define td_startzero td_flags >>> int td_flags; /* (t) TDF_* flags. */ >>> + u_int td_cowgeneration;/* (k) Generation of COW pointers. */ >>> int td_inhibitors; /* (t) Why can not run. */ >>> int td_pflags; /* (k) Private thread (TDP_*) flags. */ >>> int td_dupfd; /* (k) Ret value from fdopen. XXX */ >> >> This name is so verbose that it messes up the comment indentation. > > Yeah, that's crap, but the naming is already inconsistent and verbose. > For instance there is td_generation alrady. td_generation is a much worse. It looks like it might be a generation counter for reused thread instances, but is actually just a thread preemption counter. > Is _cowgen variant ok? Yes. It is more verbose than _cowg, but easier to guess what it means. Everyone knows what a cow is, but not what a g is. However, since td_cowgen is not for threads generally, but just for a couple of things hung off threads, perhaps its name should give a hint about those things and not say anything about "cow". I didn't notice how you named these things. >>> @@ -830,6 +832,11 @@ extern pid_t pid_max; >>> KASSERT((p)->p_lock == 0, ("process held")); \ >>> } while (0) >>> >>> +#define PROC_UPDATE_COW(p) do { \ >>> + PROC_LOCK_ASSERT((p), MA_OWNED); \ >>> + p->p_cowgeneration++; \ >> >> Missing parentheses. > > Oops, fixed. Further naming problems. This macro doesn't update cow things, but just increases the generation count. > That said, I would say the patch below is ok enough. OK. Bruce From owner-freebsd-arch@FreeBSD.ORG Sun May 3 23:59:07 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4B9B4BB2; Sun, 3 May 2015 23:59:07 +0000 (UTC) Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by mx1.freebsd.org (Postfix) with ESMTP id 0E0621C53; Sun, 3 May 2015 23:59:06 +0000 (UTC) Received: from c211-30-166-197.carlnfd1.nsw.optusnet.com.au (c211-30-166-197.carlnfd1.nsw.optusnet.com.au [211.30.166.197]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 1E55F424B6E; Mon, 4 May 2015 09:58:53 +1000 (AEST) Date: Mon, 4 May 2015 09:58:48 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Julian Elischer cc: Mateusz Guzik , Bruce Evans , freebsd-arch@freebsd.org Subject: Re: [PATCH 1/2] Generalised support for copy-on-write structures shared by threads. In-Reply-To: <55461CC3.2020306@freebsd.org> Message-ID: <20150504094742.C3339@besplex.bde.org> References: <1430188443-19413-1-git-send-email-mjguzik@gmail.com> <1430188443-19413-2-git-send-email-mjguzik@gmail.com> <20150428181802.F1119@besplex.bde.org> <20150501165633.GA7112@dft-labs.eu> <55461CC3.2020306@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=A5NVYcmG c=1 sm=1 tr=0 a=KA6XNC2GZCFrdESI5ZmdjQ==:117 a=PO7r1zJSAAAA:8 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=bUfv3vSTku50TrIb1NEA:9 a=CjuIK1q_8ugA:10 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 May 2015 23:59:07 -0000 On Sun, 3 May 2015, Julian Elischer wrote: > On 5/2/15 12:56 AM, Mateusz Guzik wrote: >> On Tue, Apr 28, 2015 at 06:45:01PM +1000, Bruce Evans wrote: >>> On Tue, 28 Apr 2015, Mateusz Guzik wrote: >>>> diff --git a/sys/sys/proc.h b/sys/sys/proc.h >>>> index 64b99fc..f29d796 100644 >>>> --- a/sys/sys/proc.h >>>> +++ b/sys/sys/proc.h >>>> @@ -225,6 +225,7 @@ struct thread { >>>> /* Cleared during fork1() */ >>>> #define td_startzero td_flags >>>> int td_flags; /* (t) TDF_* flags. */ >>>> + u_int td_cowgeneration;/* (k) Generation of COW pointers. >>>> */ >>>> int td_inhibitors; /* (t) Why can not run. */ >>>> int td_pflags; /* (k) Private thread (TDP_*) flags. >>>> */ >>>> int td_dupfd; /* (k) Ret value from fdopen. XXX */ >>> This name is so verbose that it messes up the comment indentation. >>> >> Yeah, that's crap, but the naming is already inconsistent and verbose. >> For instance there is td_generation alrady. >> >> Is _cowgen variant ok? > td_cowgen is much preferable with comment "(k) generation of c.o.w. > pointers." > + u_int td_cowgen; /* (k) generation of c.o.w. pointers. */ It is less preferable with such a comment. The change in the comment just adds 3 style bugs: - inconsistent capitalization. In this file, comments on struct members start with a capital letter - punctuation for COW. Acronyms are not punctuated. Perhaps this one should be in lower case (like vm is usually in lower case). - verboseness from the previous bug. It expands the line line to 80. But there is a problem with the comment. It doesn't do much more than echo the variable name with minor expansions and rearrangements. The only useful parts of it are "(k)" and "pointers". Bruce From owner-freebsd-arch@FreeBSD.ORG Mon May 4 05:03:57 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 91D02E8D for ; Mon, 4 May 2015 05:03:57 +0000 (UTC) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "vps1.elischer.org", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 61B771958 for ; Mon, 4 May 2015 05:03:57 +0000 (UTC) Received: from Julian-MBP3.local (ppp121-45-241-118.lns20.per4.internode.on.net [121.45.241.118]) (authenticated bits=0) by vps1.elischer.org (8.14.9/8.14.9) with ESMTP id t4453odk023594 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Sun, 3 May 2015 22:03:53 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <5546FDB0.30309@freebsd.org> Date: Mon, 04 May 2015 13:03:44 +0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Bruce Evans CC: Mateusz Guzik , freebsd-arch@freebsd.org Subject: Re: [PATCH 1/2] Generalised support for copy-on-write structures shared by threads. References: <1430188443-19413-1-git-send-email-mjguzik@gmail.com> <1430188443-19413-2-git-send-email-mjguzik@gmail.com> <20150428181802.F1119@besplex.bde.org> <20150501165633.GA7112@dft-labs.eu> <55461CC3.2020306@freebsd.org> <20150504094742.C3339@besplex.bde.org> In-Reply-To: <20150504094742.C3339@besplex.bde.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 May 2015 05:03:57 -0000 On 5/4/15 7:58 AM, Bruce Evans wrote: > On Sun, 3 May 2015, Julian Elischer wrote: td_cowgen is much > preferable with comment "(k) generation of c.o.w. pointers." > >> + u_int td_cowgen; /* (k) generation of c.o.w. >> pointers. */ > > But there is a problem with the comment. It doesn't do much more than > echo the variable name with minor expansions and rearrangements. The > only useful parts of it are "(k)" and "pointers". cow is not immediately obvious to everyone. > > Bruce > > From owner-freebsd-arch@FreeBSD.ORG Mon May 4 08:24:37 2015 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1CAFD95B; Mon, 4 May 2015 08:24:37 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9A81D1E45; Mon, 4 May 2015 08:24:36 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t448OQBj088050 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Mon, 4 May 2015 11:24:26 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t448OQBj088050 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t448OQRF088049; Mon, 4 May 2015 11:24:26 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 4 May 2015 11:24:26 +0300 From: Konstantin Belousov To: Gleb Smirnoff Cc: alc@FreeBSD.org, arch@FreeBSD.org Subject: Re: more strict KPI for vm_pager_get_pages() Message-ID: <20150504082426.GC2390@kib.kiev.ua> References: <20150430142408.GS546@nginx.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150430142408.GS546@nginx.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 May 2015 08:24:37 -0000 On Thu, Apr 30, 2015 at 05:24:08PM +0300, Gleb Smirnoff wrote: > Hi! > > The reason to write down this patch emerges from the > projects/sendfile branch, where vm_pager_get_pages() is > used in the sendfile(2) system call. Although the new > sendfile works flawlessly, it makes some assumptions > about vnode_pager that theoretically may not be valid, > however always hold in our current code. > > Going deeper in the problem I have found more important > points, which yielded in the suggested patch. To start, > let me display the current KPI assumptions: > > 1) vm_pager_get_pages() works on an array of consequtive > array of pages. Pindex of (n+1)-th pages must be pindex > of n-th + 1. One page is special, it is called reqpage. > 2) vm_pager_get_pages() guarantees to swapin only the reqpage, > and may skip or fail other pages for different reasons, that > may vary from pager to pager. > 3) There also is function vm_pager_has_page(), which reports > availability of a page at given index in the pager, and also > provides hints on how many consequtive pages before this one > and after this one can be swapped in in single pager request. > Most pagers return zeros in these hints. The vnode pager for > UFS returns a strong promise, that one can later utilize in > vm_pager_get_pages(). > 4) All pages must be busied on enter. On exit only reqpage > will be left busied. The KPI doesn't guarantee that rest > of the pages is still in place. The pager usually calls > vm_page_readahead_finish() on them, which can either free, > or put the page on active/inactive queue, using quite > a strange approach to choose a queue. > 5) The pages must not be wired, since vm_page_free() may be > called on them. However, this is violated by several > consumers of KPI, relying on lack of errors in the pager. > Moreover, the swap pager has a special function to skip > wired pages, while doing the sweep, to avoid this problem. > So, passing wired pages to swapper is OK, while to the > reset is not. > 6) Pagers may replace a page in the object with a new one. > The sg_pager actually does that. To protect from this > event, consumers of vm_pager_get_pages() always run > vm_page_lookup() over the array of pages to relookup the pages. > However, not all consumers do this. > > Speaking of pagers and their consumers: > - 11 consumers request array of size 1, a single page > - 3 consumers actually request array > > My suggestion is to change the KPI assumptions to the following: > > 1) There is no reqpage. All pages are entered busied, all pages > are returned busied and validated. If pager fails to validate > all pages it must return error. > 2) The consumer (not the pager!) is to decide what to do with the > pages: vm_page_active, vm_page_deactivate, vm_page_flash or just > vm_page_free them. The consumer also unbusies pages, if it > wants to. The consumer is free to wire pages before the call. > 3) Consumers must first query the pager via vm_pager_has_page(), > and use the after/before hints to limit the size of the > requested pages array. > 4) In case if pager replaces pages, it must also update the array, > so that consumer doesn't need to do relookup. > > Doing this sweep, I also noticed that all pagers have a copy-pasted > code of zeroing invalid regions of partially valid pages. Also, > many pagers got a set of assertions copy and pasted from each > other. So, I decided to un-inline the vm_pager_get_pages(), bring > it to the vm_pager.c file and gather all these copy-pastes > into one place. > > The suggested patch is attached. As expected, it simplifies and > removes quite a lot of code. > > Right now it is tested on UFS only, testing NFS and ZFS is on my list. > There is one panic known, but it seems unrelated, and Peter pho@ says > that once it has been seen before. Below is the summary of my part of the internal discussion about the changes. Traditionally, Unix allows the filesystems to perform the short reads. Most fundamental change in the patch removes this freedom from the filesystem implementation, and I think that only local filesystems could be compliant with the proposed strictness. IMO, the response from vm_pager_haspages() is only advisory, since filesystem might not control the external entities which are the source of the required data. From owner-freebsd-arch@FreeBSD.ORG Mon May 4 09:11:48 2015 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 778A5842; Mon, 4 May 2015 09:11:48 +0000 (UTC) Received: from cell.glebius.int.ru (glebius.int.ru [81.19.69.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "cell.glebius.int.ru", Issuer "cell.glebius.int.ru" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 04D7D1317; Mon, 4 May 2015 09:11:46 +0000 (UTC) Received: from cell.glebius.int.ru (localhost [127.0.0.1]) by cell.glebius.int.ru (8.14.9/8.14.9) with ESMTP id t449BbCx042336 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 4 May 2015 12:11:37 +0300 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.glebius.int.ru (8.14.9/8.14.9/Submit) id t449Bbgw042335; Mon, 4 May 2015 12:11:37 +0300 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Mon, 4 May 2015 12:11:37 +0300 From: Gleb Smirnoff To: Konstantin Belousov Cc: alc@FreeBSD.org, arch@FreeBSD.org Subject: Re: more strict KPI for vm_pager_get_pages() Message-ID: <20150504091137.GH34544@glebius.int.ru> References: <20150430142408.GS546@nginx.com> <20150504082426.GC2390@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150504082426.GC2390@kib.kiev.ua> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 May 2015 09:11:48 -0000 On Mon, May 04, 2015 at 11:24:26AM +0300, Konstantin Belousov wrote: K> Below is the summary of my part of the internal discussion about the changes. Quite short. Is it truncated? K> Traditionally, Unix allows the filesystems to perform the short reads. K> Most fundamental change in the patch removes this freedom from the K> filesystem implementation, and I think that only local filesystems could K> be compliant with the proposed strictness. K> K> IMO, the response from vm_pager_haspages() is only advisory, since K> filesystem might not control the external entities which are the source K> of the required data. That's why remote filesystems use vop_stdbmap() (or similar), which always return zeroes for "after" and "before" hints. -- Totus tuus, Glebius. From owner-freebsd-arch@FreeBSD.ORG Mon May 4 09:50:51 2015 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C57672DC; Mon, 4 May 2015 09:50:51 +0000 (UTC) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "vps1.elischer.org", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 8DABC1823; Mon, 4 May 2015 09:50:51 +0000 (UTC) Received: from Julian-MBP3.local (ppp121-45-241-118.lns20.per4.internode.on.net [121.45.241.118]) (authenticated bits=0) by vps1.elischer.org (8.14.9/8.14.9) with ESMTP id t449ojR1024484 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Mon, 4 May 2015 02:50:48 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <554740EF.7030808@freebsd.org> Date: Mon, 04 May 2015 17:50:39 +0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Konstantin Belousov , Gleb Smirnoff CC: alc@FreeBSD.org, arch@FreeBSD.org Subject: Re: more strict KPI for vm_pager_get_pages() References: <20150430142408.GS546@nginx.com> <20150504082426.GC2390@kib.kiev.ua> In-Reply-To: <20150504082426.GC2390@kib.kiev.ua> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 May 2015 09:50:51 -0000 On 5/4/15 4:24 PM, Konstantin Belousov wrote: > On Thu, Apr 30, 2015 at 05:24:08PM +0300, Gleb Smirnoff wrote: >> Hi! >> >> The reason to write down this patch emerges from the >> projects/sendfile branch, where vm_pager_get_pages() is >> used in the sendfile(2) system call. Although the new >> sendfile works flawlessly, it makes some assumptions >> about vnode_pager that theoretically may not be valid, >> however always hold in our current code. >> >> Going deeper in the problem I have found more important >> points, which yielded in the suggested patch. To start, >> let me display the current KPI assumptions: >> >> 1) vm_pager_get_pages() works on an array of consequtive >> array of pages. Pindex of (n+1)-th pages must be pindex >> of n-th + 1. One page is special, it is called reqpage. >> 2) vm_pager_get_pages() guarantees to swapin only the reqpage, >> and may skip or fail other pages for different reasons, that >> may vary from pager to pager. >> 3) There also is function vm_pager_has_page(), which reports >> availability of a page at given index in the pager, and also >> provides hints on how many consequtive pages before this one >> and after this one can be swapped in in single pager request. >> Most pagers return zeros in these hints. The vnode pager for >> UFS returns a strong promise, that one can later utilize in >> vm_pager_get_pages(). >> 4) All pages must be busied on enter. On exit only reqpage >> will be left busied. The KPI doesn't guarantee that rest >> of the pages is still in place. The pager usually calls >> vm_page_readahead_finish() on them, which can either free, >> or put the page on active/inactive queue, using quite >> a strange approach to choose a queue. >> 5) The pages must not be wired, since vm_page_free() may be >> called on them. However, this is violated by several >> consumers of KPI, relying on lack of errors in the pager. >> Moreover, the swap pager has a special function to skip >> wired pages, while doing the sweep, to avoid this problem. >> So, passing wired pages to swapper is OK, while to the >> reset is not. >> 6) Pagers may replace a page in the object with a new one. >> The sg_pager actually does that. To protect from this >> event, consumers of vm_pager_get_pages() always run >> vm_page_lookup() over the array of pages to relookup the pages. >> However, not all consumers do this. >> >> Speaking of pagers and their consumers: >> - 11 consumers request array of size 1, a single page >> - 3 consumers actually request array >> >> My suggestion is to change the KPI assumptions to the following: >> >> 1) There is no reqpage. All pages are entered busied, all pages >> are returned busied and validated. If pager fails to validate >> all pages it must return error. >> 2) The consumer (not the pager!) is to decide what to do with the >> pages: vm_page_active, vm_page_deactivate, vm_page_flash or just >> vm_page_free them. The consumer also unbusies pages, if it >> wants to. The consumer is free to wire pages before the call. >> 3) Consumers must first query the pager via vm_pager_has_page(), >> and use the after/before hints to limit the size of the >> requested pages array. >> 4) In case if pager replaces pages, it must also update the array, >> so that consumer doesn't need to do relookup. >> >> Doing this sweep, I also noticed that all pagers have a copy-pasted >> code of zeroing invalid regions of partially valid pages. Also, >> many pagers got a set of assertions copy and pasted from each >> other. So, I decided to un-inline the vm_pager_get_pages(), bring >> it to the vm_pager.c file and gather all these copy-pastes >> into one place. >> >> The suggested patch is attached. As expected, it simplifies and >> removes quite a lot of code. >> >> Right now it is tested on UFS only, testing NFS and ZFS is on my list. >> There is one panic known, but it seems unrelated, and Peter pho@ says >> that once it has been seen before. > Below is the summary of my part of the internal discussion about the changes. > > Traditionally, Unix allows the filesystems to perform the short reads. > Most fundamental change in the patch removes this freedom from the > filesystem implementation, and I think that only local filesystems could > be compliant with the proposed strictness. > > IMO, the response from vm_pager_haspages() is only advisory, since > filesystem might not control the external entities which are the source > of the required data. Also since the backing object is not locked, a truncate() may be performed between the operations making the prior return information invalid. Certainly in remote filesystems, possibly on local ones too. > > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > From owner-freebsd-arch@FreeBSD.ORG Mon May 4 09:51:22 2015 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 47137389; Mon, 4 May 2015 09:51:22 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B2B8A182D; Mon, 4 May 2015 09:51:21 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t449pGGD009361 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Mon, 4 May 2015 12:51:16 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t449pGGD009361 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t449pGhP009360; Mon, 4 May 2015 12:51:16 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 4 May 2015 12:51:16 +0300 From: Konstantin Belousov To: Gleb Smirnoff Cc: alc@FreeBSD.org, arch@FreeBSD.org Subject: Re: more strict KPI for vm_pager_get_pages() Message-ID: <20150504095116.GF2390@kib.kiev.ua> References: <20150430142408.GS546@nginx.com> <20150504082426.GC2390@kib.kiev.ua> <20150504091137.GH34544@glebius.int.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150504091137.GH34544@glebius.int.ru> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 May 2015 09:51:22 -0000 On Mon, May 04, 2015 at 12:11:37PM +0300, Gleb Smirnoff wrote: > On Mon, May 04, 2015 at 11:24:26AM +0300, Konstantin Belousov wrote: > K> Below is the summary of my part of the internal discussion about the changes. > > Quite short. Is it truncated? No. IMO, I pointed out the most important point about the patch. If other changes in the patch are unrelated, they must be extracted and discussed (and committed) separately. Due to the fundamental nature of the code being changed, the extra work to make it easier to bisect and detect regressions worth it. > -- > Totus tuus, Glebius. > > K> Traditionally, Unix allows the filesystems to perform the short reads. > K> Most fundamental change in the patch removes this freedom from the > K> filesystem implementation, and I think that only local filesystems could > K> be compliant with the proposed strictness. > K> > K> IMO, the response from vm_pager_haspages() is only advisory, since > K> filesystem might not control the external entities which are the source > K> of the required data. > > That's why remote filesystems use vop_stdbmap() (or similar), which > always return zeroes for "after" and "before" hints. Which precludes useful optimizations, at all, in the future. From owner-freebsd-arch@FreeBSD.ORG Mon May 4 09:55:03 2015 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 76F71469; Mon, 4 May 2015 09:55:03 +0000 (UTC) Received: from cell.glebius.int.ru (glebius.int.ru [81.19.69.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "cell.glebius.int.ru", Issuer "cell.glebius.int.ru" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 04870183F; Mon, 4 May 2015 09:55:01 +0000 (UTC) Received: from cell.glebius.int.ru (localhost [127.0.0.1]) by cell.glebius.int.ru (8.14.9/8.14.9) with ESMTP id t449sxej042498 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 4 May 2015 12:54:59 +0300 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.glebius.int.ru (8.14.9/8.14.9/Submit) id t449sxvM042497; Mon, 4 May 2015 12:54:59 +0300 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Mon, 4 May 2015 12:54:59 +0300 From: Gleb Smirnoff To: Julian Elischer Cc: Konstantin Belousov , alc@FreeBSD.org, arch@FreeBSD.org Subject: Re: more strict KPI for vm_pager_get_pages() Message-ID: <20150504095459.GI34544@glebius.int.ru> References: <20150430142408.GS546@nginx.com> <20150504082426.GC2390@kib.kiev.ua> <554740EF.7030808@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <554740EF.7030808@freebsd.org> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 May 2015 09:55:03 -0000 On Mon, May 04, 2015 at 05:50:39PM +0800, Julian Elischer wrote: J> > IMO, the response from vm_pager_haspages() is only advisory, since J> > filesystem might not control the external entities which are the source J> > of the required data. J> J> Also since the backing object is not locked, a truncate() may be performed J> between the operations making the prior return information invalid. J> Certainly in remote filesystems, possibly on local ones too. Of course, the object shouldn't be unlocked between vm_pager_haspage() and vm_pager_get_pages(). All current consumers do not unlock. -- Totus tuus, Glebius. From owner-freebsd-arch@FreeBSD.ORG Mon May 4 18:38:57 2015 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CB1FAFE4; Mon, 4 May 2015 18:38:57 +0000 (UTC) Received: from cell.glebius.int.ru (glebius.int.ru [81.19.69.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "cell.glebius.int.ru", Issuer "cell.glebius.int.ru" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 553C813A0; Mon, 4 May 2015 18:38:56 +0000 (UTC) Received: from cell.glebius.int.ru (localhost [127.0.0.1]) by cell.glebius.int.ru (8.14.9/8.14.9) with ESMTP id t44IcqK1044291 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 4 May 2015 21:38:52 +0300 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.glebius.int.ru (8.14.9/8.14.9/Submit) id t44Icq2u044290; Mon, 4 May 2015 21:38:52 +0300 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Mon, 4 May 2015 21:38:52 +0300 From: Gleb Smirnoff To: Konstantin Belousov Cc: alc@FreeBSD.org, arch@FreeBSD.org Subject: Re: more strict KPI for vm_pager_get_pages() Message-ID: <20150504183852.GK34544@glebius.int.ru> References: <20150430142408.GS546@nginx.com> <20150504082426.GC2390@kib.kiev.ua> <20150504091137.GH34544@glebius.int.ru> <20150504095116.GF2390@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150504095116.GF2390@kib.kiev.ua> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 May 2015 18:38:57 -0000 On Mon, May 04, 2015 at 12:51:16PM +0300, Konstantin Belousov wrote: K> > On Mon, May 04, 2015 at 11:24:26AM +0300, Konstantin Belousov wrote: K> > K> Below is the summary of my part of the internal discussion about the changes. K> > K> > Quite short. Is it truncated? K> No. IMO, I pointed out the most important point about the patch. If K> other changes in the patch are unrelated, they must be extracted and K> discussed (and committed) separately. Due to the fundamental nature of K> the code being changed, the extra work to make it easier to bisect and K> detect regressions worth it. Of course, I'm not going to commit it as one. Some parts can be separated. K> > K> Traditionally, Unix allows the filesystems to perform the short reads. K> > K> Most fundamental change in the patch removes this freedom from the K> > K> filesystem implementation, and I think that only local filesystems could K> > K> be compliant with the proposed strictness. K> > K> K> > K> IMO, the response from vm_pager_haspages() is only advisory, since K> > K> filesystem might not control the external entities which are the source K> > K> of the required data. K> > K> > That's why remote filesystems use vop_stdbmap() (or similar), which K> > always return zeroes for "after" and "before" hints. K> K> Which precludes useful optimizations, at all, in the future. If in the future there appears a new consumer, who is fine with partial success, and if in the future there would appear a pager, that may fail to satisfy its own hints, then we will simply: 1) Relax assertions at the end of vm_pager_get_pages() and thus allow partial success. 2) Let the new consumer scan returned pages and check their 'valid' bits. 3) All current consumers, who prefer all or none result, would still look at the return value, as they do now. No change for them. So, I'd insist that patch doesn't preclude future optimizations. Instead, the current 'one page valid & busy, all pages unbusied' approach blocks different optimizations. The consumer usually has better idea on what to do with the pages. P.S. Meanwhile Peter pho@ reported 32 hour successful test run with the patch. -- Totus tuus, Glebius. From owner-freebsd-arch@FreeBSD.ORG Wed May 6 01:17:02 2015 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7955F12E for ; Wed, 6 May 2015 01:17:02 +0000 (UTC) Received: from mail-ig0-x22e.google.com (mail-ig0-x22e.google.com [IPv6:2607:f8b0:4001:c05::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 46FB21666 for ; Wed, 6 May 2015 01:17:02 +0000 (UTC) Received: by igblo3 with SMTP id lo3so3948101igb.0 for ; Tue, 05 May 2015 18:17:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:message-id:subject:from:to:content-type; bh=VpS0H9doZa6tLEwiN6ERUdcNlF5tPzS77GYg1j7Bvd0=; b=LFFgLby7TT4t9lxx7FRpxdkiRj7Bs6EgFF9nikfEQm2gXlRGZbJmN6l3viPikPB9sC CXp+ZZKEI7Al5XcAKM06M3p9ryDwp1P+2SV/vDVDHIWC17iSxkpu1kxGlhK47osiuGd/ c4OevHYLGqqu1N91IOBmTVcKBLUKQyWoJLUCuilsM2UfVbP5wIx6TfufvP1kG3hpcce8 lbqp2EvFe0IKF/AWX/OyMvOW7U25JnCDzRgtqwUWyLzvuMu9syzNEVCBnz/tsBh/e32N YEQ3Mv2UPUkirQRkQIA7adzVt8L7pUaQ1JLNYJ+2G+pfykDNtfAAHiNuo2pyfbN67KB4 h4ZA== MIME-Version: 1.0 X-Received: by 10.50.114.9 with SMTP id jc9mr5310892igb.49.1430875021684; Tue, 05 May 2015 18:17:01 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.36.38.133 with HTTP; Tue, 5 May 2015 18:17:01 -0700 (PDT) Date: Tue, 5 May 2015 18:17:01 -0700 X-Google-Sender-Auth: 9xHTrsro2zI1xemKMLsOgUzYrpE Message-ID: Subject: CFR: ACPI SLIT (domain locality) parsing From: Adrian Chadd To: "freebsd-arch@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 May 2015 01:17:02 -0000 Hi, I've put together a simple patch to parse the SLIT (domain locality) information from ACPI. https://reviews.freebsd.org/D2460 I'd like to push this into -HEAD soon so I can continue poking at other NUMA-y bits that I'd like to land in -HEAD. I'd appreciate comments/feedback. Thanks, -adrian From owner-freebsd-arch@FreeBSD.ORG Wed May 6 11:46:00 2015 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3DF1B5FF; Wed, 6 May 2015 11:46:00 +0000 (UTC) Received: from cell.glebius.int.ru (glebius.int.ru [81.19.69.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "cell.glebius.int.ru", Issuer "cell.glebius.int.ru" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id B4E001A07; Wed, 6 May 2015 11:45:58 +0000 (UTC) Received: from cell.glebius.int.ru (localhost [127.0.0.1]) by cell.glebius.int.ru (8.14.9/8.14.9) with ESMTP id t46BjnvF055828 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 6 May 2015 14:45:49 +0300 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.glebius.int.ru (8.14.9/8.14.9/Submit) id t46BjnuC055827; Wed, 6 May 2015 14:45:49 +0300 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Wed, 6 May 2015 14:45:49 +0300 From: Gleb Smirnoff To: kib@FreeBSD.org, alc@FreeBSD.org Cc: arch@FreeBSD.org Subject: Re: more strict KPI for vm_pager_get_pages() Message-ID: <20150506114549.GS34544@FreeBSD.org> References: <20150430142408.GS546@nginx.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="xaMk4Io5JJdpkLEb" Content-Disposition: inline In-Reply-To: <20150430142408.GS546@nginx.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 May 2015 11:46:00 -0000 --xaMk4Io5JJdpkLEb Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi! I'm splitting the patch into a serie. This is step 1: Pagers are responsible to update the array of pages in case if they replace pages in an object. All array entries must be valid, if pager returns VM_PAGER_OK. Note: the only pager that replaces pages is sg_pager, and it does that correctly. -- Totus tuus, Glebius. --xaMk4Io5JJdpkLEb Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="vm_pagers.step_1.diff" Index: sys/dev/drm2/i915/i915_gem.c =================================================================== --- sys/dev/drm2/i915/i915_gem.c (revision 282533) +++ sys/dev/drm2/i915/i915_gem.c (working copy) @@ -3175,9 +3175,6 @@ i915_gem_wire_page(vm_object_t object, vm_pindex_t if (m->valid != VM_PAGE_BITS_ALL) { if (vm_pager_has_page(object, pindex, NULL, NULL)) { rv = vm_pager_get_pages(object, &m, 1, 0); - m = vm_page_lookup(object, pindex); - if (m == NULL) - return (NULL); if (rv != VM_PAGER_OK) { vm_page_lock(m); vm_page_free(m); Index: sys/fs/tmpfs/tmpfs_subr.c =================================================================== --- sys/fs/tmpfs/tmpfs_subr.c (revision 282533) +++ sys/fs/tmpfs/tmpfs_subr.c (working copy) @@ -1320,7 +1320,7 @@ tmpfs_reg_resize(struct vnode *vp, off_t newsize, struct tmpfs_mount *tmp; struct tmpfs_node *node; vm_object_t uobj; - vm_page_t m, ma[1]; + vm_page_t m; vm_pindex_t idx, newpages, oldpages; off_t oldsize; int base, rv; @@ -1367,11 +1367,9 @@ retry: VM_WAIT; VM_OBJECT_WLOCK(uobj); goto retry; - } else if (m->valid != VM_PAGE_BITS_ALL) { - ma[0] = m; - rv = vm_pager_get_pages(uobj, ma, 1, 0); - m = vm_page_lookup(uobj, idx); - } else + } else if (m->valid != VM_PAGE_BITS_ALL) + rv = vm_pager_get_pages(uobj, &m, 1, 0); + else /* A cached page was reactivated. */ rv = VM_PAGER_OK; vm_page_lock(m); Index: sys/kern/kern_exec.c =================================================================== --- sys/kern/kern_exec.c (revision 282533) +++ sys/kern/kern_exec.c (working copy) @@ -955,13 +955,10 @@ exec_map_first_page(imgp) } initial_pagein = i; rv = vm_pager_get_pages(object, ma, initial_pagein, 0); - ma[0] = vm_page_lookup(object, 0); - if ((rv != VM_PAGER_OK) || (ma[0] == NULL)) { - if (ma[0] != NULL) { - vm_page_lock(ma[0]); - vm_page_free(ma[0]); - vm_page_unlock(ma[0]); - } + if (rv != VM_PAGER_OK) { + vm_page_lock(ma[0]); + vm_page_free(ma[0]); + vm_page_unlock(ma[0]); VM_OBJECT_WUNLOCK(object); return (EIO); } Index: sys/kern/uipc_shm.c =================================================================== --- sys/kern/uipc_shm.c (revision 282533) +++ sys/kern/uipc_shm.c (working copy) @@ -187,14 +187,6 @@ uiomove_object_page(vm_object_t obj, size_t len, s if (m->valid != VM_PAGE_BITS_ALL) { if (vm_pager_has_page(obj, idx, NULL, NULL)) { rv = vm_pager_get_pages(obj, &m, 1, 0); - m = vm_page_lookup(obj, idx); - if (m == NULL) { - printf( - "uiomove_object: vm_obj %p idx %jd null lookup rv %d\n", - obj, idx, rv); - VM_OBJECT_WUNLOCK(obj); - return (EIO); - } if (rv != VM_PAGER_OK) { printf( "uiomove_object: vm_obj %p idx %jd valid %x pager error %d\n", @@ -421,7 +413,7 @@ static int shm_dotruncate(struct shmfd *shmfd, off_t length) { vm_object_t object; - vm_page_t m, ma[1]; + vm_page_t m; vm_pindex_t idx, nobjsize; vm_ooffset_t delta; int base, rv; @@ -463,12 +455,10 @@ retry: VM_WAIT; VM_OBJECT_WLOCK(object); goto retry; - } else if (m->valid != VM_PAGE_BITS_ALL) { - ma[0] = m; - rv = vm_pager_get_pages(object, ma, 1, + } else if (m->valid != VM_PAGE_BITS_ALL) + rv = vm_pager_get_pages(object, &m, 1, 0); - m = vm_page_lookup(object, idx); - } else + else /* A cached page was reactivated. */ rv = VM_PAGER_OK; vm_page_lock(m); Index: sys/kern/uipc_syscalls.c =================================================================== --- sys/kern/uipc_syscalls.c (revision 282533) +++ sys/kern/uipc_syscalls.c (working copy) @@ -2026,10 +2026,7 @@ sendfile_readpage(vm_object_t obj, struct vnode *v if (vm_pager_has_page(obj, pindex, NULL, NULL)) { rv = vm_pager_get_pages(obj, &m, 1, 0); SFSTAT_INC(sf_iocnt); - m = vm_page_lookup(obj, pindex); - if (m == NULL) - error = EIO; - else if (rv != VM_PAGER_OK) { + if (rv != VM_PAGER_OK) { vm_page_lock(m); vm_page_free(m); vm_page_unlock(m); Index: sys/vm/vm_fault.c =================================================================== --- sys/vm/vm_fault.c (revision 282533) +++ sys/vm/vm_fault.c (working copy) @@ -680,18 +680,8 @@ vnode_locked: * Found the page. Leave it busy while we play * with it. */ - - /* - * Relookup in case pager changed page. Pager - * is responsible for disposition of old page - * if moved. - */ - fs.m = vm_page_lookup(fs.object, fs.pindex); - if (!fs.m) { - unlock_and_deallocate(&fs); - goto RetryFault; - } - + /* Pager could have changed the page. */ + fs.m = marray[reqpage]; hardfault++; break; /* break to PAGE HAS BEEN FOUND */ } Index: sys/vm/vm_glue.c =================================================================== --- sys/vm/vm_glue.c (revision 282533) +++ sys/vm/vm_glue.c (working copy) @@ -230,7 +230,7 @@ vsunlock(void *addr, size_t len) static vm_page_t vm_imgact_hold_page(vm_object_t object, vm_ooffset_t offset) { - vm_page_t m, ma[1]; + vm_page_t m; vm_pindex_t pindex; int rv; @@ -238,11 +238,7 @@ vm_imgact_hold_page(vm_object_t object, vm_ooffset pindex = OFF_TO_IDX(offset); m = vm_page_grab(object, pindex, VM_ALLOC_NORMAL); if (m->valid != VM_PAGE_BITS_ALL) { - ma[0] = m; - rv = vm_pager_get_pages(object, ma, 1, 0); - m = vm_page_lookup(object, pindex); - if (m == NULL) - goto out; + rv = vm_pager_get_pages(object, &m, 1, 0); if (rv != VM_PAGER_OK) { vm_page_lock(m); vm_page_free(m); @@ -571,7 +567,7 @@ vm_thread_swapin(struct thread *td) { vm_object_t ksobj; vm_page_t ma[KSTACK_MAX_PAGES]; - int i, j, k, pages, rv; + int i, j, pages, rv; pages = td->td_kstack_pages; ksobj = td->td_kstack_obj; @@ -594,8 +590,6 @@ vm_thread_swapin(struct thread *td) panic("vm_thread_swapin: cannot get kstack for proc: %d", td->td_proc->p_pid); vm_object_pip_wakeup(ksobj); - for (k = i; k < j; k++) - ma[k] = vm_page_lookup(ksobj, k); vm_page_xunbusy(ma[i]); } else if (vm_page_xbusied(ma[i])) vm_page_xunbusy(ma[i]); Index: sys/vm/vm_object.c =================================================================== --- sys/vm/vm_object.c (revision 282533) +++ sys/vm/vm_object.c (working copy) @@ -2042,7 +2042,7 @@ vm_object_page_cache(vm_object_t object, vm_pindex boolean_t vm_object_populate(vm_object_t object, vm_pindex_t start, vm_pindex_t end) { - vm_page_t m, ma[1]; + vm_page_t m; vm_pindex_t pindex; int rv; @@ -2050,11 +2050,7 @@ vm_object_populate(vm_object_t object, vm_pindex_t for (pindex = start; pindex < end; pindex++) { m = vm_page_grab(object, pindex, VM_ALLOC_NORMAL); if (m->valid != VM_PAGE_BITS_ALL) { - ma[0] = m; - rv = vm_pager_get_pages(object, ma, 1, 0); - m = vm_page_lookup(object, pindex); - if (m == NULL) - break; + rv = vm_pager_get_pages(object, &m, 1, 0); if (rv != VM_PAGER_OK) { vm_page_lock(m); vm_page_free(m); --xaMk4Io5JJdpkLEb-- From owner-freebsd-arch@FreeBSD.ORG Wed May 6 16:59:55 2015 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E6A8F8B4; Wed, 6 May 2015 16:59:55 +0000 (UTC) Received: from pp2.rice.edu (proofpoint2.mail.rice.edu [128.42.201.101]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 955391FF0; Wed, 6 May 2015 16:59:55 +0000 (UTC) Received: from pps.filterd (pp2.rice.edu [127.0.0.1]) by pp2.rice.edu (8.14.5/8.14.5) with SMTP id t46GtoU9031070; Wed, 6 May 2015 11:59:53 -0500 Received: from mh3.mail.rice.edu (mh3.mail.rice.edu [128.42.199.10]) by pp2.rice.edu with ESMTP id 1u75yagdr1-1; Wed, 06 May 2015 11:59:53 -0500 X-Virus-Scanned: by amavis-2.7.0 at mh3.mail.rice.edu, auth channel Received: from 108-254-203-201.lightspeed.hstntx.sbcglobal.net (108-254-203-201.lightspeed.hstntx.sbcglobal.net [108.254.203.201]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh3.mail.rice.edu (Postfix) with ESMTPSA id 82A26403F6; Wed, 6 May 2015 11:59:52 -0500 (CDT) Message-ID: <554A4887.9080308@rice.edu> Date: Wed, 06 May 2015 11:59:51 -0500 From: Alan Cox User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: Gleb Smirnoff , kib@FreeBSD.org, alc@FreeBSD.org CC: arch@FreeBSD.org Subject: Re: more strict KPI for vm_pager_get_pages() References: <20150430142408.GS546@nginx.com> <20150506114549.GS34544@FreeBSD.org> In-Reply-To: <20150506114549.GS34544@FreeBSD.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 kscore.is_bulkscore=2.43328298512036e-05 kscore.compositescore=0 circleOfTrustscore=0 compositescore=0.714301145562988 suspectscore=3 recipient_domain_to_sender_totalscore=0 phishscore=0 bulkscore=0 kscore.is_spamscore=0 rbsscore=0.714301145562988 recipient_to_sender_totalscore=0 recipient_domain_to_sender_domain_totalscore=0 spamscore=0 recipient_to_sender_domain_totalscore=0 urlsuspectscore=0.714301145562988 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1402240000 definitions=main-1505060201 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 May 2015 16:59:56 -0000 On 05/06/2015 06:45, Gleb Smirnoff wrote: > Hi! > > I'm splitting the patch into a serie. This is step 1: > > Pagers are responsible to update the array of pages in > case if they replace pages in an object. All array entries > must be valid, if pager returns VM_PAGER_OK. > > Note: the only pager that replaces pages is sg_pager, and > it does that correctly. > No, look again, e.g., device_pager. From owner-freebsd-arch@FreeBSD.ORG Wed May 6 17:16:17 2015 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2E080F44; Wed, 6 May 2015 17:16:17 +0000 (UTC) Received: from cell.glebius.int.ru (glebius.int.ru [81.19.69.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "cell.glebius.int.ru", Issuer "cell.glebius.int.ru" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 8AE9D1257; Wed, 6 May 2015 17:16:15 +0000 (UTC) Received: from cell.glebius.int.ru (localhost [127.0.0.1]) by cell.glebius.int.ru (8.14.9/8.14.9) with ESMTP id t46HGBDN057648 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 6 May 2015 20:16:11 +0300 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.glebius.int.ru (8.14.9/8.14.9/Submit) id t46HGBFS057647; Wed, 6 May 2015 20:16:11 +0300 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Wed, 6 May 2015 20:16:11 +0300 From: Gleb Smirnoff To: Alan Cox Cc: kib@FreeBSD.org, alc@FreeBSD.org, arch@FreeBSD.org Subject: Re: more strict KPI for vm_pager_get_pages() Message-ID: <20150506171611.GF56147@glebius.int.ru> References: <20150430142408.GS546@nginx.com> <20150506114549.GS34544@FreeBSD.org> <554A4887.9080308@rice.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <554A4887.9080308@rice.edu> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 May 2015 17:16:17 -0000 On Wed, May 06, 2015 at 11:59:51AM -0500, Alan Cox wrote: A> > Pagers are responsible to update the array of pages in A> > case if they replace pages in an object. All array entries A> > must be valid, if pager returns VM_PAGER_OK. A> > A> > Note: the only pager that replaces pages is sg_pager, and A> > it does that correctly. A> > A> A> No, look again, e.g., device_pager. Yes, now I see it, it goes trough cdev op. But it appears also to work correctly - it updates the page in the array. -- Totus tuus, Glebius.