From owner-freebsd-virtualization@FreeBSD.ORG Sun Jul 24 08:31:14 2011 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0083C106564A; Sun, 24 Jul 2011 08:31:14 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-fx0-f44.google.com (mail-fx0-f44.google.com [209.85.161.44]) by mx1.freebsd.org (Postfix) with ESMTP id 033D58FC14; Sun, 24 Jul 2011 08:31:12 +0000 (UTC) Received: by fxe6 with SMTP id 6so5589375fxe.17 for ; Sun, 24 Jul 2011 01:31:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=from:to:cc:subject:sender:date:message-id:user-agent:mime-version :content-type; bh=bKYjvR428Pvdzq8ORs717BgyxhsAekqaCQ3hcz4XYgg=; b=fZDtdnVYRFwEShSlWVnxeKAdpZ96g4mNsu2K6yFKK4+Nk4G/9+iVzFFFw5ewW6/uTn Hj1A4yag7/LjlpJlfM/Q03EilU4p8h30Ja3tBgpW8gzS+R6QGa4KW8OHmwx2DLROn/Rk 2sNfGX9sLsz2zkbdJr7Gi+NxQKDmYQhiMBHyE= Received: by 10.223.21.220 with SMTP id k28mr4847405fab.63.1311494545964; Sun, 24 Jul 2011 01:02:25 -0700 (PDT) Received: from localhost ([95.69.173.122]) by mx.google.com with ESMTPS id a24sm3049960fak.36.2011.07.24.01.02.23 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 24 Jul 2011 01:02:24 -0700 (PDT) From: Mikolaj Golub To: freebsd-virtualization@freebsd.org Sender: Mikolaj Golub Date: Sun, 24 Jul 2011 11:02:21 +0300 Message-ID: <86r55ghzsi.fsf@kopusha.home.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Cc: Kostik Belousov , Marko Zec , "Bjoern A. Zeeb" , Robert Watson , Pawel Jakub Dawidek Subject: FIX: accessing module's virtualized global variables from another module X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Jul 2011 08:31:14 -0000 --=-=-= Hi, Some time ago I reported about this problem (the thread "vnet: acessing module's virtualized global variables from another module"). I got a reply only from Marko but failed to convinced him that it was not issue with curvnet context being not set properly. I will try to explain the issue again, as this time I come with a patch that might be a solution. Below are some things that may be obvious for people who are involved in the subject. Please skip to the last paragraphs then :-). So, kernel and modules have 'set_vnet' linker set and the virtualized global variables (defined via VNET_DEFINE() macro) are placed in this set. This serves two purposes: 1) When a new virtual network stack instance is allocated it is initialized copying the globals from kernel 'set_vnet' linker set. 2) The variable name acts as unique global name by which the variable can be referred to. The location of a per-virtual instance variable is calculated at run-time using accessor macros (VNET_VNET, VNET, ...), like in the example below for ifnet variable in the vnet0: vnet0->vnet_data_base + (uintptr_t) & vnet_entry_ifnet When a vnet is created its vnet_data_base is set accordingly to make this work. This works for virtualized variables from kernel 'set_vnet' linker set, but modules need additional work. When a module is loaded its global variables from 'set_vnet' linker set are copied to the kernel 'set_vnet', which has additional space ('modspace') reserved for this purpose. This makes (1) work. And to make the formula (2) valid for the module's virtual global the linker reallocates all its references according to this formula: (x - ef->vnet_start) + ef->vnet_base (see kern/link_elf.c:elf_relocaddr()). Here is an example. In the ipfw module layer3_chain virtual global is defined: kopusha:~% nm /boot/kernel/ipfw.ko |grep -i 'set_vnet\|layer3_chain' 00010820 A __start_set_vnet 00010b0c A __stop_set_vnet 00010840 d vnet_entry_layer3_chain So it has relative address 0x00010840. The module is loaded at 0x9337f000: kopusha:~% kldstat|grep ipfw.ko 22 2 0x9337f000 11000 ipfw.ko and vnet_entry_layer3_chain is at 0x9337f000 + 0x00010840 = 0x9338f840: (kgdb) p &vnet_entry_layer3_chain $1 = (struct ip_fw_chain *) 0x9338f840 But inside ipfw.ko, the linker relocated the 0x9338f840 references to 0x86dac580, otherwise VNET macros will access wrong location: (kgdb) p *(struct ip_fw_chain *)(vnet0->vnet_data_base + (uintptr_t) &vnet_entry_layer3_chain) Error accessing memory address 0x9908b2c0: Bad address. (kgdb) p *(struct ip_fw_chain *)(vnet0->vnet_data_base + (uintptr_t) 0x86dac580) $2 = {rules = 0x0, reap = 0x872d296c, default_rule = 0x937bdbd0, n_rules = -1823040548, ... So far so good. But what if the global defined in one module is accessed from another module (as it is in the case of layer3_chain, which is accessed from ipfw_nat)? The linker relocates references only for "local" globals (defined in its own module), so inside ipfw_nat the global is accessed using unmodified vnet_entry_layer3_chain location, thus by wrong address. The fix could be to make the linker on a module load recognize "external" virtual globals defined in previously loaded modules and relocate them accordingly. For this we need set_vnet list, where the addresses of modules 'set_vnet' linker sets are stored in. The attach patch implements this and works for me. Also, could someone explain what is kern/link_elf_obj.c for? The patch does not deal with this file, but I see the reloactions here similar to those in link_elf.c. So I might need to do similar modifiactions in link_elf_obj.c too, but I don't know where this code is used (it looks like it is not executed on my box, so I don't know how to test it). -- Mikolaj Golub --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=vnet.vnet_set.patch Index: sys/net/vnet.c =================================================================== --- sys/net/vnet.c (revision 224286) +++ sys/net/vnet.c (working copy) @@ -152,6 +152,11 @@ struct vnet *vnet0; * initialized, in order to propagate the new static initializations to all * existing virtual network stack instances so that the soon-to-be executing * module will find every network stack instance with proper default values. + * + * The addresses of modules 'set_vnet' linker sets are stored in + * vnet_set list. The list is used by the kernel module linker to bind + * references of virtualized global variables defined in another + * modules. */ /* @@ -302,6 +307,102 @@ vnet_destroy(struct vnet *vnet) } /* + * set_vnet list. + */ +MALLOC_DEFINE(M_SET_VNET, "set_vnet", "VNET data"); + +struct set_vnet { + uintptr_t vs_start; + uintptr_t vs_stop; + uintptr_t vs_base; + TAILQ_ENTRY(set_vnet) vs_link; +}; + +static TAILQ_HEAD(set_vnet_head, set_vnet) set_vnet_list; + +static void +set_vnet_insert(struct set_vnet *set) +{ + struct set_vnet *iter; + + TAILQ_FOREACH(iter, &set_vnet_list, vs_link) { + + KASSERT((set->vs_start < iter->vs_start && set->vs_stop < iter->vs_stop) || + (set->vs_start > iter->vs_start && set->vs_stop > iter->vs_stop), + ("vnet sets intersection: to insert: 0x%jx-0x%jx; inserted: 0x%jx-0x%jx", + (uintmax_t)set->vs_start, (uintmax_t)set->vs_stop, + (uintmax_t)iter->vs_start, (uintmax_t)iter->vs_stop)); + + if (iter->vs_start > set->vs_start) { + TAILQ_INSERT_BEFORE(iter, set, vs_link); + break; + } + } + + if (iter == NULL) + TAILQ_INSERT_TAIL(&set_vnet_list, set, vs_link); +} + +static void +set_vnet_remove(struct set_vnet *set) +{ + TAILQ_REMOVE(&set_vnet_list, set, vs_link); +} + +static struct set_vnet * +set_vnet_search(uintptr_t addr) +{ + struct set_vnet *set; + + TAILQ_FOREACH(set, &set_vnet_list, vs_link) { + if (addr < set->vs_start) + return (NULL); + if (addr < set->vs_stop) + return (set); + } + return (NULL); +} + +void +set_vnet_add(uintptr_t start, uintptr_t stop, uintptr_t base) +{ + struct set_vnet *set; + + set = malloc(sizeof(*set), M_SET_VNET, M_WAITOK | M_ZERO); + set->vs_start = start; + set->vs_stop = stop; + set->vs_base = base; + set_vnet_insert(set); +} + +void +set_vnet_delete(uintptr_t start) +{ + struct set_vnet *set; + + set = set_vnet_search(start); + KASSERT(set != NULL, ("deleting unknown vnet set (start = 0x%jx)", + (uintmax_t)start)); + set_vnet_remove(set); + free(set, M_SET_VNET); +} + +int +set_vnet_find(uintptr_t addr, uintptr_t *start, uintptr_t *base) +{ + struct set_vnet *set; + + set = set_vnet_search(addr); + if (set != NULL) { + *start = set->vs_start; + *base = set->vs_base; + return (1); + } else { + return (0); + } +} + +/* * Boot time initialization and allocation of virtual network stacks. */ static void @@ -312,6 +413,7 @@ vnet_init_prelink(void *arg) sx_init(&vnet_sxlock, "vnet_sxlock"); sx_init(&vnet_sysinit_sxlock, "vnet_sysinit_sxlock"); LIST_INIT(&vnet_head); + TAILQ_INIT(&set_vnet_list); } SYSINIT(vnet_init_prelink, SI_SUB_VNET_PRELINK, SI_ORDER_FIRST, vnet_init_prelink, NULL); Index: sys/net/vnet.h =================================================================== --- sys/net/vnet.h (revision 224286) +++ sys/net/vnet.h (working copy) @@ -109,6 +109,20 @@ struct vnet *vnet_alloc(void); void vnet_destroy(struct vnet *vnet); /* + * Functions to manage vnet_set list. + */ +void vnet_set_add(uintptr_t start, uintptr_t end, uintptr_t base); +void vnet_set_delete(uintptr_t start); +int vnet_set_find(uintptr_t addr, uintptr_t *start, uintptr_t *base); + +/* + * Functions to manage set_vnet list. + */ +void set_vnet_add(uintptr_t start, uintptr_t end, uintptr_t base); +void set_vnet_delete(uintptr_t start); +int set_vnet_find(uintptr_t addr, uintptr_t *start, uintptr_t *base); + +/* * The current virtual network stack -- we may wish to move this to struct * pcpu in the future. */ Index: sys/kern/link_elf.c =================================================================== --- sys/kern/link_elf.c (revision 224286) +++ sys/kern/link_elf.c (working copy) @@ -544,6 +544,7 @@ parse_vnet(elf_file_t ef) return (ENOSPC); memcpy((void *)ef->vnet_base, (void *)ef->vnet_start, count); vnet_data_copy((void *)ef->vnet_base, count); + set_vnet_add(ef->vnet_start, ef->vnet_stop, ef->vnet_base); return (0); } @@ -1000,6 +1001,7 @@ link_elf_unload_file(linker_file_t file) if (ef->vnet_base != 0) { vnet_data_free((void *)ef->vnet_base, ef->vnet_stop - ef->vnet_start); + set_vnet_delete(ef->vnet_start); } #endif #ifdef GDB @@ -1438,6 +1440,10 @@ elf_lookup(linker_file_t lf, Elf_Size symidx, int elf_file_t ef = (elf_file_t)lf; const Elf_Sym *sym; const char *symbol; + Elf_Addr addr; +#ifdef VIMAGE + uintptr_t vnet_start, vnet_base; +#endif /* Don't even try to lookup the symbol if the index is bogus. */ if (symidx >= ef->nchains) @@ -1469,7 +1475,13 @@ elf_lookup(linker_file_t lf, Elf_Size symidx, int if (*symbol == 0) return (0); - return ((Elf_Addr)linker_file_lookup_symbol(lf, symbol, deps)); + addr = ((Elf_Addr)linker_file_lookup_symbol(lf, symbol, deps)); + +#ifdef VIMAGE + if (set_vnet_find(addr, &vnet_start, &vnet_base)) + addr = addr - vnet_start + vnet_base; +#endif + return addr; } static void --=-=-=--