Date: Thu, 30 Jul 2009 09:30:29 GMT From: Sylvestre Gallon <syl@FreeBSD.org> To: Perforce Change Reviews <perforce@FreeBSD.org> Subject: PERFORCE change 166781 for review Message-ID: <200907300930.n6U9UTGj029859@repoman.freebsd.org>
next in thread | raw e-mail | index | archive | help
http://perforce.freebsd.org/chv.cgi?CH=166781 Change 166781 by syl@syl_twoflowers on 2009/07/30 09:30:23 Add more tests. Fix typo and english. Affected files ... .. //depot/projects/soc2009/syl_usb/hps_report/DifferentAPI.tex#3 edit .. //depot/projects/soc2009/syl_usb/hps_report/GenericCode.tex#4 edit .. //depot/projects/soc2009/syl_usb/hps_report/Proposition.tex#3 edit .. //depot/projects/soc2009/syl_usb/hps_report/SomeTests.tex#3 edit .. //depot/projects/soc2009/syl_usb/hps_report/report.tex#3 edit Differences ... ==== //depot/projects/soc2009/syl_usb/hps_report/DifferentAPI.tex#3 (text+ko) ==== @@ -1,11 +1,11 @@ \chapter{Differents APIs} -Only three portable Operating System have a USB Function Stack. +Only three portable Operating Systems have a USB Function Stack. In this chapter we will focus on the different API for Adding a new Device Controler Interface on Linux, Windows CE and FreeBSD. \section{Linux} - A Linux UDC (USB DEVICE CONTROLLER) is compound of 3 main structure. + A Linux UDC (USB DEVICE CONTROLLER) is compounded of 3 main structures. The first one is usb\_gadget : \begin{lstlisting} @@ -69,8 +69,8 @@ \section{Windows CE} - Windows ce driver architecture is a little bit different than Unix one. - Driver are always separate in two layer MDD PDD. The MDD is the microsoft + Windows ce driver architecture is a little bit different than the Unix one. + Drivers are always separated in two layers MDD PDD. The MDD is the microsoft code that you can modify. The pdd is the physical layer that you need to implement to have a working driver. Here is the PDD of USBFN in WinCe : @@ -105,7 +105,7 @@ } UFN_PDD_INTERFACE_INFO, *PUFN_PDD_INTERFACE_INFO; \end{lstlisting} -A paste of the MSDN could explain all this symbols : +A paste of the MSDN could explain all these symbols : - pfnDeinit : This function deinitializes the PDD. @@ -159,10 +159,10 @@ - pfnIOControl : This function executes IOCTLs. \section{FreeBSD} - The code is in dev/usb/usb\_controller.h. FreeBSD use the same - structure for both Host and Device Controllers Interfaces. + The code is in dev/usb/usb\_controller.h. FreeBSD uses the same + structure for both Host and device controller interfaces. An USB DCI must implement an usb\_bus\_methods structure. These - structure are shared between DCI and HCI. + structures are shared between DCI and HCI. \begin{lstlisting} struct usb_bus_methods { @@ -241,8 +241,8 @@ }; \end{lstlisting} - For describing possibilities of the different endpoints contained - by the xCI FreeBSD have the structure usb\_hw\_ep\_profile. + To describe the possibilities of the different endpoints contained + by the xCI FreeBSD has the structure usb\_hw\_ep\_profile. \begin{lstlisting} /* ==== //depot/projects/soc2009/syl_usb/hps_report/GenericCode.tex#4 (text+ko) ==== @@ -11,7 +11,7 @@ \end{lstlisting} We can see that only few functions are different from a driver to another. - The different code is most of the time Hardware specific. Lets us describes + The different code is most of the time Hardware specific. Lets us describe these functions : \section{Hardware Dependant Code} @@ -58,15 +58,15 @@ - xxxdci\_start\_standard\_chain Clock\_on, clock\_off, pull\_up, pull\_down, wakeup\_peer can be - factorised. A big part of this function do not move and we can create + factorised. A big part of this function does not move and we can create a generic function called directly after or before Hardware changes. Perhaps there is also something that could be factorised in xxxdci\_uninit - if there are not other possible access to the Hardware that interrupt disabling + if there is no other possible access to the hardware that interrupts disabling in this function. - There are pehaps something to do with xxxdci\_device\_done and xxxdci\_start\_standard\_chain - beacuse the only hardware code present int this function just enable Endpoint interrupt. + There is perhaps something to do with xxxdci\_device\_done and xxxdci\_start\_standard\_chain + beacuse the only hardware code present int these function just enables endpoint interrupt. \section{Hardware Independant Code} @@ -99,17 +99,17 @@ - xxxdci\_xfer\_do\_fifo xxdci\_ep\_init and xxxdci\_xfer\_setup are more or less generic. The only change we can have - between 2 controller afecting this function is the support or no of asynchronous endpoints. - This problem could be resolved if we parse xxxdci\_ep\_profile to know if the device support + between 2 controllers affecting this function is the support of asynchronous endpoints. + This problem could be solved if we parse xxxdci\_ep\_profile to know if the device supports asynchronous endpoints. All the HUB descriptor code are the same between the controller drivers. We can easily factorise it without performance modification. - There is some big change on the way to handle basic control request like GET\_STATUS etc. Linux and - Wince seems to handle That on a generic file who perform basic request during the configuration process. - In FreeBSD all these functionality are present in the controlleur code in the function roothub\_exec. - This function looks like the same on each controller. + There are some big change on the way to handle basic control request like GET\_STATUS etc. Linux and + Wince seem to handle this on a generic file that perform basic request during the configuration process. + In FreeBSD all these functionalities are present in the controlleur code in the function roothub\_exec. + These functions are similar on each controller. There is something for factorising the code for xxxdci\_device\_XXX\_methods. I have implemented it on s3c24xxxdci driver this way : @@ -162,4 +162,4 @@ }; \end{lstlisting} - All the others functions seems to be generic. + All the other functions seem to be generic. ==== //depot/projects/soc2009/syl_usb/hps_report/Proposition.tex#3 (text+ko) ==== @@ -23,15 +23,15 @@ \end{lstlisting} With this Change we can easily implement the xxxdci\_device\_XXX\_methods - with less code following the mothod used for the s3c2xxdci implementation. + with less code following the method used for the s3c2xxdci implementation. \section{roothub} - We can Create A GenericRootHub function who will takes a + We can Create A GenericRootHub function that will takes a struct with callback as parameters. - It will permit the driver to implement roothub\_exec or call + It will enable the driver to implement roothub\_exec or call GenericRootHub. \section{other generic functions} - I Have no idea now how to factorise them, but any advices are welcome ! + I Have no idea now how to factorise them, but advices are welcome ! ==== //depot/projects/soc2009/syl_usb/hps_report/SomeTests.tex#3 (text+ko) ==== @@ -1,7 +1,7 @@ \chapter{Test} -Here is the result of some test used for finding the best implementation -for the Improvement into USB. +Here is the result of some tests used to find the best implementation +for the improvement into USB. \section{ops structure Versus Simple call} @@ -77,86 +77,266 @@ 84e4: e12fff13 bx r3 \end{lstlisting} + Finally there is no great difference between callbacks and simple call. + \section{Single demultiplex vs multiple demultiplex} - Finally there is no great difference between callbacks and simple call. + I have also made some tests between single demultiplex and multiple demultiplex. + Here is the code I have tested : \begin{lstlisting} -int +static int switcha(int toto) { - switch(toto) { - case 3: - break ; - case 4: - break ; - } + switch(toto) { + case 2: + break ; + case 1: + break ; + } } -int +static int switchb(int toto) { - switch(toto) { - case 5: - break ; - case 6: - break ; - } + switch(toto) { + case 1: + break ; + case 2: + break ; + } } -int +int multipleswitch(int toto) { - __asm("nop"); - __asm("nop"); - switch(toto) { - case 1: - switcha(toto -2 ); - break; - case 2: - switchb(toto - 4); - break; - } - __asm("nop"); - __asm("nop"); + __asm("nop"); + __asm("nop"); + switch(toto) { + case 4: + case 3: + switcha(toto -2 ); + break; + case 5: + case 6: + switchb(toto - 4); + break; + } + __asm("nop"); + __asm("nop"); +} + +int +singleswitch(int toto) +{ + __asm("nop"); + __asm("nop"); + switch (toto) { + case 4: + case 3: + switch (toto - 2) { + case 2: + break ; + case 1: + break ; + } + break ; + case 5: + case 6: + switch(toto -4) { + case 1: + break ; + case 2: + break ; + } + break ; + } + __asm("nop"); + __asm("nop"); } + \end{lstlisting} + I have test with this main : + + \begin{lstlisting} + int -singleswitch(int toto) +main() { - __asm("nop"); - __asm("nop"); - switch (toto) { - case 1: - switch (toto - 2) { - case 3: - break ; - case 4: - break ; - } - break ; - case 2: - switch(toto -4) { - case 5: - break ; - case 6: - break ; - } - break ; - } - __asm("nop"); - __asm("nop"); + multipleswitch(42); + singleswitch(42); } + \end{lstlisting} + + With this code gcc dumps exactly the same switch function wich is not used. + But for the multipleswitch the functionn switcha and switchb are also generated if + we not add the attribute static to the function. + + I have also make test with this main : + + \begin{lstlisting} int main() { - singleswitch(a); - singleswitch(b); + multipleswitch(3); + multipleswitch(4); + multipleswitch(5); + multipleswitch(6); + multipleswitch(42); + singleswitch(3); + singleswitch(4); + singleswitch(5); + singleswitch(6); + singleswitch(42); } \end{lstlisting} - With this code gcc dump exactly the same switch function that do nothing. - But for the multipleswitch the functionn switcha and switchb are also generated. + Here is the assembly done by this code for x86 + + \begin{lstlisting} +08048374 <switcha>: + 8048374: 55 push %ebp + 8048375: 89 e5 mov %esp,%ebp + 8048377: 83 ec 04 sub $0x4,%esp + 804837a: c9 leave + 804837b: c3 ret + +0804837c <switchb>: + 804837c: 55 push %ebp + 804837d: 89 e5 mov %esp,%ebp + 804837f: 83 ec 04 sub $0x4,%esp + 8048382: c9 leave + 8048383: c3 ret + +08048384 <multipleswitch>: + 8048384: 55 push %ebp + 8048385: 89 e5 mov %esp,%ebp + 8048387: 83 ec 0c sub $0xc,%esp + 804838a: 90 nop + 804838b: 90 nop + 804838c: 8b 45 08 mov 0x8(%ebp),%eax + 804838f: 89 45 f8 mov %eax,-0x8(%ebp) + 8048392: 83 7d f8 03 cmpl $0x3,-0x8(%ebp) + 8048396: 7c 2c jl 80483c4 <multipleswitch+0x40> + 8048398: 83 7d f8 04 cmpl $0x4,-0x8(%ebp) + 804839c: 7e 08 jle 80483a6 <multipleswitch+0x22> + 804839e: 83 7d f8 06 cmpl $0x6,-0x8(%ebp) + 80483a2: 7f 20 jg 80483c4 <multipleswitch+0x40> + 80483a4: eb 10 jmp 80483b6 <multipleswitch+0x32> + 80483a6: 8b 45 08 mov 0x8(%ebp),%eax + 80483a9: 83 e8 02 sub $0x2,%eax + 80483ac: 89 04 24 mov %eax,(%esp) + 80483af: e8 c0 ff ff ff call 8048374 <switcha> + 80483b4: eb 0e jmp 80483c4 <multipleswitch+0x40> + 80483b6: 8b 45 08 mov 0x8(%ebp),%eax + 80483b9: 83 e8 04 sub $0x4,%eax + 80483bc: 89 04 24 mov %eax,(%esp) + 80483bf: e8 b8 ff ff ff call 804837c <switchb> + 80483c4: 90 nop + 80483c5: 90 nop + 80483c6: c9 leave + 80483c7: c3 ret + +080483c8 <singleswitch>: + 80483c8: 55 push %ebp + 80483c9: 89 e5 mov %esp,%ebp + 80483cb: 83 ec 08 sub $0x8,%esp + 80483ce: 90 nop + 80483cf: 90 nop + 80483d0: 8b 45 08 mov 0x8(%ebp),%eax + 80483d3: 89 45 f8 mov %eax,-0x8(%ebp) + 80483d6: 83 7d f8 03 cmpl $0x3,-0x8(%ebp) + 80483da: 7c 0a jl 80483e6 <singleswitch+0x1e> + 80483dc: 83 7d f8 04 cmpl $0x4,-0x8(%ebp) + 80483e0: 7e 04 jle 80483e6 <singleswitch+0x1e> + 80483e2: 83 7d f8 06 cmpl $0x6,-0x8(%ebp) + 80483e6: 90 nop + 80483e7: 90 nop + 80483e8: c9 leave + 80483e9: c3 ret + + \end{lstlisting} + + Here is the assembly done by this code for armv4i + + \begin{lstlisting} +0000842c <switcha>: + 842c: e1a0c00d mov ip, sp + 8430: e92dd800 push {fp, ip, lr, pc} + 8434: e24cb004 sub fp, ip, #4 ; 0x4 + 8438: e24dd008 sub sp, sp, #8 ; 0x8 + 843c: e50b0010 str r0, [fp, #-16] + 8440: e24bd00c sub sp, fp, #12 ; 0xc + 8444: e89d6800 ldm sp, {fp, sp, lr} + 8448: e12fff1e bx lr + +0000844c <switchb>: + 844c: e1a0c00d mov ip, sp + 8450: e92dd800 push {fp, ip, lr, pc} + 8454: e24cb004 sub fp, ip, #4 ; 0x4 + 8458: e24dd008 sub sp, sp, #8 ; 0x8 + 845c: e50b0010 str r0, [fp, #-16] + 8460: e24bd00c sub sp, fp, #12 ; 0xc + 8464: e89d6800 ldm sp, {fp, sp, lr} + 8468: e12fff1e bx lr + +0000846c <multipleswitch>: + 846c: e1a0c00d mov ip, sp + 8470: e92dd800 push {fp, ip, lr, pc} + 8474: e24cb004 sub fp, ip, #4 ; 0x4 + 8478: e24dd008 sub sp, sp, #8 ; 0x8 + 847c: e50b0010 str r0, [fp, #-16] + 8480: e1a00000 nop (mov r0,r0) + 8484: e1a00000 nop (mov r0,r0) + 8488: e51b3010 ldr r3, [fp, #-16] + 848c: e2433003 sub r3, r3, #3 ; 0x3 + 8490: e3530003 cmp r3, #3 ; 0x3 + 8494: 979ff103 ldrls pc, [pc, r3, lsl #2] + 8498: ea00000c b 84d0 <multipleswitch+0x64> + 849c: 000084ac .word 0x000084ac + 84a0: 000084ac .word 0x000084ac + 84a4: 000084c0 .word 0x000084c0 + 84a8: 000084c0 .word 0x000084c0 + 84ac: e51b3010 ldr r3, [fp, #-16] + 84b0: e2433002 sub r3, r3, #2 ; 0x2 + 84b4: e1a00003 mov r0, r3 + 84b8: ebffffdb bl 842c <switcha> + 84bc: ea000003 b 84d0 <multipleswitch+0x64> + 84c0: e51b3010 ldr r3, [fp, #-16] + 84c4: e2433004 sub r3, r3, #4 ; 0x4 + 84c8: e1a00003 mov r0, r3 + 84cc: ebffffde bl 844c <switchb> + 84d0: e1a00000 nop (mov r0,r0) + 84d4: e1a00000 nop (mov r0,r0) + 84d8: e24bd00c sub sp, fp, #12 ; 0xc + 84dc: e89d6800 ldm sp, {fp, sp, lr} + 84e0: e12fff1e bx lr + +000084e4 <singleswitch>: + 84e4: e1a0c00d mov ip, sp + 84e8: e92dd800 push {fp, ip, lr, pc} + 84ec: e24cb004 sub fp, ip, #4 ; 0x4 + 84f0: e24dd008 sub sp, sp, #8 ; 0x8 + 84f4: e50b0010 str r0, [fp, #-16] + 84f8: e1a00000 nop (mov r0,r0) + 84fc: e1a00000 nop (mov r0,r0) + 8500: e51b3010 ldr r3, [fp, #-16] + 8504: e2433003 sub r3, r3, #3 ; 0x3 + 8508: e3530003 cmp r3, #3 ; 0x3 + 850c: 979ff103 ldrls pc, [pc, r3, lsl #2] + 8510: ea000003 b 8524 <singleswitch+0x40> + 8514: 00008524 .word 0x00008524 + 8518: 00008524 .word 0x00008524 + 851c: 00008524 .word 0x00008524 + 8520: 00008524 .word 0x00008524 + 8524: e1a00000 nop (mov r0,r0) + 8528: e1a00000 nop (mov r0,r0) + 852c: e24bd00c sub sp, fp, #12 ; 0xc + 8530: e89d6800 ldm sp, {fp, sp, lr} + 8534: e12fff1e bx lr + \end{lstlisting} + We can easily conclude that single demultiplex dump less code + than multiple one. ==== //depot/projects/soc2009/syl_usb/hps_report/report.tex#3 (text+ko) ====
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200907300930.n6U9UTGj029859>