From owner-svn-doc-all@freebsd.org Sun Sep 8 20:08:15 2019 Return-Path: Delivered-To: svn-doc-all@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 7BDC2E0366; Sun, 8 Sep 2019 20:08:15 +0000 (UTC) (envelope-from bcr@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 46RMmb3QPcz42Zx; Sun, 8 Sep 2019 20:08:15 +0000 (UTC) (envelope-from bcr@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 5858023CC; Sun, 8 Sep 2019 20:08:15 +0000 (UTC) (envelope-from bcr@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id x88K8FXV016323; Sun, 8 Sep 2019 20:08:15 GMT (envelope-from bcr@FreeBSD.org) Received: (from bcr@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id x88K8FBD016322; Sun, 8 Sep 2019 20:08:15 GMT (envelope-from bcr@FreeBSD.org) Message-Id: <201909082008.x88K8FBD016322@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: bcr set sender to bcr@FreeBSD.org using -f From: Benedict Reuschling Date: Sun, 8 Sep 2019 20:08:15 +0000 (UTC) To: doc-committers@freebsd.org, svn-doc-all@freebsd.org, svn-doc-head@freebsd.org Subject: svn commit: r53386 - head/en_US.ISO8859-1/books/developers-handbook/x86 X-SVN-Group: doc-head X-SVN-Commit-Author: bcr X-SVN-Commit-Paths: head/en_US.ISO8859-1/books/developers-handbook/x86 X-SVN-Commit-Revision: 53386 X-SVN-Commit-Repository: doc MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-doc-all@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "SVN commit messages for the entire doc trees \(except for " user" , " projects" , and " translations" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 08 Sep 2019 20:08:15 -0000 Author: bcr Date: Sun Sep 8 20:08:15 2019 New Revision: 53386 URL: https://svnweb.freebsd.org/changeset/doc/53386 Log: Mass cleanup of textproc/igor warnings including: - use two spaces at sentence start - space before content - wrap long line - start content on same line - straggling - put listing on same line - add blank line after on previous line Modified: head/en_US.ISO8859-1/books/developers-handbook/x86/chapter.xml Modified: head/en_US.ISO8859-1/books/developers-handbook/x86/chapter.xml ============================================================================== --- head/en_US.ISO8859-1/books/developers-handbook/x86/chapter.xml Sun Sep 8 19:40:52 2019 (r53385) +++ head/en_US.ISO8859-1/books/developers-handbook/x86/chapter.xml Sun Sep 8 20:08:15 2019 (r53386) @@ -532,16 +532,16 @@ sys.err: The library approach may seem inconvenient at first because it requires you to produce a separate file your code depends on. But it has many advantages: For one, you only need to write it - once and can use it for all your programs. You can even let + once and can use it for all your programs. You can even let other assembly language programmers use it, or perhaps use one - written by someone else. But perhaps the greatest advantage of + written by someone else. But perhaps the greatest advantage of the library is that your code can be ported to other systems, even by other programmers, by simply writing a new library without any changes to your code. If you do not like the idea of having a library, you can at least place all your system calls in a separate assembly - language file and link it with your main program. Here, again, + language file and link it with your main program. Here, again, all porters have to do is create a new object file to link with your main program. @@ -554,7 +554,7 @@ sys.err: include in your code. Porters of your software will simply write a new include - file. No library or external object file is necessary, yet your + file. No library or external object file is necessary, yet your code is portable without any need to edit the code. @@ -651,111 +651,100 @@ access.the.bsd.kernel: Lines 3-5 are the data: Line 3 starts the data section/segment. Line 4 contains the string "Hello, World!" - followed by a new line (0Ah). Line 5 creates + followed by a new line (0Ah). Line 5 creates a constant that contains the length of the string from line 4 in bytes. - Lines 7-16 contain the code. Note that FreeBSD uses the + Lines 7-16 contain the code. Note that FreeBSD uses the elf file format for its executables, which requires every program to start at the point labeled _start (or, more precisely, the linker expects - that). This label has to be global. + that). This label has to be global. Lines 10-13 ask the system to write hbytes bytes of the hello string to stdout. Lines 15-16 ask the system to end the program with the return - value of 0. The 0. The SYS_exit syscall never returns, so the code ends there. If you have come to &unix; from &ms-dos; assembly language background, you may be used to writing - directly to the video hardware. You will never have to worry - about this in FreeBSD, or any other flavor of &unix;. As far as + directly to the video hardware. You will never have to worry + about this in FreeBSD, or any other flavor of &unix;. As far as you are concerned, you are writing to a file known as - stdout. This can be the video screen, or a + stdout. This can be the video screen, or a telnet terminal, or an actual file, - or even the input of another program. Which one it is, is for + or even the input of another program. Which one it is, is for the system to figure out. - Assembling the Code + + Assembling the Code - Type the code (except the line numbers) in an editor, and save - it in a file named hello.asm. You need - nasm to assemble it. + Type the code (except the line numbers) in an editor, and + save it in a file named hello.asm. You + need nasm to assemble it. - Installing <application>nasm</application> + + Installing <application>nasm</application> If you do not have nasm, type: -&prompt.user; su + &prompt.user; su Password:your root password &prompt.root; cd /usr/ports/devel/nasm &prompt.root; make install &prompt.root; exit &prompt.user; - -You may type make install clean instead of just -make install if you do not want to keep -nasm source code. - + You may type make install clean + instead of just make install if you do + not want to keep nasm source + code. - -Either way, FreeBSD will automatically download -nasm from the Internet, -compile it, and install it on your system. - + Either way, FreeBSD will automatically download + nasm from the Internet, compile it, + and install it on your system. - - -If your system is not FreeBSD, you need to get -nasm from its -home -page. You can still use it to assemble FreeBSD code. - - + + If your system is not FreeBSD, you need to get + nasm from its home + page. You can still use it to assemble FreeBSD + code. + - -Now you can assemble, link, and run the code: - + Now you can assemble, link, and run the code: -&prompt.user; nasm -f elf hello.asm + &prompt.user; nasm -f elf hello.asm &prompt.user; ld -s -o hello hello.o &prompt.user; ./hello Hello, World! &prompt.user; - - - - - + + -Writing &unix; Filters + Writing &unix; Filters - -A common type of &unix; application is a filter—a program -that reads data from the stdin, processes it -somehow, then writes the result to stdout. - + A common type of &unix; application is a filter—a + program that reads data from the stdin, + processes it somehow, then writes the result to + stdout. - -In this chapter, we shall develop a simple filter, and -learn how to read from stdin and write to -stdout. This filter will convert each byte -of its input into a hexadecimal number followed by a -blank space. - + In this chapter, we shall develop a simple filter, and + learn how to read from stdin and write to + stdout. This filter will convert each byte + of its input into a hexadecimal number followed by a blank + space. - -%include 'system.inc' + %include 'system.inc' section .data hex db '0123456789ABCDEF' @@ -793,102 +782,85 @@ _start: .done: push dword 0 - sys.exit - - -In the data section we create an array called hex. -It contains the 16 hexadecimal digits in ascending order. -The array is followed by a buffer which we will use for -both input and output. The first two bytes of the buffer -are initially set to 0. This is where we will write -the two hexadecimal digits (the first byte also is -where we will read the input). The third byte is a -space. - + sys.exit - -The code section consists of four parts: Reading the byte, -converting it to a hexadecimal number, writing the result, -and eventually exiting the program. - + In the data section we create an array called + hex. It contains the 16 hexadecimal digits + in ascending order. The array is followed by a buffer which + we will use for both input and output. The first two bytes of + the buffer are initially set to 0. This + is where we will write the two hexadecimal digits (the first + byte also is where we will read the input). The third byte is + a space. - -To read the byte, we ask the system to read one byte -from stdin, and store it in the first byte -of the buffer. The system returns the number -of bytes read in EAX. This will be 1 -while data is coming, or 0, when no more input -data is available. Therefore, we check the value of -EAX. If it is 0, -we jump to .done, otherwise we continue. - + The code section consists of four parts: Reading the byte, + converting it to a hexadecimal number, writing the result, and + eventually exiting the program. - - -For simplicity sake, we are ignoring the possibility -of an error condition at this time. - - + To read the byte, we ask the system to read one byte from + stdin, and store it in the first byte of + the buffer. The system returns the number + of bytes read in EAX. This + will be 1 while data is coming, or + 0, when no more input data is available. + Therefore, we check the value of EAX. If it is + 0, we jump to .done, + otherwise we continue. - -The hexadecimal conversion reads the byte from the -buffer into EAX, or actually just -AL, while clearing the remaining bits of -EAX to zeros. We also copy the byte to -EDX because we need to convert the upper -four bits (nibble) separately from the lower -four bits. We store the result in the first two -bytes of the buffer. - + + For simplicity sake, we are ignoring the possibility of + an error condition at this time. + - -Next, we ask the system to write the three bytes -of the buffer, i.e., the two hexadecimal digits and -the blank space, to stdout. We then -jump back to the beginning of the program and -process the next byte. - + The hexadecimal conversion reads the byte from the + buffer into EAX, or actually just AL, while clearing the remaining + bits of EAX to zeros. We + also copy the byte to EDX + because we need to convert the upper four bits (nibble) + separately from the lower four bits. We store the result in + the first two bytes of the buffer. - -Once there is no more input left, we ask the system -to exit our program, returning a zero, which is -the traditional value meaning the program was -successful. - + Next, we ask the system to write the three bytes of the + buffer, i.e., the two hexadecimal digits and the blank space, + to stdout. We then jump back to the + beginning of the program and process the next byte. - -Go ahead, and save the code in a file named hex.asm, -then type the following (the ^D means press the -control key and type D while holding the -control key down): - + Once there is no more input left, we ask the system to + exit our program, returning a zero, which is the traditional + value meaning the program was successful. -&prompt.user; nasm -f elf hex.asm + Go ahead, and save the code in a file named + hex.asm, then type the following (the + ^D means press the control key and type + D while holding the control key + down): + + &prompt.user; nasm -f elf hex.asm &prompt.user; ld -s -o hex hex.o &prompt.user; ./hex Hello, World! 48 65 6C 6C 6F 2C 20 57 6F 72 6C 64 21 0A Here I come! 48 65 72 65 20 49 20 63 6F 6D 65 21 0A ^D &prompt.user; - - -If you are migrating to &unix; from &ms-dos;, -you may be wondering why each line ends with 0A -instead of 0D 0A. -This is because &unix; does not use the cr/lf convention, but -a "new line" convention, which is 0A in hexadecimal. - - + + If you are migrating to &unix; from + &ms-dos;, you may be wondering why each + line ends with 0A instead of + 0D 0A. This is because &unix; does not + use the cr/lf convention, but a "new line" convention, which + is 0A in hexadecimal. + - -Can we improve this? Well, for one, it is a bit confusing because -once we have converted a line of text, our input no longer -starts at the beginning of the line. We can modify it to print -a new line instead of a space after each 0A: - + Can we improve this? Well, for one, it is a bit confusing + because once we have converted a line of text, our input no + longer starts at the beginning of the line. We can modify it + to print a new line instead of a space after each + 0A: - -%include 'system.inc' + %include 'system.inc' section .data hex db '0123456789ABCDEF' @@ -935,29 +907,26 @@ _start: .done: push dword 0 - sys.exit - - -We have stored the space in the CL register. We can -do this safely because, unlike µsoft.windows;, &unix; system -calls do not modify the value of any register they do not use -to return a value in. - + sys.exit - -That means we only need to set CL once. We have, therefore, -added a new label .loop and jump to it for the next byte -instead of jumping at _start. We have also added the -.hex label so we can either have a blank space or a -new line as the third byte of the buffer. - + We have stored the space in the CL register. We can do this + safely because, unlike µsoft.windows;, &unix; system + calls do not modify the value of any register they do not use + to return a value in. - -Once you have changed hex.asm to reflect -these changes, type: - + That means we only need to set CL once. We have, therefore, + added a new label .loop and jump to it for + the next byte instead of jumping at _start. + We have also added the .hex label so we can + either have a blank space or a new line as the third byte of + the buffer. -&prompt.user; nasm -f elf hex.asm + Once you have changed hex.asm to + reflect these changes, type: + + &prompt.user; nasm -f elf hex.asm &prompt.user; ld -s -o hex hex.o &prompt.user; ./hex Hello, World! @@ -966,42 +935,33 @@ these changes, type: 48 65 72 65 20 49 20 63 6F 6D 65 21 0A ^D &prompt.user; - -That looks better. But this code is quite inefficient! We -are making a system call for every single byte twice (once -to read it, another time to write the output). - + That looks better. But this code is quite inefficient! We + are making a system call for every single byte twice (once to + read it, another time to write the output). + - + + Buffered Input and Output - -Buffered Input and Output + We can improve the efficiency of our code by buffering our + input and output. We create an input buffer and read a whole + sequence of bytes at one time. Then we fetch them one by one + from the buffer. - -We can improve the efficiency of our code by buffering our -input and output. We create an input buffer and read a whole -sequence of bytes at one time. Then we fetch them one by one -from the buffer. - + We also create an output buffer. We store our output in + it until it is full. At that time we ask the kernel to write + the contents of the buffer to + stdout. - -We also create an output buffer. We store our output in it until -it is full. At that time we ask the kernel to write the contents -of the buffer to stdout. - + The program ends when there is no more input. But we + still need to ask the kernel to write the contents of our + output buffer to stdout one last time, + otherwise some of our output would make it to the output + buffer, but never be sent out. Do not forget that, or you + will be wondering why some of your output is missing. - -The program ends when there is no more input. But we still need -to ask the kernel to write the contents of our output buffer -to stdout one last time, otherwise some of our output -would make it to the output buffer, but never be sent out. -Do not forget that, or you will be wondering why some of your -output is missing. - + %include 'system.inc' - -%include 'system.inc' - %define BUFSIZE 2048 section .data @@ -1092,39 +1052,35 @@ write: add esp, byte 12 sub eax, eax sub ecx, ecx ; buffer is empty now - ret - - -We now have a third section in the source code, named -.bss. This section is not included in our -executable file, and, therefore, cannot be initialized. We use -resb instead of db. -It simply reserves the requested size of uninitialized memory -for our use. - + ret - -We take advantage of the fact that the system does not modify the -registers: We use registers for what, otherwise, would have to be -global variables stored in the .data section. This is -also why the &unix; convention of passing parameters to system calls -on the stack is superior to the Microsoft convention of passing -them in the registers: We can keep the registers for our own use. - + We now have a third section in the source code, named + .bss. This section is not included in our + executable file, and, therefore, cannot be initialized. We + use resb instead of + db. It simply reserves + the requested size of uninitialized memory for our use. - -We use EDI and ESI as pointers to the next byte -to be read from or written to. We use EBX and -ECX to keep count of the number of bytes in the -two buffers, so we know when to dump the output to, or read more -input from, the system. - + We take advantage of the fact that the system does not + modify the registers: We use registers for what, otherwise, + would have to be global variables stored in the + .data section. This is also why the + &unix; convention of passing parameters to system calls on the + stack is superior to the Microsoft convention of passing them + in the registers: We can keep the registers for our own + use. - -Let us see how it works now: - + We use EDI and + ESI as pointers to the next + byte to be read from or written to. We use EBX and ECX to keep count of the number of + bytes in the two buffers, so we know when to dump the output + to, or read more input from, the system. -&prompt.user; nasm -f elf hex.asm + Let us see how it works now: + + &prompt.user; nasm -f elf hex.asm &prompt.user; ld -s -o hex hex.o &prompt.user; ./hex Hello, World! @@ -1133,17 +1089,15 @@ Let us see how it works now: 48 65 72 65 20 49 20 63 6F 6D 65 21 0A ^D &prompt.user; - -Not what you expected? The program did not print the output -until we pressed ^D. That is easy to fix by -inserting three lines of code to write the output every time -we have converted a new line to 0A. I have marked -the three lines with > (do not copy the > in your -hex.asm). - + Not what you expected? The program did not print the + output until we pressed ^D. That is + easy to fix by inserting three lines of code to write the + output every time we have converted a new line to + 0A. I have marked the three lines with + > (do not copy the > in your + hex.asm). - -%include 'system.inc' + %include 'system.inc' %define BUFSIZE 2048 @@ -1238,14 +1192,11 @@ write: add esp, byte 12 sub eax, eax sub ecx, ecx ; buffer is empty now - ret - + ret - -Now, let us see how it works: - + Now, let us see how it works: -&prompt.user; nasm -f elf hex.asm + &prompt.user; nasm -f elf hex.asm &prompt.user; ld -s -o hex hex.o &prompt.user; ./hex Hello, World! @@ -1254,265 +1205,214 @@ Now, let us see how it works: 48 65 72 65 20 49 20 63 6F 6D 65 21 0A ^D &prompt.user; - -Not bad for a 644-byte executable, is it! - + Not bad for a 644-byte executable, is it! - - -This approach to buffered input/output still -contains a hidden danger. I will discuss—and -fix—it later, when I talk about the -dark -side of buffering. - + + This approach to buffered input/output still + contains a hidden danger. I will discuss—and + fix—it later, when I talk about the dark side of + buffering. + - -How to Unread a Character + + How to Unread a Character - -This may be a somewhat advanced topic, mostly of interest to -programmers familiar with the theory of compilers. If you wish, -you may skip to the next -section, and perhaps read this later. - - - -While our sample program does not require it, more sophisticated -filters often need to look ahead. In other words, they may need -to see what the next character is (or even several characters). -If the next character is of a certain value, it is part of the -token currently being processed. Otherwise, it is not. - + + This may be a somewhat advanced topic, mostly of + interest to programmers familiar with the theory of + compilers. If you wish, you may skip to the next + section, and perhaps read this later. + - -For example, you may be parsing the input stream for a textual -string (e.g., when implementing a language compiler): If a -character is followed by another character, or perhaps a digit, -it is part of the token you are processing. If it is followed by -white space, or some other value, then it is not part of the -current token. - + While our sample program does not require it, more + sophisticated filters often need to look ahead. In other + words, they may need to see what the next character is (or + even several characters). If the next character is of a + certain value, it is part of the token currently being + processed. Otherwise, it is not. - -This presents an interesting problem: How to return the next -character back to the input stream, so it can be read again -later? - + For example, you may be parsing the input stream for a + textual string (e.g., when implementing a language + compiler): If a character is followed by another character, + or perhaps a digit, it is part of the token you are + processing. If it is followed by white space, or some other + value, then it is not part of the current token. - -One possible solution is to store it in a character variable, -then set a flag. We can modify getchar to check the flag, -and if it is set, fetch the byte from that variable instead of the -input buffer, and reset the flag. But, of course, that slows us -down. - + This presents an interesting problem: How to return the + next character back to the input stream, so it can be read + again later? - -The C language has an ungetc() function, just for that -purpose. Is there a quick way to implement it in our code? -I would like you to scroll back up and take a look at the -getchar procedure and see if you can find a nice and -fast solution before reading the next paragraph. Then come back -here and see my own solution. - + One possible solution is to store it in a character + variable, then set a flag. We can modify + getchar to check the flag, and if it is + set, fetch the byte from that variable instead of the input + buffer, and reset the flag. But, of course, that slows us + down. - -The key to returning a character back to the stream is in how -we are getting the characters to start with: - + The C language has an ungetc() + function, just for that purpose. Is there a quick way to + implement it in our code? I would like you to scroll back + up and take a look at the getchar + procedure and see if you can find a nice and fast solution + before reading the next paragraph. Then come back here and + see my own solution. - -First we check if the buffer is empty by testing the value -of EBX. If it is zero, we call the -read procedure. - + The key to returning a character back to the stream is + in how we are getting the characters to start with: - -If we do have a character available, we use lodsb, then -decrease the value of EBX. The lodsb -instruction is effectively identical to: - + First we check if the buffer is empty by testing the + value of EBX. If it is + zero, we call the read + procedure. - - mov al, [esi] - inc esi - + If we do have a character available, we use lodsb, then decrease the value of + EBX. The lodsb instruction is effectively + identical to: - -The byte we have fetched remains in the buffer until the next -time read is called. We do not know when that happens, -but we do know it will not happen until the next call to -getchar. Hence, to "return" the last-read byte back -to the stream, all we have to do is decrease the value of -ESI and increase the value of EBX: - + mov al, [esi] + inc esi - -ungetc: + The byte we have fetched remains in the buffer until the + next time read is called. We do not know + when that happens, but we do know it will not happen until the + next call to getchar. Hence, to "return" + the last-read byte back to the stream, all we have to do is + decrease the value of ESI + and increase the value of EBX: + + ungetc: dec esi inc ebx - ret - + ret - -But, be careful! We are perfectly safe doing this if our look-ahead -is at most one character at a time. If we are examining more than -one upcoming character and call ungetc several times -in a row, it will work most of the time, but not all the time -(and will be tough to debug). Why? - + But, be careful! We are perfectly safe doing this if our + look-ahead is at most one character at a time. If we are + examining more than one upcoming character and call + ungetc several times in a row, it will + work most of the time, but not all the time (and will be tough + to debug). Why? - -Because as long as getchar does not have to call -read, all of the pre-read bytes are still in the buffer, -and our ungetc works without a glitch. But the moment -getchar calls read, -the contents of the buffer change. - + Because as long as getchar does not + have to call read, all of the pre-read + bytes are still in the buffer, and our + ungetc works without a glitch. But the + moment getchar calls + read, the contents of the buffer + change. - -We can always rely on ungetc working properly on the last -character we have read with getchar, but not on anything -we have read before that. - + We can always rely on ungetc working + properly on the last character we have read with + getchar, but not on anything we have read + before that. - -If your program reads more than one byte ahead, you have at least -two choices: - + If your program reads more than one byte ahead, you have + at least two choices: - -If possible, modify the program so it only reads one byte ahead. -This is the simplest solution. - + If possible, modify the program so it only reads one byte + ahead. This is the simplest solution. - -If that option is not available, first of all determine the maximum -number of characters your program needs to return to the input -stream at one time. Increase that number slightly, just to be -sure, preferably to a multiple of 16—so it aligns nicely. -Then modify the .bss section of your code, and create -a small "spare" buffer right before your input buffer, -something like this: - + If that option is not available, first of all determine + the maximum number of characters your program needs to return + to the input stream at one time. Increase that number + slightly, just to be sure, preferably to a multiple of + 16—so it aligns nicely. Then modify the + .bss section of your code, and create a + small "spare" buffer right before your input buffer, something + like this: - -section .bss + section .bss resb 16 ; or whatever the value you came up with ibuffer resb BUFSIZE -obuffer resb BUFSIZE - +obuffer resb BUFSIZE - -You also need to modify your ungetc to pass the value -of the byte to unget in AL: - + You also need to modify your ungetc + to pass the value of the byte to unget in AL: - -ungetc: + ungetc: dec esi inc ebx mov [esi], al - ret - + ret - -With this modification, you can call ungetc -up to 17 times in a row safely (the first call will still -be within the buffer, the remaining 16 may be either within -the buffer or within the "spare"). - + With this modification, you can call + ungetc up to 17 times in a row safely + (the first call will still be within the buffer, the remaining + 16 may be either within the buffer or within the + "spare"). + + - + + Command Line Arguments - + Our hex program will be more + useful if it can read the names of an input and output file from + its command line, i.e., if it can process the command line + arguments. But... Where are they? -Command Line Arguments + Before a &unix; system starts a program, it pushes some data on the stack, then + jumps at the _start label of the program. + Yes, I said jumps, not calls. That means the data can be + accessed by reading [esp+offset], or by + simply popping it. - -Our hex program will be more useful if it can -read the names of an input and output file from its command -line, i.e., if it can process the command line arguments. -But... Where are they? - + The value at the top of the stack contains the number of + command line arguments. It is traditionally called + argc, for "argument count." - -Before a &unix; system starts a program, it pushes some -data on the stack, then jumps at the _start -label of the program. Yes, I said jumps, not calls. That means the -data can be accessed by reading [esp+offset], -or by simply popping it. - + Command line arguments follow next, all + argc of them. These are typically referred + to as argv, for "argument value(s)." That + is, we get argv[0], + argv[1], ..., + argv[argc-1]. These are not the actual + arguments, but pointers to arguments, i.e., memory addresses of + the actual arguments. The arguments themselves are + NUL-terminated character strings. - -The value at the top of the stack contains the number of -command line arguments. It is traditionally called -argc, for "argument count." - + The argv list is followed by a NULL + pointer, which is simply a 0. There is + more, but this is enough for our purposes right now. - -Command line arguments follow next, all argc of them. -These are typically referred to as argv, for -"argument value(s)." That is, we get argv[0], -argv[1], ..., -argv[argc-1]. These are not the actual -arguments, but pointers to arguments, i.e., memory addresses of -the actual arguments. The arguments themselves are -NUL-terminated character strings. - + + If you have come from the &ms-dos; + programming environment, the main difference is that each + argument is in a separate string. The second difference is + that there is no practical limit on how many arguments there + can be. + - -The argv list is followed by a NULL pointer, -which is simply a 0. There is more, but this is -enough for our purposes right now. - + Armed with this knowledge, we are almost ready for the next + version of hex.asm. First, however, we + need to add a few lines to + system.inc: - - -If you have come from the &ms-dos; programming -environment, the main difference is that each argument is in -a separate string. The second difference is that there is no -practical limit on how many arguments there can be. - - + First, we need to add two new entries to our list of system + call numbers: - -Armed with this knowledge, we are almost ready for the next -version of hex.asm. First, however, we need to -add a few lines to system.inc: - + %define SYS_open 5 +%define SYS_close 6 - -First, we need to add two new entries to our list of system -call numbers: - + Then we add two new macros at the end of the file: - -%define SYS_open 5 -%define SYS_close 6 - - - -Then we add two new macros at the end of the file: - - - -%macro sys.open 0 + %macro sys.open 0 system SYS_open %endmacro %macro sys.close 0 system SYS_close -%endmacro - +%endmacro - -Here, then, is our modified source code: - + Here, then, is our modified source code: - -%include 'system.inc' + %include 'system.inc' %define BUFSIZE 2048 @@ -1653,234 +1553,192 @@ write: *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***