Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Jun 2007 22:55:07 +0100
From:      Duncan Barclay <dmlb@dmlb.org>
To:        Mark Jayson Alvarez <mjalvarez@fastmail.fm>
Cc:        freebsd-chat@FreeBSD.ORG
Subject:   Re: Where software meets hardware.. (Hello World program dissected)
Message-ID:  <4682DCBB.1080801@dmlb.org>
In-Reply-To: <1182670310.32703.1196739027@webmail.messagingengine.com>
References:  <200706211633.l5LGXWvG095148@lurza.secnetix.de> <1182670310.32703.1196739027@webmail.messagingengine.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hello Mark,

I'd like to try and move your understanding on - your enthusiasm is 
infectious. In this reply I can only touch on the topics that you raise. 
I would wholeheartedly recommend getting a copy of Hennersy & Patterson 
mentioned in another post - it is excellent.

The first thing I would like to say to help you is to ask you to think 
in terms of voltages and not worry about charge, current and 
capacitance. These last three physical quantities are vitally important 
to the design of the electronics within a computer (and for disk drives 
the magnetic analogues). At the level you are trying to understand 
things voltage is more useful. The electronics within a computer is for 
the most part, all voltage driven and capacitance is, unfortunately, a 
parasitic that detracts from the ideal operation of the circuit. 
Voltages are what is wanted and should be used in the model of the 
underlying circuit that provides most insight to how it performs a 
useful function.

Mark Jayson Alvarez wrote:
> Hi,
> 
> To fully understand things, here's another scenario.
> A "Hello World" program. This time I'm going to dissect all
> the processes, as far as I can.
> 
> 
> Upon boot up, electric charge fills some capacitors in 
> the memory in a certain pattern of "charged and not charged" (1 and 0).
> The pattern that was formed points into "another some capacitors" or
> addresses
> in the memory. This pattern of "charged and not charged" capacitors is
> called the "Interrupt Handler Table". "Another some capacitors" are
> the actual "Interrupt Handlers"

Yes.

> Back to the "Hello World" program.
> 
> Before anything else, first when the computer boots up,
> the operating system is loaded into the memory. The operating
> system also loads the driver for every device it finds in 
> the computer. All of these are in the form of charged and not charged
> capacitor patterns.
 >
> "Hello World" program continued...
> 
> First, I pressed a series of keys in my keyboard then
> electronic current flows into the keyboard port attached
> to the motherboard. Then this port has a circuit connected
> to a particular pin in the CPU. Then the CPU upon getting
> interrupted by the electronic current that has arrived into
> this particular pin, through some sort of transistors/capacitor magic,
> accesses the "Interrupt Handler Table" in the memory and look up the
> address
> that points into the actual "Interrupt Handler" for the keyboard.
> Now that "Interrupt Handler" is also a pattern of "Charged and not
> charged"
> capacitors which represents a binary driver written to the memory 
> by the operating system upon bootup. 
> 
> The processor does all its machine codes stuffs to dissect and process
> this keyboard driver via an Operating System routine, store the result
> first into
> its register and then write it back to the RAM, return the execution and 
> the results back to the operating system routine that called the driver.
> Now the operating system knows that I have actually pressed some keys in 
> the keyboard and do the same processes to echoed everything that I have
> typed into the monitor and also write it on the surface of my hard 
> drive using magnetism.
> 
> 
> Everthing that I have typed, echoed into the monitor, and got written
> into the surface of my hard drive looks like:
> 
> #include <stdio.h>
> 
> main()
> {
>   for(;;)
>       { 
>           printf ("Hello World!\n");
>       }
> }
> 
> 
> Now I run the compiler, read the above series of
> characters which are written on the surface of my hard drive
> as pattern of 1's and 0's or "magnetized and not magnetized =)"
> and write everything back into a binary format or a pattern of
> 1's and 0's, again??

Yes, but these are a different pattern of 1's & 0's to the 1's & 0's 
that represent the characters in your "text file", using a common 
convention called ASCII encoding.

> The process of actually executing the routine in the Operating system
> that writes these binary code into the memory for the processor to
> process
> and run it, is actually running the program itself.

Yes.

> Things are more clearer now. It's all about electric charges, either
> "on" or "off".
> The most essential parts are the processor and the memory. The processor
> do
> all the processing using some sort of transistors and capacitors magic.
> Everything
> that gets processed are written by the processor into the memory in a
> form of 
> electronically charged and not charged capacitors.

Yes.

> The only confusing thing that is left is that binary format thing. I
> thought everything
> that is written in the hard drive is already a series of 1's and 0's. If
> I compile
> the program, it is converted into a machine readable format or a binary
> format. 
> What's the physical difference of binary and non binary format?? The
> only difference
> I can see is that when I open the non binary format in an editor, it is
> quite readable
> unlike the binary format which is full of mixed and unrecognizable
> characters.

Okay, this is where ASCII, high level languages and machine code comes in.

Let's work from the bottom up. The CPU is an electronic machine (a 
"state machine") that can perform a large number of simple operations in 
response to instructions given to it. For example, it can add the 
contents of two memory locations together and store the result in a 
third. The instructions are read by the CPU from memory and are in code. 
The codes are represented by a number from 0 to 4294967295 (2^32-1). So 
maybe the add instruction is represented by the code number 14578. These 
codes are called "machine code". Machine code is the only set of 
instructions that the CPU understands.

Now, machine code is a hassle for humans to understand. Imagine writing 
your "hello world" program as a series of numbers from 0 to 4294967295 - 
it would be very hard to do, and even harder for someone else to read it 
(maybe to review whether you've got it right).

This is where we create models or layers of abstraction. The first step 
is to maybe use a more readable representation of the codes for a human, 
for example, lets use the mnemonic "ADD" to represent code 14578. We can 
then write programs that look like:

	ADD 1235375, 54623783, 57564

which means add the two numbers at locations in memory 1235375 and 
54623783, then store the result in location 57564.

This is called "assembly language" and has been around for a long time. 
A lot of small computers are still programmed in it - for example those 
in the remote controls for your TV, DVD player etc.

Assembly is/was a great step forward from the raw "machine code". My 
mother had to enter raw machine code into computers in the 1960's and it 
was hard work!

The translation from assembly language to machine code used to be done 
by hand, but very quickly programmers wrote programs that could read a 
file that contained assembly language and turn it into machine code - 
these were called assemblers.

People came to realize that assembly language, whilst an improvement, 
wasn't great. So they started to develop "higher level" languages. 
Wouldn't it be easier to write in a text file:

	variable a is at location 1235375
	variable b is at location 54623783
	variable result is at 57564

	result = a + b

So they wrote programs that could convert this into the assembly 
language - a compiler. The early languages were sometime a mix of the 
example above and assembly language.

People then realized the compiler itself could work out where to put the 
variables in memory itself so one would be able to write

	variable a
	variable b
	variable result

	result = a + b

And high level languages were born! Famous early ones were BASIC, 
FORTRAN, PL1, C etc.

Nowadays, modern compilers translate the text file into assembly 
language and then use an assembler to translate these into machine code.

Returning to your question:

> What's the physical difference of binary and non binary format??

I hope you can now see, that the difference is in what the binary 
numbers represent. In a text file the binary number 100001 is 
interpreted by a compiler as the letter "a" - because most computers use 
the ASCII codes to map numbers to letters. Compilers take sequences of 
numbers, interpret them according to the rules (grammar and syntax) of 
the language they understand and convert them to another set of numbers 
representing the machine code instructions.

> 
> Thanks for the time.
> 
> -jay

Duncan

> 
> On Thu, 21 Jun 2007 18:33:32 +0200 (CEST), "Oliver Fromme"
> <olli@lurza.secnetix.de> said:
>> Mark Jayson Alvarez wrote:
>>  > Let's say I have a very simple washing machine program.
>>  > Now it has a timer which the duration of the spinning can be set.
>>  > If I press the 3-minute button, wires beneath will get shorted.
>>  > Electric current will flow into pin number 5 of the parallel cable
>>  > connected to the parallel port of my PC. Now the CPU has a pin
>>  > connected to this port. If it receives let's say 5V, it will stop
>>  > what it's doing and 
>>  > 
>>  > > fetches the
>>  > > address of an "interrupt handler routine" from memory,
>>  > > and jumps to that address (i.e. starts executing
>>  > > instructions from that address).  That handler is
>>  > > usually installed in memory by the operating system.
>>  > > The code checks which device caused the interrupt,
>>  > > and then executes the appropriate routine in the
>>  > > corresponding device driver.
>>  > 
>>  > And when exactly did the Operating system installed this interrupt
>>  > handler??
>>
>> When it boots.  The processor supports an interrupt
>> handler address table.  That's simply a list of memory
>> addresses which is itself stored in memory.  Each
>> kind of interrupt (they're numbered) has an entry
>> in that table that points to the appropriate interrupt
>> handler which has been installed by the OS upon boot.
>>
>> For example, let's say interrupt line 7 is connected
>> to the parallel port.  So when the processer receives
>> a signal on that line, it looks up the address that
>> is stored in entry #7 in the interrupt table.  Then
>> it will execute commands at that memory address, and
>> afterwards it will resume whatever was interupted.
>>
>>  > And suppose this handler runs the driver and the 
>>  > appropriate routine inside it, how did the driver able to convert the
>>  > electric
>>  > current into a machine understandable data and was able to pass it
>>  > to a program and the program receive the data as 3 minutes?
>>
>> It depends.  If you have one interrupt per button,
>> then there's a one-to-one relation ship between
>> buttons and interrupt numbers.  So if you press
>> that 3-minutes-button, let's say it's connected to
>> interrupt pin #7, so the processor will run the
>> handler that has been registered for interrupt #7.
>> That handler is specific to that interrupt and to
>> that button, so it "knows" that the 3-minutes-button
>> has been pressed when it is called (because that's
>> why it was installed for the interrupt in the first
>> place).  There is no need for the driver to "convert
>> the electric current".  The handler is called as a
>> reaction to the interrupt signal, and that reaction
>> in itself contains the information about the press
>> of the button.
>>
>> However -- normally you don't have one interrupt
>> per button, but rather one interrupt per device.
>> Having one interrupt per key (on a keyboard) would
>> be very inefficient.  Instead, there is one interrupt
>> for the whole keyboard (or for all the buttons on a
>> device).  So, any button press will cause the same
>> handler to be executed.  The device driver routine
>> knows that a button has been pressed, but it still
>> has to find out which one.  How does it do that?
>>
>> Well, in simple cases (like embedded systems in a
>> washing machine), the electrical lines from the
>> buttons are connected to I/O pins on the processor,
>> or on separate I/O chip which is connected to the
>> actual processor.  Basically this is similar to an
>> interrupt line, in that it causes a pin to go from
>> 0 V to 5 V (or whatever voltage levels are used).
>> But the difference is that it does not cause an
>> interrupt to occur.  The processor simply ignores
>> those I/O pins during normal operation.  However,
>> the processor supports machine codes that can read
>> the current state of the I/O pins.  If the processor
>> executes such a code (i.e. a certain byte sequence),
>> it copies the current state of the I/O pins into a
>> register, where it can be dissected and examined
>> with other machine instructions.  That state is
>> usually encoded in a binary format, where each bit
>> corresponds to one I/O pin.  A single byte has 8 bit,
>> so it can contain the information of 8 such I/O pins.
>> If a pin is 0 V, the corresponding bit is 0,
>> otherwise it is 1.
>>
>>  > Driver is just a software right?
>>
>> Right.
>>
>>  > I'm sure if I can find out how electric current have been actually
>>  > converted into 1's and 0's I will not have trouble understanding
>>  > how it can be converted the other way around.
>>
>> Actually nothing needs to be converted.  "0" and "1"
>> are just interpretations of different voltage levels.
>>
>>  > It has something to do with registers right? What are this
>>  > registers looks like? A microchip that can get written using
>>  > electric current?
>>
>> Processor register are simply small pieces of memory
>> inside the processor.  They are required for the
>> processor to perform calculations and other things,
>> because they cannot be performed directly in RAM.
>> In order to do anything with data stored in RAM, the
>> processor has to load values into registers, and
>> when it's done, the result have to be stored back
>> into RAM.
>>
>> For example, in order to add two numbers that are
>> stored in memory, the processor loads both of them
>> into two of its registers.  Once they're there, they
>> are added by the ALU (== arithmetic-logial unit,
>> part of the processor), and the result is again
>> stored in a register.  Then the contents of that
>> register are written back to main memory.
>>
>>  > What are these 1's and 0's look like anyway? How are they written in the
>>  > memory? A chemical reaction when electric current flows into the ram?
>>
>> No, it's all electrophysical, not chemical.  Well, a
>> "1" usually looks like 0 V, and a "1" looks like 5V
>> (or 3 V or whatever).  Inside electronic components
>> such as processors, RAM, graphics and network cards
>> etc., bits are almost always represented as voltage
>> levels.
>>
>>  > Data that is written in the RAM differs the way they are written in a
>>  > hard drive or a CD right? But the truth is they are all 1's and 0's?
>>
>> Yes.  All media (RAM, flash, disks, tapes, CD, DVD
>> and even punch cards) have in common that they store
>> data as "0" or "1" in one form or another.  The
>> important property is that the media is capable of
>> having two distinct states, so one of the states is
>> assigned the "0" value and the other the "1" value.
>>
>> For example, CDs have tiny "pits" on the surface.
>> A laser beam measures the width of those pits (small
>> ones and large ones), and a DSP converts that into a
>> sequence of "0" and "1".
>>
>> On hard disks the same information is stored using
>> magnetism.  RAM (DRAM == dynamic RAM) uses tiny
>> capacitors to hold a very small electric charge that
>> represents the bit value.
>>
>> If you want to know more details about how a processor
>> access data in memory, how address bus and data bus
>> works, how a processor is built up from transistor
>> functions, I strongly recommend that you buy a good
>> beginners book of processor design.
>>
>> I remember at school we've built simple electronic
>> components ourselves:  a flipflop (that's a simple
>> 1-bit memory) from two transistors, logical gates
>> (i.e. "and", "or", "not" circuits), bit counters,
>> adders and similar things.  At that time that was
>> very enlightening for me.  I suggest you try
>> something like that, too.  You can buy electronic
>> construction and experimentation kits at toy shops.
>> Don't be afraid that they're intended for children,
>> I know quite some adults who play with things like
>> that once in a while, including myself.  :-)
>>
>> Best regards
>>    Oliver
>>
>> -- 
>> Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
>> Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
>> secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
>> chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart
>>
>> FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd
>>
>> "C++ is the only current language making COBOL look good."
>>         -- Bertrand Meyer





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4682DCBB.1080801>