Computers have become ubiquitous but yet we take the process of computing for granted. Many people are still cannot answer “how does a computer work” even though we interact with many computers daily. Some of us may have an idea that computers operate using the binary system and that the computer hardware consists of “chips” and “memory” but they may not have a deep understanding of how these components actually work and are able to process and store large amount of data. In this article, I will try to provide a detailed explanation of how computer hardware is able to process and store information. I will cover the following:
Explain what semiconductors are and why are they suitable for use in computers
Explain how semiconductors can be used to build things known as “logic gates”, which can in turn be combined to create transistors
Explain how transistors (and other electrical components) can be combined to create computer components like a CPU, memory, clock etc and how these components work together to achieve the computing processes that we take for granted
A lot that follow has been extensively influence by the following books:
I highly recommend both of them.
What are semiconductors?
Semiconductors are materials with electrical conductivity between that of a conductor (metals like Copper, Aluminium) and insulators (rubber, wood, plastic). This partial conductivity (i.e. ability to selectively conduct electricity in only certain states) is a very useful property. Computer processing is based on binary representation of data and the partial conductivity makes semiconductors an ideal candidate for the chips that power modern computers.
Silicon is the most well know semiconductor. Silicon has an atomic number of 14 (i.e. an atom of Silicon has 14 electrons). These 14 electrons are distributed among 3 orbits (around the nucleus of the silicon atom) and these orbits have 2, 8 and 4 electrons respectively (refer to exhibit 3). Typically, the outermost orbit is the most “interesting” orbit for an atom and is responsible for most of its chemical properties. This is true in the case of silicon as well and the presence of 4 electrons in its outermost orbit (also known as the valence shell) is responsible for the semiconductor properties that silicon exhibits.
This property is not unique to silicon and there are other elements that have a similar atomic structure. For e.g. Germanium is right below silicon in the periodic table and has 4 valence electrons too and exhibit semiconductor properties as well. However, silicon is the most popular semiconductor material as it is cheap and easily available (the most common constituent of sand is silica i.e. silicon oxide, which is the main source of industrial silicon).
Why is semiconductor important for the computer industry?
How exactly does the presence of 4 valence electrons make Silicon a good semiconductor? Pure silicon (used in the chip industry) is manufactured as a giant crystal and we will need to do a deep dive into the crystal structure of silicon to understand the source of its semiconducting properties. In a silicon crystal, each silicon atom is surrounded by 4 other silicon atoms. Thus, each pair of electrons (one electron each from the valence shell of 2 neighboring silicon atoms) forms an atomic bond (refer to exhibit 4).
Electricity is essentially a flow of electrons. Metals have 1–2 electrons in their valence shell that can easily be enticed to move to neighboring atoms in the metal’s lattice (when an electric potential or a voltage is applied). As these electrons flow from one metal atom to the other, the metal conducts electricity. Since, all the electrons in silicon valence shell are paired and not available, pure silicon does not conduct electricity. Therefore, impurities are added to pure silicon in a process known as doping to make it conduct electricity and the resultant silicon is known as “doped silicon”.
2 chemical elements are commonly used for doping — boron and phosphorus. Boron has an atomic number of 5 and it has 3 electrons in its valence shell. Therefore, some boron atoms “steal” an electron from a silicon atom in the silicon lattice. This creates an absence of electrons within the silicon crystal; absence of electrons is represented by a “hole” i.e. an empty space, which should have been occupied by an electron. Since, the silicon crystal has “lost” an electron to Boron, it is left with an overall positive charge. Hence, this type of doping results in a p-type semiconductor (p stands for positive).
The doped semiconductor is able to conduct electricity because the “hole” is able to move from one silicon atom to another silicon atom in the lattice, thereby, conducting electricity from one “end” to the other “end” of the crystal (movement of a hole can be considered the opposite of movement of an electron — in either case, there is a flow i.e. an electric current).
The opposite of this effect happens when Silicon is doped with Phosphorus. Phosphorus has an atomic number of 15 (electronic configuration of 2, 8 and 5). It, thus, donates an electron to the silicon crystal, thereby, leaving extra electrons in the lattice. Since, the silicon lattice has gained electrons (which are negatively charged), this type of doping results in a n-type semiconductor (n stands for negative). It is important to note that both p-type and n-type semiconductors conduct electricity as both of them have free particles (either electrons or “holes” that can move through the crystal).
Let’s summarize where we are right now: Pure silicon does not conduct electricity. Addition of impurities (doping) results in p-type or n-type semiconductor, both of which conducts electricity. However, a p-type or n-type semiconductor will conduct electricity all the time. As discussed earlier, we are looking for something that can conduct electricity sometimes and not conduct electricity at other times. This takes us to the next step — creation of a semiconductor junction.
From semiconductors to transistors:
A junction is simply the boundary between 2 different types of semiconductor. For e.g. when we combine a n-type semiconductor with a p-type, we get a n-p junction. The free electrons in the n-type migrate towards the p-type (as the p-type has “holes” that can be filled by these electrons). As the electron concentration at the n-p junction increases, the junction repels additional electrons from moving towards the junction. 2 such junctions can be combined (known as n-p-n junctions) to produce the bedrock of modern computing: a transistor.
Put simply, a transistor is just a switch (albeit with certain properties which makes it suitable for use in electronic circuitry). An external voltage applied to a transistor can control the output current. For e.g. a n-p-n transistor looks like the following:
Essentially, when a positive voltage is applied to the “base”, the transistor will be in the “on” state and the current will start flowing from the collector to the emitter. Similarly, when no voltage is applied to the base, the transistor would be in “off” state and no current will flow. Therefore, the n-p-n transistor works like a light switch. The following video summarizes this operation as well:
Switches (made of transistors) can be combined to form interesting structures or circuits known as logic gates. A logic gate is a collection of switches that can perform a logical function. For e.g. consider an AND logic gate. An AND logic gate has 2 inputs: A and B. If both of them have a voltage (i.e. has a state of 1), then the AND gate will return an output of 1. If even one of them has a zero voltage (i.e. has a state of 0), then the AND gate will return an output of 0.
As shown below (exhibit 7), 2 n-p-n transistors can be combined together to form an AND gate. In the diagram below, the circuit will be complete only when both A and B inputs have a voltage. When the circuit is complete, a current will be flowing through it and a wire drawn from the circuit (labelled A*B or the output of the AND gate) will return a voltage as well. Other logic gates (like OR, NAND etc.) can be obtained using transistors as well.
Time for a quick summary again: we saw how semiconductors can be doped to make them conducting. These doped semiconductors can then be combined to build a transistor — which is nothing but a glorified electric switch. These transistors can then be combined together to form logic gates, which are simple electrical circuits than can perform simple logical functions. Modern processors and chips then combine billions of these transistors (and by extension logic gates) into a computer. It might be difficult to believe but a simple switch is responsible for all the “magic” that computers can perform. In the next article in this series, I will try to explain how combination of logic gates can lead to well known computing operations like CPU processing and memory storage.
How are semiconductors manufactured?
Since, semiconductor manufacturing is CAPEX heavy, these firms employ different manufacturing models. Typically, there are 3 models that are prevalent in the industry:
Integrated manufacturers: These firms design as well as manufacture their own chips. The factories for manufacturing the chips are known as “fabs”. E.g. Intel, Samsung
Fabless manufacturers: These firms design the chips but then outsource the manufacturing to a foundry or the manufacturing unit of integrated manufacturers. E.g. Qualcomm, Nvidia
Foundries: These firms manufacture chips for branded players. E.g. TSMC is the largest foundry firm in the world but does not show up on the list above. This is because they manufacture the chips on behalf of another firm. Hence, the sales of TSMC would be included under the sales of the firms listed above (especially fabless firms like Qualcomm, Nvidia as they do not manufacture any chips in-house).
Silicon is obtained from silica (which is the major constituent of sand). The silica is melted and then a silicon seed is dipped into the mixture and slowly pulled out to obtain a cylinder of pure silicon.
The diameter of the silicon cylinder is a key metric that is tracked in the semiconductor industry with silicon cylinders available in diameters like 200 mm and 300 mm.
The industry is moving towards larger diameter cylinders as shown in exhibit 10 (due to the cost advantage — more small chips can be fit onto a disc cut from a wider cylinder and handling costs are spread over more chips).
Once the cylinder is obtained, it is sliced into thin discs. The chip design is then printed on the disc using a process known as photolithography. In this process, light is passed through a “stencil” (which contains the chip design) to etch or “burn in” the circuit design on the disc. A disc is 200–300 mm in diameter and 1000s of chips will be etched onto it at any given time. A key metric in photolithography is the wavelength of the light that is used for etching the design. The shorter the wavelength, the shorter would be the width of the “lines” etched on the device.
The width of these lines is another key metric and is known as the semiconductor geometry or the transistor size. The semiconductor industry has been innovating to reduce the geometry over the last decades with 14 nm as state of the art geometry currently. Smaller geometries have various advantages: (1) faster chips as electrons have to cover less distance from one point to the other, (2) cheaper devices as more transistors can be fit into the same area i.e. more transistors can be built from the same amount of material and (3) lower power consumption as the transistors that have to be powered are smaller as well.
How are transistors used to manufacture memory?
Memory is needed to store data. However, just a data store is not enough. A computer also needs a system to retrieve the stored data and, therefore, requires every data storage unit to have an address. In this section, I will be talking about primary computer memory or Random Access Memory (RAM). Most modern computers have RAM that is measured in GBs. For e.g. iPhone 11 models have 4GB RAM.
The following diagram (exhibit 1) shows how gates can be combined to store a piece of information.
Firstly, the 4 gates (numbered 1, 2, 3 and 4) shown in the above diagram are NAND gates. A NAND gate takes the output of an AND gate and reverses it. Therefore, a NAND gate will output 0 if and only if both the inputs to it are 1. It will return a 1 for all other cases.
In the diagram, “i” is the input bit i.e. the bit that we want to store in the system. A bit is a unit of information. Therefore, when we say that we can store a bit then it implies that we can store information because information can be encoded in bits. For e.g. the lower case alphabet ‘m’ can be represented by the bits 01101101 (in ASCII). “s” can be considered to be a “set” bit, which can be used for controlling the above circuit and determining if the circuit should store the “i” bit. When s = 1 then we want the circuit to store whatever is being inputted via “i” and when s = 0 then we want the circuit to ignore whatever is being inputted via “i”. “O” is the output or the bit that the above system stores/remembers.
Let us run through some cycles to see how this combination of gates is able to remember information. Let’s start by making s = 1 (i.e. we want to store the number being inputted via i) and i = 1. We can see that doing this changes the output, “o” to 1 as well (i.e. the same as input). We then change s to 0 and i to 0 as well. However, we notice that “o” does not change. Therefore, the circuit remembered that in the previous cycle, it was asked to store the bit 1. In step 3, we change “s” to 1 and keep “i” as zero implying that we want to store the bit 0 in the memory and we are able to achieve that. Note that bit 0 does not mean that there is no information. In Step 4, we change “s” to zero and change “i” to 1 and we see that the circuit again remembers the digit that it stored in step 3.
We can continue repeating this cycle but you can trust me when I say that the above circuit will store “i” when “s” is 1 and will remember the previous input “i” if “s” is zero. It may be simple to overlook this but I want to stress the fact that through a simple combination of a few logic gates (in fact just one type of gate — NAND gate), we were able to achieve a seemingly difficult task of storing/remembering information.
This collection of 4 NAND gates can remember and store only one bit of information. Storing one letter of the alphabet can take about 8 bits (ASCII). Therefore, this circuit is not even enough to store 1 single alphabet and, therefore, is a far cry from our modern memory needs. We can address this shortcoming by adding more gates. For e.g. if we increase the total number of NAND gates to 32 then we will be able to store 8 bits of information, which is enough to store any ASCII character.
As mentioned earlier, storing/remembering is only one task of memory. There is no point in storing stuff if we cannot retrieve it. However, before diving deep into information retrieval, it is important to understand another concept: Register. A register is a combination of a memory storage circuit and another component known as an enabler. There are typically 3 inputs into a register: input bits, set and enable. The input bits, “i”, are just bits or information flowing into the register from other parts of the computer.
Typically, the inputs bits will come in a collection of 8 wires or 8 bits as that make a byte. When “s” is 1 then the register stores that information (using the memory circuit that we discussed above). The register shown below has 8 input bits coming in and, therefore, will need 32 NAND gates to store all of them. When “e” is 1 then the register will output whatever value is stored in the register to the output wires, “o”.
What is the use of a register? Think of a register as a temporary scratch pad/sticky note. You can type something on the sticky note and then decide to leave it in the pad or tear the note and pass it along. Therefore, registers play an important role in serving as temporary storage location and moving information around the computer.
Let’s do a deep dive into the memory retrieval system now. Look at the following circuit (exhibit 4):
The above combination is known as a decoder and it consists of 2 types of gates: The triangular shaped gates with a circle at its end is known as a NOT gate. This gate reverses the input fed into it. As shown in the diagram, when the input is off the output of a NOT gate is on and vice versa. The other type of gate in the circuit is known as an AND gate. An AND gate outputs 1 if and only if all the inputs into it are 1. It returns a zero for all other combinations. The truth table (a table that lays out the output for all possible combinations of the inputs) for the above circuit is shown below:
Therefore, a decoder is able to take various combinations of 2 inputs and will switch on one (and only one) of 4 possible outputs. To generalize this, a decoder can take ’n’ inputs and will switch on one (and only one) of 2^n possible outputs.
What value can such a circuit have? Think of the decoder as fulfilling the job of a coordinate system. For e.g. in the cartesian coordinate, a x and y coordinate is able to identify a point in the plane. Similarly, a decoder can take in the a and b inputs (which can define the address to a memory location) and the output can be used to select the memory location (that the address points to).
We have covered all the components that can be used to build a memory now and let’s look at the complete memory circuit:
There is a lot happening in the above. So let’s take it one at a time. There is a register in the top left hand corner of the circuit. This is a special type of register and is known as the memory address register. This register can store the “address” to a memory location when the “sa” input is on. The address is just a combination of bits like 10011001. This memory address register uses 8 bits i.e. it can reference to 2⁸ = 256 different locations. This memory address register is connected to 2 decoders.
Each of these decoders take in 4 inputs (i.e. memory address is broken into 2 parts and each part is fed into a decoder) and switch on one of 16 wires that come out of it. Therefore, for every unique memory address only one horizontal wire and only one vertical wire is switched on. Therefore, the combination of memory address register and the 2 decoders can select one of 256 available locations or intersections in the above diagram. Now let’s zoom into an intersection point. Every intersection in the above matrix has the following circuit:
As we discussed, once a memory location has been selected then both the vertical and horizontal wires are on. Therefore, the AND gate ‘x’ has an output of 1 as well. Every other grid location will also have a similar circuit but for all those locations the horizontal and the vertical wires will be off (due to the property of the decoder, which selects one and only one output). Therefore, the AND gate ‘x’ for those locations will be switched off and the entire circuit will be passive.
Now let’s look at the bottom of the circuit. There are 3 sets of wires: a set wire, an enable wire and something known as the bus. We will do a deep dive into the bus afterwards but think of the bus as a set of rails/highway on which the information moves around the computer. Wires ‘s’ and ‘e’ are control wires.
The rectangle labelled R is yet another register. It consists of the memory circuit shown in exhibit 1 and is the place where the information is actually stored. The 3 sets of wires at the bottom of the circuit all connect into the rectangle/register R. When “s” is on then whatever is on the “bus” is fed into the register R and is stored in it i.e. information gets “written” into the memory. When “e” is on then whatever is in the register is put onto the bus and is sent to a different part of the computer i.e. information gets “read” from the memory.
Therefore, let’s think through how the memory unit works. First, something (we will see what this is in the next section) sends an address to the memory address register. This address points to a memory location that has data of interest. This address feeds 2 decoders and the combination of this is able to select one of the 256 memory locations that are available.
Each of these memory locations has the capacity to either (1) store 8 bits of information (when s is on) that is sent to it from some other part of the computer via the bus (or the computer information highway) OR (2) send the 8 bits already stored at the memory location to another part of the computer (by putting it on the bus or the information highway) when the enable wire is on. This is in essence what a computer memory is. Information can be systematically written into it or read from it. Therefore, this combination serves as an information storage and retrieval system.
The memory that we saw has a capacity of 256 locations x 8 bits/location (=1 byte/location) = 256 byte = 0.256 kilobyte. This is 0.0000064% of 4GB, which is the RAM memory that is available in an iPhone 11 Pro. This is where the size of a transistor (which makes up the gates) becomes important. A computer chip consists of billions of transistors and through sheer scale, it is able to achieve the typical memory sizes that we have become used to.
We have covered a lot of ground. However, till now we have focused on only storing information and retrieving it. We have not really covered the “compute” part of the computer. We also saw that something is constantly communicating with the memory and sending it memory addresses, control instructions (e.g. whether information needs to be read or written) as well as the data that needs to be written/read. This something is the CPU and we will discuss that next.
How are transistors used to manufacture CPU?
A CPU is responsible for processing all the data that is stored in a computer. The output of this processing can be more data that can be stored. We may think that computers do all sorts of complicated stuff but it is ultimately all simple math operations. And almost all math requires the most basic of operations — addition, subtraction, multiplication, division and some logical calculation e.g. greater than, less than comparisons.
In this section, I will focus on showing how combinations of gates can result in addition. I won’t be showing the circuits for other operations but I hope showing a circuit for addition will help explain the general concept that a seemingly simple combination of gates can achieve compute operations (in addition to storing information that we saw in the previous section).
We will start by a simple circuit shown below. This is known as a half-adder.
This circuit can add 2 binary digits, a and b. This circuit consists of 2 gates: AND and XOR gate. We have already discussed the AND gate. The XOR gate is known as an exclusive OR gate. The XOR gate gives an output of 1 if and only if only one of the inputs into it is 1. For every other case, it will return an output of zero.
It may appear that things are getting complicated here as we are introducing a new type of gate. However, an XOR gate can be constructed from the following combination of 4 NAND gates:
Therefore, an XOR gate can be thought of as a simpler way of representing a collection of 4 NAND gates that are connected as shown as above. Now let’s look at the truth table for a half-adder:
If we carefully look at the above, we will notice that the truth table represents what we will expect if we are adding the 2 input bits. For e.g. adding 2 binary digits that are both zero will result in zero as well. If only one of the input is one then the sum will be one as well. If both the inputs are one then the binary sum will be 10 (because that is the next number that comes in binary as the digit 2 is not allowed). However, each of the wires in our circuit can handle only 1 bit (and not 2 bits that are needed to handle a number like 10). Therefore, the ‘0’ digit of binary 10 is represented by “sum” and the ‘1’ digit of binary 10 is represented by “carry”.
In a generic addition operation, we should assume that the carry from a previous operation may also be coming in and this is incorporated in the full adder circuit shown below:
Before we move on, let’s walk through a typical addition cycle. For an addition, a CPU will first need to know what the inputs ‘a’ and ‘b’ are. This data would be stored in the memory. Therefore, the cpu will first send the memory address of input ‘a’ to the memory and read that value (by putting the value on the bus). The CPU can temporarily store the value of ‘a’ in a register inside the CPU. It will then send the memory address of input ‘b’ to the memory and do the same for ‘b’. The CPU will then enable the registers containing ‘a’ and ‘b’ so that it flows into the full adder circuit shown above to generate the output of the addition.
Typically, this value would need to be stored in the memory as well. Therefore, the CPU will send the output of the addition with the desired memory location (at which the output should be stored) to the memory so that the value can be “set” into that memory location.
Everything that I mentioned above is a simplistic explanation of what is happening inside the computer for a simple addition operation. Computers are able to achieve complicated tasks because millions of such steps are happening every second. But typically, instructions and information are constantly being exchanged between the CPU and the memory.
This sets up the stage for the final component required for our discussion. As we saw above, information is constantly moving around in a computer and it is very important to coordinate/control the timing of this information exchange. This is done by a computer’s clock, which is responsible for orchestrating all these actions. A common analogy is that of an orchestra master. The operations inside a computer are similar to that of a symphony and everything has to be timed perfectly for optimum performance. And this orchestration is done by the clock.
How are transistors used to manufacture Clock?
A clock can also be created using gates (although implementation may be different for modern computers). For e.g. look at the following circuit:
This circuit will keep flip-flopping between 0 and 1 as the output of the NOT gate is fed into it as an input. However, there will be some lag before the output of the clock can change as the electricity will take some time to travel back and forth. This lag can be increased by increasing the length of the wire as shown below:
The key point is that a clock can be made via gates and the output of this is a signal that oscillates between 0 and 1 with a frequency (based on the lag — if we increase the wire length then the lag will increase and this will decrease the frequency) that can be controlled.
A clock’s speed/frequency is typically measured in GHz, which is a simple measure of the oscillations or the cycles that the clock completes in one second. For e.g. iPhone 11 Pro has 2 high powered lightning cores that have a clock speed of 2.66 GHz and 4 low power Thunder cores that operate at 1.82 GHz. We will not go into cores in this article but I wanted to emphasize that modern CPUs have a clock speed of a few GHz.
Typically the higher the clock speed the faster a CPU would be as the CPU will be able to execute commands faster (and, thus, complete more commands in a second). However, just increasing the clock speed of your CPU may not help you in improving your computer’s performance as the bottleneck/constraint may be something else (for e.g. if the CPU has to wait for memory to respond back with the requested data).
I am going to stop here! We have discussed most of the core components that are needed for operating a computer. If there is only one thing that you take away from this article then it should be this: complex computer functions like storing information or computing data can be generated by simple combinations of logic gates, which are in turn combinations of transistors and other simple electrical components.
We did not discuss a number of other parts of a modern computer. For e.g. what role does a hard disk play? How does the display, trackpad, keyboard fit into all of this. While these devices are definitely important…most of them are peripherals and are controlled by the components that we discussed in this article. For e.g. a keyboard ultimately generates a set of binary data/information that is sent to the CPU and the CPU decides what to do with that data and what computation it should perform (if at all). Similarly, a CPU can decide that the information that it just generated should be sent to the display and displayed on the computer screen. The information that you see on the screen is again just a collection of bits/information that the display receives from the CPU and is then rendered in a way so that it can be easily ready by humans.
Everything that I have discussed in this article has been heavily influences by the book “But How Do it Know?” by J Clark Scott. A video explanation of how the parts that we discussed work together can be found below: