Ethereum 101: Part 1 - A gentle introduction to Ethereum
Ethereum is an open source, globally decentralized computing infrastructure that executes programs called smart contracts. It uses a blockchain to synchronize and store the system’s state changes. It uses a cryptocurrency - ether - to meter and constrain execution resource costs.
The idea for Ethereum was originally proposed by Vitalik Buterin in a white paper published in 2014. In the white paper, Vitalik first introduces the concept of a “state” and a “state transition system”. From the white paper:
In a standard banking system, for example, the state is a balance sheet, a transaction is a request to move $X from A to B, and the state transition function reduces the value in A's account by $X and increases the value in B's account by $X. If A's account has less than $X in the first place, the state transition function returns an error.
The “state” of a system (like that for a bank) is a snapshot of the bank’s ledger or balance sheet. This consists of the account balance of all the bank’s account holders. The “state” of the bank’s ledger can change only via a transaction. A transaction has to follow certain rules (for e.g. a user cannot transfer more funds from his account than the balance that he has). The “system” that takes the existing “state” (let’s refer to it as state A) of a bank's ledger and applies a transaction to modify the state (let’s refer to the new state as state B) is known as a state transition system.
Vitalik then applies the above terminology to Bitcoin as follows:
From a technical standpoint, the ledger of a cryptocurrency such as Bitcoin can be thought of as a state transition system, where there is a "state" consisting of the ownership status of all existing bitcoins and a "state transition function" that takes a state and a transaction and outputs a new state which is the result.
Exhibit 1 shows how a transaction can modify Bitcoin’s state. The transaction consists of bitcoin transfers from certain accounts (that have been adequately signed i.e. the transactions have been authorized by the account holders). The bitcoins that are being spent (1 bitcoin from account ‘7b53ab84’ and 2 bitcoins from account ‘3ce6f712’) have to be equal to the output.
Vitalik then explains the deficiencies that exist in Bitcoin and its state transition system. One of the biggest issue is that the Bitcoin system can be used for executing very simple state changes (e.g. debit bitcoin from an account, credit bitcoin to another account etc). The system is not capable of handling more complicated state changes like introducing if-then-else statements inside the state transition function to enable more complex use cases for value transfer from one account to another. This was the primary motivation behind creation of Ethereum.
Ethereum is a blockchain with a built-in Turing-complete programming language, allowing anyone to write smart contracts and decentralized applications where they can create their own arbitrary rules for ownership, transaction formats and state transition functions. I unpack this definition below:
“Turing-complete programming language”: This means that any program of any complexity can be computed by Ethereum. As discussed above, Bitcoin’s “script” system can be used for relatively simple use cases and is, therefore, not “Turing complete”
Smart contracts: A smart contract is a program/code that is written on top of Ethereum. This code has some special properties:
It is visible to everyone. Therefore, anyone can audit the code and verify if the code actually does what it claims to do
Anyone can interact/execute the code provided they follow certain rules for running the code. Therefore, smart contracts are permissionless and a user does not needs permission from a company/government to be able to interact with a smart contract
Code in a smart contract is deterministic i.e. outcome of execution of a smart contract is the same for everyone who runs it and no one gets a preferential treatment
Contract code cannot be changed. However, it can be deleted. The contract code can be deleted only if an adequate command/function to delete the contract code is included in the code. If such a function is not included (which can be verified by anyone based on property 1) then users interacting with the smart contract can be rest assured that the contract code will always exist
Vitalik describes smart contracts as cryptographic boxes that contain value and only unlock if certain conditions are met. I will go into more details on smart contracts in the next article in this series.
Decentralized applications: These are also known as dapps. A dapp is a collection of smart contracts and typically a UI interface to make it easy for users to interact with the underlying smart contracts.
Therefore, Ethereum is a blockchain that allows users to write smart contracts (which is just code) that can execute state transitions that are more complicated than some of the simpler transitions that we discussed earlier (e.g. bank accounts, Bitcoin). An example of Ethereum state transition function is shown in exhibit 2.
We will cover this state transition in more details later in this article. Before doing that, we need to understand 2 other Ethereum concepts: (1) Ethereum accounts and (2) Transactions.
There are 2 types of accounts in Ethereum:
Externally owned accounts (EOAs): These accounts are controlled by private keys. These accounts typically belong to retail users. For e.g. users can create such accounts by using popular Ethereum wallets like MetaMask. A private key in Ethereum is just a random number that is less than 115792089237316195423570985008687907853269984665640564039457584007908834671663. Through some complicated math (using properties of elliptic curves), a unique public key is associated with the private key. The “hash” of the public key (after some “formatting”) gives the “address” of the account. Users can share this address with other users to receive ETH (or other “tokens”) from them. An example of a wallet address is: 0xbd5e4d8fcbae9198f6b2373a7cfc770dbc3f0dd3
Contract accounts: A contract account has “smart contract” code. Contract accounts also have addresses like EOA. However, they do not have a private kay. Instead, it is owned and controlled by the logic of its code
The properties of the above type of accounts are described below:
Transactions are the only thing that can trigger a change of state or trigger the execution of a smart contract (which in turn can change the state). Ethereum doesn’t run autonomously and everything starts with a transaction. Transaction contains the following:
Recipient: Consists of a 20 byte Ethereum address. The recipient can be either an EOA or a contract account. When a transaction destination is a contract address, it causes that contract to run, using the transaction and transaction data (described below) as inputs. You can send ether to an address without a private key or contract, thereby burning the ether.
Signature: As discussed earlier, a transaction can be triggered only by an EOA. The owner of the EOA has to sign the transaction using their private key (as only the owner of the EOA can have access to the private key) to confirm that they want to trigger the transaction (and the corresponding state change that will be implemented as part of that)
Nonce: This ensures that the transaction is processed in the order that was intended by the account that triggered the transaction. Nonce is an attribute of the originating address. Every transaction from an originating address has a unique nonce. Transactions are added to the blockchain in increasing order of nonce. With the nonce value included in the transaction data, every transaction is unique, even when sending the same amount of ether to the same receiving address multiple times. Ethereum network processes transactions sequentially, based on the nonce. If you transmit a transaction with nonce 0 and then transmit a transaction with nonce 2, the second transaction will not be included in any block. It will be stored in the Ethereum "mempool", while the Ethereum network waits for the missing nonce to appear. All nodes will assume that the transaction with the missing nonce has been delayed.
Value: Amount of ether that should be transferred from the sender to the recipient. Ether can be sent to an EOA or to a contract
Data: This is an optional field. When your transaction contains data, it is most likely addressed to a contract address. You can also send data to an EOA. However, the interpretation of the data is completely up to the wallet you use to access the EOA. The data included in the transaction can be used by the contract code.
Before going into the last 2 attributes (gas limit, gas price) of a transaction, we need to first understand the concept behind gas. As discussed earlier, Ethereum is Turing complete. It means that any program of any complexity can be computed by Ethereum. This brings some security and resource management challenges. Ethereum can't predict if a smart contract will terminate or how long it will run without actually running it. Whether by accident or on purpose, a smart contract can be created such that it runs forever when a node attempts to validate it. This property can be used by attackers to attack/hijack the Ethereum system by flooding it with “complex” and compute intensive transactions to take down the system (similar to a DDoS attack). To address this, Etheruem introduced a metering mechanism called gas. This ensures that every user has to pay an appropriate price for consuming the system’s resources and makes the cost of attack very high for a malicious attacker. Vitalik explains the concept behind gas fees in his white paper:
In order to prevent accidental or hostile infinite loops or other computational wastage in code, each transaction is required to set a limit to how many computational steps of code execution it can use. The fundamental unit of computation is "gas"; usually, a computational step costs 1 gas, but some operations cost higher amounts of gas because they are more computationally expensive, or increase the amount of data that must be stored as part of the state. There is also a fee of 5 gas for every byte in the transaction data. The intent of the fee system is to require an attacker to pay proportionately for every resource that they consume, including computation, bandwidth and storage; hence, any transaction that leads to the network consuming a greater amount of any of these resources must have a gas fee roughly proportional to the increment.
Gas limit: Maximum number of units of gas the transaction originator is willing to buy/spend in order to complete the transaction. For simple transactions like transfer ether from one EOA to another EOA, the gas amount needed is fixed at 21,000 gas units
Gas price: Gas Price allows the transaction originator to set the price that they are willing to pay in exchange for gas. The price is measured in wei (which is a subunit of ether) per gas. Wallets can adjust the gas price to achieve faster confirmation time. Minimum price that this field can be set to is zero.
We are now ready to understand how a transaction can be used for changing the state of the Ethereum system as shown in exhibit 2. The original state in exhibit 2 (lets refer to it as state A) consists of 4 accounts:
2 x EOA accounts:
Account 14c5f8ba: With a balance of 1024 ether
Account 4096ad65: With a balance of 77 ether
2 x contract accounts:
Account bb75a980: Balance of 5202 eth and some contract code. The contract account also has some data represented as [0, 235235, 0, ALICE, …]. This data storage format is referred to as an array. For simplicity, an array can be thought of as a list/collection of data.
Account 892bf92f: Balance of 0 eth and some contract code
Let’s now look at the transaction in exhibit 2. The transaction originates from account 14c5f8ba. This is expected as only EOA can trigger transactions and account 14c5f8ba is an EOA. Recipient for this transaction is bb75a980, which is a contract account. This implies that the transaction can trigger the contract code. The “value” in the transaction is 10 i.e. 10 eth is being transferred from account 14c5f8ba to account bb75a980. If we look at the future state in the diagram then we can verify that the account balance of 14c5f8ba has been reduced by 10 eth from 1024 eth to 1014 eth and the account balance of the contract account bb75a980 has increased from 5202 eth to 5212 eth due to this value transfer.
We also have some data that consists of: 2, Charlie. The data included in the transaction is included as an array. Array is a type of data structure i.e. a way to organize the data. This array is recognized or referred to as “tx.data[x]” by the contract code. The ‘x’ inside the square bracket is used to refer to the position (index) inside the array. For e.g. for this transaction, the data is represented as follows:
tx.data = [2, Charlie]
The length of this array is 2 as there are 2 data points inside it. An "index" can be used to specify a specific item inside the array.
tx.data =  tx.data = [Charlie]
Now let's look at the code included inside the contract account:
if !contract.storage[tx.data]: contract.storage[tx.data] = tx.data
This contract is referring to 2 arrays:
tx.data array, which consists of the data that is being included in the transaction
contract.storage array, which consists of the data that already is present in the contract account in state A
contract.storage = [0, 235235, 0, ALICE, ….]
contract.storage =  contract.storage =  contract.storage =  contract.storage = [ALICE]
The contract code runs in the following manner:
Check value for contract.storage[tx.data] = contract.storage = 
If the above is 0 then execute: contract.storage[tx.data] = tx.data. Since value in step 1 is 0, step 2 gets executed as follows:
contract.storage[tx.data] = tx.data = [Charlie]
contract.storage = [Charlie]
Therefore, the contract’s storage array will be updated because of the transaction as shown in the future state in the diagram.
Ethereum offers more flexibility than Bitcoin as it is Turing complete and allows highly complex state transition functions
Ethereum consists of 2 types of accounts: EOA (Externally Owned Account) and Contract accounts
EOA are accounts that are typically controlled by retail users (using their private key, which is just a random number less than a very large number)
Contract accounts contain code/program known as smart contract. Smart contracts have some special properties like being permissionless, permanence, being deterministic etc
State can be changed only via a transaction that is triggered by an EOA
Transaction can transfer ETH from one account to another
Transaction can trigger a contract code and cause it to take some action (e.g. updating the values that are stored in an account)