All posts by Stefan Beyer

Blockchain — A Definition for Software Engineers

Blockchain has become a buzzword grown out of Bitcoin and subsequent cryptocurrencies. The original Bitcoin paper did not use the term blockchain and it took some time for the term to emerge, in order to describe the underlying technology that permits implementing digital currencies and other applications.

Consequently, and also due to the decentralised nature of blockchain communities, there is no official definition of the term. Many definitions out there are aimed at the investment community or the general public, which is fine, but means they lack a technical depth. Other definitions merely reflect the lack of understanding of the author. Those with a background in Computer Science or Software Engineering that would like to understand the underlying technologies need more precise definition.

Therefore, here is a definition aimed at software engineers and computer scientists:

A blockchain is a linked list data structure, implementing an infinite state engine with immutable state transitions, on top of a peer-to-peer network with a byzantine failure model.

Let’s take a look at this definition one concept at a time.

Linked list data structure

The data held in a blockchain is represented in a list of block, with each block linking back to the previous one. The following diagram is from the original Bitcoin paper and shows how data (transctions) is organised in blocks.


Bitcoin block structure

Blocks are linked by including the cryptographic hash of the previous block. All blockchains have a similar way of organising data. However, just defining a blockchain as a linked list of blocks is not enough. Some additional properties are required.

Infinite state engine

Blockchain systems represent state engines. State engines are systems modelled as a series of state through which the system transitions. State engines are termed finite, if there is a finite number of possible system states. The possible states are known in advance. In the case of infinite state engines, there is an endless number of possible states.


Example of a state engine

Blockchains thus, are infinite state engines. Transactions take the system from one state to another. System state may be account balances, as in Bitcoin’s case, represented by the set of UTXOs (unspent transaction outputs). In more general purpose blockchains, such as Ethereum, state can represent any piece of data.

A virtual machine (VM) typically executes transactions to take the system form one state to another. The Bitcoin VM executes a domain specific script language encoded in transaction inputs and outputs. The Ethereum VM executes more complex operations, as it is touring complete (it can be used execute anything that can be modelled computationally), but the concept is the same.

Transaction immutability

Transactions, which we have just defined as state transitions, are immutable in a blockchain system. This means, that once a transaction has been confirmed, i.e. included in a valid block, it cannot be undone. State can only be reversed by issuing another transaction, but history of transactions cannot be modified.

Peer-to-peer network (P2P)

The data structure and the infinite state engine a system implements is no enough to make it a blockchain. A real blockchain executes on a P2P network. Each full node has a copy of the data-structure, and importantly, all nodes reach consensus on the correct version of the data. That is; all nodes agree on the same system state. A blockchain system thus implements various P2P protocols, such as node and neighbour detection, group communication protocols and consensus protocols.

Byzantine failure model

The biggest achievement of blockchains is probably, that they achieve all the above in a byzantine failure model. In distributed systems research a failure model is the assumption on what may go wrong in a system. For instance, the crash failure model assumes that nodes may crash and the the partition failure model assumes that nodes (or groups of nodes) may be temporarily isolated, due to network faults. Both failure models however assume that nodes try to behave correctly, i.e. there are no malicious participants.

In the byzantine failure model we assume that a node may behave in any way, i.e. system state integrity may not be the goal of every node and there may be malicious participants. Thus, no node is assumed to be trusted.

I was working in distributed systems research in the early 2000s and we managed to implement pretty reliable systems with crash and partition failure models, but byzantine failure models were considered extremely difficult. Blockchains present a working solution to this problem, which is why they can actually be used to model financial transactions in trustless systems.

Blockchains achieve operation with a byzantine failure model by implementing strict consensus mechanisms which include financial incentives for nodes to maintain state integrity, such as proof-of-work or proof-of-stake based models implemented by cryptocurrencies. This also explains why blockchains usually come with their own currency and why transactions tend to have an ascociated transaction fee.

Supporting a byzantine failure model is very costly. This is the main reason why blockchain solutions scale badly. It is important to think carefully wether a problem requires a blockchain based solution, or may work on a traditional centralised system or even a distributed system that does not require a byzantine failure model.

Relaxed definitions and technology advances

The above definition is muddied in some systems which are usually also grouped in the blockchain category. These include systems that relax the failure model by including some trusted nodes, for example systems that use proof-of-authority consensus schemes, in which blocks are created by a set of trusted nodes.

At the other end of the scale there are systems with similar properties that replace the underlying data structure. For example there are systems that build on directed acyclic graphs instead of linked lists of blocks, such as RaiBlocks and IOTA.

Blockchain — A Definition for Software Engineers

Blockchain has become a buzzword grown out of Bitcoin and subsequent cryptocurrencies. The original Bitcoin paper did not use the term blockchain and it took some time for the term to emerge, in order to describe the underlying technology that permits implementing digital currencies and other applications.

Consequently, and also due to the decentralised nature of blockchain communities, there is no official definition of the term. Many definitions out there are aimed at the investment community or the general public, which is fine, but means they lack a technical depth. Other definitions merely reflect the lack of understanding of the author. Those with a background in Computer Science or Software Engineering that would like to understand the underlying technologies need more precise definition.

Therefore, here is a definition aimed at software engineers and computer scientists:

A blockchain is a linked list data structure, implementing an infinite state engine with immutable state transitions, on top of a peer-to-peer network with a byzantine failure model.

Let’s take a look at this definition one concept at a time.

Linked list data structure

The data held in a blockchain is represented in a list of block, with each block linking back to the previous one. The following diagram is from the original Bitcoin paper and shows how data (transctions) is organised in blocks.


Bitcoin block structure

Blocks are linked by including the cryptographic hash of the previous block. All blockchains have a similar way of organising data. However, just defining a blockchain as a linked list of blocks is not enough. Some additional properties are required.

Infinite state engine

Blockchain systems represent state engines. State engines are systems modelled as a series of state through which the system transitions. State engines are termed finite, if there is a finite number of possible system states. The possible states are known in advance. In the case of infinite state engines, there is an endless number of possible states.


Example of a state engine

Blockchains thus, are infinite state engines. Transactions take the system from one state to another. System state may be account balances, as in Bitcoin’s case, represented by the set of UTXOs (unspent transaction outputs). In more general purpose blockchains, such as Ethereum, state can represent any piece of data.

A virtual machine (VM) typically executes transactions to take the system form one state to another. The Bitcoin VM executes a domain specific script language encoded in transaction inputs and outputs. The Ethereum VM executes more complex operations, as it is touring complete (it can be used execute anything that can be modelled computationally), but the concept is the same.

Transaction immutability

Transactions, which we have just defined as state transitions, are immutable in a blockchain system. This means, that once a transaction has been confirmed, i.e. included in a valid block, it cannot be undone. State can only be reversed by issuing another transaction, but history of transactions cannot be modified.

Peer-to-peer network (P2P)

The data structure and the infinite state engine a system implements is no enough to make it a blockchain. A real blockchain executes on a P2P network. Each full node has a copy of the data-structure, and importantly, all nodes reach consensus on the correct version of the data. That is; all nodes agree on the same system state. A blockchain system thus implements various P2P protocols, such as node and neighbour detection, group communication protocols and consensus protocols.

Byzantine failure model

The biggest achievement of blockchains is probably, that they achieve all the above in a byzantine failure model. In distributed systems research a failure model is the assumption on what may go wrong in a system. For instance, the crash failure model assumes that nodes may crash and the the partition failure model assumes that nodes (or groups of nodes) may be temporarily isolated, due to network faults. Both failure models however assume that nodes try to behave correctly, i.e. there are no malicious participants.

In the byzantine failure model we assume that a node may behave in any way, i.e. system state integrity may not be the goal of every node and there may be malicious participants. Thus, no node is assumed to be trusted.

I was working in distributed systems research in the early 2000s and we managed to implement pretty reliable systems with crash and partition failure models, but byzantine failure models were considered extremely difficult. Blockchains present a working solution to this problem, which is why they can actually be used to model financial transactions in trustless systems.

Blockchains achieve operation with a byzantine failure model by implementing strict consensus mechanisms which include financial incentives for nodes to maintain state integrity, such as proof-of-work or proof-of-stake based models implemented by cryptocurrencies. This also explains why blockchains usually come with their own currency and why transactions tend to have an ascociated transaction fee.

Supporting a byzantine failure model is very costly. This is the main reason why blockchain solutions scale badly. It is important to think carefully wether a problem requires a blockchain based solution, or may work on a traditional centralised system or even a distributed system that does not require a byzantine failure model.

Relaxed definitions and technology advances

The above definition is muddied in some systems which are usually also grouped in the blockchain category. These include systems that relax the failure model by including some trusted nodes, for example systems that use proof-of-authority consensus schemes, in which blocks are created by a set of trusted nodes.

At the other end of the scale there are systems with similar properties that replace the underlying data structure. For example there are systems that build on directed acyclic graphs instead of linked lists of blocks, such as RaiBlocks and IOTA.

Ethereum Development Guide — Part 1

Developing an Ethereum Document Certification Application

Introduction

Document certification, also known as timestamping or proof of existence, is one of the most obvious use cases for blockchain technology beyond digital currencies. Document certification consists in saving a tamper-proof timestamped fingerprint of a document (or binary file) on the blockchain. This basically serves as proof that the document existed in a certain version at a certain time and can be used to prove integrity of the file. That is, you can prove that a document has not been modified since its certification. Use cases cover registering private contracts, protecting copyright, sealing log files and any other case where file integrity is important.

In this tutorial we will look at how to create an Ethereum contract that allows to store data on the blockchain and read it, by using document certification as an example. To do so, we will create a simple smart contract that saves an SHA-256 hash of a file on the blockchain, together with a timestamp. SHA-256 hashes uniquely identify sets of data by means of a cryptographic hash function. We will not go into detail on hash functions and cryptography here, but you should know that a SHA-256 hash is a 32-byte fingerprint derived from the input data. The reverse operation, i.e. calculating the data from the hash value, is not feasible, so data protection is a welcome side-effect.

Document certification is fairly easy to implement, which means it is ideal for explaining certain concepts and techniques without unnecessary complexity, whilst still being more useful than typical “Hello World” examples.

Prerequisites and Learning outcomes

To follow this tutorial you need some basic understanding of the command line, text editing and a basic understanding of how blockchains work. You should have Node.js installed on your system and know some Javascript. After following this tutorial you will know how to:

  • Write a simple smart contract for the Ethereum blockchain
  • Store and retrieve data on the blockchain
  • Understand the difference between a transaction and a call
  • Deploy and test a smart contract in a local test environment
  • Use basic functionality of a professional Ethereum development framework

Developing the smart contract

We will use the Solidity programming language to write our contract and the Truffle framework to ease development.

Install Truffle with the following command (the -g option may require root privileges on your system):

npm install -g truffle

Now, we are ready to initialize our smart contract development project. In a new directory type the following:

truffle init

This creates various directories and files. For now we will create a Solidity source file in the contracts directory. We can use truffle to create the file with some scaffolding code for us:

truffle create contract Notary

This creates a file called Notary.sol in the contracts directory. In the same directory there is also a Migrations.sol file. This is a contract used by truffle to aid blockchain deployment. You should leave this file untouched. Let’s edit Notary.sol to contain the following code:

pragma solidity ^0.4.4;
contract Notary {
struct Record {
uint mineTime;
uint blockNumber;
}
mapping (bytes32 => Record) private docHashes;
function Notary() public {
// constructor
}
function addDocHash (bytes32 hash) public {
Record memory newRecord = Record(now, block.number);
docHashes[hash] = newRecord;
}
function findDocHash (bytes32 hash) public constant returns(uint, uint) {
return (docHashes[hash].mineTime, docHashes[hash].blockNumber);
}
}

This is all the Solidity code we require for this application. First of all, we use a pragma expression to specify a minimum compiler version (0.4.4) for this code. The code should compile with anything below version 0.5.0.

Next we use declare the Notary contract. Inside the contract we declare a struct datatype, instances of which will be used to store records of the documents we wish to fingerprint. The struct has two entries of type uint, which is an unsigned 256-bit integer. The fields are a timestamp and the block number at which the transaction supplying the hash value of our files is mined.

In the next line we create the following state variable:

mapping (bytes32 => Record) private docHashes;

This will actually be stored on the blockchain and maps 32-byte values, our SHA-256 hashes, to an instance of our previously declared Record struct. A mapping is essentially a hash table, which you may know from other programming languages. The SHA-256 hash will actually be used as the key to find the corresponding record.

The next piece of code is the constructor of our contract. Constructors are called at contract deployment and serve for initialisation tasks. As our contract is very simple we do not need to do anything special here, so we just leave the constructor empty.

Next, we declare the function that is used to store records on the blockchain:

function addDocHash (bytes32 hash) public {
Record memory newRecord = Record(now, block.number);
docHashes[hash] = newRecord;
}

Note that the function takes a hash value as an argument. We could send the actual file content to the contract and calculate the hash value in the contract code, but this would be a poorer design decision for a number of reasons: First of all, contract execution has an associated cost in gas, which is translated to Ether via the gas price, and is charged to the caller of the transaction. The more work to be done, the costlier the transaction will be. Secondly, users should not have to send their files across the network. There is no reason for user data to leave the owner’s machine, as SHA-256 hashes can be calculated locally. Finally, independent of transaction fees, sending large files over the network is just much more inefficient than sending the SHA-256 hash. As can be seen above, the addDocHash function is declared public, so that it can be accessed externally. The memory keyword used in the local Record declaration means that it won’t be saved to storage. Storing it is eventually achieved by saving the data to the docHashes mapping, which is an in-storage state variable.

When the addDocHash function is called in a transaction, the same transaction is executed by all nodes trying to include the transaction in a new block. The winning miner node that eventually seals the block sets the correct block number and mine time. Note, that whilst the timestamp can be manipulated by the miner in theory, it still needs to be larger than the previous block’s and lower than the next block’s timestamp. Therefore, any discrepancies will be minimal.

Having provided a way to save a hash, we now need a way to verify wether a hash exists on the blockchain and retrieve the corresponding record:

function findDocHash (bytes32 hash) public constant returns(uint, uint) {
return (docHashes[hash].mineTime, docHashes[hash].blockNumber);
}

This function simply returns the corresponding mine time and block number for a given hash as a tuple (Solidity can have multiple return values). If a hash is not in the mapping the code will return (0,0). We could in fact add a test wether the hash exists and throw an exception, but as (0,0) is a pretty clear indicator for failure to find a hash, we can deal with this off-chain in the client code. A good rule of thumb is to only include code that benefits from the blockchain’s properties in your contract. The rest is better off in client side code.

One important fact to notice is the constant keyword in the findDocHash function declaration. This indicates that the function does not alter state and can therefore be executed in a call, rather than a transaction. Calls may read state of a local Ethereum node and do not have to be propagated through the network and mined. They therefore do not require any gas and are free to use.

To compile this code with truffle type:

truffle compile

This creates a JSON files in the build directory of our contracts. Feel free to look at Notary.json. We will discuss the content of the file further on this series of articles. For now, we let Truffle handle the file.

Testing the contract

To test our contract we will deploy it onto an in-memory Ethereum blockchain simulator. It would be unwise to directly deploy our contract onto the real network without prior testing. We will use the Ganache tool for testing. Installation of the tool is straightforward. Once executed, Ganache presents the following graphical user interface:

As you can see there are a number of accounts created automatically, each of of which hold plenty of test Ether. You can also see details on the chain’s blocks and transactions, which is useful for testing.

Now we will deploy our contract onto the test blockchain. To do so, we use truffle to first create a migration file for our Notary contract:

truffle create migration Notary

This will create a new Javascript file in the migrations directory with some scaffolding code. Edit this file to contain the following code:

var NotaryContract = artifacts.require("Notary");
module.exports = function(deployer) {
deployer.deploy(NotaryContract);
};

The code tells Truffle to obtain a reference to the Notary contract and deploy it. However, before we can execute this code and proceed with the actual deployment, we have to configure Truffle to use our local test blockchain for deployment.

To do so we need to edit the file truffle.js which is the main configuration file, and add the following content:

module.exports = {
networks: {
development: {
host: "localhost",
port: 7545,
network_id: "*"
}
}
};

This code tells truffle to look for an Ethereum node’s RPC interface on port 7545 on localhost and deploy onto the network we find there, whatever the network id (Ethereum networks are identified by ids, with 1 being the oficial Ethereum main network). Ganache is listening on this port by default. We have also named this deployment configuration development, so that we can add further networks with different names later on.

Now we can deploy the contract by typing:

truffle migrate --network development

It is not actually necessary to explicitly specify the network, as truffle will deploy to the first network on the list. However, it is good practice to do so, in order to avoid mistakes later on. During development we might fix some bugs and redeploy. To do a re-deploy we have to add a reset option:

truffle migrate --network development --reset

Note, that we have automatically deployed the truffle supplied Migrations contract as well. This is used by Truffle for migration tasks and we will leave it alone. You should see the address each contract is assigned.

We now have version of our contract on our local blockchain simulator and can interact with it. There are various ways to test a contract and Truffle actually ships with a sophisticated test framework, but for now we connect to our blockchain using Truffle’s console mode, which provides a Javascript console with the Web3 library and some further Truffle extensions. The Web3 library provides a standard Javascript interface to allow applications to communicate with the blockchain. Type the following to enter console mode and receive a Javascript command prompt:

truffle console

Truffle makes life easy by injecting a Notary object with some important fields, such as the contracts Application Binary Interface (ABI) and it’s a address. These are two details you need to know of any contact you wish to interact with on the blockchain. We can use the injected object to create a reference to our deployed contract:

var notary = Notary.at(Notary.address);

We can now send a hash value of a file to the blockchain. As we have not implemented a client which calculates hash values yet, we can either send any integer number for testing purposes or use openssl or any other tool you have installed locally to generate a SHA-256 hash of a file. Alternatively you can just use the value I use in my example below. To send a hash value to the blockchain we can use the following code in the Truffle console:

notary.addDocHash("0x5abf61c361e5ef91582e70634dfbf2214fbdb6f29c949160b69f27ae947d919d");

Note that we are passing a string datatype as an argument. As long as we format the string correctly with a leading (0x) to indicate a hexadecimal value, the web3 library translates this correctly to our contract’s ABI.

On return, you should be presented with a receipt of your transaction, including a transaction id, some information on the block the transaction has been mined in and the gas used. Note, that in our testing environment the transaction is mined instantly. Using a real Ethereum network things would be considerably slower and testing would be cumbersome.

Let’s now use the findDocHash function to check wether our hash value exists on the blockchain:

notary.findDocHash("0x5abf61c361e5ef91582e70634dfbf2214fbdb6f29c949160b69f27ae947d919d");

The result is an array with two objects of type BigNumber. Web3 uses the BigNumber library, as Javascript is notoriously bad at dealing with large numbers. We could use the library to decode these values properly, but it is easy to see that the first element of the array corresponds to our first return value, the timestamp, and the second element to the block number.

You may try repeating the call with a hash value which we have not saved onto the blockchain to get two zero values as return values.

We have thus shown that our contract works correctly.

Next Steps

Once we have a working contract our next steps could be as follows:

  1. Develop a client application that calculates hash values for provided files and sends them to the blockchain and also allows checking if hashes exist.
  2. Test our client with our test environment.
  3. Deploy the contract on the Ethereum test network, a real blockchain that works with worthless test Ether to allow inexpensive real world testing.
  4. Test our client against the contract deployed on the test network
  5. Deploy our contract on the Ethereum main network.

We will look at these steps in part 2 of this tutorial series.

The source code for this tutorial can be found on GitHub:

stbeyer/docCertTutorial

If you find this tutorial useful, you may consider donating a modest amount:

Ethereum: 0xd9fa3fb148154ca0c70ecb89009df54ae9f9924b

Bitcoin: 1Jk13X9BTcQ2huMy1RS6Dx71oFtMMUmfXc

Litecoin: LcfTyh7xbLgi1gFStEG3xBNpGhmVAXg2Yj

Ethereum Development Guide — Part 1

Developing an Ethereum Document Certification Application

Introduction

Document certification, also known as timestamping or proof of existence, is one of the most obvious use cases for blockchain technology beyond digital currencies. Document certification consists in saving a tamper-proof timestamped fingerprint of a document (or binary file) on the blockchain. This basically serves as proof that the document existed in a certain version at a certain time and can be used to prove integrity of the file. That is, you can prove that a document has not been modified since its certification. Use cases cover registering private contracts, protecting copyright, sealing log files and any other case where file integrity is important.

In this tutorial we will look at how to create an Ethereum contract that allows to store data on the blockchain and read it, by using document certification as an example. To do so, we will create a simple smart contract that saves an SHA-256 hash of a file on the blockchain, together with a timestamp. SHA-256 hashes uniquely identify sets of data by means of a cryptographic hash function. We will not go into detail on hash functions and cryptography here, but you should know that a SHA-256 hash is a 32-byte fingerprint derived from the input data. The reverse operation, i.e. calculating the data from the hash value, is not feasible, so data protection is a welcome side-effect.

Document certification is fairly easy to implement, which means it is ideal for explaining certain concepts and techniques without unnecessary complexity, whilst still being more useful than typical “Hello World” examples.

Prerequisites and Learning outcomes

To follow this tutorial you need some basic understanding of the command line, text editing and a basic understanding of how blockchains work. You should have Node.js installed on your system and know some Javascript. After following this tutorial you will know how to:

  • Write a simple smart contract for the Ethereum blockchain
  • Store and retrieve data on the blockchain
  • Understand the difference between a transaction and a call
  • Deploy and test a smart contract in a local test environment
  • Use basic functionality of a professional Ethereum development framework

Developing the smart contract

We will use the Solidity programming language to write our contract and the Truffle framework to ease development.

Install Truffle with the following command (the -g option may require root privileges on your system):

npm install -g truffle

Now, we are ready to initialize our smart contract development project. In a new directory type the following:

truffle init

This creates various directories and files. For now we will create a Solidity source file in the contracts directory. We can use truffle to create the file with some scaffolding code for us:

truffle create contract Notary

This creates a file called Notary.sol in the contracts directory. In the same directory there is also a Migrations.sol file. This is a contract used by truffle to aid blockchain deployment. You should leave this file untouched. Let’s edit Notary.sol to contain the following code:

pragma solidity ^0.4.4;
contract Notary {
struct Record {
uint mineTime;
uint blockNumber;
}
mapping (bytes32 => Record) private docHashes;
function Notary() public {
// constructor
}
function addDocHash (bytes32 hash) public {
Record memory newRecord = Record(now, block.number);
docHashes[hash] = newRecord;
}
function findDocHash (bytes32 hash) public constant returns(uint, uint) {
return (docHashes[hash].mineTime, docHashes[hash].blockNumber);
}
}

This is all the Solidity code we require for this application. First of all, we use a pragma expression to specify a minimum compiler version (0.4.4) for this code. The code should compile with anything below version 0.5.0.

Next we use declare the Notary contract. Inside the contract we declare a struct datatype, instances of which will be used to store records of the documents we wish to fingerprint. The struct has two entries of type uint, which is an unsigned 256-bit integer. The fields are a timestamp and the block number at which the transaction supplying the hash value of our files is mined.

In the next line we create the following state variable:

mapping (bytes32 => Record) private docHashes;

This will actually be stored on the blockchain and maps 32-byte values, our SHA-256 hashes, to an instance of our previously declared Record struct. A mapping is essentially a hash table, which you may know from other programming languages. The SHA-256 hash will actually be used as the key to find the corresponding record.

The next piece of code is the constructor of our contract. Constructors are called at contract deployment and serve for initialisation tasks. As our contract is very simple we do not need to do anything special here, so we just leave the constructor empty.

Next, we declare the function that is used to store records on the blockchain:

function addDocHash (bytes32 hash) public {
Record memory newRecord = Record(now, block.number);
docHashes[hash] = newRecord;
}

Note that the function takes a hash value as an argument. We could send the actual file content to the contract and calculate the hash value in the contract code, but this would be a poorer design decision for a number of reasons: First of all, contract execution has an associated cost in gas, which is translated to Ether via the gas price, and is charged to the caller of the transaction. The more work to be done, the costlier the transaction will be. Secondly, users should not have to send their files across the network. There is no reason for user data to leave the owner’s machine, as SHA-256 hashes can be calculated locally. Finally, independent of transaction fees, sending large files over the network is just much more inefficient than sending the SHA-256 hash. As can be seen above, the addDocHash function is declared public, so that it can be accessed externally. The memory keyword used in the local Record declaration means that it won’t be saved to storage. Storing it is eventually achieved by saving the data to the docHashes mapping, which is an in-storage state variable.

When the addDocHash function is called in a transaction, the same transaction is executed by all nodes trying to include the transaction in a new block. The winning miner node that eventually seals the block sets the correct block number and mine time. Note, that whilst the timestamp can be manipulated by the miner in theory, it still needs to be larger than the previous block’s and lower than the next block’s timestamp. Therefore, any discrepancies will be minimal.

Having provided a way to save a hash, we now need a way to verify wether a hash exists on the blockchain and retrieve the corresponding record:

function findDocHash (bytes32 hash) public constant returns(uint, uint) {
return (docHashes[hash].mineTime, docHashes[hash].blockNumber);
}

This function simply returns the corresponding mine time and block number for a given hash as a tuple (Solidity can have multiple return values). If a hash is not in the mapping the code will return (0,0). We could in fact add a test wether the hash exists and throw an exception, but as (0,0) is a pretty clear indicator for failure to find a hash, we can deal with this off-chain in the client code. A good rule of thumb is to only include code that benefits from the blockchain’s properties in your contract. The rest is better off in client side code.

One important fact to notice is the constant keyword in the findDocHash function declaration. This indicates that the function does not alter state and can therefore be executed in a call, rather than a transaction. Calls may read state of a local Ethereum node and do not have to be propagated through the network and mined. They therefore do not require any gas and are free to use.

To compile this code with truffle type:

truffle compile

This creates a JSON files in the build directory of our contracts. Feel free to look at Notary.json. We will discuss the content of the file further on this series of articles. For now, we let Truffle handle the file.

Testing the contract

To test our contract we will deploy it onto an in-memory Ethereum blockchain simulator. It would be unwise to directly deploy our contract onto the real network without prior testing. We will use the Ganache tool for testing. Installation of the tool is straightforward. Once executed, Ganache presents the following graphical user interface:

As you can see there are a number of accounts created automatically, each of of which hold plenty of test Ether. You can also see details on the chain’s blocks and transactions, which is useful for testing.

Now we will deploy our contract onto the test blockchain. To do so, we use truffle to first create a migration file for our Notary contract:

truffle create migration Notary

This will create a new Javascript file in the migrations directory with some scaffolding code. Edit this file to contain the following code:

var NotaryContract = artifacts.require("Notary");
module.exports = function(deployer) {
deployer.deploy(NotaryContract);
};

The code tells Truffle to obtain a reference to the Notary contract and deploy it. However, before we can execute this code and proceed with the actual deployment, we have to configure Truffle to use our local test blockchain for deployment.

To do so we need to edit the file truffle.js which is the main configuration file, and add the following content:

module.exports = {
networks: {
development: {
host: "localhost",
port: 7545,
network_id: "*"
}
}
};

This code tells truffle to look for an Ethereum node’s RPC interface on port 7545 on localhost and deploy onto the network we find there, whatever the network id (Ethereum networks are identified by ids, with 1 being the oficial Ethereum main network). Ganache is listening on this port by default. We have also named this deployment configuration development, so that we can add further networks with different names later on.

Now we can deploy the contract by typing:

truffle migrate --network development

It is not actually necessary to explicitly specify the network, as truffle will deploy to the first network on the list. However, it is good practice to do so, in order to avoid mistakes later on. During development we might fix some bugs and redeploy. To do a re-deploy we have to add a reset option:

truffle migrate --network development --reset

Note, that we have automatically deployed the truffle supplied Migrations contract as well. This is used by Truffle for migration tasks and we will leave it alone. You should see the address each contract is assigned.

We now have version of our contract on our local blockchain simulator and can interact with it. There are various ways to test a contract and Truffle actually ships with a sophisticated test framework, but for now we connect to our blockchain using Truffle’s console mode, which provides a Javascript console with the Web3 library and some further Truffle extensions. The Web3 library provides a standard Javascript interface to allow applications to communicate with the blockchain. Type the following to enter console mode and receive a Javascript command prompt:

truffle console

Truffle makes life easy by injecting a Notary object with some important fields, such as the contracts Application Binary Interface (ABI) and it’s a address. These are two details you need to know of any contact you wish to interact with on the blockchain. We can use the injected object to create a reference to our deployed contract:

var notary = Notary.at(Notary.address);

We can now send a hash value of a file to the blockchain. As we have not implemented a client which calculates hash values yet, we can either send any integer number for testing purposes or use openssl or any other tool you have installed locally to generate a SHA-256 hash of a file. Alternatively you can just use the value I use in my example below. To send a hash value to the blockchain we can use the following code in the Truffle console:

notary.addDocHash("0x5abf61c361e5ef91582e70634dfbf2214fbdb6f29c949160b69f27ae947d919d");

Note that we are passing a string datatype as an argument. As long as we format the string correctly with a leading (0x) to indicate a hexadecimal value, the web3 library translates this correctly to our contract’s ABI.

On return, you should be presented with a receipt of your transaction, including a transaction id, some information on the block the transaction has been mined in and the gas used. Note, that in our testing environment the transaction is mined instantly. Using a real Ethereum network things would be considerably slower and testing would be cumbersome.

Let’s now use the findDocHash function to check wether our hash value exists on the blockchain:

notary.findDocHash("0x5abf61c361e5ef91582e70634dfbf2214fbdb6f29c949160b69f27ae947d919d");

The result is an array with two objects of type BigNumber. Web3 uses the BigNumber library, as Javascript is notoriously bad at dealing with large numbers. We could use the library to decode these values properly, but it is easy to see that the first element of the array corresponds to our first return value, the timestamp, and the second element to the block number.

You may try repeating the call with a hash value which we have not saved onto the blockchain to get two zero values as return values.

We have thus shown that our contract works correctly.

Next Steps

Once we have a working contract our next steps could be as follows:

  1. Develop a client application that calculates hash values for provided files and sends them to the blockchain and also allows checking if hashes exist.
  2. Test our client with our test environment.
  3. Deploy the contract on the Ethereum test network, a real blockchain that works with worthless test Ether to allow inexpensive real world testing.
  4. Test our client against the contract deployed on the test network
  5. Deploy our contract on the Ethereum main network.

We will look at these steps in part 2 of this tutorial series.

The source code for this tutorial can be found on GitHub:

stbeyer/docCertTutorial

If you find this tutorial useful, you may consider donating a modest amount:

Ethereum: 0xd9fa3fb148154ca0c70ecb89009df54ae9f9924b

Bitcoin: 1Jk13X9BTcQ2huMy1RS6Dx71oFtMMUmfXc

Litecoin: LcfTyh7xbLgi1gFStEG3xBNpGhmVAXg2Yj