Citizen Science on the Map

Putting Citizen Science on the Map

Citizen Science

Crowdcrafting, a web-based service that invites volunteers to contribute to scientific projects, defines Citizen science as “the active contribution of people who are not professional scientists to science. It provides volunteers with the opportunity to contribute intellectually to the research of others, to share resources or tools at their disposal, or even to start their own research projects. Volunteers provide real value to ongoing research while they themselves acquire a better understanding of the scientific method.”

As definitions go it is though a little unfair, for it somewhat trivialises the importance of the citizen scientist in achieving Sustainable Development Goals. We are in this together. If any has more importance than another then it’s the Citizen Scientist, it is the amateur, not the professional who is pivotal to our sustainable future. Without the citizen scientists most projects would be impossible or take decades to achieve a fraction of what the crowd can do in a matter of months. It is only because of the collective effort of the crowd that much of this data can be gathered and or analysed. With more and more data being generated then this volume is set to grow exponentially. As it does so the importance in recruiting citizen scientists will similarly grow, for without citizen scientists all efforts to turn this blue planet green will be little more than academic exercises: the half hearted efforts of fools and sycophants holding summits in capital cities.

Desk Top and Field Work

For Citizen Scientists the highs come from taking part and seeing completed projects, projects that can be divided into two distinct types: desktop, where citizen scientists look at images/data on a screen and then answer questions or annotate the image in some way; and field work, where citizen scientists take an active part in providing the data. Examples of desktop studies would be the analysis of satellite or drone images, the identification of insect or plant specimens, or the measure of lichen growth or plant populations. Examples of field work projects would be Open Street Map, mapping seaweed distribution along the UK’s shoreline or insect distribution inland.

There is though an ever growing number of citizen science projects that one can now get involved with. Below are links to some of the larger collections and larger projects. Most are built using open software platforms with those platforms similarly providing a platform upon which to promote the projects. They encourage anyone and everyone to create a project and so there are opportunities to volunteer ones services for everything from the sublime to the ridiculous. The last three sites provide search facilities to find more projects but this is in no way a complete list, for the complete list needs it’s own dedicated site.

The Open Street Map Project

Whilst Google’s Map, Earth and Street View projects are great applications they are similarly subject to copyright restrictions. Google owns the resources and whilst it currently allows ‘anyone’ to use them it retains the option to change its mind at some point in the future. An eventuality made more likely should google ever become the world’s sole cartographer; fortunately there’s the Open Street Map project.

osm2The open source alternative to Google, OpenStreetMap is the largest and most contributed to citizen science project to date. Supported by the OpenStreetMap Foundation the OSM project is “an initiative created to provide free geographic data, such as street maps, to anyone”.  A not-for-profit organization maintained by donations and membership fees (You can join here) the OSM foundation supports rather than controls the development of the OpenStreetMap Project. Over the last ten years, the OSM foundation has mobilised more than 2 million volunteers to create free open street maps across the globe. If you are looking to get involved and want to find a project in your area I have created a twitter list of all the OSM projects and resources. The list current contains over 50 country and city projects as well as resources to support OSM groups; if you know of anymore please advise.



ZooniverseZooniverse declares itself as “the world’s largest and most popular platform for people-powered research. Research made possible by hundreds of thousands of volunteers from around the world who come together to assist professional researchers in work that would not be possible, or practical, otherwise.” With just 46 projects in 11 categories (art, biology, climate, history, language, literature, medicine, nature, physics, social, science, space) that might seem like a hollow boast but Zooniverses software is used by some of the biggest scientific data projects in the World. This professionalism comes through in the projects Zooniverse showcases which are well documented and fun.


crowdcraftingIf Zooniverse is the biggest then Crowdcrafting, a “web-based service that invites volunteers to contribute to scientific projects developed by citizens, professionals or institutions”, with over 400 projects in 6 categories is the broadest. As with the previous two projects crowdcrafting uses and develops open source software to “help solve problems, analyze data or complete challenging tasks that cant be done by machines alone.” However it is non too discerning about the projects it promotes and some, whilst valid are perhaps a little too esoteric for most. If though odd and obscure is what you are looking for then crowdcrafting is a good place to start looking.

NASA Earth Observatory

nasa terraNASA’s Terra satellite, which they describe as being about the size of a school bus and falling about the Earth on a circular sun-synchronous polar orbit carries five imaging and monitoring instruments to aid understanding air and water processes and quality. There’s not too much I can say about this project after the successful planting of a mind gif containing a yellow school bus slaloming it’s way around the planet on an endless loop. If you live in the States and that idea doesn’t make you dizzy then you could help NASA by doing some essential ground measurements and observations in air and water quality to support the work of the school bus falling above you.


geosurveyJust as the data from the Earth Observing Systems of NASA’s Terra Satellite is being used to map air and water quality in the USA, so data from the European Space Agencies Sentinel-2 satellite is being used to map Africa’s soils and habitats. Geosurvey in conjunction with the African Soil’s Information Service are currently running 30 desktop land use projects in over 10 African countries. Whilst geosurvey is built on the open source web framework Django the land classification survey is unique in that it has all been built in house. The net result is perhaps the most user friendly interface that I have so far come across. Geosurvey’s simple clean approach makes it a good place to begin citizen science mapping work.


The Natural History Museum

bigseaweedThe natural history museum have a mission to digitize their collection of 80 million specimens. They are similarly one of the big data collection holders who use Zooniverse software to harness the efforts of citizen scientists to digitize those collections. Current projects range from transcribing microscope slides from the comfort of your home to visiting the coast to map the distribution of seaweed or report whale and dolphin stranding. The natural history museum have eight projects currently in progress and provide a huge amount of support and educational material around those projects. If you want to get really active and turn trips to the seaside into science expiditions then the natural history museum is the place to go.

Centre for Ecology and Hydrology


As with the Natural History Museum the Centre for Ecology and Hydrology uses Zooniverse software to power their citizen science projects. Taking further advantage of smartphone technology the CEH has been mapping the distribution of more than 10,000 insect species in the UK and in other projects giving volunteers a backpack containing a particle monitor and a GPS logger to monitor their personal exposure to air pollution. The Centre for Ecology and Hydrology also offer guides on best practice and designing and executing citizen science projects. If the natural History museum’s citizen science projects are designed for kids then the centre for ecology and hydrology citizen science projects are designed for their grown up counterparts, those who can’t stop fiddling with their smart phones, drones or wearable tech.. now you can do it all in the name of science.


British Geological Society (BGS)

mysoilThe BGS have been using crowdmap for amateur geologists to record observations and share images about temporary geological exposures for a number of years. The BGS also manage the mysoil app which allows citizen scientist to record soil properties and upload to the BGS database to refine the UK’s aging legacy soil maps.

RSPB Birdwatch

The RSPB (Royal Society for the Protection of Birds) have been campaigning for the protection of birds for over 100 years. Conducting surveys amongst their members for over half a century they are perhaps the earliest adopters of citizen science and continue to be a major contributor with national birdwatch surveys. On the other side of the Atlantic ornithologically based citizen science projects are run by the Cornell Lab of Ornithology.




If you haven’t found the perfect citizen science project to get involved in through any of the above links then Scistarter might be able to help. Scistarter has a mission to “bring together the millions of citizen scientists in the world,” a mission objective I share and support. It does however and despite being in receipt of a Mark Shuttleworth foundation fellowship/flash grant, have one of the most retro looking web sites I’ve seen in years; a real blast back to 1995.. It’s a great concept but it needs some serious web designing on top and some lateral thinking underneath.



Scientific America

Scientific American has a similar but somewhat smaller mandate than SciStarter. A little more discerning they want volunteers to “Help make science happen by volunteering for a real research project”. What qualifies as ‘real research’ is perhaps a little subjective however that said they do give citizen scientists the opportunity to engaged with 240 of what they determine to be real projects (i.e. coming from subscribing institutions). That’s not to say these projects are not valid, they are, they are as real as any that  Scientific America feels it has the authority to say are not.


An EU funded initiative Socientize received 710,000 euro to fund a 2 year project with the “aim of creating a common forum for cooperation between e-Infrastructure providers and citizen science infrastructures providers.” Four years after receiving that funding Socientize have five projects listed and a simple list of links to other projects (most of which can be found through the previous links above). Whilst I don’t doubt that this was money well spent the results of the investment are not immediately apparent through the web site.

Paid or Voluntary

Whilst there are a great variety of projects that need the help of citizen scientists to be found in the above links nearly all have one thing in common; non of them pay citizen scientists for their efforts.  Some, such as the the geosurvey website award points which are later converted into chances to win a prize. But I have yet to come across a site that pays even a nominal sum to citizen scientists for their efforts. It’s likely that millions of contributors to thousands of scientific projects to date have not even received an acknowledgement. The crowds contribution is both anonymous and voluntary but the researchers is likely neither. Whilst it’s not a fair deal on the citizen scientist it similarly shortchanges the research project. Paying citizen scientist would result in greater participation, greater accuracy and earlier completion. It’s also extremely cheap as a draft paper I created explains pay to map

A Common Framework

Whilst all the projects above conform to a single voluntary principal with respect to citizen scientists none make any real effort to conform when it comes to categorizing projects. None make it particularly easy to find or organize projects according to factors such as execution (desktop/field), geographical location, study subject or research objectives. In most cases the research objectives are not even shared suggesting that citizen scientists are either disinterested in the objectives of the projects they freely give their time and effort to, or that researchers are disinterested in the very people they want and need to engage. If researchers only looked after the citizen scientists.. then the citizen scientists would complete their project.

The Future of Citizen Science

Citizen Science, crowd mapping and collective distributed actions are unlikely to be passing fads. The technological and data driven path we now find ourselves on is not one we can choose to simply step off. Whilst individually we have the option to not engage and act only in a selfish capacity, as a society we don’t. It’s a whole new paradigm.

To take advantage of this and reap the collective rewards citizen science sites need to be highly functional. A dashboard that allows a visitor to easily see mapping projects according to geographical relevance, desktop or field, study and research objectives, and organized into environmental, social and economic categories. A dashboard that is interactive and visual.

Projects need to provide detailed information on the nature of the study, where is it based, who is running and who is sponsoring it. Once the study is finished if not before citizen scientist should be given access to the project data so that they can see the fruits of their labours.

Consideration should be given to identifying projects suitable for education of school children. To fully engage schools in citizen science projects additional educational material to assist teachers in using and contributing to projects as part of or in sympathy to the curriculum should be provided.

There is no such thing as free lunch that isn’t stale or lacking in flavour. To make and keep people more engaged a nominal payment system that allows citizen scientists to earn convertible value, such as a crypto currency, should be introduced. The payment does not need to be substantial but sufficient to allow an industrious individual to earn up to a $1 an hour. This would make participation attractive to groups who have the time and access to an interface but would not otherwise engage. Agency and zero hours contract workers, the unemployed, homeworkers, disabled, carers and the retired would all be more likely to engaged if they earned a nominal amount for their efforts. Paying contributors I would aver would result in higher recruitment and faster individual work rate shortening study periods considerably. Time and funds previously wasted on promotion can be reclaimed as the prospect of reward encourages viral promotion and longer participation. Overall paying contributors would likely result in lowering overheads and improving efficiency.

It’s a whole new Paradigm is this Citizen Science, Data Driven, Sharing, IoT Society Thingy…

Data Databases Distributed Networks

Random Thoughts on Cargo, Ships and Oceans

(Data, Databases and Distributed Network)


We tend to regard data as if it were a thing with dimensions and boundaries. A product of the information age we live in it travels like the cargo of a ship on the virtual ocean that is the information highway; when in fact the cargo, the ship and the information highway are all data, there is only ocean.

This ocean of data drives society, determines national budgets, aids decisions in industry and pigeon holes us into social and economic groups. From the global to the personal level data plays a significant role in all the decision processes of everyone’s life. Processes that if based on poor inaccurate, out of date or misleading data risk making decisions that are equally poor, misleading and out of date.

So, if we are to make good decisions, we need to know the outcomes, the benefits and consequences of our actions on ourselves, our neighbours and our environment. We need to understand the relationship between the macro and the micro, the local and the global and the only way to do that is through the data.

According to some reports we have generated more data in the last five years than in our entire history and each year we generate more. With this explosion in data comes opportunities for improving our decision processes and achieving global sustainability objectives. However with those opportunities come challenges in handling, differentiating and working out just what is and is not useful. For no data, is better than the wrong data. The right data however, despite what Mark Twain would aver, makes for good statistics and good statistics support good decision processes. But what is the ‘right data’ in an information age awash with the stuff.

What Is Data

The internet is data, everything on it and every piece of software on a computer is made up of Data. However in the context herein data has the more ‘narrow‘ scientific definition of

a set of values or measurements of qualitative or quantitative variables, records or information collected together for reference or analysis,” (Wikipedia)

it is the cargo on our ship…

The contents of a telephone book is an example of data collected for reference. Data that can and is put into databases for analysis. Once entered it can be re-organized and sorted so as to reveal how the names are distributed, measure their frequency and estimate ethnic or social economic distributions. The analysis might reveal odd correlations, trends and anomalies, such as the frequency at which three sixes appear in the telephone numbers of people with double barrel names, that would otherwise be missed. Such anomalies can fuel conspiracies and are examples of statistics being used like a drunk uses a lamp post, more for support than illumination. In truth there is though little one can get from a telephone book other than a telephone number and an address. That’s not to say that data isn’t useful.

Types Of Data

Data categorisation is very much dependent on purpose; there is no single category structure applicable to all. With that in mind I propose four Data spheres to initially distinguish data types.

Personal Data

A telephone book is just one source of personal data, as is a mailing list, a club membership, a bank account or a tax office receipt. Individually these data sources provide limited information about an individual but contain fields (name,address, etc) that make it easy to link the data so that collectively it documents extensive details about an individuals personal and financial life. Scary stuff and whilst it’s the most precious kind of data it similarly makes up an insignificant fraction of the total data currently held or being generated by the internet.

Economic Data

The state of the nation, the productivity of industry and the movement of goods and services within and between trading entities relies on the supply of good data. The budget, government policy and changes to or creation of new laws all rely on good relevant data. Without it there would be no means to balance the books, to calculate a nations GDP and value it’s currency. However data collection currently lags behind the policy that relies on it. At best the figures are for the previous quarter but more often than not are estimates aggregated together from different sources.

Sociopolitical Data

Domestic government policy on health and education as well as changes to and creation of new laws all rely on good data. At the regional level Data determines how policy will be implemented and budgets distributed between schools, policing, refuse collection, etc. National and local government therefore needs quantitative and qualitative data on the demographics, social trends, political, cultural and ethnic identities of the people it serves.

Environmental Data

Environmental data includes any lab, field and desktop data from any chemical, physical or biological discipline from the natural sciences. All data relating to Earth and biological disciplines from theoretical particle physics to the applied science of agriculture are forms of Environmental Data.

Non Exclusive Nature Of Data

Within these spheres data can be quantitative/qualitative, spatial/temporal, deterministic/stochastic or combinations there of. The data may similarly be relevant to a few, many or have a lasting or fleeting influence, and whilst most data conforms to the categories above some straddles more than one and all of it interacts with and influences the data in others. So whilst we can can compartmentalize data we can only understand it in the context of the whole.

What Is A Database

A database is an application (program) into which data can be input and organised to provide an indexing system or display statistical information on the data. A simple data set could be a membership list of a golf club. Each entry containing details on a members name, age, address, joining/subscription date and details of their achievements (i.e. handicap, or records held). The database would allow the club to sort the details by any field (name, age, address, joining date, subscription renewal, handicap, etc) and compile simple statistics (i.e. avg age, length of membership) or see who hadn’t paid their subs. A database might store values, charts, tables, files or just the location of the data as with bit torrent file sharing sites or search engines (i.e. google).

Types Of Database

All databases store information, ideally for easy retrieval. What differentiates one from another is the way the data is stored (within the database itself, or links to an external location), where the database is held (central or distributed), and how the data is subsequently accessed (public or private).

Traditional Database

Whilst limited and not generally regarded as a true database, a spreadsheet performs all the basic functions of one. MySQL the database in the LAMP (Linux Apache MySQL PHP) stack that drives the internet is an example of a more complex database. A MySQL database stores the content and links to a web sites media. This content is accessed though PHP scripts ( i.e. a Content Management System like WordPress) and then served to the internet by an Apache server built using Linux.

Distributed Hash Table (DHT)

A Distributed Hash Table (DHT) is a database that stores only the location(s) of a file along with a hash value (a unique reference that is the sum of the contents of the file). The hash value stored in the database can then be compared with that of the external file in order to qualify the integrity of the external file. A DHT may also hold data on when the file(s) was added, the last time it was accessed and the total number of calls made to the file. A DHT is a mechanism used for indexing and distributing files across a P2P network.


The bitcoin blockchain solves trust issues for cryptocurrency, but burns a lot of fossil fuel in the process. Although the bitcoin blockchain is referred to as a distributed database, it is more a duplicated ledger with every node maintaining an identical copy of the entire database. All nodes compete to balance the ledger by guessing a hash value; a value that can’t be calculated easily and can only by discovered by brute force. Guessed correctly it balances the entire system, and creates a block. That in a nut shell is the proof of work concept that makes the Bitcoin blockchain secure; A very energy hungry solution to solve an integrity issue with Homo sapiens.

A Framework For Sustainability

In the previous post I summarised a recent technical report by the Open Data Institute (ODI) which raised the need for a “blockchain ecosystem to emerge that mirrored the common LAMP 7 web stack” and was “compatible with the Web we have already.”

lamp stack02Reliable and secure the software that underpins the LAMP stack is, it is now nearly 20 years old and has arguably reached its peak. It has similarly evolved to be better at generating data than dealing with it. It’s good at serving files, not dealing with the information in them, so whilst the evolution of a data stack needs to evolve alongside the existing web structure it will likely be an evolution independent of it. One ‘promising’ data stack identified by the ODI team which met this criteria was “Ethereum as an application layer, BigchainDB as a database layer and the Interplanetary File System (IPFS) as a storage layer”.


the data stack04

Application Database Storage (ADS) Network

Unlike the LAMP stack the data ecosystem is more likely to evolve as a weave of intertwined data streams that converge on nodes that use the data. Similarly with the LAMP stack exchanges between nodes occurs at the server level, in an ADS network exchanges of data would occur in all layers, Application, Database and Storage.

The Application Layer

What makes databases powerful are the scripts, applications, programs and content management systems that use it. Scripts that are similarly responsible for entering data and with the rapid growth in smart appliances and the IoT this data inputting is increasingly becoming automated. How useful all that data turns out to ultimately be will depend as much on the applications that can use the data effectively as on the databases that store and organize it. Once data no longer has a processing value it would be archived, an action that would be performed by an application.

data spheres networks02

The Database and Storage Layers

Data with different economic, social and environmental relevance, much of it originating from the application layer, is indexed and organized through the database layer before finding its way into the storage layer. There is to a degree some blurring of the lines between these two layers with the database layer being dynamic whilst the storage layer is more for large files, legacy databases, redundant or archived data.

Blockchain As Metronomes In An ADS Network

The main function of a blockchain is to provide an immutable ledger that can be trusted. It’s a property an ADS network can exploit in order to synchronize databases. In particular supply chain auditing on a blockchain would provide a trusted data source for multiple users in a network. Blockchain being the ideal tool with which to build an authentication and tracking system that shadows produce as it moves from farm to fork (strengthening the food chain with a blockchain)

A Manifest Of Global Agricultural Produce

ads network 01Providing invaluable data to producers, importers, retailers and consumers alike, with an authentication and tracking system on the blockchain the the origin and route produce took to market could be qualified.

Once established a consumer would have access to an audit trail where they would be able to authenticate origin, standards in production or the carbon footprint of food.  Detailing the precise route that the produce took from the field to the shelf would give Importers and Retailers insight into double handling, stalling and wastage on route, whilst National and Supranational bodies would have precise data on the production, origin and consumption of agricultural produce. If data be the cargo in an ADS network, supply chain authentication and tracking system is the ship that carries that data.


Sowing The Seeds For Integrated Crop Production And Management Systems

With an authentication and tracking system in place a farmer would be able track in real time how much produce left the farm and reached the intended market. He would be able to see this relative to his neighbour, relative to acreage of a given crop in a region and relative to all the routes that crop took to market. Without having to communicate all farmers in a publicly accessible authentication and tracking system would be exchanging data that would help all of them plan and co-ordinate crop choices and market logistics.

It is a small step for that hub to widen, to encourage integrated crop production and management in farms across a region and improved logistics to tackle over and under production and transport wastage. One more step and farmers could begin to operate in their own regional network not only to produce and supply food but to create co-operatives to allocate resources more amicable or developing integrated fertility programs. My experiment with IRCC Cameroon was an attempt to remotely put such a structure in place.

Supporting The Development Of A Peer To Peer Economy

As well as farmers retailers and consumers could build co-operatives around a supply chain. Orders could be automatically coordinated through logistics operators to find the optimum route, and then tracked to the delivery address. On arrival the order could trigger payment or payments. It’s a future that relies on the establishment of an authentication and tracking system as well as the market places to promote and display the wares.

A good example of a blockchain authentication and tracking system is Deloitte’s ArtTracktive blockchain. Launched in May of this year to “prove the provenance and movements of artwork” the same technology, despite the huge difference in value of the goods, could be used to authenticate and track a hand of bananas from the Caribbean to the corner shop as easily as it can track a basket of fruit from Caravaggio to the Biblioteca Ambrosiana in Milan.

Widening the tracking remit are the London based startups  Blockverify and Provenance. A blockchain initiative on the Ethereum platform Provenance currently provides authentication and traceability of bespoke goods . They are similarly actively exploring retail supply chain tracking.  Blockverify similarly claim to be able to provide blockchain authentication to the pharmaceutical, luxury goods, diamonds and electronics industries.

Cropster, a company who create software solutions for the speciality coffee industry, similarly provides provenance to coffee producers so they can “instantly connect to a centralized market where thousands of roasters are actively looking.” Provenance which could be enhanced further by an authentication and tracking system that follows the beans entire journey from plantation to cup.

Undermining The Dark Web

Openbazaar, a peer to peer market place, now integrated with IPFS, is a decentralized Amazon/Ebay that charges no fees and uses an escrow system with Bitcoin for payments. Although Openbazaar discourages illicit trade, being a P2P network makes policing that policy difficult. Escrow brings in a new layer of authentication, a layer that would be enhanced and strengthened by an authentication and tracking system.

A decentralised market place using Bitcoin and supply chain tracking on a blockchain would represent the first completely decentralized market place to be created on the web. Whilst not completely ending the Dark Web an authentication and tracking system would address many of the anonymity issues P2P networks and cryptocurrency create by authenticating sender, delivery and recipient. Potentially a mechanism that is better suited to assisting the development of wholesale markets than a P2P reinvention of Yahoo Auctions.

Blockchains and global data infrastructure


Applying blockchain technology in global data infrastructure report by the ODI


Disclaimer: This 800 word summary of the 8000 word ODI document Applying blockchain technology in global data infrastructure was created to provide an overview and some commentary on what this author believes to be the most significant points in the report. It is the personal view of the author shared here for those too lazy to read the whole document and come to their own informed opinion.


Most Promising Applications

The report identifies that Bitcoin and cryptocurrency applications dominate the blockchain space and advise against “being swept up by ‘blockchain hype’ and to remember to focus on solid user needs……Whilst there are promising applications, a great many of the ideas out there are ‘vapourware’, with no viable implementation or model. There are also many instances of old ideas that failed for good reasons and the addition of a blockchain will not change those reasons.

Concentrating on “non-financial use cases” the authors identified four promising application areas for blockchain technology:

1. Document and intellectual property verification,
2. Monitoring supply chains (prev post: strengthening the food chain)
3. Building a peer-to-peer economy
4. Governance

Principal Draw Backs

They also note that in “an append-only system” data, once added to a blockchain, can never be removed. This as the authors highlight has consequences for privacy and scale. However the indelibility of the data, the fact that it is permanent and cannot be altered is equally one of if not the main selling point of the concept.

The authors argue that there are “drivers for having a few blockchains maintained by a large number of nodes, and drivers for having many blockchains maintained by a small number of nodes. It is likely it will end up somewhere in the middle.

btc minorHowever based on the examples of Bitcoin and Ethereum, there is currently only one driver. Nodes are maintained by mining rigs specifically employed to generate reward for creating blocks. A situation that has spawned voracious mining pools that gobble up huge amounts of energy in the process. If and when mining ceases or the energy costs exceed the value of the reward, the nodes will stop maintaining the blockchain. So there must be other benefits to encourage the nodes to continue.

More Paper Clips!

Similarly a Blockchain that stores a lot of data for multiple applications will also store a lot of data that is irrelevant to most nodes. Irrelevant data that will cause the blockchain to bloat in size and raise the difficulty. Nodes will just use up large amounts of computational power to maintain a blockchain that is, from their perspective, full of ‘irrelevant’ data. Both the mining reward and the one chain for everything model are thus flawed. One encouraging nodes composed of server banks fed by a pool of specialist mining rigs, the other populated with nodes hashing out blocks filled with data they (and possibly no one else) has any use for. Both behemoths churning out nonsense and relying on the energy output of a medium sized country in order to do so.

Reward systems as do blockchains that carry too much data for too many applications just encourage higher levels of difficulty in hashing blocks to be reached sooner rather than later. The immutability of blockchain data is in this respect both it’s purpose and it’s obstacle, it is forever doomed to trip over it’s own shoe laces. Thus a suite of smaller specialized blockchains that rely on providing cost efficient services to the nodes that maintain them can thrive as long as those services remain cost effective. If they are not the nodes will simply discard the blockchain in favour of and without impact on ones that are.

Waiting on Superman

Much of this promise however relies on, as the authors point out, “a technology stack that has not yet fully emerged”. As the field evolves the authors anticipate a common technology stack, one similar to the LAMP web stack will similarly evolve. The authors also posed the following questions with respect to data compatibility.

  1. How do we standardise storage in systems so that we get a single network of data, as opposed to having to use a different storage system every time we want a new type of information?
  2. What are the data protocols for distributed storage?
  3. How do we talk about, and perhaps enforce, ownership and licensing?

It’s Life Jim…


The authors conclude that distributed ledgers are “potentially important for enabling a shared data infrastructure” that could see “Blockchains used to build confidence in [private and] government services.” There was similarly “great potential for blockchains in collaborative maintenance of data for applications such as supply-chain information”. Smart contracts were also seen as having promise, however the authors “uncovered many cases that were little more than attempts to bolt failed ideas onto the technology or reinvent things that work perfectly well” without blockchain technology.

Perhaps when it’s all said and done, there is only so much one can do with a ledger, distributed or not…

Strengthening The Food Chain with Blockchain

Strengthening The Food Chain with a Blockchain
A publicly accessible authentication and tracking system

Produce Authenticity Log [PAL]

The blockchain in this scenario is used as a simple global tracking system to facilitate the logistics and authentication of global produce. It is not an alternative to existing logistics and supply chain mechanisms, nor is it a smart contract or payment system; but a ledger in which the produce is the unit of currency and the Blockchain the means to authenticate the origin and destination of the produce. In particular it would set out to achieve the following goals:

  • Authenticate and track the distribution of agricultural produce across the globe in real time.
  • Provide a secure, robust and publicly available record authenticating the origin, method of production and subsequent route from farm gate to shop shelf.
  • Provide a mechanism to prevent mislabelling of foods as organic, fair traded or originating from a country other than stated.
  • Facilitate the management of import licenses and the issuing of standards and certificates.

The Concept of PAL

A PAL is created by a node running the blockchain [and placed in a parent wallet]
An amount to reflect the value (qty) of the produce is allocated to a portion of the PAL.
This portion is moved into a consignment wallet.

The PAL is then ‘called’ by the wallets in the Supply and Retail chains.
Every movement of the PAL being called (initiated) by the recipient rather than the sender as the produce moves along the supply chain. The end customer will then be able to use the PAL to authenticate produce origin and transport history.

foodchain PALOnce created a portion of the PAL, equivalent to the produce volume, is sent to a ‘consignment wallet’ with any unused portion of the PAL remaining in the parent wallet. The portion of the PAL now in the consignment wallet can be transferred out by trusted wallets in the supply chain. This is repeated as the PAL follows the produce along the supply chain. If the produce is split and sent along different supply routes the PAL too can be split to follow the destination of everything that left the farm gate.

The PAL is ‘called’ by the recipient wallets in the Supply and Retail chains rather than sent by the Consignment wallet as both an anti-spam measure and a mechanism to aid the fluid ‘unhindered’ distribution of produce.

Benefits of PAL

Data linked to producer/regulatory organisations, supply and retailer chains to provide:

  • Full audit trail of produce to authenticate it’s origin and production standards.
  • Transnational, cross operator tracking independent of all the operators who use it.
  • Global statistics on food production and consumption.
  • Identify points of failure, such as unnecessary delay or excessive wastage/losses.
  • Log precise time goods transfer occurred.
  • Help source origin and routes of a pest or disease outbreak. (*including human pandemics, i.e ebola)

Malicious attack

A node could create fake logs with the intention of spamming the system, however it would need access to a supply chain of trusted wallets to call the log. Without such a route any PAL created would remain in the parent wallet of the node. A rogue node could still set about producing millions of PAL’s, potentially threatening scalability and operability. A limit in the rate at which a node could produce a PAL to one every 5 seconds would limit a node to 6.5 million PAL creations in a year: A year in which the node would need to be maintaining the blockchain; consume energy and securing the network in the process, in order to produce a six million innocuous entries.

No smart contracts

A smart contract is an executable command that is triggered by conditions in the blockchain being met… it may trigger a payment or some other action following an event or set of events occurring in the block chain. In more common parlance this would be referred to as ‘automatic’ rather than a smart payment or action. However smart contracts are inflexible they do not allow for changing plans and being hard coded run the risk of becoming recursive monsters, triggering actions and payments, long after the relevant entities have expired. Once written into the blockchain it is impossible to change the action of a smart contract without having to change the entire ledger. This does not preclude executable commands (smart contract) working with blockchains but the two can (and likely should) be distinct entities with the blockchain providing the authentication mechanism for an application to read and then execute a command.

Mapping UK Habitats

A soil carbon and land use database for the United Kingdom.
[Bradley et al 2005]


The above paper describes the compilation of a database to estimate soil carbon stocks and carbon dioxide emissions from UK soils by interpolating the analysis of 11,000 soil horizons and site data with legacy soil maps at 1:250,000 scale  in order to

derive high-resolution spatial data on soils and land-use data for use by a dynamic simulation model of carbon fluxes from soils resulting from land-use changes”.


Since the creation of the database the European Space Agency’s (ESA) Copernicus missions have measured, mapped and observed the Earth’s surface at 10, 20 and 60m resolutions. A vast library of images is now freely available, one that could be used to improve and update the habitat component of the soil carbon and land use database above. Furthermore having used an extended spectrum the data not only provides imagery in the visible spectrum but also in the infrared and microwave to provide data on the thermal, hydrological and gaseous properties of the Earth’s surface. It’s a library that could significantly enhance and updated the soil carbon and land use database to the extent that it extends its use beyond CO2 simulation models.

The Crowd

AfSISIn Africa the African Soil Information Services (AfSIS) has been taking advantage of this imagery to develop land use maps. It’s a first for Africa which doesn’t have the underlying soil maps to build upon and similarly has over 30 million km2. Working at a resolution of 250m2 AfSIS are using a simple yes/no question analysis to utilize the power of the crowd to map Africa’s land cover. Questions, that as with Bradley, set out to identified the basic land uses in Africa: Cultivated, Grassland, Woodland and Built. Categories that correspond to Bradley’s categories of Cultivated-Arable, Woodland, and Semi-natural.

It’s worth noting that as the UK has only 240,000km2, (under 2 million images at a resolution 250m2), it would take two thousand volunteers less than a month to completely revise and improve the resolution,by a factor of eight, and accuracy of the land cover aspect of the soil carbon and land use database. As the Copernicus missions are ongoing so could be this revision process so that the actual land cover is always accurately reflected in the database.

Farm/Field derived Data

Approximately 70% of the UK’s land is agricultural with half under cultivation.

data sources for mapping uk habitatsIn many cases these cultivated lands are utilized in precision farmer operations where soil properties have been measured and mapped at a resolution of 1ha or less. With as much as 1/3rd of the UK’s soils properties mapped at field scale, and the legacy soil maps used by Bradley at a resolution of 1km, these field measurements represent a great opportunity to enhance the soils data in the soil carbon and land use database.

The British Geological Society (BGS) have similarly produced a mysoil app which allows the general public to asses their soil and add data to the BGS database. Unfortunately there is no GNU/Linux version of this app and my request for more information has not yet been replied to so I can only assume that this app could also help to improve the soil carbon and land use database.

Citizen Science

The advent of the internet has bought about the opportunity for mass data exercises that were previously impossible for a small team of researchers to analysis extensive. Thus studies relied on supporting datasets by using statistical techniques designed to squeeze out the biggest truth from the least amount of effort. With the internet there is no need to use these support methods to the same extent and instead the statistics can be applied to qualify the integrity of the data and illuminate its implications. (link: The BGS Citizen Science home page)