A Shared Global Data Ecosystem for Agriculture and the Environment


The executive summary of GODAN’s recent discussion document ‘A Global Data Ecosystem for Agriculture and Food’, (the cover of which manages to somewhat capture the problem with the modern agricultural environment), calls for:

..a common data ecosystem, produced and used by diverse stake-holders, from smallholders to multinational conglomerates, a shared global data space..

The report identified stakeholder engagement, provenance in data sourcing and handling, sharing, and collaborative frameworks as key components in developing a global data ecosystem.

Stakeholder Engagement and Data Integrity

However within the agricultural sector “many groups might not have obvious motivation to participate in data sharing and use…” and that “..in order to get trust-worthy data, there has to be a direct reward to the data supplier.” The authors further state that “a large part of the motivation for data sharing has to do with how widely it will be shared, with whom and under what conditions.

There is, justified or otherwise, suspicion that data may be misappropriated to the provider’s dis-advantage or provide disproportionate advantage to others. The perceived risk of negative unforeseen consequences can outweigh any potential benefits of sharing data, particularly when those benefits can not be so readily quantified or realized in the short term.

Stakeholders may develop a big brother mentality where they respond by withholding data or deliberately providing inaccurate data in the belief they are better served. This problem is amplified in the provenance of agricultural products, which “undergo a chain of transformations and pass through many hands on their way to the final consumer”. One drop in the veracity of data at any point in the chain potentially undermines all the data in that chain. These issues are sadly though not just relevant to small farmers and supply chain operators but are as prevalent and strongly held by many of the big data holders such as trans-national corporations, governments and academic institutions.

Informed Consent

Whilst the integrity of the source and the veracity of the data are important factors in building a global data ecosystem the authors further identified ‘documentation, support and interaction’ as key to fostering trust. Data providers and users need to interact so as to serve each others needs better and ensure that stakeholders feel included not just sampled. Stakeholders need to be confident that there are no negative consequences or disproportionate benefits from sharing data to the whole ecosystem.

Sharing Frameworks

Where the data is held, who maintains it, the veracity, accessibility and availability to the whole ecosystem as well as who pays to deliver those services are issues that also need to be addressed. A global data ecosystem cannot rely on single large repositories to act as data silo’s or individual data providers to maintain data crucial for network function. Data needs to be distributed and maintained across the system to prevent bottle necks and failure points . The concept of the ADS (application database storage) network  which exploits the distributed network concept could potentially offer resolutions to many if not all these issues.

Data Conformity and Convention

Whilst stakeholders need an environment that is transparent, robust and secure, the data, as does all the documentation and support in that environment, needs to conform to certain conventions. The ‘five star open data maturity model (available, structured, non-proprietary format, referenceable and linked)’ lays out a basic checklist but these properties themselves need to further conform to taxonomies and naming conventions (controlled vocabularies) that are inter disciplinary and facilitate data from different sources being easily related. Conventions which must themselves be explained in and applied to any documentation and support.


In order to get trust-worthy data, there has to be a direct reward to the data supplier

For large stakeholders, governments and corporations that reward may come from the need to provide proof in meeting sustainable development goals and climate commitments, but with smaller stakeholders the same incentives may not apply. The question needs to be asked “what’s the data worth?” or more importantly “what is the cost of not having the data?” Can we achieve global sustainability goals and climate objectives without the majority of stakeholders taking part? If we can’t, is it worth weighting benefits in the short term to favour the smaller stakeholders to encourage them? Even weighting that benefit in the form of payment for engaging, and if so can technologies such as blockchain be used to verify data and facilitate those payments? One possible use for such a mechanism would be for the annotation of data such as satellite imagery.

Collaborative Frameworks

The authors draw attention to the fact that sharing data is only the start; “It is one thing to share data, but to achieve the desired gains from a data ecosystem for agriculture, to draw conclusions across the globe to guide decision making, it is necessary to exploit synergy between datasets efficiently.

Such synergies however arise out of a framework that extends beyond purely agricultural data to one that includes all environmental data. It is a framework that similarly needs to be able to seamlessly integrate with more mundane economic, sociopolitical and legal data and frameworks, an integration that will itself give rise to greater synergies between our economic activities and their environmental consequences. Di-Functional Modelling (DFM), what most of this site is dedicated too, is one such framework.

agricultue-zero-emission02Di-Functional Modelling (DFM)

Designed around the concept of soil fertility DFM was created to model the processes and resources that contribute to the sustainable management of an environmental project. In the normal course these would be the soils of an agricultural unit, a group of units or a component in a unit such as a field, forest or grassland.

However DFM is not restricted to modelling soil fertility and can be used to model other mechanisms in the agricultural and wider environment. [Agriculture in a Zero Emissions Society]

DFM is not though a database, blockchain or application but a framework or ‘ecosystem’ within which the inter-dependencies of the whole system can be more easily visualized. DFM can thus assist in the development of databases, blockchains and applications that are inter-operable and can exchange and verify environmental and agricultural data [Data Databases and Distributed Networks].

agricultue-zero-emission-economicDFM similarly models the processes and functions of an agricultural system relative to the whole. A whole that further extends to the interactions and exchanges that occur between natural systems and the socioeconomic systems they support. These sociopolitical, economic and legal system are themselves nested within the model.

These inner mechanisms are connected to the environment by existing supply chain mechanisms, data from which can reveal the true sustainability or carbon footprint of agricultural goods [TRASE]. Further enhancement of these mechanisms with relevant data should make it possible to trace the ingredients of a chocolate bar from field to retail outlet, every step and any within to give a grand total of the true cost of the indulgence in terms of carbon, habitat or social impact. Once calculated the totals could be added to your own personal tally of GHG emissions, habitat loss and social deprivation. [strengthening the food chain with the block chain]


DFM was though conceived for and is best used to help determine localized land use, crop choices and management strategies based on the available resources and the soil, habitat and hydrological properties. It was not envisaged as a top down tool but as a tool to be applied at farm end; to provide a means to both audit the farm and it’s resources and structure that audit in a way that facilitates integrating scientific data. By repeating the process on successive farms and linking those farms through a content management system each audit would contribute to a greater one permitting each unit to enhance it’s own data with that of neighbouring farms. Extended over a region and the framework would help to manage and allocate resources, plan crop choices and integrate with the natural environment: A Shared Global Data Ecosystem that mirrors the Shared Global Ecosystem we call home.


Towards a Data Ready Farm




 The Sustainable Farm

The sustainable farm and by extension a sustainable agricultural sector and planet, is one underpinned by knowledge and driven by data. Knowledge and data that can contribute to crop and livestock choices, resource management and ultimately reveal the sustainability, or not, of an enterprise.

The data ready farm is thus aware of it’s own resources, the resources of the surrounding environment and the relationship it has with those resources and the markets it supplies.




Local Knowledge: A Land Use Inventory

Whilst technology has a significant role to play, the data ready farm begins with knowledge of itself, the land use (woodland, cultivated, grassland), the inherent properties (soil and water resources), as well as the livestock and crops that depend on those resources. It is a simple inventory at the local scale; one which requires no equipment to perform.

Land Use                       Woodland, Cultivated, Grassland
Inherent Properties    Soil Texture and Water Resources
Land Dependants        Livestock and Crop Choices

The inventory should distinguish land use according to basic habitat criteria: woodland, grassland and cultivated. As this is a farm the cultivated habitats further differentiate into arable (short rotation), permanent (orchards, vineyards, etc) or heterogeneous (covered crops, flowers, etc). The woodlands and grasslands similarly differentiate but at this point only grasslands land connected with farming, pasture and rough grazing, need to be differentiated. The boundaries between and within the habitats, along with any hedgerows, fences or banks on those boundaries, and the position of any wells, standing or running water within them should also be recorded and mapped. Even if the farm appears homogeneous, has only one land use, crop or livestock, it is still likely made up of several parcels of land with varying properties; properties that are not easily visible in themselves but can be revealed by the recording and analysis of simple data, such as soil texture.

hand textural chart by S Nortcliff and JS Lang from Rowell (1994)

hand textural chart by S Nortcliff and JS Lang from Rowell (1994)

Soil Texture

Soil texture, a property that arises out of the relative proportions of sand silt and clay strongly influences the hydrological and nutrient characteristics of the soil. Variations in soil texture across a field or farm can thus reveal changes in the hydrology or nutrient status of the soil.

Soil texture can be measured by taking a small sample of soil from just below the surface (10cm). Moistened with water or spit the sample is then moulded with the hands into a ball. The ball is then deformed and it’s malleability noted and checked against a chart. The sample is usually taken along a ‘W’ transect positioned across the face of a field and the data bulked to provide a single textural class for that field/plot. All that now remains is to quantify the livestock and crop choices that depend on the land; at this point it is jut to list the type, number and location of stock and crops. This basic reconnaissance map, which needs no equipment to create, can be drawn onto a piece of paper to identify the land use, crop choices, soil texture, location of water and number of livestock.


A Local Inventory in a Global Context

With remote sensing and mobile technology the inventory and soil data could be annotated directly onto a map from the field. Coupled with Geo-statistical strategies this could be further developed to create complex contour maps of textural variation across the agricultural landscapes. With additional external scientific, environmental and economic data this local inventory could be qualified relative to a global economy


data-ready-farm-02science-dataScientific Data

Into this inventory scientific data relevant to the sustainable management of resources and the husbandry of crops and livestock can be appended.

Meteorology           Quarterly precipitation figures.
Crop Data                Nutrition, culture, pest and disease, 
Livestock Data       Nutrition, stocking numbers, general husbandry.
Soil Mineral data   345 nutrient model



Environmental Data

data-ready-farm-02environmentsIntegration of environmental data can help the farm be sympathetic to the needs of the natural environment and the species that inhabit it. Aware of the environments and species around it the data ready farm can identify synergies and conflicts and then use that data to find resolutions to conservation, pollution and emissions issues.

Conserve habitats and species
Prevent pollution from soil erosion and nutrient leaching
Reduce emissions from livestock and management practices



Market Data

To meet global sustainability goals the data ready farm must link and integrate with the ‘wider’ economic, sociopolitical and legal frameworks. Data from supply chain mechanisms, political policies, and legal and administrative bodies must integrate seamlessly with data from the agricultural and natural environments to meet SDG’s and climate change objectives.

Supply Chains               TRASE and the blockchain
Legal Frameworks       COP22 Objective
Political policy              Paris Agreement

A Local Data Hub

A farm that is aware of itself, the environment and the markets it supplies has the means to measure it’s sustainability relative to the environment and the markets. However a farm integrated with neighbouring farms can improve it’s sustainability. A locally connected farm has greater resilience and can better manage and share resources, integrate crop and livestock choices, and supply markets more efficiently. A local data hub can connect remote farmers and help to build trust and educate in using and sharing data.

Applications and Databases

To move beyond a simply inventory and into a sustainable data driven future requires the development of applications and databases that compliment the framework. Some such as TRASE already exist but local databases and applications to share data within a comprehensive and structured framework still needs development. [Data Databases and Distributed Networks]

Data Databases Distributed Networks

Random Thoughts on Cargo, Ships and Oceans

(Data, Databases and Distributed Network)


We tend to regard data as if it were a thing with dimensions and boundaries. A product of the information age we live in it travels like the cargo of a ship on the virtual ocean that is the information highway; when in fact the cargo, the ship and the information highway are all data, there is only ocean.

This ocean of data drives society, determines national budgets, aids decisions in industry and pigeon holes us into social and economic groups. From the global to the personal level data plays a significant role in all the decision processes of everyone’s life. Processes that if based on poor inaccurate, out of date or misleading data risk making decisions that are equally poor, misleading and out of date.

So, if we are to make good decisions, we need to know the outcomes, the benefits and consequences of our actions on ourselves, our neighbours and our environment. We need to understand the relationship between the macro and the micro, the local and the global and the only way to do that is through the data.

According to some reports we have generated more data in the last five years than in our entire history and each year we generate more. With this explosion in data comes opportunities for improving our decision processes and achieving global sustainability objectives. However with those opportunities come challenges in handling, differentiating and working out just what is and is not useful. For no data, is better than the wrong data. The right data however, despite what Mark Twain would aver, makes for good statistics and good statistics support good decision processes. But what is the ‘right data’ in an information age awash with the stuff.

What Is Data

The internet is data, everything on it and every piece of software on a computer is made up of Data. However in the context herein data has the more ‘narrow‘ scientific definition of

a set of values or measurements of qualitative or quantitative variables, records or information collected together for reference or analysis,” (Wikipedia)

it is the cargo on our ship…

The contents of a telephone book is an example of data collected for reference. Data that can and is put into databases for analysis. Once entered it can be re-organized and sorted so as to reveal how the names are distributed, measure their frequency and estimate ethnic or social economic distributions. The analysis might reveal odd correlations, trends and anomalies, such as the frequency at which three sixes appear in the telephone numbers of people with double barrel names, that would otherwise be missed. Such anomalies can fuel conspiracies and are examples of statistics being used like a drunk uses a lamp post, more for support than illumination. In truth there is though little one can get from a telephone book other than a telephone number and an address. That’s not to say that data isn’t useful.

Types Of Data

Data categorisation is very much dependent on purpose; there is no single category structure applicable to all. With that in mind I propose four Data spheres to initially distinguish data types.

Personal Data

A telephone book is just one source of personal data, as is a mailing list, a club membership, a bank account or a tax office receipt. Individually these data sources provide limited information about an individual but contain fields (name,address, etc) that make it easy to link the data so that collectively it documents extensive details about an individuals personal and financial life. Scary stuff and whilst it’s the most precious kind of data it similarly makes up an insignificant fraction of the total data currently held or being generated by the internet.

Economic Data

The state of the nation, the productivity of industry and the movement of goods and services within and between trading entities relies on the supply of good data. The budget, government policy and changes to or creation of new laws all rely on good relevant data. Without it there would be no means to balance the books, to calculate a nations GDP and value it’s currency. However data collection currently lags behind the policy that relies on it. At best the figures are for the previous quarter but more often than not are estimates aggregated together from different sources.

Sociopolitical Data

Domestic government policy on health and education as well as changes to and creation of new laws all rely on good data. At the regional level Data determines how policy will be implemented and budgets distributed between schools, policing, refuse collection, etc. National and local government therefore needs quantitative and qualitative data on the demographics, social trends, political, cultural and ethnic identities of the people it serves.

Environmental Data

Environmental data includes any lab, field and desktop data from any chemical, physical or biological discipline from the natural sciences. All data relating to Earth and biological disciplines from theoretical particle physics to the applied science of agriculture are forms of Environmental Data.

Non Exclusive Nature Of Data

Within these spheres data can be quantitative/qualitative, spatial/temporal, deterministic/stochastic or combinations there of. The data may similarly be relevant to a few, many or have a lasting or fleeting influence, and whilst most data conforms to the categories above some straddles more than one and all of it interacts with and influences the data in others. So whilst we can can compartmentalize data we can only understand it in the context of the whole.

What Is A Database

A database is an application (program) into which data can be input and organised to provide an indexing system or display statistical information on the data. A simple data set could be a membership list of a golf club. Each entry containing details on a members name, age, address, joining/subscription date and details of their achievements (i.e. handicap, or records held). The database would allow the club to sort the details by any field (name, age, address, joining date, subscription renewal, handicap, etc) and compile simple statistics (i.e. avg age, length of membership) or see who hadn’t paid their subs. A database might store values, charts, tables, files or just the location of the data as with bit torrent file sharing sites or search engines (i.e. google).

Types Of Database

All databases store information, ideally for easy retrieval. What differentiates one from another is the way the data is stored (within the database itself, or links to an external location), where the database is held (central or distributed), and how the data is subsequently accessed (public or private).

Traditional Database

Whilst limited and not generally regarded as a true database, a spreadsheet performs all the basic functions of one. MySQL the database in the LAMP (Linux Apache MySQL PHP) stack that drives the internet is an example of a more complex database. A MySQL database stores the content and links to a web sites media. This content is accessed though PHP scripts ( i.e. a Content Management System like WordPress) and then served to the internet by an Apache server built using Linux.

Distributed Hash Table (DHT)

A Distributed Hash Table (DHT) is a database that stores only the location(s) of a file along with a hash value (a unique reference that is the sum of the contents of the file). The hash value stored in the database can then be compared with that of the external file in order to qualify the integrity of the external file. A DHT may also hold data on when the file(s) was added, the last time it was accessed and the total number of calls made to the file. A DHT is a mechanism used for indexing and distributing files across a P2P network.


The bitcoin blockchain solves trust issues for cryptocurrency, but burns a lot of fossil fuel in the process. Although the bitcoin blockchain is referred to as a distributed database, it is more a duplicated ledger with every node maintaining an identical copy of the entire database. All nodes compete to balance the ledger by guessing a hash value; a value that can’t be calculated easily and can only by discovered by brute force. Guessed correctly it balances the entire system, and creates a block. That in a nut shell is the proof of work concept that makes the Bitcoin blockchain secure; A very energy hungry solution to solve an integrity issue with Homo sapiens.

A Framework For Sustainability

In the previous post I summarised a recent technical report by the Open Data Institute (ODI) which raised the need for a “blockchain ecosystem to emerge that mirrored the common LAMP 7 web stack” and was “compatible with the Web we have already.”

lamp stack02Reliable and secure the software that underpins the LAMP stack is, it is now nearly 20 years old and has arguably reached its peak. It has similarly evolved to be better at generating data than dealing with it. It’s good at serving files, not dealing with the information in them, so whilst the evolution of a data stack needs to evolve alongside the existing web structure it will likely be an evolution independent of it. One ‘promising’ data stack identified by the ODI team which met this criteria was “Ethereum as an application layer, BigchainDB as a database layer and the Interplanetary File System (IPFS) as a storage layer”.


the data stack04

Application Database Storage (ADS) Network

Unlike the LAMP stack the data ecosystem is more likely to evolve as a weave of intertwined data streams that converge on nodes that use the data. Similarly with the LAMP stack exchanges between nodes occurs at the server level, in an ADS network exchanges of data would occur in all layers, Application, Database and Storage.

The Application Layer

What makes databases powerful are the scripts, applications, programs and content management systems that use it. Scripts that are similarly responsible for entering data and with the rapid growth in smart appliances and the IoT this data inputting is increasingly becoming automated. How useful all that data turns out to ultimately be will depend as much on the applications that can use the data effectively as on the databases that store and organize it. Once data no longer has a processing value it would be archived, an action that would be performed by an application.

data spheres networks02

The Database and Storage Layers

Data with different economic, social and environmental relevance, much of it originating from the application layer, is indexed and organized through the database layer before finding its way into the storage layer. There is to a degree some blurring of the lines between these two layers with the database layer being dynamic whilst the storage layer is more for large files, legacy databases, redundant or archived data.

Blockchain As Metronomes In An ADS Network

The main function of a blockchain is to provide an immutable ledger that can be trusted. It’s a property an ADS network can exploit in order to synchronize databases. In particular supply chain auditing on a blockchain would provide a trusted data source for multiple users in a network. Blockchain being the ideal tool with which to build an authentication and tracking system that shadows produce as it moves from farm to fork (strengthening the food chain with a blockchain)

A Manifest Of Global Agricultural Produce

ads network 01Providing invaluable data to producers, importers, retailers and consumers alike, with an authentication and tracking system on the blockchain the the origin and route produce took to market could be qualified.

Once established a consumer would have access to an audit trail where they would be able to authenticate origin, standards in production or the carbon footprint of food.  Detailing the precise route that the produce took from the field to the shelf would give Importers and Retailers insight into double handling, stalling and wastage on route, whilst National and Supranational bodies would have precise data on the production, origin and consumption of agricultural produce. If data be the cargo in an ADS network, supply chain authentication and tracking system is the ship that carries that data.


Sowing The Seeds For Integrated Crop Production And Management Systems

With an authentication and tracking system in place a farmer would be able track in real time how much produce left the farm and reached the intended market. He would be able to see this relative to his neighbour, relative to acreage of a given crop in a region and relative to all the routes that crop took to market. Without having to communicate all farmers in a publicly accessible authentication and tracking system would be exchanging data that would help all of them plan and co-ordinate crop choices and market logistics.

It is a small step for that hub to widen, to encourage integrated crop production and management in farms across a region and improved logistics to tackle over and under production and transport wastage. One more step and farmers could begin to operate in their own regional network not only to produce and supply food but to create co-operatives to allocate resources more amicable or developing integrated fertility programs. My experiment with IRCC Cameroon was an attempt to remotely put such a structure in place.

Supporting The Development Of A Peer To Peer Economy

As well as farmers retailers and consumers could build co-operatives around a supply chain. Orders could be automatically coordinated through logistics operators to find the optimum route, and then tracked to the delivery address. On arrival the order could trigger payment or payments. It’s a future that relies on the establishment of an authentication and tracking system as well as the market places to promote and display the wares.

A good example of a blockchain authentication and tracking system is Deloitte’s ArtTracktive blockchain. Launched in May of this year to “prove the provenance and movements of artwork” the same technology, despite the huge difference in value of the goods, could be used to authenticate and track a hand of bananas from the Caribbean to the corner shop as easily as it can track a basket of fruit from Caravaggio to the Biblioteca Ambrosiana in Milan.

Widening the tracking remit are the London based startups  Blockverify and Provenance. A blockchain initiative on the Ethereum platform Provenance currently provides authentication and traceability of bespoke goods . They are similarly actively exploring retail supply chain tracking.  Blockverify similarly claim to be able to provide blockchain authentication to the pharmaceutical, luxury goods, diamonds and electronics industries.

Cropster, a company who create software solutions for the speciality coffee industry, similarly provides provenance to coffee producers so they can “instantly connect to a centralized market where thousands of roasters are actively looking.” Provenance which could be enhanced further by an authentication and tracking system that follows the beans entire journey from plantation to cup.

Undermining The Dark Web

Openbazaar, a peer to peer market place, now integrated with IPFS, is a decentralized Amazon/Ebay that charges no fees and uses an escrow system with Bitcoin for payments. Although Openbazaar discourages illicit trade, being a P2P network makes policing that policy difficult. Escrow brings in a new layer of authentication, a layer that would be enhanced and strengthened by an authentication and tracking system.

A decentralised market place using Bitcoin and supply chain tracking on a blockchain would represent the first completely decentralized market place to be created on the web. Whilst not completely ending the Dark Web an authentication and tracking system would address many of the anonymity issues P2P networks and cryptocurrency create by authenticating sender, delivery and recipient. Potentially a mechanism that is better suited to assisting the development of wholesale markets than a P2P reinvention of Yahoo Auctions.

Blockchains and global data infrastructure


Applying blockchain technology in global data infrastructure report by the ODI


Disclaimer: This 800 word summary of the 8000 word ODI document Applying blockchain technology in global data infrastructure was created to provide an overview and some commentary on what this author believes to be the most significant points in the report. It is the personal view of the author shared here for those too lazy to read the whole document and come to their own informed opinion.


Most Promising Applications

The report identifies that Bitcoin and cryptocurrency applications dominate the blockchain space and advise against “being swept up by ‘blockchain hype’ and to remember to focus on solid user needs……Whilst there are promising applications, a great many of the ideas out there are ‘vapourware’, with no viable implementation or model. There are also many instances of old ideas that failed for good reasons and the addition of a blockchain will not change those reasons.

Concentrating on “non-financial use cases” the authors identified four promising application areas for blockchain technology:

1. Document and intellectual property verification,
2. Monitoring supply chains (prev post: strengthening the food chain)
3. Building a peer-to-peer economy
4. Governance

Principal Draw Backs

They also note that in “an append-only system” data, once added to a blockchain, can never be removed. This as the authors highlight has consequences for privacy and scale. However the indelibility of the data, the fact that it is permanent and cannot be altered is equally one of if not the main selling point of the concept.

The authors argue that there are “drivers for having a few blockchains maintained by a large number of nodes, and drivers for having many blockchains maintained by a small number of nodes. It is likely it will end up somewhere in the middle.

btc minorHowever based on the examples of Bitcoin and Ethereum, there is currently only one driver. Nodes are maintained by mining rigs specifically employed to generate reward for creating blocks. A situation that has spawned voracious mining pools that gobble up huge amounts of energy in the process. If and when mining ceases or the energy costs exceed the value of the reward, the nodes will stop maintaining the blockchain. So there must be other benefits to encourage the nodes to continue.

More Paper Clips!

Similarly a Blockchain that stores a lot of data for multiple applications will also store a lot of data that is irrelevant to most nodes. Irrelevant data that will cause the blockchain to bloat in size and raise the difficulty. Nodes will just use up large amounts of computational power to maintain a blockchain that is, from their perspective, full of ‘irrelevant’ data. Both the mining reward and the one chain for everything model are thus flawed. One encouraging nodes composed of server banks fed by a pool of specialist mining rigs, the other populated with nodes hashing out blocks filled with data they (and possibly no one else) has any use for. Both behemoths churning out nonsense and relying on the energy output of a medium sized country in order to do so.

Reward systems as do blockchains that carry too much data for too many applications just encourage higher levels of difficulty in hashing blocks to be reached sooner rather than later. The immutability of blockchain data is in this respect both it’s purpose and it’s obstacle, it is forever doomed to trip over it’s own shoe laces. Thus a suite of smaller specialized blockchains that rely on providing cost efficient services to the nodes that maintain them can thrive as long as those services remain cost effective. If they are not the nodes will simply discard the blockchain in favour of and without impact on ones that are.

Waiting on Superman

Much of this promise however relies on, as the authors point out, “a technology stack that has not yet fully emerged”. As the field evolves the authors anticipate a common technology stack, one similar to the LAMP web stack will similarly evolve. The authors also posed the following questions with respect to data compatibility.

  1. How do we standardise storage in systems so that we get a single network of data, as opposed to having to use a different storage system every time we want a new type of information?
  2. What are the data protocols for distributed storage?
  3. How do we talk about, and perhaps enforce, ownership and licensing?

It’s Life Jim…


The authors conclude that distributed ledgers are “potentially important for enabling a shared data infrastructure” that could see “Blockchains used to build confidence in [private and] government services.” There was similarly “great potential for blockchains in collaborative maintenance of data for applications such as supply-chain information”. Smart contracts were also seen as having promise, however the authors “uncovered many cases that were little more than attempts to bolt failed ideas onto the technology or reinvent things that work perfectly well” without blockchain technology.

Perhaps when it’s all said and done, there is only so much one can do with a ledger, distributed or not…

Mapping UK Habitats

A soil carbon and land use database for the United Kingdom.
[Bradley et al 2005]



The above paper describes the compilation of a database to estimate soil carbon stocks and carbon dioxide emissions from UK soils by interpolating the analysis of 11,000 soil horizons and site data with legacy soil maps at 1:250,000 scale  in order to

derive high-resolution spatial data on soils and land-use data for use by a dynamic simulation model of carbon fluxes from soils resulting from land-use changes”.


Since the creation of the database the European Space Agency’s (ESA) Copernicus missions have measured, mapped and observed the Earth’s surface at 10, 20 and 60m resolutions. A vast library of images is now freely available, one that could be used to improve and update the habitat component of the soil carbon and land use database above. Furthermore having used an extended spectrum the data not only provides imagery in the visible spectrum but also in the infrared and microwave to provide data on the thermal, hydrological and gaseous properties of the Earth’s surface. It’s a library that could significantly enhance and updated the soil carbon and land use database to the extent that it extends its use beyond CO2 simulation models.

The Crowd

AfSISIn Africa the African Soil Information Services (AfSIS) has been taking advantage of this imagery to develop land use maps. It’s a first for Africa which doesn’t have the underlying soil maps to build upon and similarly has over 30 million km2. Working at a resolution of 250m2 AfSIS are using a simple yes/no question analysis to utilize the power of the crowd to map Africa’s land cover. Questions, that as with Bradley, set out to identified the basic land uses in Africa: Cultivated, Grassland, Woodland and Built. Categories that correspond to Bradley’s categories of Cultivated-Arable, Woodland, and Semi-natural.

It’s worth noting that as the UK has only 240,000km2, (under 2 million images at a resolution 250m2), it would take two thousand volunteers less than a month to completely revise and improve the resolution,by a factor of eight, and accuracy of the land cover aspect of the soil carbon and land use database. As the Copernicus missions are ongoing so could be this revision process so that the actual land cover is always accurately reflected in the database.

Farm/Field derived Data

Approximately 70% of the UK’s land is agricultural with half under cultivation.

data sources for mapping uk habitatsIn many cases these cultivated lands are utilized in precision farmer operations where soil properties have been measured and mapped at a resolution of 1ha or less. With as much as 1/3rd of the UK’s soils properties mapped at field scale, and the legacy soil maps used by Bradley at a resolution of 1km, these field measurements represent a great opportunity to enhance the soils data in the soil carbon and land use database.

The British Geological Society (BGS) have similarly produced a mysoil app which allows the general public to asses their soil and add data to the BGS database. Unfortunately there is no GNU/Linux version of this app and my request for more information has not yet been replied to so I can only assume that this app could also help to improve the soil carbon and land use database.

Citizen Science

The advent of the internet has bought about the opportunity for mass data exercises that were previously impossible for a small team of researchers to analysis extensive. Thus studies relied on supporting datasets by using statistical techniques designed to squeeze out the biggest truth from the least amount of effort. With the internet there is no need to use these support methods to the same extent and instead the statistics can be applied to qualify the integrity of the data and illuminate its implications. (link: The BGS Citizen Science home page)


The Green Data Revolution

sentinel-2aFrom Satellites to Termites…

Flying high above the Earth’s poles are the two satellites of the European Space Agency’s Sentinel-2 program. Part of the Copernicus initiative  they are mapping Africa at 10m, 20m and 60m resolutions so as to  provide data to facilitate the creation of accurate maps for environmental management and to study the effects of climate change.

Root zone, Soil Moisture http://www.esa.int/

For more than 30 years the ESA’s Living Planet Programme (Explorer missions, Earth Watch and Copernicus Sentinel missions) has remotely sensed and created a data archive on the Earth’s  climates, biomes and the processes that operate within them.

Mapping Africa’s Habitats

AfSISThese satellite archives are now being utilized by  The African Soil Information Service (AfSIS)  to describe Africa’s soil and landscape resources. It’s a project that will contribute to the development of sustainable agricultural systems that can feed the populace and co-exist with Africa’s wildlife.

Mapping 30 million km2 of the Earth’s terrestrial surface at a resolution of 250m2 is though a mammoth task that can only succeed with the effort of the crowd. So if you care about the elephants and lions, lend AfSIS your eyes, map a few km of Africa and help to make the World sustainable.  Join the crowd and map Africa’s habitats.

Digitizing the Herbarium 

Sustainable agricultural strategies need to be comprehensive and include details of the plants and animals that inhabit the land. This is particularly so for agricultural where poor crop choices and cultivation techniques can quickly and irreversibly rob the land of it’s soil, and water.  A situation that forces farmers to fell virgin forest and plough up natural grasslands in a repetition of the exercise. Many of the World’s deserts, including the largest, the Sahara, have been significantly expanded by this process of agricultural degradation.

Solanaceae-sourceProjects such as the International Plant Names Index, the World Flora online and the Solanaceae source are in the process of constructing comprehensive botanical databases for the 350,000 plant species that inhabit this planet.

Accessing Crop Data

With less than 100 plant species responsible for over 80% of the World’s raw material and food production, the digitization and access to bibliographic and research data on crops is essential if we are to achieve any semblance of agricultural sustainability. To this ends The ODI (Open Data Institute)  and GODAN (Global Open Data for Agriculture and Nutrition) are both exploring what the challenges are and what the global priorities should be.

A Digital Zoo

nhmThe Natural History Museum are digitizing over 80 million specimens from their avian, entomological ,and botanical collections. Many of these collections include additional physiological and habitat data such as the Hostplants and Caterpillars  and the Host-Parasite databases.

Animal Husbandry 

Just as we cultivate a fraction of the Earth’s plants so we have domesticated fewer than 30 of its animals. The agriculturally significant have though been diversified into many and varied breeds each with it’s own characteristics and habitat requirements. I know of no plans to digitize and make details on the requirements, management, or impact of different livestock breeds and husbandry methods on the environment despite this information existing in abundance.

Crop Pests

plantwiseCABI (Centre for Agriculture and Biosciences International) is the secretariat for GODAN and manages the plantwise project. Through the plantwise project CABI run clinics on pest and disease management and have similarly developed a searchable knowledge bank for pest identification.

Smart Data

Making use of this data and turning into something practical and useful was the remit of smartopendatathe smartopendata project. However both the web site and the final report are littered with unexplained acronyms [wtf?] and composed in a language that could well have been uttered by a Lewis Carroll character: “Hereafter, using this central core, the pilots extended the SmartO penData vocabulary to take into a account their own singularities. ”

That said, a Singularity, as in a technological/ digital singularity, is perhaps the correct adjective and conceptual framework to ‘think’ in when talking about smartdata for Agriculture and the Environment.

Let the Data Make the Decisions?

The Singularity in this sense is not an AI (artificial intelligence) but a singular objective [sustainability], within a single integrated application environment. An application environment that extends beyond basic crop and animal husbandry advice into a fully integrated service that can calculate and allocate water resources, greenhouse gas emissions or the potential for erosion. Services that can be further integrated with economic and socio-political objectives so that land and crop choices are efficiently made to meet market needs without over and under production. The same mechanism could be used to reward/compensate for mitigation strategies in the name of climate change and biodiversity objectives.

A digital singularity doesn’t  have a persona, a corporate or a national identity. With no concept of self it seeks the optimum solution based on the data supplied. The more comprehensive and far reaching that data the better the decisions that can be made using the singularity. A singularities decisions are however only as good as the data supplied; inaccurate, incomplete or misleading data, just as it does in real life, can lead to a catastrophe.

The Need for Education

Supplied with comprehensive and accurate data, a digital singularity can encourage sustainable practices, but for them to be adopted correctly a farmer needs to understand the reasoning behind the action. If global targets for climate change and habitat preservation are to be met then farmers need to be informed of the significance, understand the logic and gain tangible benefit from implementing a given strategy. Strategies that similarly require feedback (data input) from the farmers and that can only be achieved with the informed cooperation of those farmers, many if not most of which are illiterate.


Next Part II