Analysing the blockchain is becoming more and more relevant for detecting attacks and frauds on cryptocurrency exchanges and smart contract activations. However, this is a challenging task due to the continuous growth of the blockchain. For example, in early 2017 Ethereum was estimated to contain approximately 300GB of data [1], a number that keeps growing day after day. In order to analyse such ever-growing amount of data, this paper argues that blockchain analysis should be treated as a novel type of application for Big Data platforms. In this paper we explore the application of parallelization techniques from the Big Data domain, in particular Map/Reduce, to extract and analyse information from the blockchain. We show that our approach significantly improves the index generation by 7.77 times, with a setup of 20 worker nodes, 1 Ethereum node and 1 Database node. We also share our findings of our massively parallel setup for querying Ethereum in terms of architecture and the bottlenecks. This should help researchers setup similar infrastructures for analysing the blockchain in the future.

Original languageEnglish
Title of host publicationProceedings of the 2nd International Workshop on Emerging Trends in Software Engineering for Blockchain
PublisherWiley / IEEE Press
Pages1-7
Number of pages7
ISBN (Print)978-1-7281-2257-1
DOIs
Publication statusPublished - May 2019
Event2nd International Workshop on Emerging Trends in Software Engineering for Blockchain - Montreal, Canada
Duration: 27 May 201927 May 2019

Workshop

Workshop2nd International Workshop on Emerging Trends in Software Engineering for Blockchain
Abbreviated titleWETSEB '19
CountryCanada
CityMontreal
Period27/05/1927/05/19

ID: 47451576