Difference between revisions of "SNF Blockchain"

From EU COST Fin-AI
Jump to navigation Jump to search
 
(19 intermediate revisions by the same user not shown)
Line 1: Line 1:
[https://data.snf.ch/grants/grant/211195 Grant link]
+
= Abstract =
 
 
 
Blockchain networks are increasingly being implemented into healthcare, supply chain, and retail systems, through smart contracts, smart devices, smart identity management. Although the use of this technology brings with it benefits, it can also still cause problems. A particular problem is derived from the immutability property, which means that fraudulent transactions or transfers of information cannot be reversed. Rationale: Blockchains can be attacked via a deluge of requests or transactions within a short time span, resulting in the loss of connectivity to the blockchain for users and businesses, or even financial institutions. Therefore, the rapid detection of anomalies from such activities is critical in order to prevent damage from occurring, or correct any damage as soon as possible to reduce the severity of its impact.Overall objectives: This project will study the problem of anomaly and fraud detection from the perspective of blockchain-based networks. Anomaly and fraud detection in blockchain-based networks is more complex due to their unique properties such as decentralisation, global reach, anonymity, etc., which make them different from traditional networks.Specific aims: To further the understanding of the sources and behaviours of anomalies and fraud in blockchain-based networks, and develop new improved methods for both static and dynamic anomaly detection that can be used alongside blockchain-based systems for real-time fraud detection.Methods: Developing and implementing static anomaly detection methods via a hybrid approach and developing dynamic anomaly detection methods using extreme value theory.Expected results: This research work will be able to contribute to improving the security relating to blockchain-based networks by providing more accurate and efficient methods for detecting anomalies and fraud and reducing the impact of losses resulting from these anomalies.Impact for the field: The project will be particularly beneficial alongside real world blockchain-based networks to allow for the fast detection of anomalous or fraudulent data, preventing damage or allowing for damage to be corrected as soon as possible. For cryptocurrency networks, this will reduce the impact of market manipulation, fraud, and more widely on global financial markets, currencies, and trade. In addition, the project will be of interest to a broad range of cryptocurrency and blockchain stakeholders including (but not limited to) academics, financial institutions, policymakers, regulators, and cybercrime agencies.
 
Blockchain networks are increasingly being implemented into healthcare, supply chain, and retail systems, through smart contracts, smart devices, smart identity management. Although the use of this technology brings with it benefits, it can also still cause problems. A particular problem is derived from the immutability property, which means that fraudulent transactions or transfers of information cannot be reversed. Rationale: Blockchains can be attacked via a deluge of requests or transactions within a short time span, resulting in the loss of connectivity to the blockchain for users and businesses, or even financial institutions. Therefore, the rapid detection of anomalies from such activities is critical in order to prevent damage from occurring, or correct any damage as soon as possible to reduce the severity of its impact.Overall objectives: This project will study the problem of anomaly and fraud detection from the perspective of blockchain-based networks. Anomaly and fraud detection in blockchain-based networks is more complex due to their unique properties such as decentralisation, global reach, anonymity, etc., which make them different from traditional networks.Specific aims: To further the understanding of the sources and behaviours of anomalies and fraud in blockchain-based networks, and develop new improved methods for both static and dynamic anomaly detection that can be used alongside blockchain-based systems for real-time fraud detection.Methods: Developing and implementing static anomaly detection methods via a hybrid approach and developing dynamic anomaly detection methods using extreme value theory.Expected results: This research work will be able to contribute to improving the security relating to blockchain-based networks by providing more accurate and efficient methods for detecting anomalies and fraud and reducing the impact of losses resulting from these anomalies.Impact for the field: The project will be particularly beneficial alongside real world blockchain-based networks to allow for the fast detection of anomalous or fraudulent data, preventing damage or allowing for damage to be corrected as soon as possible. For cryptocurrency networks, this will reduce the impact of market manipulation, fraud, and more widely on global financial markets, currencies, and trade. In addition, the project will be of interest to a broad range of cryptocurrency and blockchain stakeholders including (but not limited to) academics, financial institutions, policymakers, regulators, and cybercrime agencies.
  
 
+
[https://data.snf.ch/grants/grant/211195 Grant link]
  
 
= Aims and Relevance =
 
= Aims and Relevance =
Line 19: Line 18:
 
= Methods =
 
= Methods =
  
The proposed research work focuses on the problem of anomaly and fraud detection in blockchain-based and cryptocurrency networks. Due to the rising popularity of these systems in the financial sector and the potential benefits, it has become increasingly important to detect anomalies and outliers, which may be derived from true errors or more likely monetary or information fraud. Therefore, our goal is to extend and improve upon the accuracy of existing methods of static anomaly detection in the literature relating to blockchain-based network graphs through combining methods from statistics and data mining. Furthermore, our goal is also to develop a new method for dynamic anomaly detection based on data streams and statistical extreme value theory. This methodology will be particularly beneficial alongside real-world blockchain-based networks to allow for the fast detection of anomalous or fraudulent data, preventing damage or allowing for damage to be corrected as soon as possible. For cryptocurrency networks, this will reduce the impact of market manipulation, fraud, and more widely on global financial markets, currencies, and trade. For blockchain-based networks in general, this will assist in reducing the impact of information loss. The proposed research design can be split into three main targets as outlined below and illustrated in Figure 1.
+
The proposed research work focuses on the problem of anomaly and fraud detection in blockchain-based and cryptocurrency networks. Due to the rising popularity of these systems in the financial sector and the potential benefits, it has become increasingly important to detect anomalies and outliers, which may be derived from true errors or more likely monetary or information fraud. Therefore, our goal is to extend and improve upon the accuracy of existing methods of static anomaly detection in the literature relating to blockchain-based network graphs through combining methods from statistics and data mining. Furthermore, our goal is also to develop a new method for dynamic anomaly detection based on data streams and statistical extreme value theory. This methodology will be particularly beneficial alongside real-world blockchain-based networks to allow for the fast detection of anomalous or fraudulent data, preventing damage or allowing for damage to be corrected as soon as possible. For cryptocurrency networks, this will reduce the impact of market manipulation, fraud, and more widely on global financial markets, currencies, and trade. For blockchain-based networks in general, this will assist in reducing the impact of information loss. The proposed research design can be split into three main targets as outlined below and illustrated in Figure 1 [[File:Methodology_summary.png|thumb|Figure 1: Summary of the methodology|link=Special:FilePath/Methodology_summary.png]]
 +
 
  
[[Image: Methodology_summary.png|| Figure 1: Summary of the methodology]]
 
  
 
== Analysis of the Evolution of Blockchain-Based Network Graphs and Their Properties ==
 
== Analysis of the Evolution of Blockchain-Based Network Graphs and Their Properties ==
Line 58: Line 57:
 
   
 
   
 
the proposed work is that our methodology attempts to extend static anomaly detection methods via a hybrid approach, and also develop dynamic anomaly detection methods using extreme value theory. This research work will be able to contribute to improving the security relating to blockchain-based networks by providing more accurate and efficient methods for detecting anomalies and fraud and reducing the impact of losses resulting from these anomalies.
 
the proposed work is that our methodology attempts to extend static anomaly detection methods via a hybrid approach, and also develop dynamic anomaly detection methods using extreme value theory. This research work will be able to contribute to improving the security relating to blockchain-based networks by providing more accurate and efficient methods for detecting anomalies and fraud and reducing the impact of losses resulting from these anomalies.
 +
 +
= Schedule =
 +
Pre-start: Literature review of blockchain network graphs/anomaly detection in networks. June 2022: Kick off meeting online: Discuss the aims/goals.
 +
 +
'''Activities:'''
 +
----
 +
 +
;Jul-Sep 2022: Work on the project/disseminate the preliminary results.
 +
;Sep 2022: Research Seminar on “Fraud detection on the Blockchain” invited speaker Prof. Akçora.
 +
;Sep-Dec 2022: Further research on anomaly/fraud detection methods.
 +
;Oct 2022: Review first academic draft paper.
 +
;Jan-Feb 2023: Professor Osterrieder academic visit to American University of Sharjah. Activities includes:
 +
:*Collaborate, do research and present a talk in our regular university seminar series on our joint work emphasising anomaly and fraud detection in blockchain networks.
 +
:*Discuss future strategic directions and priorities of the European COST Action 19130, Fintech and Artificial Intelligence in Finance (FinAI) (Professor Osterrieder acts as Action Chair.)
 +
:*Further discuss joint applications for funding our research in Blockchain and strengthen our network with the Criminology and Blockchain sector.
 +
:*Preparing the outline of our joint paper to be presented at "The Science of Blockchain Conference 2023”, Stanford University Jan 24, 2023.
 +
;Jan 2023: Disseminate draft results in the COST Fintech Action.
 +
;Jan 2023: Results to present at “The Science of Blockchain Conference 2023”, Stanford University.
 +
;Feb-Jun 2023: Review of ML techniques in anomaly detection.
 +
;Mar 2023: Research meeting (CH): Additional researchers invited via the EU COST FINAI network (1 STSM in Switzerland).
 +
;Apr 2023: Submission of first paper.
 +
;May 2023: Implement the methodology in the ZHAW: Big Data Analytics meeting.
 +
;Jun 2023: Final Workshop-discuss results/future research directions (co-organized/funded). Present results at the annual AI in Finance and Industry conference (supported by Innosuisse TFV conference series) at ZHAW. Submission of 2nd/3rd paper.
 +
 +
 +
 +
'''Work Plan -''' The proposed research is expected to be undertaken for a period of one year, which will allow enough time for the expected results to be achieved. The following details outline the proposed research plan, timeline, and tasks:
 +
----
 +
;Jun–Sep 2022: Mine blockchain-based network data. Obtain preliminary results.
 +
;Oct–Jan 2022/23: Build/test the dynamic anomaly detection method. First working paper available.
 +
;Feb–Apr 2023: Optimise the dynamic method and finalise the software package/paper.
 +
;May–Jun 2023: Disseminate the results in a seminar/the Innosuisse TFV conference series on AI in Finance Paper published.
 +
 +
= Expected outcomes =
 +
The proposed outcomes of the project are as follows:
 +
*Publish three research papers in international journals on the topics: ML and AI techniques (ACM SIGKDD Conference on Knowledge Discovery & Data Mining); Statistical methodology (Computational Statistics & Data Analysis); Financial methodology (Review of Financial Studies).
 +
*The preliminary research and goals will be presented in the UAE at the “Mathematics for Industry: Blockchain” meeting, Dec. 2022 (co-funded by Co-I's affiliations).The meeting provides an opportunity to share research and developments in Blockchain.
 +
*One conference titled “Data Science for Blockchain”, May 2023 in Switzerland (funded by Innosuisse TFV). The conference enable knowledge transfer and exchange between participating institutions.
 +
*Final product and results will be presented at “The Science of Blockchain Conference 2023”, Stanford University Jan 24, 2023.
 +
*Training two Masters Students, providing them with hands-on experience in a research project.
 +
*New methodologies will be developed into R software packages/website dashboards.
 +
*Final findings will be presented at a local seminar at all participants’ institutions, aim of inspiring new undergraduate/graduate research project.
 +
*AUS Executive professional education course on “Decrypting Cryptocurrencies” that Prof. Chan created will disseminate the preliminary results and goal of this project to those in industry/academics in the UAE. Scheduled Dec 2022 (16 hours).
 +
 +
 +
=Partnership: added value for visiting fellow and host institute and potential for further collaboration=
 +
This academic visit will be to provide a positive contribution to both the host institute and visiting fellow. Professor Osterrieder will provide the host institute with his expertise in the area of digitalization and industrialization of the finance industry and Fintech, where faculties in interdisciplinary departments and graduate students will immensely benefit through seminars, research collaboration and projects. Also, Professor Osterrieder acts as Action Chair of the European COST Action 19130, Fintech and Artificial Intelligence in Finance (FinAI), and will be discussing future strategic directions and priorities between both universities. As well, the output of this academic visit will be pivotal in helping both universities to achieve and apply for larger external grants. These external grants will be in collaboration with existing academic and industrial partners and will involve new cross- disciplinary studies. This will be a vital part in assisting both universities future long-term goal of creating the first Blockchain research center connecting Switzerland and the MENA region. Professor Chan will provide the visiting fellow with his research expertise in blockchain, data science and statistics, which will assist in their joint collaboration project and in achieving the stated deliverables (refer to the expected outcomes).
 +
 +
=References=
 +
# Ahmed, M., Mahmood, A.N. and Islam, M.R., 2016. A survey of anomaly detection techniques in financial domain. ''Future Generation Computer Systems'', 55, pp. 278-288.
 +
# Akoglu, L., Tong, H. and Koutra, D., 2015. Graph based anomaly detection and description: a survey. ''Data Mining and Knowledge Discovery'', 29, pp. 626-688.
 +
# Baqer, K., Huang, D.Y., McCoy, D. and Weaver, N. 2016. Stressing Out: Bitcoin “Stress Testing”. In: Clark J., Meiklejohn S., Ryan P., Wallach D., Brenner M., Rohloff K. (eds) ''Financial Cryptography and Data Security''. FC 2016. Lecture Notes in Computer Science, vol 9604. Springer, Berlin, Heidelberg.
 +
# Boginski, V., Butenko, S. and Pardalos, P.M., 2005. Statistical analysis of financial networks. ''Computational Statistics and Data Analysis'', 48, pp. 431-443.
 +
# Bunn, A.G., Urban, D.L. and Keitt, T.H., 2000. Landscape connectivity: A conservation application of graph theory. ''Journal of Environmental Management'', 59, pp. 265-278.
 +
# Chandola, V., Banerjee, A. and Kumar, V., 2009. Anomaly detection: a survey. ''ACM Computing Surveys'', 41, 15, pp. 1-58.
 +
# Chou, K.C., 1990. Applications of graph theory to enzyme kinetics and protein folding kinetics. steady and non-steady-state systems. ''Biophysical Chemistry'', 35, pp. 1-24.
 +
# Coinbase, 2021. What is a Blockchain? Available at: [https://www.coinbase.com/learn/crypto-basics/what-is-a-blockchain https://www.coinbase.com/learn/crypto-basics/what-is-a-blockchain].
 +
# Dobrjanskyj, L. and Freudenstein F., 1967. Some applications of graph theory to the structural analysis of mechanisms. ''Journal of Manufacturing Science and Engineering'', 89, pp. 153-158.
 +
# Feder, A., Gandal, H., Hamrick, J.T., Moore, T., Mukherjee, A., Rouhi, F. and Vasek, M., 2018. The Economics of Cryptocurrency Pump and Dump Schemes. ''CEPR Discussion Papers'' 13404, C.E.P.R. Discussion Papers.
 +
# Gai, P. and Kapadia, S., 2010. Contagion in financial networks. ''Proceedings of The Royal Society A'', 466, pp. 2401-2423.
 +
# Gupta, M., 2018. ''Blockchain for Dummies'', 2nd IBM Limited Edition. John Wiley & Sons, New York: Hoboken.
 +
# Han, J., Kamber, M., and Pei, J., 2012. ''Data Mining: Concepts and Techniques'', Third Edition. Elsevier.
 +
# Hautsch, N., Schaumburg, J. and Schienle, M., 2015. Financial Network Systemic Risk Contributions. ''Review of Finance'', 19, pp. 685-738.
 +
# IBM, 2021. What is Blockchain Technology? Available at: [https://www.ibm.com/topics/what-is-blockchain https://www.ibm.com/topics/what-is-blockchain].
 +
# Kamps, J. and Kleinberg, B., 2018. To the moon: defining and detecting cryptocurrency pump-and-dumps. ''Crime Science'', 7, 18.
 +
# Kim, J., Nakashima, M., Fan, W., Wuthier, S., Zhou, X., Kim, I. and Chang, S.-Y., 2021. Anomaly Detection based on Traffic Monitoring for Secure Blockchain Networking. 2021 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), pp. 1-9.
 +
# Li, Y., Islambekov, U., Akcora, C., Smirnova, E., Gel, Y.R. and Kantarcioglu, M., 2019. Dissecting Ethereum Blockchain Analytics: What We Learn from Topology and Geometry of Ethereum Graph. Available at: [https://arxiv.org/abs/1912.10105 https://arxiv.org/abs/1912.10105].
 +
# Liang, W., Xiao, L., Zhang, K., Tang, M., He, D. and Li K.-C., 2021. Data Fusion Approach for Collaborative Anomaly Intrusion Detection in Blockchain-based Systems. ''IEEE Internet of Things Journal'', doi: 10.1109/JIOT.2021.3053842.
 +
# Maesa, D.F., Marino, A. and Ricci, L., 2016. Uncovering the Bitcoin Blockchain: An Analysis of the Full Users Graph. ''IEEE International Conference on Data Science and Advanced Analytics (DSAA)'', pp. 537-546.
 +
# Maesa, D.F., Marino, A. and Ricci, L., 2018. Data-driven analysis of Bitcoin properties: exploiting the users graph. ''International Journal of Data Science and Analytics'', 6, pp. 63-80.
 +
# Mansourifar, H., Chen, L. and Shi, W., 2020. Hybrid Cryptocurrency Pump and Dump Detection. Available at: [https://arxiv.org/abs/2003.06551 https://arxiv.org/abs/2003.06551].
 +
# Monamo, P., Marivate, V. and Twala, B., 2016. Unsupervised learning for robust Bitcoin fraud detection. 2016 Information Security for South Africa (ISSA), pp. 129-134.
 +
# Morishima, S. and Matsutani, H. 2018. Acceleration of Anomaly Detection in Blockchain Using In-GPU Cache. ''IEEE International Conference on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom)'', pp. 244-251.
 +
# New York Post, 2021. Protests in El Salvador after bitcoin made official currency. Available at: [https://nypost.com/2021/09/16/protests-in-el-salvador-after-bitcoin-made-official-currency/ https://nypost.com/2021/09/16/protests-in-el-salvador-after-bitcoin-made-official-currency/].
 +
# Nier, E., Yang, J., Yorulmazer, T. and Alentorn, A., 2007. Network models and financial stability. ''Journal of Economic Dynamics & Control'', 31, pp. 2033-2060.
 +
# Ober, M., Katzenbeisser, S. and Hamacher, K., 2013. Structure and Anonymity of the Bitcoin Transaction Graph. ''Future Internet'', 5, pp. 237-250.
 +
# Pham, T.T. and Lee, S., 2016. Anomaly Detection in the Bitcoin System - A Network Perspective. Available at: [https://arxiv.org/abs/1611.03942 https://arxiv.org/abs/1611.03942].
 +
# Pham, T.T. and Lee, S., 2017. Anomaly Detection in Bitcoin Network Using Unsupervised Learning Methods. Available at: [https://arxiv.org/abs/1611.03941 https://arxiv.org/abs/1611.03941].
 +
# Ron D. and Shamir A., 2013. Quantitative Analysis of the Full Bitcoin Transaction Graph. In: Sadeghi, A.R. (eds) ''Financial Cryptography and Data Security''. FC 2013. Lecture Notes in Computer Science. Springer, Berlin, Heidelberg.
 +
# Sayadi, S., Rejeb, S.B. and Choukar, Z., 2019. Anomaly Detection Model Over Blockchain Electronic Transactions. 2019 15th International Wireless Communications & Mobile Computing Conference (IWCMC), pp. 895-900.
 +
# Shayegan, M.J. and Sabor, H.R., 2021. A Collective Anomaly Detection Method Over Bitcoin Network. Available at: [https://arxiv.org/abs/2107.00925 https://arxiv.org/abs/2107.00925].
 +
# Signorini, M., Pontecorvi, M., Kanoun, A. and Di Pitreo, R., 2020. BAD: Blockchain Anomaly Detection. Available at: [https://arxiv.org/abs/1807.03833 https://arxiv.org/abs/1807.03833].
 +
# Yahoo, 2021. Bitcoin, ethereum rise as Venezuela launches digital currency. Available at: [https://finance.yahoo.com/news/bitcoin-ethereum-rise-venezuela-launches-digital-currency- 081104851.html https://finance.yahoo.com/news/bitcoin-ethereum-rise-venezuela-launches-digital-currency-081104851.html].
 +
# Zhang, R., Zhang, G., Liu, L., Wang, C. and Wan, S., 2020. Anomaly detection in bitcoin information networks with multi-constrained meta path. ''Journal of Systems Architecture'', 110, 101829.

Latest revision as of 15:26, 30 October 2023

Abstract

Blockchain networks are increasingly being implemented into healthcare, supply chain, and retail systems, through smart contracts, smart devices, smart identity management. Although the use of this technology brings with it benefits, it can also still cause problems. A particular problem is derived from the immutability property, which means that fraudulent transactions or transfers of information cannot be reversed. Rationale: Blockchains can be attacked via a deluge of requests or transactions within a short time span, resulting in the loss of connectivity to the blockchain for users and businesses, or even financial institutions. Therefore, the rapid detection of anomalies from such activities is critical in order to prevent damage from occurring, or correct any damage as soon as possible to reduce the severity of its impact.Overall objectives: This project will study the problem of anomaly and fraud detection from the perspective of blockchain-based networks. Anomaly and fraud detection in blockchain-based networks is more complex due to their unique properties such as decentralisation, global reach, anonymity, etc., which make them different from traditional networks.Specific aims: To further the understanding of the sources and behaviours of anomalies and fraud in blockchain-based networks, and develop new improved methods for both static and dynamic anomaly detection that can be used alongside blockchain-based systems for real-time fraud detection.Methods: Developing and implementing static anomaly detection methods via a hybrid approach and developing dynamic anomaly detection methods using extreme value theory.Expected results: This research work will be able to contribute to improving the security relating to blockchain-based networks by providing more accurate and efficient methods for detecting anomalies and fraud and reducing the impact of losses resulting from these anomalies.Impact for the field: The project will be particularly beneficial alongside real world blockchain-based networks to allow for the fast detection of anomalous or fraudulent data, preventing damage or allowing for damage to be corrected as soon as possible. For cryptocurrency networks, this will reduce the impact of market manipulation, fraud, and more widely on global financial markets, currencies, and trade. In addition, the project will be of interest to a broad range of cryptocurrency and blockchain stakeholders including (but not limited to) academics, financial institutions, policymakers, regulators, and cybercrime agencies.

Grant link

Aims and Relevance

This project aims to study the problem of anomaly and fraud detection from the perspective of blockchain-based networks. The major developments of blockchain technology and cryptocurrencies have brought benefits such as increased efficiency and transparency to all, but the immutability property means that fraudulent transactions or transfers of information cannot be reversed. Rapid detection of anomalies from such activities is critical in order to prevent damage from occurring, or correct any damage as soon as possible to reduce the severity of its impact. Anomaly and fraud detection in blockchain-based networks are more complex due to their unique properties such as decentralization, global reach, anonymity, etc., which make them different from traditional networks.

The proposed research work comprises three main parts:

  1. Studying the evolution of blockchain-based networks over time.
  2. Investigating static anomaly detection methods for blockchain-based networks.
  3. Developing dynamic anomaly detection methods for blockchain-based networks.

This research aims to contribute to a better understanding of the sources and behaviors of anomalies and fraud in blockchain-based networks, as well as the development of new improved methods for anomaly detection, especially in reducing the false positive rate. Additionally, it will help to develop new methods that can be used alongside blockchain-based systems to detect anomalies and fraud in real time as new data is generated.

Methods

The proposed research work focuses on the problem of anomaly and fraud detection in blockchain-based and cryptocurrency networks. Due to the rising popularity of these systems in the financial sector and the potential benefits, it has become increasingly important to detect anomalies and outliers, which may be derived from true errors or more likely monetary or information fraud. Therefore, our goal is to extend and improve upon the accuracy of existing methods of static anomaly detection in the literature relating to blockchain-based network graphs through combining methods from statistics and data mining. Furthermore, our goal is also to develop a new method for dynamic anomaly detection based on data streams and statistical extreme value theory. This methodology will be particularly beneficial alongside real-world blockchain-based networks to allow for the fast detection of anomalous or fraudulent data, preventing damage or allowing for damage to be corrected as soon as possible. For cryptocurrency networks, this will reduce the impact of market manipulation, fraud, and more widely on global financial markets, currencies, and trade. For blockchain-based networks in general, this will assist in reducing the impact of information loss. The proposed research design can be split into three main targets as outlined below and illustrated in Figure 1

Figure 1: Summary of the methodology


Analysis of the Evolution of Blockchain-Based Network Graphs and Their Properties

The initial goal involves studying and analyzing the key properties of blockchain-based network graphs and how they have evolved over time. The key difference between blockchain-based networks and other systems that can be represented in terms of a network graph is that blockchain technology is relatively young, existing for just over 10 years, and still developing. Therefore, it is likely that the structures of blockchain-based networks have changed since they were first implemented and have continued to evolve. This is a key part of our analysis which needs to be completed before we start our investigation into extending existing and developing new methods for anomaly detection. The main reason is that many assumptions regarding anomalies in other types of networks may not be directly applicable. For example, in credit card transaction networks, anomalies may be classed as transactions where the value of the transaction is significantly higher, the number of transactions is significantly higher, or transactions occur in locations that are far away from the majority. However, in blockchain-based networks, the concepts of normal and anomalous data are not so clear-cut and known.

To address this problem, we propose to perform a comprehensive analysis of the network graphs of large blockchain-based networks. These will include the network graphs of large cryptocurrencies such as Bitcoin and Ethereum, in addition to other blockchains for which network data can be obtained. A starting point is to investigate the fundamental result that the network graphs of many real-world systems follow the power-law model. This states that in a network graph, the probability that a node has a degree (number of edges) of k is given by the relationship 𝑷𝑷(𝒌𝒌) ∝ 𝒌𝒌−𝜶𝜶 or equivalently 𝐥𝐥𝐥𝐥𝐥𝐥 𝑷𝑷 ∝ −𝜸𝜸 𝐥𝐥𝐥𝐥𝐥𝐥 𝜶𝜶, which forms a straight line on a logarithmic scale (Boginski et al., 2005). This indicates that a large number of nodes have a very small degree, while a small number of nodes have a very large degree. For example, in a blockchain transaction graph, this would suggest that a large number of accounts make very few transactions, while a small number of accounts make a large number of transactions. This would provide a general idea of whether the structure and behavior of the blockchain show any similarities to traditional networks. In addition, other common network graph statistics such as the clustering coefficient, cliques, and independent sets will also be computed. Due to the lack of labeled data relating to anomalies and fraud, there do not appear to be any benchmarks for distinguishing between normal and anomalous data in blockchain-based networks. Therefore, we propose to split our network data into subsamples of months and years and construct a large number of different network graphs from our datasets covering transaction graphs, user graphs, and graphs based on other network variables. By analyzing the distribution of these network graph statistics for the graphs, we will be able to see how the distributions and their parameters have changed over time. This can then provide a benchmark time series for parameters and statistics that can be used as possible baselines and inputs in the anomaly detection methods.

Analysis of static anomaly detection methods

After obtaining a comprehensive overview of the structures of blockchain-based networks and cryptocurrency networks and how they have evolved over time, the second phase will focus on trying to improve existing methods for anomaly detection in blockchain-based networks. We classify anomalies and outliers, into three different groups as follows (Chandola et al., 2009): a) point anomalies – these are the simplest types of anomalies. Single data points are classed as anomalies if they are located far enough away from the centroid of the data set; b) collective anomalies – these are sets of point anomalies that are linked to each other; c) contextual anomalies – these anomalies are conditional and usually occur in time series data.

Point anomalies and collective anomalies can be considered as part of the static anomaly and fraud detection problem. In theory, these types of anomalies will generally be more pronounced and easier to detect. Existing data on anomalies that have previously occurred in blockchain-based networks is limited. We propose to build our real data sample from a combination of publicly available data from two main sources: a) cryptocurrency networks – using data from previously reported anomalous events such as hacks, including date and time, type of anomaly, total loss, etc; b) other blockchain-based networks – using data from previously reported anomalous events such as attacks on user wallets, smart contracts, double spending, distributed denial of service (DDOS) attacks, etc. Although this data will likely be very general, it can still provide an indication of approximate time periods that can be focused on for detecting anomalies.

The main part of our method will be based on a hybrid approach, where individual anomaly detection methods are combined and used in parallel, or consecutively. The motivation for this approach is provided in the current literature, which has found that anomalies detected in blockchain-based networks using existing methods do not show a significant overlap (Mansourifar et al., 2020). To overcome this, network graphs of the cryptocurrency and blockchain-based networks will be constructed from our real data using the undirected graph model defined in Section 1. These will correspond to the network graphs of the time periods when anomalous events actually occurred.

An existing method such as k-means clustering will be used to search for clusters in the networks graphs to split the data into groups based on their similarity, in order to determine which data points may be anomalies. In the case of k-means clustering, the goal is to partition the sample data corresponding to a particular network graph into k distinct and non-overlapping clusters. In the simplest case, the number of clusters k can be set manually to be equal to the number of types of anomalies we expect to see, or set to a value of two based on the premise of data being normal or anomalous. More formally, let 𝐶𝐶1, 𝐶𝐶2, ⋯ , 𝐶𝐶𝑘𝑘 denote sets containing the data points in each cluster, we aim to minimize the within cluster variation for each of the 𝑘𝑘 clusters as follows:


Math formula.JPG

where |𝐶𝐶𝑘𝑘 | denotes the number of observations within the 𝑘𝑘th cluster, and the within cluster variation is denoted by the term in brackets (pairwise squared Euclidean distances between observations within the 𝑘𝑘th cluster). Depending on the graph type, these data points may correspond with transactions, accounts, etc., and may be clustered according to the frequency of transactions, values of transactions, or other attributes.

Results will be obtained for a large number of graphs for each blockchain-based network. Point and collective anomalies can then be identified and compared against the true anomalies in the real data, and to the benchmark values computed from the analysis in Part 1. In addition, anomalous trends and patterns in network graph statistics may also be revealed that can also be indicative of an anomalous event and possible fraud. However, to improve the accuracy of the detection of anomalies, we propose to combine these methods with the use of extreme value theory (EVT). This is because anomalies resulting from fraud usually correspond with extreme data – for example, in cryptocurrency networks, a small number of transactions with large values, or a large number of transactions with small values.

One possibility is for the most extreme values in network statistics to be modelled using the generalised extreme value (GEV) distribution from extreme value theory. Suppose anomalies have been detected in a network graph representing the number of transactions between user accounts. Considering all data points exceeding some threshold as extreme values, we can model the distribution of these values, x, by the GEV distribution. This can provide a probabilistic interpretation of how likely the data, in this case very large or small numbers of transactions, are to occur. The anomalies detected by single methods such as k-means clustering can then be analysed using the fitted model to provide a further confirmation of how likely the true anomalies were to occur. Simulations of these extreme data can also be generated using this model, which can be used to analyse possible anomalies during periods when there were no confirmed anomalies.

Development of dynamic anomaly detection methods

The third phase of the proposed work will focus on developing dynamic methods for the detection of contextual anomalies in the network graphs of blockchain-based networks. This problem is more complex as these anomalies are dependent on the context or the conditions at the time when they occur. In addition, many static anomaly detection methods are not suitable as they require scanning of the network data multiple times. To solve this problem, we propose to develop our method by treating the data from blockchain-based networks as a data stream. As new data is continuously generated, the structures of the corresponding network graphs will change over time and so will the network graph statistics.

We suppose that data from blockchain-based networks can be represented as a streaming time series 𝑋𝑋𝑡𝑡, 𝑡𝑡 > 0 of independent and identically distributed (i.i.d.) observations. To determine a level of “normality” with respect to network graph statistics, we use the results from the analysis in Part 1 as benchmark values. The focus will be on determining a threshold level 𝛼𝛼 that if network graph statistics exceed, then they will be classed as being a possible anomaly and can then be analysed further. Instead of analysing the network graphs of the whole sample period of our data, we account for the time component by analysing network graphs and statistics corresponding to rolling windows of pre-defined lengths like one hour, one day, one week, etc.

Inspired by extreme value theory, we can use the peaks over threshold method to determine which data points in a rolling window are defined as extreme. The simplest way is to define an initial threshold value so that within a rolling window a small percentage (e.g. 2.5%) of data points are above or below 𝛼𝛼 . These extreme data points, x, can then be modelled by the generalised Pareto distribution (GPD), which is very similar to the GEV distribution used in Part 2.This model can again provide a probabilistic interpretation of how likely these anomaly data points are to occur. As a way to check whether the data points are true anomalies that should be investigated, we require additional conditions. One way is to define anomalies as data points which have a very small probability of occurring according to the fitted model, in addition to exceeding a percentage relating to the benchmark values obtained in Part 1 of the analysis. Another possibility is to also require data points to exceed a percentage relating to average values of the network statistics in the current rolling window, or a number of the most recent rolling windows. Selecting the most appropriate conditions will require testing on real anomaly data and also simulated anomaly data to find the optimal detection conditions.

In summary, due to the rising popularity and use of blockchain-based and cryptocurrency networks, the risks from these networks are also growing due to anomalous data and events. Much of the current literature focuses only on static anomaly detection in blockchain-based networks. The uniqueness and innovation of

the proposed work is that our methodology attempts to extend static anomaly detection methods via a hybrid approach, and also develop dynamic anomaly detection methods using extreme value theory. This research work will be able to contribute to improving the security relating to blockchain-based networks by providing more accurate and efficient methods for detecting anomalies and fraud and reducing the impact of losses resulting from these anomalies.

Schedule

Pre-start: Literature review of blockchain network graphs/anomaly detection in networks. June 2022: Kick off meeting online: Discuss the aims/goals.

Activities:


Jul-Sep 2022
Work on the project/disseminate the preliminary results.
Sep 2022
Research Seminar on “Fraud detection on the Blockchain” invited speaker Prof. Akçora.
Sep-Dec 2022
Further research on anomaly/fraud detection methods.
Oct 2022
Review first academic draft paper.
Jan-Feb 2023
Professor Osterrieder academic visit to American University of Sharjah. Activities includes:
  • Collaborate, do research and present a talk in our regular university seminar series on our joint work emphasising anomaly and fraud detection in blockchain networks.
  • Discuss future strategic directions and priorities of the European COST Action 19130, Fintech and Artificial Intelligence in Finance (FinAI) (Professor Osterrieder acts as Action Chair.)
  • Further discuss joint applications for funding our research in Blockchain and strengthen our network with the Criminology and Blockchain sector.
  • Preparing the outline of our joint paper to be presented at "The Science of Blockchain Conference 2023”, Stanford University Jan 24, 2023.
Jan 2023
Disseminate draft results in the COST Fintech Action.
Jan 2023
Results to present at “The Science of Blockchain Conference 2023”, Stanford University.
Feb-Jun 2023
Review of ML techniques in anomaly detection.
Mar 2023
Research meeting (CH): Additional researchers invited via the EU COST FINAI network (1 STSM in Switzerland).
Apr 2023
Submission of first paper.
May 2023
Implement the methodology in the ZHAW: Big Data Analytics meeting.
Jun 2023
Final Workshop-discuss results/future research directions (co-organized/funded). Present results at the annual AI in Finance and Industry conference (supported by Innosuisse TFV conference series) at ZHAW. Submission of 2nd/3rd paper.


Work Plan - The proposed research is expected to be undertaken for a period of one year, which will allow enough time for the expected results to be achieved. The following details outline the proposed research plan, timeline, and tasks:


Jun–Sep 2022
Mine blockchain-based network data. Obtain preliminary results.
Oct–Jan 2022/23
Build/test the dynamic anomaly detection method. First working paper available.
Feb–Apr 2023
Optimise the dynamic method and finalise the software package/paper.
May–Jun 2023
Disseminate the results in a seminar/the Innosuisse TFV conference series on AI in Finance Paper published.

Expected outcomes

The proposed outcomes of the project are as follows:

  • Publish three research papers in international journals on the topics: ML and AI techniques (ACM SIGKDD Conference on Knowledge Discovery & Data Mining); Statistical methodology (Computational Statistics & Data Analysis); Financial methodology (Review of Financial Studies).
  • The preliminary research and goals will be presented in the UAE at the “Mathematics for Industry: Blockchain” meeting, Dec. 2022 (co-funded by Co-I's affiliations).The meeting provides an opportunity to share research and developments in Blockchain.
  • One conference titled “Data Science for Blockchain”, May 2023 in Switzerland (funded by Innosuisse TFV). The conference enable knowledge transfer and exchange between participating institutions.
  • Final product and results will be presented at “The Science of Blockchain Conference 2023”, Stanford University Jan 24, 2023.
  • Training two Masters Students, providing them with hands-on experience in a research project.
  • New methodologies will be developed into R software packages/website dashboards.
  • Final findings will be presented at a local seminar at all participants’ institutions, aim of inspiring new undergraduate/graduate research project.
  • AUS Executive professional education course on “Decrypting Cryptocurrencies” that Prof. Chan created will disseminate the preliminary results and goal of this project to those in industry/academics in the UAE. Scheduled Dec 2022 (16 hours).


Partnership: added value for visiting fellow and host institute and potential for further collaboration

This academic visit will be to provide a positive contribution to both the host institute and visiting fellow. Professor Osterrieder will provide the host institute with his expertise in the area of digitalization and industrialization of the finance industry and Fintech, where faculties in interdisciplinary departments and graduate students will immensely benefit through seminars, research collaboration and projects. Also, Professor Osterrieder acts as Action Chair of the European COST Action 19130, Fintech and Artificial Intelligence in Finance (FinAI), and will be discussing future strategic directions and priorities between both universities. As well, the output of this academic visit will be pivotal in helping both universities to achieve and apply for larger external grants. These external grants will be in collaboration with existing academic and industrial partners and will involve new cross- disciplinary studies. This will be a vital part in assisting both universities future long-term goal of creating the first Blockchain research center connecting Switzerland and the MENA region. Professor Chan will provide the visiting fellow with his research expertise in blockchain, data science and statistics, which will assist in their joint collaboration project and in achieving the stated deliverables (refer to the expected outcomes).

References

  1. Ahmed, M., Mahmood, A.N. and Islam, M.R., 2016. A survey of anomaly detection techniques in financial domain. Future Generation Computer Systems, 55, pp. 278-288.
  2. Akoglu, L., Tong, H. and Koutra, D., 2015. Graph based anomaly detection and description: a survey. Data Mining and Knowledge Discovery, 29, pp. 626-688.
  3. Baqer, K., Huang, D.Y., McCoy, D. and Weaver, N. 2016. Stressing Out: Bitcoin “Stress Testing”. In: Clark J., Meiklejohn S., Ryan P., Wallach D., Brenner M., Rohloff K. (eds) Financial Cryptography and Data Security. FC 2016. Lecture Notes in Computer Science, vol 9604. Springer, Berlin, Heidelberg.
  4. Boginski, V., Butenko, S. and Pardalos, P.M., 2005. Statistical analysis of financial networks. Computational Statistics and Data Analysis, 48, pp. 431-443.
  5. Bunn, A.G., Urban, D.L. and Keitt, T.H., 2000. Landscape connectivity: A conservation application of graph theory. Journal of Environmental Management, 59, pp. 265-278.
  6. Chandola, V., Banerjee, A. and Kumar, V., 2009. Anomaly detection: a survey. ACM Computing Surveys, 41, 15, pp. 1-58.
  7. Chou, K.C., 1990. Applications of graph theory to enzyme kinetics and protein folding kinetics. steady and non-steady-state systems. Biophysical Chemistry, 35, pp. 1-24.
  8. Coinbase, 2021. What is a Blockchain? Available at: https://www.coinbase.com/learn/crypto-basics/what-is-a-blockchain.
  9. Dobrjanskyj, L. and Freudenstein F., 1967. Some applications of graph theory to the structural analysis of mechanisms. Journal of Manufacturing Science and Engineering, 89, pp. 153-158.
  10. Feder, A., Gandal, H., Hamrick, J.T., Moore, T., Mukherjee, A., Rouhi, F. and Vasek, M., 2018. The Economics of Cryptocurrency Pump and Dump Schemes. CEPR Discussion Papers 13404, C.E.P.R. Discussion Papers.
  11. Gai, P. and Kapadia, S., 2010. Contagion in financial networks. Proceedings of The Royal Society A, 466, pp. 2401-2423.
  12. Gupta, M., 2018. Blockchain for Dummies, 2nd IBM Limited Edition. John Wiley & Sons, New York: Hoboken.
  13. Han, J., Kamber, M., and Pei, J., 2012. Data Mining: Concepts and Techniques, Third Edition. Elsevier.
  14. Hautsch, N., Schaumburg, J. and Schienle, M., 2015. Financial Network Systemic Risk Contributions. Review of Finance, 19, pp. 685-738.
  15. IBM, 2021. What is Blockchain Technology? Available at: https://www.ibm.com/topics/what-is-blockchain.
  16. Kamps, J. and Kleinberg, B., 2018. To the moon: defining and detecting cryptocurrency pump-and-dumps. Crime Science, 7, 18.
  17. Kim, J., Nakashima, M., Fan, W., Wuthier, S., Zhou, X., Kim, I. and Chang, S.-Y., 2021. Anomaly Detection based on Traffic Monitoring for Secure Blockchain Networking. 2021 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), pp. 1-9.
  18. Li, Y., Islambekov, U., Akcora, C., Smirnova, E., Gel, Y.R. and Kantarcioglu, M., 2019. Dissecting Ethereum Blockchain Analytics: What We Learn from Topology and Geometry of Ethereum Graph. Available at: https://arxiv.org/abs/1912.10105.
  19. Liang, W., Xiao, L., Zhang, K., Tang, M., He, D. and Li K.-C., 2021. Data Fusion Approach for Collaborative Anomaly Intrusion Detection in Blockchain-based Systems. IEEE Internet of Things Journal, doi: 10.1109/JIOT.2021.3053842.
  20. Maesa, D.F., Marino, A. and Ricci, L., 2016. Uncovering the Bitcoin Blockchain: An Analysis of the Full Users Graph. IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 537-546.
  21. Maesa, D.F., Marino, A. and Ricci, L., 2018. Data-driven analysis of Bitcoin properties: exploiting the users graph. International Journal of Data Science and Analytics, 6, pp. 63-80.
  22. Mansourifar, H., Chen, L. and Shi, W., 2020. Hybrid Cryptocurrency Pump and Dump Detection. Available at: https://arxiv.org/abs/2003.06551.
  23. Monamo, P., Marivate, V. and Twala, B., 2016. Unsupervised learning for robust Bitcoin fraud detection. 2016 Information Security for South Africa (ISSA), pp. 129-134.
  24. Morishima, S. and Matsutani, H. 2018. Acceleration of Anomaly Detection in Blockchain Using In-GPU Cache. IEEE International Conference on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), pp. 244-251.
  25. New York Post, 2021. Protests in El Salvador after bitcoin made official currency. Available at: https://nypost.com/2021/09/16/protests-in-el-salvador-after-bitcoin-made-official-currency/.
  26. Nier, E., Yang, J., Yorulmazer, T. and Alentorn, A., 2007. Network models and financial stability. Journal of Economic Dynamics & Control, 31, pp. 2033-2060.
  27. Ober, M., Katzenbeisser, S. and Hamacher, K., 2013. Structure and Anonymity of the Bitcoin Transaction Graph. Future Internet, 5, pp. 237-250.
  28. Pham, T.T. and Lee, S., 2016. Anomaly Detection in the Bitcoin System - A Network Perspective. Available at: https://arxiv.org/abs/1611.03942.
  29. Pham, T.T. and Lee, S., 2017. Anomaly Detection in Bitcoin Network Using Unsupervised Learning Methods. Available at: https://arxiv.org/abs/1611.03941.
  30. Ron D. and Shamir A., 2013. Quantitative Analysis of the Full Bitcoin Transaction Graph. In: Sadeghi, A.R. (eds) Financial Cryptography and Data Security. FC 2013. Lecture Notes in Computer Science. Springer, Berlin, Heidelberg.
  31. Sayadi, S., Rejeb, S.B. and Choukar, Z., 2019. Anomaly Detection Model Over Blockchain Electronic Transactions. 2019 15th International Wireless Communications & Mobile Computing Conference (IWCMC), pp. 895-900.
  32. Shayegan, M.J. and Sabor, H.R., 2021. A Collective Anomaly Detection Method Over Bitcoin Network. Available at: https://arxiv.org/abs/2107.00925.
  33. Signorini, M., Pontecorvi, M., Kanoun, A. and Di Pitreo, R., 2020. BAD: Blockchain Anomaly Detection. Available at: https://arxiv.org/abs/1807.03833.
  34. Yahoo, 2021. Bitcoin, ethereum rise as Venezuela launches digital currency. Available at: 081104851.html https://finance.yahoo.com/news/bitcoin-ethereum-rise-venezuela-launches-digital-currency-081104851.html.
  35. Zhang, R., Zhang, G., Liu, L., Wang, C. and Wan, S., 2020. Anomaly detection in bitcoin information networks with multi-constrained meta path. Journal of Systems Architecture, 110, 101829.