1
Project overview
2
Client Analysis
3
Staking Analysis
4
Roadmap
๐
Client Analysis - Data Collection
Objective
- To retrieve network data about nodes (such as client, ip, hosting, peers etc)
- Analyze EL-CL pairings, Client diversity, Client effectiveness, Peering differences between clients.
- Setup a data pipeline to ingest data into a data store for further analysis.
Challenges
- Without setting up a node or a crawler, there was no way to query the required information.
Steps taken
-
Identified and experimented the various open source network crawler implementations listed below:
- node-crawler (Eth1) : ๐งก Although the documentation was good for setting this up, it could only parse 100 mainnet Eth1 nodes/day. There is no guide to configure this.
- node-watch (Eth2) : ๐ Easy to setup. Crawled Eth2 nodes and stored them in MongoDB. We wrote scripts to extract data from MongoDB and load it into an AWS RDS MySQL instance.
- CrawlEth (Eth1) : โค๏ธ It appears to be a useful tool, but it was difficult to set up due to a lack of documentation.
-
We were able to set up nodewatch in an AWS instance and parse below information for Eth2 nodes
- Client
- IP
- Fork Digest
- Sync Status
- Code: nodewatch-to-db.py
-
We were unsuccessful in retrieving some of the desired information (such as peers of the nodes). This would be one of the action items in our roadmap.
-
We were also unsuccessful in setting up a Eth1 Crawler.
-
We ended up crawling ethernodes to retrieve required Eth1 info.
- Code: ethernode-to-db.py
-
To retrieve hosting provider information, We utilized Maxmind free database. We have to manually filter out ISP providers without hosting infrastructure to remove noise.
- Code: geoip-to-db.py