By using this site, you agree to our use of cookies, which we use to analyse our traffic in accordance with our Privacy Policy. We also share information about your use of our site with our analytics partners.

Developers

How to Monitor Your Eth2 Validator and Analyze Your P&L

by Coogan BrennanJanuary 15, 2021
My Journey to Becoming a Validator on Ethereum 2 0  Part 3

This is the third article in a four-part series on how to run your own Eth2 validator. If you are new to this series, be sure to check out Part 1: Getting Started, Part 2: Setting Up Your Client and Part 4: Safely Migrating Your Eth2 Node.

Also, you all should be checking Ben Edgington鈥檚 Eth2.News newsletter for essential updates, bug fixes, and news on the forthcoming roadmap. Our Eth2 Knowledge Base is helpful if you need more of a background on key terms, phases, and ConsenSys鈥 Eth2 products.

Intro聽

It鈥檚 been a month and a half since Ethereum 2.0 Beacon chain genesis kicked off. Already, 2,515,170 ETH has been staked (about $2.9 billion at current market rates) with 61,561 unique validators, and another 16,687 waiting in queue. Despite the tremendous interest in staking, it鈥檚 actually been a pretty uneventful month and a half: There have been no major disruptions, only a few slashings and validator participation in the 98th percentile most of the time. Now鈥檚 a good time to take a breath to take stock of what we鈥檝e done so far.聽

In this blog post I will be covering monitoring and financial analysis of your Eth2 validator. I provide an overview on how to access Teku metrics, set up Beaconcha.in notifications, and how to query the node. I also share my current P&L breakdown. In the final installment of this series, I will discuss how to safely and (hopefully) successfully migrate a Teku node from one server to another.

Monitoring

In this section, I鈥檓 going to walk through how to read your validator node鈥檚 metrics. Running an Ethereum 2.0 validator is running infrastructure for a distributed system. A crucial part of maintaining infrastructure is being able to see what鈥檚 going on. Luckily, Teku comes with a great suite of monitoring tools enabled with the 鈥溾搈etrics-enabled鈥 flag on our start-up command, highlighted below:

ExecStart=/home/ubuntu/teku-20.11.1/bin/teku --network=mainnet<strong> </strong> <strong>--eth1-endpoint=INFURA_ETH1_HTTP_ENDPOINT_GOES_HERE </strong> <strong>--validator-keys=/home/ubuntu/validator_key_info/KEYSTORE-M_123456_789_ABCD.json:/home/ubuntu/validator_key_info/validator_keys/KEYSTORE-M_123456_789_ABCD.txt </strong> --rest-api-enabled=true --rest-api-docs-enabled=true --metrics-enabled --validators-keystore-locking-enabled=false <strong>--data-base-path=/var/lib/teku</strong>
Code language: HTML, XML (xml)

We do have to follow a few steps before being able to read the data.

For those not running a Teku client: First, why? Second, you can see the minimum metrics provided by all clients in Ethereum 2.0 specs here.

Installing Prometheus

First, we need to install Prometheus, an open source monitoring program, and Grafana, an open-source analytics and interactive visualization web app. Prometheus pulls the data and Grafana displays it.

On your Ubuntu command line, download the latest stable Prometheus:

curl -JLO <a href="https://github.com/prometheus/prometheus/releases/download/v2.23.0/prometheus-2.23.0.linux-amd64.tar.gz">https://github.com/prometheus/prometheus/releases/download/v2.23.0/prometheus-2.23.0.linux-amd64.tar.gz</a>
Code language: HTML, XML (xml)

Decompress the file like so:

tar -zxvf <a href="https://github.com/prometheus/prometheus/releases/download/v2.23.0/prometheus-2.23.0.linux-amd64.tar.gz">prometheus-2.23.0.linux-amd64.tar.gz</a>
Code language: HTML, XML (xml)

Move the binary so it鈥檚 available from command line:

Cd prometheus-2.23.0
Code language: CSS (css)
sudo mv prometheus promtool /usr/local/bin/

Check to make sure it鈥檚 been installed correctly:

prometheus --version promtool --version

Create a prometheus YML configuration file:

sudo nano prometheus.yml
Code language: CSS (css)

Paste these parameters to the configuration file:

global: scrape_interval: 15s scrape_configs: 聽聽- job_name: "prometheus" 聽聽聽聽static_configs: 聽聽聽聽- targets: ["localhost:9090"] 聽聽- job_name: "teku-dev" 聽聽聽聽scrape_timeout: 10s 聽聽聽聽metrics_path: /metrics 聽聽聽scheme: http 聽聽聽static_configs: 聽聽聽- targets: ["localhost:8008"]
Code language: PHP (php)

This instructs Prometheus to poll your Teku node every 10 seconds on the 8008 port. Hit command-X and press Y to save buffer

Now, let鈥檚 create a directory to put our Prometheus config file:

sudo mkdir /etc/prometheus sudo mv prometheus.yml /etc/prometheus/prometheus.yml

We鈥檙e going to make one other directory for other Prometheus files and move the console and console_libraries modules to /etc/prometheus

sudo mkdir /var/lib/prometheus sudo mv consoles/ console_libraries/ /etc/prometheus/
Code language: JavaScript (javascript)

We鈥檒l create a prometheus user to run a systemd service, like we did for Teku (read more here about how roles-based user access is best practice for server security) and give it access to appropriate files:

sudo useradd --no-create-home --shell /bin/false prometheus sudo chown -R prometheus:prometheus /var/lib/prometheus聽 sudo chown -R prometheus:prometheus /etc/prometheus sudo chown -R prometheus:prometheus /usr/local/bin/
Code language: JavaScript (javascript)

Last, create a systemd service that can run in the background and restart itself if it fails:

sudo nano /etc/systemd/system/prometheus.service

In this file (which should be empty), we鈥檙e going to put in a series of commands for the systemd to execute when we start the service. Copy the following into the text editor:

[Unit] Description=Prometheus Wants=network-online.target After=network-online.target [Service] Type=simple User=prometheus Group=prometheus Restart=always RestartSec=5 ExecStart=/usr/local/bin/prometheus \ 聽聽--config.file=/etc/prometheus/prometheus.yml \ 聽聽--storage.tsdb.path=/var/lib/prometheus \ 聽聽--web.console.templates=/etc/prometheus/consoles \ 聽聽--web.console.libraries=/etc/prometheus/console_libraries\ 聽--web.listen-address=0.0.0.0:9090 \ [Install] WantedBy=multi-user.target
Code language: JavaScript (javascript)

Type command-X, then type 鈥淵鈥 to save your changes

We have to restart systemctl to update it:

sudo systemctl daemon-reload

Start the service:

sudo systemctl start prometheus

Check to make sure it鈥檚 running okay:

sudo systemctl status prometheus

If you see any errors, get more details by running:

sudo journalctl -f -u prometheus.service
Code language: CSS (css)

You can stop the Prometheus service by running:

sudo systemctl stop prometheus

Install Grafana

We鈥檙e going to use the APT package manager for Linux to install Grafana. This will save us a good amount of work and give us what we need. We鈥檒l follow the steps from the Grafana installation page:

sudo apt-get install -y apt-transport-https sudo apt-get install -y software-properties-common wget wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
Code language: JavaScript (javascript)

We add the stable Grafana repository for updates:

echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list
Code language: PHP (php)

Then we run APT:

sudo apt-get update sudo apt-get install grafana
Code language: JavaScript (javascript)

The package sets up a systemd service for us (including a user grafana) so we just need to run:

sudo service grafana-server start sudo service grafana-server status sudo update-rc.d grafana-server defaults
Code language: CSS (css)

SSH Tunnelling

Grafana creates a very slick dashboard where we can view our metrics. That dashboard is typically available in the browser, but since we鈥檙e running the server version of Ubuntu 20.04, it鈥檚 all command-line. So how do we access Grafana?

Enter SSH tunnelling. It鈥檚 the same protocol we use to access AWS from our command-line, but we鈥檙e going to set it up so we create a mirror port on our local computer that connects to a specific port on our AWS instance. That way, when we call up the port locally, say by opening the browser to http://localhost:3000, we are actually looking at the 3000 port on our AWS instance.

To do this properly, you鈥檒l need your SSH key for AWS and the AWS IP information. You also need to know what port you鈥檇 like to connect to. In this case, we know our Grafana instance is running on port 3000, so the command-line instructions will have this generic structure:

ssh -N -L 3000:localhost:3000 -i "PATH_TO_AWS_KEYPAIR.pem"ubuntu@INSTANCE_IDENTIFIER.compute-ZONE.amazonaws.com
Code language: CSS (css)

This allows us to go to http://localhost:3000 on our local machine and see our Grafana dashboard. But we don鈥檛 have one yet, so we need to do the following:

Add Prometheus as data source:

Go to 鈥渁dd new data source鈥

Click 鈥淧rometheus鈥 from the drop-down

Click 鈥淪ave and Test鈥

Click + on left-hand menu and select 鈥渋mport dashboard鈥

Add Teku Grafana ID: 13457

And, bada-bing! We have our dashboard, visible from the comfort of our own browser:

Beaconcha.in App

The Grafana dashboard is excellent and Prometheus is storing information for us. However, there are other options to check validator status.

I鈥檝e been using Beaconcha.in Dashboard mobile app for Android. It鈥檚 a simple interface, which is fine because it鈥檚 not my primary monitoring service. It allows me to quickly glance at my phone to check validator status and provides notifications if something鈥檚 wrong with the validator.

You enter the validator address you鈥檇 like to watch and that鈥檚 pretty much it! Again, not heavy-duty monitoring (that鈥檚 what the Grafana Teku feed provides). But it鈥檚 fine as a secondary service and binary 鈥渋s the validator functioning or not鈥:

Querying the Node

Another way to 鈥渕onitor鈥 our Ethereum validator client is to query it! Like an Ethereum 1.0 client, our Ethereum validator client is storing and maintaining a world state. It鈥檚 much smaller compared to Ethereum 1.0, but it鈥檚 still on-chain data stored and maintained by your validator client.聽

This is the same data consumed by the Prometheus / Grafana workflow. We are simply getting closer to the metal (virtually speaking) by querying the node ourselves. Here鈥檚 a sample of the available data (full list here):

  • Beacon chain information (genesis block, block headers and root, etc.)
  • Validator information (list of validators, validator balance, validator responsibilities, etc.)
  • Node information (overall health, list of peers, etc.)

cURL

The first way to do this is from the command line. When we started Teku, we added the flag 鈥搑est-api-enabled=true. This opens up an API endpoint at the default port of 5051 (you can specify another port by using the flag 鈥搑est-api-port=<PORT>). 聽You can double-check your port is open by running sudo lsof -i -P -n | grep LISTEN.

Once you鈥檝e confirmed port 5051 is open by Teku, we will use cURL to send REST calls to the Teku API endpoint at http://localhost:5051. For example, here is the way we check the balance of the highest聽 performing validator (according to Beaconcha.in):

curl -X GET "http://localhost:5051/eth/v1/beacon/states/head/validator_balances id=0x8538bbc2bdd5310bcc71b1461d48704e36dacd106fa19bb15c918e69adbcc360e5bf98ebc3f558eb4daefe6d6c26dda5"
Code language: PHP (php)

Here鈥檚 the response I got back in mid-January 2021 (in Gwei):聽

{"data":[{"index":"4966","balance":"32607646851"}]}
Code language: JSON / JSON with Comments (json)

Try out any of the methods on the Teku API doc page using the format at the bottom of this page:

curl -X [REST_METHOD]API_CALL_IN_QUOTES
Code language: CSS (css)

Swagger UI

There鈥檚 a basic graphic UI for API calls Teku provides when the flag 鈥搑est-api-docs-enabled=true is added in the start-up commands. It鈥檚 built on swagger-ui and It鈥檚 on the port 5051 by default and we can use SSH tunneling to access it. Follow the same SSH tunnelling steps from above but with 5051 as the port:

ssh -N -L 5051:localhost:5051 -i "PATH_TO_AWS_KEYPAIR.pem" ubuntu@INSTANCE_IDENTIFIER.compute-ZONE.amazonaws.com
Code language: CSS (css)

From the browser on our computer, we can then navigate to http://localhost:5051/swagger-ui, which looks like this on my machine:

World state and consensus is something that鈥檚 emergent in all public blockchains. This means Ethereum 2.0 reaches consensus by all validators storing and updating information. It鈥檚 a bit nerdy, but to look into your local state is to peer into a single pane of a much larger structure. A subset of the fractal constantly updating and emerging into something new. Try it!

Financial analysis

In my first post, I sketched out basic material requirements needed:

  • A three year commitment to staking 32 ETH and maintaining a validator node
  • 32 ETH (plus <1 ETH for gas costs)
  • $717.12 (three-year reserved instance pricing for an m5.xlarge instance) + 120 (one year鈥檚 cost of 100 GB of storage, conservatively assuming nearly full storage capacity) = $837.12 paid over the course of the year to AWS
  • MetaMask Extension (free install)聽
  • Infura Account (free tier)

The AWS costs were for a three year lock-in, but I mentioned later I wasn鈥檛 quite ready to do that. And I鈥檓 glad I didn鈥檛! You鈥檒l see why in a moment, but here鈥檚 my basic breakdown of costs for the month of December 31st 2020:

AWS Monthly Costs

  • Data Transfer: $8.52
  • Server: $142.85
  • Storage: $72.50
  • Total: $223.87

Eth2 Validator Rewards聽

  • Blocks: 5
  • Attestations: ~6,803
  • ETH Rewards: 0.420097728 ($485.83 USD)

As you can probably see, a profit of $261.96聽 isn鈥檛 a great spread for one validator. There are a couple of options: This is a relatively stable cost, so I could stake another 32 ETH. The better option might be to change the VPS I鈥檓 using, which I mentioned in my first post, actually:

Initially, I was confident AWS was the best virtual platform and it鈥檚 the service I鈥檒l use for this post and the next. However, after going through the whole process, I realized AWS might be overkill for the individual developer. AWS鈥 real strength seems to be its capacity to dynamically scale up to meet demand which comes at a premium cost. This makes economic sense for a large scale, enterprise-level project, but individual Ethereum 2.0 current client requirements do not require such rigor.

I鈥檓 going to continue with AWS but am also entertaining the option of running an instance on Digital Ocean, which may be more appropriate for an individual developer.聽

I think I can get a much better profit from running on Digital Ocean and still not take a hit on my validator performance. A friend is running a validator instance on a much smaller VPS which costs an order of magnitude less and we have the same validator performance.聽

It鈥檚 great to experiment with AWS and I don鈥檛 regret having the capacity in case something goes sideways on the beacon chain. However, I think it鈥檚 really great that Eth 2 devs are delivering on the promise of making validating available from home networks and setups!聽

The current price modulations also make financial analysis hard, as server costs are fixed in USD but rewards are fluctuating. Long-term, I鈥檓 very confident that my validator rewards will increase in value. It does make cost-benefit tricky!

For the last installment of this series, I will discuss how to safely and (hopefully) successfully migrate a Teku node from one server to another. The major issue is getting slashed, of course. It seems the vast majority of slashings that have taken place is due to this very issue. We鈥檒l see how it goes鈥