Metrics Datasets

Overview

This document introduces Santiment's precomputed metrics datasets.

You can filter all metric tables as:

1
SHOW TABLES LIKE '%metrics%'

Test in Queries

Usually, tables storing pre-computed data have a common set of columns:

  • dt - A DateTime field storing the corresponding date and time.
  • asset_id - An UInt64 unique identifier for an asset. The data for that ID is stored in the asset_metadata table.
  • metric_id - An UInt64 unique identifier for a metric. The data for that ID is stored in the metric_metadata table.
  • value - A Float column holding the metric's value for the given asset/metric pair.
  • computed_at - A DateTime column storing the date and time when the given row was computed.

Metrics are located in different tables, depending on some different parameters. All tables are similar to each other and described below.

Tables


daily_metrics_v2

The daily_metrics_v2 table stores a single metric value for each asset for each day. I.e. in that tables stored metrics that have exactly 1 value per day.

Here's an example how to fetch daily_active_addresses for bitcoin using daily_metrics_v2

1
2
3
4
5
6
7
8
9
10
11
SELECT
    dt,
    get_asset_name(asset_id) AS asset,
    get_metric_name(metric_id) AS metric,
    value
FROM daily_metrics_v2 FINAL
WHERE
    asset_id = get_asset_id('bitcoin') AND
    metric_id = get_metric_id('daily_active_addresses') AND
    dt >= toDateTime('2020-01-01 00:00:00')
LIMIT 2

Test in Queries


intraday_metrics

Table intraday_metrics is very similar to daily_metrics_v2 but stores metrics with more than one value per day. In most cases, these metrics have a new value every 5 minutes.

Here's an example how to get 5-minute prices for ethereum:

1
2
3
4
5
6
7
8
9
10
11
12
SELECT
    dt,
    get_asset_name(asset_id) AS asset,
    get_metric_name(metric_id) AS metric,
    value
FROM intraday_metrics FINAL
WHERE
    asset_id = get_asset_id('ethereum') AND
    metric_id = get_metric_id('price_usd') AND
    dt >= toDateTime('2024-01-01 00:00:00')
ORDER BY dt DESC
LIMIT 5

Test in Queries


labeled_intraday_metrics_v2

Table labeled_intraday_metrics_v2 stores metrics for a given blockchain and label_id. Each label_id corresponds to a certain label FQN. The data for that ID is stored in the label_metadata table. Blockchain column contains the string for blockchain name, e.g. bictoin, ethereum, etc.

For instance, you can retrieve ethereum balance of all centralized exchange addresses with the query:

1
2
3
4
5
6
7
8
9
10
11
12
SELECT
    dt,
    value AS balance
FROM labeled_intraday_metrics FINAL
WHERE
    dt >= toDateTime('2024-01-01 00:00:00')
    AND label_id = dictGet('labels_by_fqn', 'label_id', 'santiment/centralized_exchange:v1')
    AND metric_id = dictGet('metrics_by_name', 'metric_id', 'combined_labeled_balance')
    AND asset_id = dictGet('assets_by_name', 'asset_id', 'ethereum')
    AND blockchain = 'ethereum'
ORDER BY dt DESC
LIMIT 5

Test in Queries


daily_label_based_metrics

Table intraday_metrics also has label and owner columns so that metrics stored in that table correspond to a particular label and owner.

Note that this table uses old way to filter labels (plain label/owner instead of label_fqn).

This table could be used to retrieve labelled exchange metrics under names:

  • labelled_active_deposits
  • labelled_deposit_transactions
  • labelled_active_withdrawals
  • labelled_withdrawal_transactions

or NFT trading volume and trades count metrics.

Here's an example to get Binance active deposits:

1
2
3
4
5
6
7
8
9
10
11
12
13
SELECT
    dt,
    get_asset_name(asset_id) AS asset,
    get_metric_name(metric_id) AS metric,
    value
FROM daily_label_based_metrics FINAL
WHERE
    dt >= toDateTime('2024-01-01 00:00:00')
    AND asset_id = get_asset_id('ethereum')
    AND metric_id = get_metric_id('labelled_active_deposits')
    AND label = 'centralized_exchange' AND owner = 'binance'
ORDER BY dt DESC
LIMIT 5

Test in Queries


ecosystem_aggregated_metrics

ecosystem_aggregated_metrics is a table similar to the daily_metrics_v2 table. The only distinction is that it doesn't contain asset_id column; instead, it includes the ecosystem column. This adjustment was made to accommodate the requirements for development and github activity metrics, allowing us to monitor the activity of the entire ecosystem.

ecosystem is a string, you can get available ecosystems as:

1
2
3
SELECT DISTINCT ecosystem
FROM ecosystem_aggregated_metrics
WHERE dt >= today() - 7

Test in Queries

In that table you can find dev activity metrics like

  • ecosystem_github_activity
  • ecosystem_dev_activity
  • ecosystem_github_activity_contributors_count
  • ecosystem_dev_activity_contributors_count, etc.

Here's an example how to get ecosystem_github_activity_contributors_count for ethereum:

1
2
3
4
5
6
7
8
9
10
11
SELECT
    dt,
    get_metric_name(metric_id) AS metric,
    value
FROM ecosystem_aggregated_metrics FINAL
WHERE
    dt >= toDateTime('2024-01-01 00:00:00')
    AND ecosystem = 'ethereum'
    AND metric_id = get_metric_id('ecosystem_github_activity_contributors_count')
ORDER BY dt DESC
LIMIT 5

Test in Queries


intraday_nft_metrics

A table intraday_nft_metrics was created to store intraday nft metrics which are related to whole nft collection or to specific tokens within nft collection. Here's an example how to get floor price for boredapeyachtclub collection:

1
2
3
4
5
6
7
8
9
SELECT
    dt, value
FROM intraday_nft_metrics
WHERE
    dt >= toDateTime('2024-01-01 00:00:00')
    AND asset_id = get_asset_id('ethereum') 
    AND metric_id = get_metric_id('nft_collection_min_price')
    AND collection_name = 'boredapeyachtclub'
LIMIT 5

Test in Queries


Use Cases and Other tables

Metadata tables

There are few metadata tables storing data about metrics, assets and label_fqns:

  • metric_metadata stores metric name, ID, metric version, etc.
  • asset_metadata stores asset name, ID, decimals, contract_addresses, ticker and slug, etc.
  • label_metadata stores label fqn, ID and other label related info.

Using these table you can navigate through a wide range of Santiment datasets. For instance, you're able to find all price-related metrics with a simple query:

1
2
3
SELECT DISTINCT name, metric_id
FROM metric_metadata
WHERE name LIKE '%price%'

Test in Queries

Also you're able to find deciamals and contract addresses for a particular asset, e.g. uniswap:

1
2
3
SELECT DISTINCT name, asset_id, decimals, contract_addresses
FROM asset_metadata
WHERE name = 'uniswap'
1
2
3
┌───────name─┬──asset_id─┬──deciamls─┬──────────contract_addresses───────────────────────────┐
│ 'uniswap'       2825         18   ['0x1f9840a85d5af5bf1d1762f925bdaddc4201f984']  │ 
└─────────────┴────────────┴─────────────┴───────────────────────────────────────────────────────────┘

Using precomputed metrics to build new metrics

Not all metrics are build from the raw data only. Some of the metrics are computed by combining a set of pre-computed metrics.

The MVRV is defined as the ratio between the Market Value and Realized Value. The total supply is part of the numerator and the denominator, so it can be eliminated. The result is that the numerator is just price_usd and the denominator is realized_price_usd. There are precomputed metrics for both, so we can use them to compute the MVRV (and that's how we do it for the official MVRV metric!). Depending on the load on the database, the query duration can vary. At the moment of writing this, running the query takes 0.13 seconds.

In the query anyIf is used as there is filtering by asset_id and metric_id, so there is only one value per metric for each dt. The example after that discusses how to handle more complex GROUP BY clauses.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
SELECT
  dt,
  get_asset_name(any(asset_id)) AS asset,
  anyIf(value, metric_id=get_metric_id('price_usd')) AS numerator,
  anyIf(value, metric_id=get_metric_id('mean_realized_price_usd_intraday_20y')) AS denominator,
  numerator / denominator AS mvrv_usd_ratio,
  floor((mvrv_usd_ratio - 1) * 100, 2) AS mvrv_usd_percent
FROM intraday_metrics FINAL
WHERE
  asset_id = get_asset_id('bitcoin') AND
  metric_id IN (get_metric_id('price_usd'), get_metric_id('mean_realized_price_usd_intraday_20y')) AND
  dt >= toDateTime('2022-01-01 00:00:00')
GROUP BY dt
ORDER BY dt ASC
LIMIT 10

Test in Queries

1
2
3
4
5
6
7
8
9
10
11
12
┌──────────────────dt─┬─asset───┬──────────numerator─┬────────denominator─┬─────mvrv_usd_ratio─┬─mvrv_usd_percent─┐
│ 2022-01-01 00:00:00 │ bitcoin │  46378.1577858292223002.9920484453832.0161793599786777101.61 │
│ 2022-01-01 00:05:00 │ bitcoin │  46394.9483873731223002.9920484453832.016909291176695101.69 │
│ 2022-01-01 00:10:00 │ bitcoin │  46376.9209928300323002.9973731905372.0161251266707207101.61 │
│ 2022-01-01 00:15:00 │ bitcoin │ 46342.62590584589623002.9973731905372.01463423022676101.46 │
│ 2022-01-01 00:20:00 │ bitcoin │ 46349.90844109937523002.951142462382.014954870531355101.49 │
│ 2022-01-01 00:25:00 │ bitcoin │ 46419.39120846100623002.9209055151442.0179781254358686101.79 │
│ 2022-01-01 00:30:00 │ bitcoin │ 46423.02414518549623002.930750485842.0181351954122198101.81 │
│ 2022-01-01 00:35:00 │ bitcoin │ 46499.00541072294423002.9485857220022.021436740487479102.14 │
│ 2022-01-01 00:40:00 │ bitcoin │ 46573.47460049367523002.956294984492.024673437763669102.46 │
│ 2022-01-01 00:45:00 │ bitcoin │  46647.5219339296623003.0391672759762.0278851674647504102.78 │
└─────────────────────┴─────────┴────────────────────┴────────────────────┴────────────────────┴──────────────────┘

To return only some of the columns, the query can be provided as a FROM subquery. This does not induce any performence degradation. This example also shows how the WITH Clause can be used to avoid string literals repetition.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
WITH
    get_metric_id('price_usd') AS price_usd_metric_id,
    get_metric_id('mean_realized_price_usd_intraday_20y') AS realized_price_usd_metric_id
SELECT
    dt, 
    price_usd / realized_price_usd AS mvrv_usd_ratio,
    floor((mvrv_usd_ratio - 1) * 100, 2) AS mvrv_usd_percent
FROM (
  SELECT
    dt,
    get_asset_name(any(asset_id)) AS asset,
    anyIf(value, metric_id=price_usd_metric_id) AS price_usd,
    anyIf(value, metric_id=realized_price_usd_metric_id) AS realized_price_usd
  FROM intraday_metrics FINAL
  WHERE
    asset_id = get_asset_id('bitcoin') AND
    metric_id IN (price_usd_metric_id, realized_price_usd_metric_id) AND
    dt >= toDateTime('2022-01-01 00:00:00')
  GROUP BY dt
)
ORDER BY dt ASC
LIMIT 10

Test in Queries

The next query demonstrates what needs to be done if there is a need to aggregate the datetime instead of getting a value for each dt:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
WITH
    get_metric_id('price_usd') AS price_usd_metric_id,
    get_metric_id('mean_realized_price_usd_intraday_20y') AS realized_price_usd_metric_id
SELECT
    month, 
    price_usd / realized_price_usd AS mvrv_usd_ratio,
    floor((mvrv_usd_ratio - 1) * 100, 2) AS mvrv_usd_percent
FROM (
  SELECT
    toStartOfMonth(dt) AS month,
    get_asset_name(any(asset_id)) AS asset,
    argMaxIf(value, dt, metric_id=price_usd_metric_id) AS price_usd,
    argMaxIf(value, dt, metric_id=realized_price_usd_metric_id) AS realized_price_usd
  FROM intraday_metrics FINAL
  WHERE
    asset_id = get_asset_id('bitcoin') AND
    metric_id IN (price_usd_metric_id, realized_price_usd_metric_id) AND
    dt >= toDateTime('2022-01-01 00:00:00')
  GROUP BY month
)
ORDER BY month ASC
LIMIT 10

Test in Queries

The following row needs some explanation:

1
argMaxIf(value, dt, metric_id=get_metric_id('price_usd')) AS price_usd,

This function call has three parameters:

  • value - This is the column that is returned
  • dt - This is the column that max is performed upon. Of all columns matching the filter, the one with the max dt is returned.
  • metric_id=get_metric_id('price_usd') - This a boolean expression. Look only at the rows for which the expression evaluates to true.

If the FINAL keyword is not used, taking the row with biggest computed_at among those with the same dt can be achieved by using a tuple as a second argument:

1
argMaxIf(value, (dt, computed_at), metric_id=get_metric_id('price_usd')) AS price_usd,
Help & Feedback

We are using cookies to improve your experience!

By clicking “Allow all”, you agree to use of all cookies. Visit our Cookies Policy to learn more.