Skip to main content

Databend Cloud

In this document we'll show how to set up a simple data pipeline to load data from indexed.xyz to Databend Cloud and perform analysis with SQL.

Currently, Databend Cloud is under beta access, and offers a trial with $200 credit on their homepage

Load data from indexed.xyz

Indexed.xyz file sare stored in cloudflare R2, and for this example, we'll be using the prefix 9d for the demo. We can set up a Databend worksheet to run all queries with an XSmall instance.

First, we need to add indexed.xyz R2 as a Databend stage.

CREATE STAGE r2_stage
URL = 's3://indexed-xyz-wnam/ethereum/raw/logs/v2.0.0/dt=2020-02-20/'
CONNECTION = (
ENDPOINT_URL = 'https://ed5d915e0259fcddb2ab1ce5592040c3.r2.cloudflarestorage.com/'
REGION = 'auto'
ACCESS_KEY_ID = '43c31ff797ec2387177cabab6d18f15a'
SECRET_ACCESS_KEY = 'afb354f05026f2512557922974e9dd2fdb21e5c2f5cbf929b35f0645fb284cf7'
);

You can take a look at the loaded files with the following command:

LIST @r2_stage;

This should return something like the following table:

namesizemd5last_modifiedcreator
dt=2015/1680119600-25ee0299-28eb-47a4-b81e-f49f5399617b-0-4.parquet12616e273f913a5f75fea96f9fe7d2f086a6f2023-04-07 02:02:21.717 +0000NULL
dt=2015/1680119600-25ee0299-28eb-47a4-b81e-f49f5399617b-0-42.parquet5470a29c8427372d02dece62924dd985ae512023-04-07 02:02:21.725 +0000NULL
dt=2016/1680119600-25ee0299-28eb-47a4-b81e-f49f5399617b-37-26.parquet327082c21a9ffbec6d900f21fe4e8ff30d00f2023-04-07 02:02:22.131 +0000NULL
dt=2016/1680119600-25ee0299-28eb-47a4-b81e-f49f5399617b-37-27.parquet2611128efed0a4490b52bb35a12eb1d2162d182023-04-07 02:02:21.724 +0000NULL
dt=2016/1680119600-25ee0299-28eb-47a4-b81e-f49f5399617b-37-44.parquet10927d5f7c980ac9e1e1c3de8d42c4388a7b12023-04-07 02:02:21.794 +0000NULL
dt=2016/1680119600-25ee0299-28eb-47a4-b81e-f49f5399617b-37-45.parquet273489e41f8268c73ff9e8062a16fd1bd3d8bf2023-04-07 02:02:22.180 +0000NULL
dt=2016/1680119600-25ee0299-28eb-47a4-b81e-f49f5399617b-37-9.parquet4220496f206d6bcb48483dc684ba8c67d5c67332023-04-07 02:02:22.145 +0000NULL

Create contract table

Build the target table using the following SQL:

CREATE DATABASE indexedxyz;

USE indexedxyz;

CREATE TABLE `contract` (
`block_time` BIGINT NULL,
`address` VARCHAR NULL,
`event_signature` VARCHAR NULL,
`event_params` ARRAY(STRING NULL) NULL,
`block_number` BIGINT NULL,
`block_hash` VARCHAR NULL,
`log_index` BIGINT NULL,
`transaction_hash` VARCHAR NULL,
`transaction_index` BIGINT NULL,
`data` VARCHAR NULL,
`topics` VARCHAR NULL,
`id` VARCHAR NULL
);

Load data from stage

COPY INTO contract FROM @r2_stage FILE_FORMAT = (type = PARQUET);

The above command is idempotent, once files are loaded into the contract table, redoing the copy would not load the file again. Once data is loaded into the contract table (typically takes 1-10 minutes), we can start to query it.

Counting the tokens

We could check on BAYC mints under the partiion 9d by querying the data:

SELECT COUNT(DISTINCT(CAST(event_params[3] AS INTEGER)))
FROM contract
WHERE LOWER(address) = '0xbc4ca0eda7647a8ab7c2061c2e118a18a936f13d'
AND event_signature = 'Transfer(address,address,uint256)';

| count() | | 5374 |

And if we want to check on the transaction volume by month, we can use the following command:

SELECT
date_trunc('month', to_timestamp(block_time)),
count(*)
FROM
contract
WHERE
lower(address) = '0xbc4ca0eda7647a8ab7c2061c2e118a18a936f13d'
AND event_signature = 'Transfer(address,address,uint256)'
GROUP BY
1;

The result should look something like:

date_trunc(month, to_timestamp(block_time))count(*)
2021-05-011742
2022-04-01352
2023-03-011622
2022-07-01244
2023-01-012556
2021-12-01262
2021-10-01190
2022-06-01257
2022-02-01205
2022-08-01285
2021-07-01357
2022-10-01225
2021-11-01219
2022-11-01334
2021-08-01563
2021-06-01727
2023-04-01121
2022-12-011566
2022-01-01272
2022-05-01426
2022-03-01415
2021-04-0148
2021-09-01257
2023-02-011950
2022-09-01245

Visualize results through Databend worksheet

Databend Cloud natively provides a visualization toolkits on worksheets.

Databend