BTC transactions table inputs missing spend_output_index

0

As the title suggests, I queried the BTC transactions table and found that (after querying the first 100_000_000 rows) every one of the transactions inputs has spend_output_index = null.

Is this an error or is there another way to determine a spends output index?

The query I used was:

SELECT 
    COUNT(*)
FROM (
    SELECT 1
    FROM 
        transactions tx,
        UNNEST(tx.inputs) as t_input(input)
    WHERE 
        input.spent_transaction_hash IS NOT NULL AND 
        input.spend_output_index IS NULL
    LIMIT 100000000
) AS subquery;

And the results...

#	_col0
1	100000000
charles
已提問 6 個月前檢視次數 256 次
2 個答案
1

Hello Charles,

Thank you very much for bringing this issue to Rajan and I's attention. The CloudFormation template and accompanying schema in the repository have been updated to reflect these changes.

Have a fantastic rest of your day.

profile pictureAWS
專家
已回答 5 個月前
0

I used the below doc to setup Athena to query the BTC transaction table

[+] https://aws.amazon.com/blogs/database/access-bitcoin-and-ethereum-open-datasets-for-cross-chain-analytics/

The above documentation contain a CloudFormation stack

[+] CF stack: https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/DBBLOG-2500/aws-public-blockchain.yaml

I then verified that that spend_output_index was, in fact, coming out to be null for all the rows in the BTC transaction table.

I then checked the source for the data. As per the above doc, the data is stored in a public S3 bucket with URI: s3://aws-public-blockchain/v1.0/btc/transactions/

Below is one of the sample records from this above S3 files:

{"hash":"77406ab7b318fd951a7eceeeb215b1ab6870f2b27d90c8642ec29e5a923200ce","version":"2","size":"304","block_hash":"00000000000000000001b120c6a88c9a3d52bd61b0cd6be3526077d34fae3dfc","block_number":"817878","index":"375","virtual_size":"147","lock_time":"0","input_count":"1","output_count":"1","is_coinbase":false,"output_value":0.00000331,"outputs":[{"address":"bc1pe727l62jdxgr3jk8qxruu4dh0k5lmqdzpapcxjr9htfc66rtt2ksgk20et","index":"0","required_signatures":"1","script_asm":"1 cf95efe952699038cac70187ce55b77da9fd81a20f43834865bad38d686b5aad","script_hex":"5120cf95efe952699038cac70187ce55b77da9fd81a20f43834865bad38d686b5aad","type":"witness_v1_taproot","value":0.00000331}],"block_timestamp":1700611323000,"date":"2023-11-22","last_modified":1700611367997.9,"fee":0.0000337,"input_value":0.00003701,"inputs":[{"address":"bc1pm8nr9cralwndxuap4g06ry0y6ptjur83uupxursdeuzumwrwr56qyudkl6","index":"0","required_signatures":"1","script_asm":"","script_hex":"","sequence":"4261412863","spent_output_index":"514","spent_transaction_hash":"64d83c205fef56d7772cb6b4189f8f3d55153fb1ce512391c488bd5d3965d07a","txinwitness":["e20ca15ddc137123056cef48433a2d8cb6ceb291c23fd4e5263f269fed48bceb6cc426b3930e09119c20218db174f0feef2d07ca38172cdd4e11a83c39ebf6f2","209cbb6921ba3ba7e3ed0f4a320c86cbb0352d6523fba1dc165f0c1f288ad639a6ac0063036f726401010a746578742f706c61696e00337b2270223a226272632d3230222c226f70223a226d696e74222c227469636b223a226d6d7373222c22616d74223a223230227d68","c19cbb6921ba3ba7e3ed0f4a320c86cbb0352d6523fba1dc165f0c1f288ad639a6"],"type":"witness_v1_taproot","value":0.00003701}]}

It can be observed that the above records contains the field in "inputs" column called "spent_output_index" (NOT "spend_output_index"). This means that due to this mismatch in the field name, the data in the Athena for spend_output_index is coming out to be null as there is no such field in the S3 source parquet files.

I deleted the CloudFormation stack and manually updated the below line in the CloudFormation template. I then restored this updated CloudFormation template and I can confirm that I was getting correct output in the Athena queries:

Original line

          Type: array<struct<index:bigint,spent_transaction_hash:string,spend_output_index:bigint,script_asm:string,script_hex:string,sequence:bigint,required_signatures:bigint,type:string,address:string,value:double>>

Updated line

          Type: array<struct<index:bigint,spent_transaction_hash:string,spent_output_index:bigint,script_asm:string,script_hex:string,sequence:bigint,required_signatures:bigint,type:string,address:string,value:double>>

Note, I have replaced "spend_output_index" with "spent_output_index" in the above line.

After making the above change the query was working fine and I was getting 0 rows with spent_output_index = null.

SELECT 
    COUNT(*)
FROM (
    SELECT 1
    FROM 
        transactions tx,
        UNNEST(tx.inputs) as t_input(input)
    WHERE 
        input.spent_transaction_hash IS NOT NULL AND 
        input.spent_output_index IS NULL
    LIMIT 100000000
) AS subquery;

output
=========

#	_col0
1	0
AWS
支援工程師
Rajan_L
已回答 6 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南