BTC transactions table inputs missing spend_output_index

0

As the title suggests, I queried the BTC transactions table and found that (after querying the first 100_000_000 rows) every one of the transactions inputs has spend_output_index = null.

Is this an error or is there another way to determine a spends output index?

The query I used was:

SELECT 
    COUNT(*)
FROM (
    SELECT 1
    FROM 
        transactions tx,
        UNNEST(tx.inputs) as t_input(input)
    WHERE 
        input.spent_transaction_hash IS NOT NULL AND 
        input.spend_output_index IS NULL
    LIMIT 100000000
) AS subquery;

And the results...

#	_col0
1	100000000
charles
질문됨 6달 전256회 조회
2개 답변
1

Hello Charles,

Thank you very much for bringing this issue to Rajan and I's attention. The CloudFormation template and accompanying schema in the repository have been updated to reflect these changes.

Have a fantastic rest of your day.

profile pictureAWS
전문가
답변함 5달 전
0

I used the below doc to setup Athena to query the BTC transaction table

[+] https://aws.amazon.com/blogs/database/access-bitcoin-and-ethereum-open-datasets-for-cross-chain-analytics/

The above documentation contain a CloudFormation stack

[+] CF stack: https://aws-blogs-artifacts-public.s3.amazonaws.com/artifacts/DBBLOG-2500/aws-public-blockchain.yaml

I then verified that that spend_output_index was, in fact, coming out to be null for all the rows in the BTC transaction table.

I then checked the source for the data. As per the above doc, the data is stored in a public S3 bucket with URI: s3://aws-public-blockchain/v1.0/btc/transactions/

Below is one of the sample records from this above S3 files:

{"hash":"77406ab7b318fd951a7eceeeb215b1ab6870f2b27d90c8642ec29e5a923200ce","version":"2","size":"304","block_hash":"00000000000000000001b120c6a88c9a3d52bd61b0cd6be3526077d34fae3dfc","block_number":"817878","index":"375","virtual_size":"147","lock_time":"0","input_count":"1","output_count":"1","is_coinbase":false,"output_value":0.00000331,"outputs":[{"address":"bc1pe727l62jdxgr3jk8qxruu4dh0k5lmqdzpapcxjr9htfc66rtt2ksgk20et","index":"0","required_signatures":"1","script_asm":"1 cf95efe952699038cac70187ce55b77da9fd81a20f43834865bad38d686b5aad","script_hex":"5120cf95efe952699038cac70187ce55b77da9fd81a20f43834865bad38d686b5aad","type":"witness_v1_taproot","value":0.00000331}],"block_timestamp":1700611323000,"date":"2023-11-22","last_modified":1700611367997.9,"fee":0.0000337,"input_value":0.00003701,"inputs":[{"address":"bc1pm8nr9cralwndxuap4g06ry0y6ptjur83uupxursdeuzumwrwr56qyudkl6","index":"0","required_signatures":"1","script_asm":"","script_hex":"","sequence":"4261412863","spent_output_index":"514","spent_transaction_hash":"64d83c205fef56d7772cb6b4189f8f3d55153fb1ce512391c488bd5d3965d07a","txinwitness":["e20ca15ddc137123056cef48433a2d8cb6ceb291c23fd4e5263f269fed48bceb6cc426b3930e09119c20218db174f0feef2d07ca38172cdd4e11a83c39ebf6f2","209cbb6921ba3ba7e3ed0f4a320c86cbb0352d6523fba1dc165f0c1f288ad639a6ac0063036f726401010a746578742f706c61696e00337b2270223a226272632d3230222c226f70223a226d696e74222c227469636b223a226d6d7373222c22616d74223a223230227d68","c19cbb6921ba3ba7e3ed0f4a320c86cbb0352d6523fba1dc165f0c1f288ad639a6"],"type":"witness_v1_taproot","value":0.00003701}]}

It can be observed that the above records contains the field in "inputs" column called "spent_output_index" (NOT "spend_output_index"). This means that due to this mismatch in the field name, the data in the Athena for spend_output_index is coming out to be null as there is no such field in the S3 source parquet files.

I deleted the CloudFormation stack and manually updated the below line in the CloudFormation template. I then restored this updated CloudFormation template and I can confirm that I was getting correct output in the Athena queries:

Original line

          Type: array<struct<index:bigint,spent_transaction_hash:string,spend_output_index:bigint,script_asm:string,script_hex:string,sequence:bigint,required_signatures:bigint,type:string,address:string,value:double>>

Updated line

          Type: array<struct<index:bigint,spent_transaction_hash:string,spent_output_index:bigint,script_asm:string,script_hex:string,sequence:bigint,required_signatures:bigint,type:string,address:string,value:double>>

Note, I have replaced "spend_output_index" with "spent_output_index" in the above line.

After making the above change the query was working fine and I was getting 0 rows with spent_output_index = null.

SELECT 
    COUNT(*)
FROM (
    SELECT 1
    FROM 
        transactions tx,
        UNNEST(tx.inputs) as t_input(input)
    WHERE 
        input.spent_transaction_hash IS NOT NULL AND 
        input.spent_output_index IS NULL
    LIMIT 100000000
) AS subquery;

output
=========

#	_col0
1	0
AWS
지원 엔지니어
Rajan_L
답변함 6달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠