I am making an attempt to create a Bitcoin handle database utilizing btcrecover and Google BigQuery information. Whereas the Ethereum database works advantageous, my Bitcoin database fails to return any addresses, although take a look at addresses and random addresses from the blockchain exist within the dataset.
Right here’s what I attempted:
1.Created the database utilizing:
python create-address-db.py --inputlistfile C:UserstestDesktopbtc-addresses-db-20250816 --dbfilename btc-addresses-db-20250816.db --dblength 31
2.Checked addresses utilizing:
python check-address-db.py --dbfilename btc-addresses-db-20250816.db --checkaddresslist ./addressdb-checklists/BTC.txt
python check-address-db.py --dbfilename btc-addresses-db-20250816.db --checkaddresses bc1qxy2kgdygjrsqtzq2n0yrf2493p83kkfjhx0wlh
Observations:
- Ethereum DB works with –dblength 29.
- After I create the Bitcoin DB with –dblength 31 utilizing the total dataset (~2000 BigQuery information, 16 GB DB), it finds no addresses.
- A smaller subset (4 BigQuery information out of 2000) works appropriately, with each –dblength 27 (1GB DB) and –dblength 31 (16GB DB).
- Splitting the dataset into halves or subsets results in inconsistent outcomes: some handle ranges are discovered, others will not be, even inside the identical database.
- Tried a number of Python variations (3.9–3.13) and btcrecover variations (1.6.0, 1.12.0, grasp) with the identical outcomes.
- 64-bit system, all dependencies put in, Rust put in, digital environments used.
- Official Bitcoin database from the creator (addresses-BTC-2011-to-2021-03-31.zip) works appropriately.
Instance of inconsistent outcomes from subsets:
| File Vary | Addresses Discovered? |
|---|---|
| 500–520 | Sure |
| 500–599 | No |
| 500–570 | No |
| 550–570 | No |
| 550–559 | Sure |
| 560–569 | Sure |
| 570–579 | Sure |
| 563–579 | Sure |
As you’ll be able to see, some databases efficiently returned addresses, whereas others didn’t. Initially, I assumed the issue is perhaps attributable to file 570. Nonetheless, in a subsequent take a look at, the database that included file 570 labored appropriately.
In one other take a look at, I created a database from information 550–570 and examined addresses from every file as samples for this database. The outcomes had been as follows:
- Pattern handle from file 570: discovered
- Pattern handle from file 553: not discovered
- Pattern handle from file 563: discovered
So even inside the identical database, some addresses had been efficiently retrieved whereas others weren’t. This sample means that the problem isn’t merely with a single file, however is perhaps associated to how sure addresses are listed or saved within the database.
Query:
Has anybody skilled related points creating massive Bitcoin databases with btcrecover? Might this be a dataset situation, a bug in btcrecover, or an issue with how addresses are listed when –dblength is massive?