Ethereum Bloom Filter
To reduce storage space (e.g. list of transactions and logs they generate). Each block generated has a bloom filter (
logsBloom) which contains the address of any logging contract and its indexed fields from the logs generated.
When an application wants to find all log entries for a given contract or some specific field it can have the node scan each block via the
logsBloom filter and then re-execute the transactions to regenerate the logs and return relevant ones.
A Bloom filter is a data structure designed to tell you, rapidly and memory-efficiently, whether an element is present in a set. [it] is a probabilistic data structure: it tells us that the element either definitely is not in the set or may be in the set.
Attacking Ethereum’s Bloom Filter
we can fill the entire filter by generating only 683 (ceil of 2048/3) events! A smart contract can generate as many events as it wants, so this can definitely work.
we looked into how Solidity events actually work under the hood — they actually just execute a LOGX instruction! We realized that it wasn’t even necessary to actually define any events and trigger them manually, we could just log a bunch of junk. The whole contract took up only a few lines of code.
LOGX instructions cost 375 gas per instruction, plus 8 gas per byte of topic data. Calldata is another 68 gas per byte. With the base transaction fee of 21,000 gas, we’re looking at a theoretical limit of a little over 330,000 gas to fill the whole block. Our method usually lands at about 515,000 gas, so we’re far from the maximal limit (probably because of loop overhead).
At 2018 ETH and gas prices it’d cost 20 - 30 cents per block using their contract implementation.
As of 2018 bloom false positive rate was 0.5% to ~1.5%.