PoW upgrades by soft fork


#1

A few days ago I wondered if Aeternity could accept not only cuckoo30
solutions, but cycles in larger graphs as well. A cuckoo31 graph for
instance has twice as many nodes, and takes about twice as much effort
to search. But scaling the difficulty only by half provides little
incentive to work on larger graphs, since the longer runtime also
hurts progress freeness.

There is however a natural difficulty scaling factor that provides
proper incentives. Recall that a cycle passes the difficulty test if
the 64 most significant bits of the cyclehash are below a difficulty
target.

With graph size scaling, we first divide this 64 bit number by
2^(N-M)*(N-1), where M is the minimum allowed value of N.
This is the number of siphash output bits that define the cuckoo graph.
It also matches the amount of work performed by a radix sorting based solver.

As a result, a cuckoo34 solution is valued 8.8 times more than a
cuckoo31 solution; a 10% bonus on top of the node ratio.

So I propose to just set a minimum value, e.g. N >= 30, and allow
simultaneous mining by cuckoo30, cuckoo31, … , cuckoo64.

To increase the memory requirements, i.e. to upgrade the PoW,
we need only increment M, which is a mere soft fork!
(EDIT: For this to be a soft-fork the scaling factor above cannot contain M. It should instead just be the starting value of M, like 30).
We’ll probably never see cuckoo64 solutions, since it would require Petabytes of memory to solve (and take close to forever), but I’m looking forward to a cuckoo42 cycle some day.

Miner manufacturers are incentivized to produce hardware for larger
graphs, both because of the bonus in cycle valuation, and to make the
miner more future proof against PoW upgrades.

The large and growing memory requirements aim to deter single-chip
ASICs in favor of simpler memory-controller like ASICs that connect
commodity memory chips together. Such ASICs only need to run efficient
enough to saturate the limited memory bandwidth/latency. With power
consumption and cost dominated by that of the memory chips, and the
roles of ASIC design skills and fab access diminished, miner
manufacturers can hopefully compete on more equal terms.


#2

Thanks~

How about ARM cores+HBM memory miner?


#3

I don’'t know of any HBM equipped ARM platform. Do you?
I’m also not aware of anyone having ported my solvers to ARM+NEON…


#4

Does that means 1080TI is the most efficient in mining AE?


#5

Yes, among all available consumer GPUs…


#6

Let me re-iterate, to see if I understand correctly.

You provide neat way of making Cuckoo PoW more configurable and hence more resistant to more sophisticated (in design and manufacturing) ASICs. It’s done by both enabling multiple graph sizes and also extra incentivising larger graphs in difficulty.

On one hand I like it, on the other, I wonder how it will impact our consensus algorithm. Probably one solution of larger graph will take longer time, so key block candidate will be pointing to microblock further in the past.


#7

Not much. GPUs take under a quarter second for cuckoo30, and will be
able to do cuckoo31 in about a half second. It makes less sense to run
larger graphs if it takes much more than a second, because you will
suffer from lack of progress freedom (when someone else finds next
block, your current attempt is aborted, and you want to reduce the
total time in aborted runs).
But you might see ASICs that can run multiple cuckoo33 per second,
which is what I want to encourage.