[Solved] Mainnet mining - Ubuntu Multi GPU Tutorial


#1

I made a tutorial for everyone.

https://link.medium.com/PdkD4UrnfS

Good luck!


#2

Thanks! Getting tired of waiting for the windows guide, looks like its ubuntu time


#3

Hey I tried installing with the procedure that you mentioned when I am trying to validate the epoch.yaml file i am getting validation error.
[email protected]:~/epoch$ ls
bin docs erts-9.3.3.3 lib patches REVISION
data epoch.yaml generated_keys log releases VERSION
[email protected]:~/epoch$ ./bin/epoch check_config epoch.yaml
Validation failed
Position: autostart
Value : #{<<97,117,116,111,115,116,97,114,116>>=>true,<<98,101,110,101,102,105,99,105,97,114,121>>=><<97,107,95,50,98,57,82,112,65,54,53,112,114,72,115,49,115,103,101,68,71,114,85,71,115,84,76,74,55,104,68,120,51,65,102,52,112,78,82,68,86,105,103,106,97,105,84,121,88,105,104,81,104>>,<<99,104,97,105,110>>=>null,<<99,104,97,110,110,101,108>>=>null,<<99,117,99,107,111,111>>=>null,<<100,98,95,112,97,116,104>>=><<46,47,99,104,97,105,110>>,<<100,105,114>>=><<107,101,121,115>>,<<101,100,103,101,95,98,105,116,115>>=>29,<<101,120,101,99,117,116,97,98,108,101>>=><<99,117,100,97,50,57>>,<<101,120,116,101,114,110,97,108>>=>null,<<101,120,116,101,114,110,97,108,95,112,111,114,116>>=>3015,<<101,120,116,114,97,95,97,114,103,115>>=><<>>,<<102,111,114,107,95,109,97,110,97,103,101,109,101,110,116>>=>null,<<104,101,120,95,101,110,99,111,100,101,100,95,104,101,97,100,101,114>>=>true,<<104,116,116,112>>=>null,<<105,110,115,116,97,110,99,101,115>>=>1,<<105,110,116,101,114,110,97,108>>=>null,<<107,101,121,115>>=>null,<<109,105,110,101,114>>=>null,<<109,105,110,105,110,103>>=>null,<<110,101,116,119,111,114,107,95,105,100>>=><<97,101,95,109,97,105,110,110,101,116>>,<<112,101,101,114,95,112,97,115,115,119,111,114,100>>=><<67,97,114,101,50,48,48,55>>,<<112,101,114,115,105,115,116>>=>true,<<112,111,114,116>>=>3014,<<115,121,110,99>>=>null,<<119,101,98,115,111,99,107,101,116>>=>null}
Schema :
{
“$schema”: “http://json-schema.org/draft-04/schema#”,
“additionalProperties”: false,
“definitions”: {
“key_value_pattern”: {
“pattern”: “^[a-zA-Z0-9\-\.]+\h*:\h*[0-9]+(\h*,\h*[a-zA-Z]+\h*:\h*[0-9]+)*”
}
},

}
Reason : No extra properties allowed
Configuration error (validation_failed)


#4

Here is my epoch.yaml


sync:
port: 3115
external_port: 3015

keys:
dir: keys
peer_password: “Care2007”

http:
external:
port: 3013
internal:
port: 3113

websocket:
channel:
port: 3014

mining:
beneficiary: “ak_2b9RpA65prHs1sgeDGrUGsTLJ7hDx3Af4pNRDVigjaiTyXihQh”
autostart: true
cuckoo:
miner:
executable: cuda29
extra_args: “”
instances: 5
edge_bits: 29
hex_encoded_header: true

chain:
persist: true
db_path: ./chain

fork_management:
network_id: ae_mainnet


#5

If I validate my yaml file I am getting validation error


#6

think some of the formatting got lost on medium.
Try copying the content of this file and then change the beneficiary


#7

Thank you so much @TwenteMining for helping the community get into mining!
Building æternity together!!! :heart_eyes:

Best,
Albena


#8

I failed to launch my node.
This dosen’t really help for me at least.


#9

Give us more details on what went wrong @MiniLemmings


#10

I think it has to do with the epoch.yaml file.
medium messes up the formatting.
correct format is found here


#11

I think it’s working now. Thx!


#12

hello,chris! Follow your tutorial,sync of my epoch looks normal, but the load on the GPU is very low,only 15watt/per card. After checking the LOG file, I got the above content. Where is the configuration wrong? Please help me, thank you
My “epoch_pow_cuckoo.log” looks like this:

2018-11-30 21:11:40.983 [debug] <0.16751.6>@aec_pow_cuckoo:parse_generation_result:479 GeForce GTX 1060 6GB with 6078MB @ 192 bits x 4004MHz
2018-11-30 21:11:40.993 [debug] <0.16753.6>@aec_pow_cuckoo:parse_generation_result:479 GeForce GTX 1060 6GB with 6078MB @ 192 bits x 4004MHz
2018-11-30 21:11:40.993 [debug] <0.16751.6>@aec_pow_cuckoo:parse_generation_result:479 Looking for 42-cycle on cuckoo30(“EyNnYRJCRZURHqtZ1zqRPy3ky+z+0F+vMG1qZUAGfZs=hP5DqCMA0dA=”,0) with 50% edges, 64 64 buckets, 176 trims, and 64 thread blocks.
2018-11-30 21:11:40.993 [debug] <0.16753.6>@aec_pow_cuckoo:parse_generation_result:479 Looking for 42-cycle on cuckoo30(“EyNnYRJCRZURHqtZ1zqRPy3ky+z+0F+vMG1qZUAGfZs=hv5DqCMA0dA=”,0) with 50% edges, 64
64 buckets, 176 trims, and 64 thread blocks.
2018-11-30 21:11:41.143 [error] <0.16755.6>@aec_pow_cuckoo:wait_for_result:421 OS process died: {status,30}
2018-11-30 21:11:41.148 [error] <0.16750.6>@aec_pow_cuckoo:wait_for_result:421 OS process died: {status,30}
2018-11-30 21:11:41.148 [error] <0.16754.6>@aec_pow_cuckoo:wait_for_result:421 OS process died: {status,30}
2018-11-30 21:11:41.153 [error] <0.16752.6>@aec_pow_cuckoo:wait_for_result:421 OS process died: {status,30}
2018-11-30 21:11:41.153 [error] <0.16751.6>@aec_pow_cuckoo:wait_for_result:421 OS process died: {status,30}
2018-11-30 21:11:41.201 [error] <0.16753.6>@aec_pow_cuckoo:wait_for_result:421 OS process died: {status,30}
2018-11-30 21:11:41.230 [debug] <0.16763.6>@aec_pow_cuckoo:generate:79 Generating solution for data hash <<19,35,103,97,18,66,69,149,17,30,171,89,215,58,145,63,45,228,203,236,254,208,95,175,48,109,106,101,64,6,125,155>> and nonce 15046807983168421513 with target 520715587.
2018-11-30 21:11:41.231 [info] <0.16767.6>@aec_pow_cuckoo:generate_int_:214 Executing cmd: “./cuda29 -h 45794E6E59524A43525A55524871745A317A71525079336B792B7A2B30462B764D4731715A554147665A733D6A50354471434D413064413D -d 3”
2018-11-30 21:11:41.231 [info] <0.16764.6>@aec_pow_cuckoo:generate_int_:214 Executing cmd: “./cuda29 -h 45794E6E59524A43525A55524871745A317A71525079336B792B7A2B30462B764D4731715A554147665A733D6966354471434D413064413D -d 0”
2018-11-30 21:11:41.231 [info] <0.16765.6>@aec_pow_cuckoo:generate_int_:214 Executing cmd: “./cuda29 -h 45794E6E59524A43525A55524871745A317A71525079336B792B7A2B30462B764D4731715A554147665A733D6976354471434D413064413D -d 1”
2018-11-30 21:11:41.231 [info] <0.16768.6>@aec_pow_cuckoo:generate_int_:214 Executing cmd: “./cuda29 -h 45794E6E59524A43525A55524871745A317A71525079336B792B7A2B30462B764D4731715A554147665A733D6A66354471434D413064413D -d 4”
2018-11-30 21:11:41.231 [info] <0.16766.6>@aec_pow_cuckoo:generate_int_:214 Executing cmd: “./cuda29 -h 45794E6E59524A43525A55524871745A317A71525079336B792B7A2B30462B764D4731715A554147665A733D692F354471434D413064413D -d 2”
2018-11-30 21:11:41.231 [info] <0.16769.6>@aec_pow_cuckoo:generate_int_:214 Executing cmd: “./cuda29 -h 45794E6E59524A43525A55524871745A317A71525079336B792B7A2B30462B764D4731715A554147665A733D6A76354471434D413064413D -d 5”
2018-11-30 21:11:42.563 [error] <0.16768.6>@aec_pow_cuckoo:wait_for_result:406 ERROR: GPUassert: unknown error mean.cu 369

2018-11-30 21:11:42.573 [debug] <0.16768.6>@aec_pow_cuckoo:parse_generation_result:479 GeForce GTX 1060 6GB with 6078MB @ 192 bits x 4004MHz
2018-11-30 21:11:42.574 [debug] <0.16768.6>@aec_pow_cuckoo:parse_generation_result:479 Looking for 42-cycle on cuckoo30(“EyNnYRJCRZURHqtZ1zqRPy3ky+z+0F+vMG1qZUAGfZs=jf5DqCMA0dA=”,0) with 50% edges, 64*64 buckets, 176 trims, and 64 thread blocks.
2018-11-30 21:11:42.609 [error] <0.16767.6>@aec_pow_cuckoo:wait_for_result:406 ERROR: GPUassert: unknown error mean.cu 369


#13

Hi,

Have you fixed the issue because I have similar issue but I am using 1060 3GB. If you find any solution please let me know.


#14

I tried to update CUDA 10. It seems that the power consumption is still low, but there are no errors in the log. God knows if it is working properly.


#15

22:56:37.762 [info] Initializing keys manager
{“Kernel pid terminated”,application_controller,"{application_start_failure,aecore,{{shutdown,{failed_to_start_child,aec_conductor_sup,{shutdown,{failed_to_start_child,aec_conductor,{{badmatch,{error,beneficiary_not_configured}},[{aec_conductor,init,1,[{file,"/home/albanes/multi_gpu/_build/prod/lib/aecore/src/aec_conductor.erl"},{line,176}]},{gen_server,init_it,2,[{file,“gen_server.erl”},{line,365}]},{gen_server,init_it,6,[{file,“gen_server.erl”},{line,333}]},{proc_lib,init_p_do_apply,3,[{file,“proc_lib.erl”},{line,247}]}]}}}}},{aecore_app,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,aecore,{{shutdown,{failed_to_start_child,aec_conductor_sup,{shutdown,{failed_to_start_child,aec_conductor,{{badmatch,{error,b

Crash dump is being written to: erl_crash.dump…done


#16

Just to clarify the following is cleary an error… Which is kind of hinted by including the string ERROR :wink:


#17

Yes, this error disappeared when I upgraded cuda. At present, the rig can work without error, but the power is unstable. The previous second is 100%, and the last second is 1%. Nothing in 24 hours :frowning:


#18

Your node must have a beneficiary configured… See: https://github.com/aeternity/epoch/blob/master/docs/configuration.md


#20

Yes, there are beneficiaries in my configuration. I am just confused about one thing: nvidia-smi shows that the GPU always reaches 100% utilization in a few seconds and then 0% utilization in a few seconds. This is a huge fluctuation, isn’t it? is this normal? Is there a better solution? Thank you


#21

As have been noted in other Threads here, the current gpu/multi-gpu setup is too simplistic and thus inefficient, there is too much overhead and I guess that is why you see fluctuation in GPU utilization.
Disclaimer: I am not a GPU expert.