[Completed] Aeternity node maintenance - iris hard fork release candidate

lydia · September 2, 2020, 12:59pm

@hanssv.chain and @marco.chain Thank you for the reply!
@uwiger Please assign this issue.
Who can take over the important Sync: cleanup dead peers #3290 ?

uwiger · September 2, 2020, 1:09pm

Done, and Hans and Radek removed as assignees.

botanicalcarebot.chain · September 4, 2020, 9:39am

I also suggest not to move onto OTP 23 now, unless any new language feature from OTP 23 is a hard requirement, which doesn’t seem to be the case. Migrating to OTP 22.3 is a lot of work already, considering all dependencies need to be updated and potentially fixed too. It is a stable target to work against, whereas OTP 23 is still fresh and there will be bugs and incompatibilities ahead.

vishwas_hypermine · September 4, 2020, 3:35pm

Hi guys,

I have been working on State channel for quite sometime. And been noticing a few issues here and there (like sometime socket getting disconnected etc.) but could not really figure out the root cause hence did not post. All I can say, it does not seems to be node issue

I am in touch with a company in India with whom I am working parallely on using state channel protocol for a use case. We also have added a couple of RPCs and preparing a demo call with you guys so that I can explain need of those RPC and after that we will raise the PR if it makes sense.

AE state channels are currently in highlight atleast within my network here in India since I have been promoting it for quite sometime.

dimitar.chain · September 7, 2020, 1:46pm

Our progress for the past week:

@uwiger worked 36.5 hours
PR #3292 Pluggable core functionality
This feature is a cornerstone of the Hyperchains work, and has now been merged into master (cooperation between the Hyperchains team and the maintenance project)
PR #3294 Use parse_transform w -pluggable() attrs
This PR is a prerequisite for #3292 above, and has now been merged into master (cooperation between the Hyperchains team and the maintenance project)
PR #3341 Update deps and CircleCI for OTP 22
CI is now up and running for Aeternity on OTP 22. We disabled a job for OTP 23, since there are still some build issues there.
Issue #3283 Expose chain “transactions” in contract calls
This is progressing, but not yet ready.

@dimitar.chain worked 38.75 hours:
ForceProgress transaction has no info on-chain #3229
Finalised it
Missing test SUITE: aesc_utils #3285
I’ve added a few dozens of tests, but a few dozens yet to be added. I’ve found some small improvement points in the code and I’ve addressed those accordingly.

@dincho.chain spent 15 hours on modifying the CI to make it run OTP22 by default. Docker builds were adjusted as well.

dimitar.chain · September 9, 2020, 10:21am

Since this is our last week on this proposal, on Monday we will share our last progress.

In the past month we’ve accomplished a lot and we are happy with our progress there, especially given it was Ulf and me doing the coding and Dincho the DevOps. On this basis we would like to share with you our proposal for the next 2 months. We propose a bigger timeframe so we can tackle some bigger tasks. What is more, Hans can help us out as well. At the moment he can not dedicate more than 16h a week, hopefully this would change for the better.

Below you can find our horizon of tasks for the next 2 months. Please note that we don’t commit that we would do all of those in the timeframe but rather this is the order we would tackle tasks.

So this is our proposal It is up to the foundation to decide if they would like to support it or not. cc @Lydia and @Tina

Ulf Wiger @uwiger

Update rocksdb to 6.4.6

The latest version of erlang-rocksdb supports Rocksdb 6.5.2 Our system currently uses erlang-rocksdb 0.24.0, which uses Rocksdb 5.15.10. A new release should be forthcoming, also adapting the Erlang part to OTP 23. We want to move to a newer Rocksdb not least because Rocksdb takes up a large part of the Aeternity build time. Also, lots of bugfixes and performance improvements have been introduced in later Rocksdb versions.

When syncing from backup, accept previous states in DB if they don’t differ

This would improve things for the Middleware, avoiding unnecessary problems during database import.

Rest API endpoints version prefix

github.com/aeternity/aeternity

Rest API endpoints version prefix

opened 07:15AM - 15 Jun 20 UTC

closed 08:43AM - 20 May 21 UTC

dincho

breaking/api kind/improvement

Rest API endpoints version prefix should be bumped with the node major. Currentl…y the version endpoint prefix is hardcoded to `v2` in the URL, this is plainly wrong and currently useless. The prefix should reflect major (backward incompatible) API changes to signal users and machines about the fact. An example use case is caching layers, e.g. block 1337 can be cached "forever" until its API structure changes for some reason, and changing the prefix will technically invalidate the cache.

This is regular technical debt, and should be fixed.

Dev mode

Supporting “dev mode” (fake) mining instead of running light cuckoo cycle mining. A prototype for this can be said to exist in the test suites, where this is achieved through mocking.

Data and log locations should be configurable from other location

github.com/aeternity/aeternity

data and log locations should be configurable from other location

opened 03:29PM - 30 Jan 20 UTC

uwiger

kind/improvement area/core status/approved

## Expected Behavior Data and log locations should be configurable from the o…utside. ## Actual Behavior Some data and log paths are relative, so end up in the CWD A reasonable solution would be to ensure that if `setup:home()` is set to an absolute path (and by extension also `setup:data_dir()` and `setup:log_dir()`), then all data and log files should end up there. Esp `lager` settings will need to be tweaked for this. ## Steps to Reproduce the Problem 1. See the [ae_plugin](https://github.com/aeternity/ae_plugin) The bootstrap logic performs some path rewriting that should be unnecessary, in order to get the files in the right place (and some still don't) ## Logs, error output, etc. ## Specifications

This would be helpful for plugin applications, and should not be too hard to implement.

Unhandled error in aec_chain_metrics_probe

github.com/aeternity/aeternity

Unhandled error in aec_chain_metrics_probe

opened 12:48PM - 23 Dec 19 UTC

closed 02:45PM - 23 Mar 21 UTC

dincho

kind/bug need/input-requested

## Expected Behavior Process not crashing ## Actual Behavior Crash #…# Steps to Reproduce the Problem Probably inconsistent database. Restart does not help. ## Logs, error output, etc. ``` 2019-12-23 12:17:17.022 [error] emulator Error in process <0.5158.196> on node aeternity@localhost with exit value: {{try_clause,{error,not_rooted}},[{aec_chain_metrics_probe,total_difficulty,0,[{file,"/home/builder/aeternity/apps/aecore/src/aec_chain_metrics_probe.erl"},{line,130}]},{aec_chain_metrics_probe,sample_,2,[{file,"/home/builder/aeternity/apps/aecore/src/aec_chain_metrics_probe.erl"},{line,125}]},{aec_chain_metrics_probe,'-probe_sample/1-fun-0-',1,[{file,"/home/builder/aeternity/apps/aecore/src/aec_chain_metrics_probe.erl"},{line,79}]}]} ``` Might be related as well: ``` 2019-12-23 11:50:27.872 [error] <0.32556.195> CRASH REPORT Process <0.32556.195> with 0 neighbours exited with reason: {{{badmatch,{error,not_rooted}},[{aec_peer_connection,local_ping_obj,1,[{file,"/home/builder/aeternity/apps/aecore/src/aec_peer_connection.erl"},{line,738}]},{aec_peer_connection,prepare_request_data,3,[{file,"/home/builder/aeternity/apps/aecore/src/aec_peer_connection.erl"},{line,592}]},{aec_peer_connection,handle_request,4,[{file,"/home/builder/aeternity/apps/aecore/src/aec_peer_connection.erl"},{line,587}]},{gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,636}]},{gen_server,handle_msg,...},...]},...} in gen_server:call/3 line 214 ``` ## Specifications - Virtualization: AWS - Hardware specs: t3.large - OS: Ubuntu 16.04 - Node Version: 5.3.0

Probably a rare error, but should be easy to fix. Though the origin of the error is unknown, so testing may be a bit tricky, and addressing the root cause even more so. What we can begin to do is to make the metric probe more robust.

More flexible/file-less configuration

This would simplify testing and deployment of closed systems, and should be easy to implement (testing may take a little bit more time).

Allow configuration by OS environment variables

github.com/aeternity/aeternity

Allow configuration by OS environment variables

opened 07:04AM - 15 Jun 20 UTC

closed 01:44PM - 27 Sep 21 UTC

dincho

community kind/improvement

Currently the node can be configuration by command line parameters and configura…tion file. Where the configuration file itself can be changes by command line parameter or `AETERNITY_CONFIG` OS environment var. see https://github.com/aeternity/aeternity/blob/master/docs/configuration.md#user-provided-configuration In the world of containerisation is much more "natural" to use OS environment variables to fully configuration a given piece of software, that would easy the deployment in such environments. e.g. `AETERNITY_NETWORK_ID`, TBD the exact structure/format

This would simplify test setup and development environments. The best way to address it may be to refactor some of the legacy code which checks configuration data. The methods of handling config data evolved over time, and the code reflects this.

aehttp_sc_SUITE failure: timeout waiting for channel `open` messages

github.com/aeternity/aeternity

aehttp_sc_SUITE failure: timeout waiting for channel `open` messages

opened 05:57PM - 25 Aug 20 UTC

uwiger

kind/bug area/tests

The `aehttp_sc_SUITE:sc_ws_min_depth_is_modifiable/1` test case fails with a tim…eout - at least in some runs. ``` === Reason: {timeout,{messages,[{<0.9168.0>,websocket_event,channel, update, #{<<"jsonrpc">> => <<"2.0">>, <<"method">> => <<"channels.update">>, <<"params">> => #{<<"channel_id">> => <<"ch_21woyLUNVapgSZrHrbdTKohDG5Yad9xNw4wFsrLzf8sKFspZim">>, <<"data">> => #{<<"state">> => <<"tx_+QENCwH4hLhAdU5TGcrejxQkGW36Bb7mfyY/N6FwCG5qJxSDCpUmcIv0ie2oy0TzPWq9TpTNeby7zYVcnO/hjIUY1S+KiDTrBLhAn0NBWw3RzA4FS9tujscTinOUTK4jm5RuG7eRMyrMycUVRl4olmgxkDHIPLi3YMMzJ+sdHgK3wHvkxUETOYCDA7iD+IEyAaEBnvLdFTEfyWE16WLId900E+O0wsvSmKqkynYPhUodScSGP6olImAAoQG5u4uTbiAM4+qz6hnyjAQZJnfxP/hQFvbjGc7kagT/PoYkYTnKgAACCgCGEAZ510gAwKCBwESIXtJYQs/2KrdjFqVbmEKLcMJyZ1sylrOYFH8zrQJifO77">>}}, <<"version">> => 1}}, {<0.9163.0>,websocket_event,channel, update, #{<<"jsonrpc">> => <<"2.0">>, <<"method">> => <<"channels.update">>, <<"params">> => #{<<"channel_id">> => <<"ch_21woyLUNVapgSZrHrbdTKohDG5Yad9xNw4wFsrLzf8sKFspZim">>, <<"data">> => #{<<"state">> => <<"tx_+QENCwH4hLhAdU5TGcrejxQkGW36Bb7mfyY/N6FwCG5qJxSDCpUmcIv0ie2oy0TzPWq9TpTNeby7zYVcnO/hjIUY1S+KiDTrBLhAn0NBWw3RzA4FS9tujscTinOUTK4jm5RuG7eRMyrMycUVRl4olmgxkDHIPLi3YMMzJ+sdHgK3wHvkxUETOYCDA7iD+IEyAaEBnvLdFTEfyWE16WLId900E+O0wsvSmKqkynYPhUodScSGP6olImAAoQG5u4uTbiAM4+qz6hnyjAQZJnfxP/hQFvbjGc7kagT/PoYkYTnKgAACCgCGEAZ510gAwKCBwESIXtJYQs/2KrdjFqVbmEKLcMJyZ1sylrOYFH8zrQJifO77">>}}, <<"version">> => 1}}]}} in function aehttp_ws_test_utils:wait_for_msg/5 (/home/builder/aeternity/apps/aehttp/test/aehttp_ws_test_utils.erl, line 324) in call from aehttp_sc_SUITE:wait_for_channel_event_/3 (/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl, line 4294) in call from aehttp_sc_SUITE:wait_for_channel_event_match/4 (/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl, line 4268) in call from aehttp_sc_SUITE:channel_send_chan_open_infos/3 (/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl, line 894) in call from aehttp_sc_SUITE:finish_sc_ws_open/2 (/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl, line 842) in call from aehttp_sc_SUITE:sc_ws_open_/4 (/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl, line 775) in call from aehttp_sc_SUITE:sc_ws_min_depth_is_modifiable/1 (/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl, line 3012) in call from test_server:ts_tc/3 (test_server.erl, line 1755) ``` From some log analysis, it seems as if the problem is that the channel is opened with `minimum_depth => 0`. This confuses the generic channel setup code, which has a finishing phase where, optionally, blocks are mined to ensure that the `create_tx` is actually included in a block, and minimum depth is reached. This is triggered for the test case in question, but the tx has already been included, and since `minimum_depth == 0`, minimum depth has also been reached and the associated info reports already delivered. In a failing run, the following could be seen from the test case output: ``` *** User 2020-08-25 07:23:33.929 *** aec_conductor:start_mining(#{}) (aeternity_dev1@localhost) -> ok *** User 2020-08-25 07:23:33.973 *** aec_conductor:stop_mining() (aeternity_dev1@localhost) -> ok *** CT Error Notification 2020-08-25 07:23:45.980 *** aehttp_ws_test_utils:wait_for_msg failed on line 324 Reason: timeout ``` From the stacktrace above, we can see that the test core is waiting for an `open` info msg (line 842). But scrolling up, we find those messages already delivered, although the test case code wasn't ready for them then. ``` *** User 2020-08-25 07:23:33.908 *** No test registered for this event (Msg = #{<<"jsonrpc">> => <<"2.0">>, <<"method">> => <<"channels.info">>, <<"params">> => #{<<"channel_id">> => <<"ch_21woyLUNVapgSZrHrbdTKohDG5Yad9xNw4wFsrLzf8sKFspZim">>, <<"data">> => #{<<"event">> => <<"open">>}}, <<"version">> => 1}) *** User 2020-08-25 07:23:33.909 *** [initiator] Received msg #{<<"jsonrpc">> => <<"2.0">>, <<"method">> => <<"channels.info">>, <<"params">> => #{<<"channel_id">> => <<"ch_21woyLUNVapgSZrHrbdTKohDG5Yad9xNw4wFsrLzf8sKFspZim">>, <<"data">> => #{<<"event">> => <<"open">>}}, <<"version">> => 1} ```

This bug was detected during the maintenance project, and causes intermittent failures in the CI. It should be fixed, should not take more than 1-2 man-days.

The following issues are broken-down tasks from the already approved issue #3194 (Relax restriction that channel cannot be used before min_depth · Issue #3194 · aeternity/aeternity · GitHub)

State Channels: Inactivity timer in chain watcher

github.com/aeternity/aeternity

State Channels: Inactivity timer in chain watcher

opened 03:53PM - 01 Sep 20 UTC

uwiger

area/statechannels

High-level issue: #3194 [High-level discutssion](https://github.com/aeternity/…aeternity/wiki/Making-the-State-Channel-FSM-responsive-before-minimum-depth-confirmation) The idea is to be able to order a timer which triggers if a given event (e.g. any, or specific, channel change, for a given channel ID) doesn't occur within a given number of key-/microblocks. As a first step, this type of event could be requested by the client (perhaps also other types of chain watcher events).

State Channels: Client can ask FSM to quit waiting for minimum depth

State Channels: modifiable `minimum_depth` default

Hans Svenson @hanssv.chain

FATE cannot get blockhash of current generation

github.com/aeternity/aeternity

FATE cannot get blockhash of current generation

opened 07:49AM - 11 Nov 19 UTC

closed 08:48AM - 08 Oct 20 UTC

ThomasArts

breaking/consensus kind/improvement area/fate

## Expected Behavior For Sophia contract: ``` entrypoint my_hash() = C…hain.block_hash(Chain.block_height) ``` One expects a hash back, but instead `None` is returned. The cause is in: https://github.com/aeternity/aeternity/blob/afef1aa92a1cd6a75f4037d8e2be540ed4114612/apps/aefate/src/aefa_fate_op.erl#L858 There is an off-by-one error `>=` should be `>`. This is consensus breaking and has to be conditional w.r.t. next hard fork. If we change that, a request to relax 256 blocks in the past to something that reflects 24 hours or so. Some discussion needed on how to deal with contracts that change semantics. Clearly, contracts created after the hardfork will have to respect the new semantics, but should we support different call outrcomes for contracts created before hardfork, but called after hardfork? Do we need a general policy for this or is this a case-by-case discussion?

This an outright bug that should be fixed.

AENS: Review and simplify pointers

Currently name pointers allow too much freedom for the user to be creative. This should be revisited

Make inner transaction of PayingForTx non-valid

github.com/aeternity/aeternity

Make inner transaction of PayingForTx non-valid

opened 09:37AM - 03 Dec 19 UTC

closed 10:33AM - 05 Jan 21 UTC

hanssv

breaking/consensus kind/improvement area/core

The initial implementation has a fully fledged normally signed inner transaction…. This means that it is possible to unwrap the PayingForTx and post the inner transaction on-chain. Since the inner transaction isn't intended to be used like this it would be nice to disallow it. The idea is to change what is signed for the inner transaction - the obvious idea is to drop the network-id from the signing schema but perhaps there are other ways as well.

This is a bug in the PayingForTx that would render it useless. The attack vector is described in the GitHub issue. This must be done before Iris release.

AENS: Increase the name expiry time

This is something that came up a few times in the forum already: name expiration was never decided by the public. The idea here is to allow the community to vote on when names should expire.

AENS: Fix bug in AENS.update signature check

This is a bug, it must be fixed.

Deprecate AEVM properly for Iris

This one is a technical debt, it should be resolved ASAP.

Dincho Todorov @dincho.chain

Dincho would be providing us with his DevOps skills so he is needed all over the tasks, really. When he is not overloaded with work, he will be cleaning the issues assigned to him:

Dimitar Ivanov @dimitar.chain

Sync: cleanup dead peers

This bug had beem around for long time now. This would be my priority task. There had been a few attempts to expose the bug, so far all of those exposed some issues but didn’t solve it. It is a black box issue and we would not know how much time and effort it would require to fix. It might take 2 weeks or over a month, exactly how much it would take is to determine my availability for the rest of the tasks. A few more issues might be created from this one. I will need Dincho’s help here as well.

HTTP Websockets upgrade regression

github.com/aeternity/aeternity

HTTP Websockets upgrade regression

opened 11:19AM - 27 Jan 20 UTC

dincho

kind/bug area/statechannels area/api status/approved

## Expected Behavior ``` $ curl localhost:3014/channel -v * Trying ::1...… * TCP_NODELAY set * Connected to localhost (::1) port 3014 (#0) > GET /channel HTTP/1.1 > Host: localhost:3014 > User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0) > Accept: */* > Referer: > < HTTP/1.1 426 Upgrade Required < connection: upgrade < content-length: 0 < date: Mon, 27 Jan 2020 11:08:39 GMT < server: Cowboy < upgrade: websocket < * Connection #0 to host localhost left intact * Closing connection 0 ``` ## Actual Behavior ``` $ curl localhost:3014/channel -v * Trying ::1... * TCP_NODELAY set * Connected to localhost (::1) port 3014 (#0) > GET /channel HTTP/1.1 > Host: localhost:3014 > User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0) > Accept: */* > Referer: > < HTTP/1.1 500 Internal Server Error < content-length: 0 < * Connection #0 to host localhost left intact * Closing connection 0 ``` ## Steps to Reproduce the Problem Install any 5.* version of the node (except rc1) and use the above commands. ## Logs, error output, etc. *`aeternity.yaml` configuration file (formerly named `epoch.yaml`)* ``` websocket: channel: listen_address: 0.0.0.0 ``` ## Specifications - Node Version: 5.* This is working as expected in node versions 4.* and v5.0.0-rc.1 First appears in v5.0.0-rc.2

This bug is breaking some of the tools used by SRE and should be a low hanging fruit.

Out of sync /status endpoint data

github.com/aeternity/aeternity

Out of sync /status endpoint data

opened 04:34PM - 18 Dec 19 UTC

closed 09:54AM - 05 Jan 21 UTC

dincho

kind/bug

## Expected Behavior ``` $ curl -s http://35.166.231.86:3013/v2/status | jq … { "difficulty": 21527826519820, "genesis_key_block_hash": "kh_pbtwgLrNu23k9PA6XCZnUbtsvEFeQGgavY4FS2do3QP8kcp2z", "listening": true, "network_id": "ae_mainnet", "node_revision": "c6c12b039971ebe9a367d76826c6acbbd966fa0d", "node_version": "5.2.0", "peer_connections": { "inbound": 110, "outbound": 20 }, "peer_count": 25238, "peer_pubkey": "pp_21DNLkjdBuoN7EajkK3ePfRMHbyMkhcuW5rJYBQsXNPDtu3v9n", "pending_transactions_count": 104, "protocols": [ { "effective_at_height": 161150, "version": 4 }, { "effective_at_height": 90800, "version": 3 }, { "effective_at_height": 47800, "version": 2 }, { "effective_at_height": 0, "version": 1 } ], "solutions": 0, "sync_progress": 100, "syncing": false, "top_block_height": 184607, "top_key_block_hash": "kh_2hTa446BBHBYKoodtnQxJmXzmophGjmF5P8gtwdeUx8Ji3aZvN" } ``` ## Actual Behavior ``` $ curl -s http://35.166.231.86:3013/v2/status | jq { "difficulty": 21527826519820, "genesis_key_block_hash": "kh_pbtwgLrNu23k9PA6XCZnUbtsvEFeQGgavY4FS2do3QP8kcp2z", "listening": true, "network_id": "ae_mainnet", "node_revision": "c6c12b039971ebe9a367d76826c6acbbd966fa0d", "node_version": "5.2.0", "peer_connections": { "inbound": 110, "outbound": 20 }, "peer_count": 25238, "peer_pubkey": "pp_21DNLkjdBuoN7EajkK3ePfRMHbyMkhcuW5rJYBQsXNPDtu3v9n", "pending_transactions_count": 104, "protocols": [ { "effective_at_height": 161150, "version": 4 }, { "effective_at_height": 90800, "version": 3 }, { "effective_at_height": 47800, "version": 2 }, { "effective_at_height": 0, "version": 1 } ], "solutions": 0, "sync_progress": 100, "syncing": true, "top_block_height": 184607, "top_key_block_hash": "kh_2hTa446BBHBYKoodtnQxJmXzmophGjmF5P8gtwdeUx8Ji3aZvN" } ``` Note that `sync_progress` and `syncing` fields are out of sync. ## Steps to Reproduce the Problem No, looks like the sync processes are sometimes stuck ## Logs, error output, etc. Nope ## Specifications See the status output above. Hardware not related probably.

This is a curious bug that points to a race condition in the code. The result is a confusing API that is hard to reason about.

aec_chain_state infinity restarts and crashes

github.com/aeternity/aeternity

aec_chain_state infinity restarts and crashes

opened 09:54AM - 11 Dec 19 UTC

closed 11:27AM - 18 Dec 20 UTC

dincho

## Expected Behavior Recover or crash the node ## Actual Behavior The n…ode starts spitting exponential number of errors (e.g. 10k/4h) without trying to recover. If the recovery it's not possible it should stop then as it's not operational at all. ## Steps to Reproduce the Problem Unknown ## Logs, error output, etc. ``` 2019-12-09 12:44:26.482 [error] <0.12513.30> Supervisor aec_conductor_sup had child aec_conductor started with aec_conductor:start_link() at <0.12610.30> exit with reason {aborted,{{found_already_calculated_state,<<104,122,94,58,45,227,152,23,188,69,0,106,35,191,113,115,133,113,37,25,201,170,116,99,65,78,0,151,68,239,164,19>>},[{aec_chain_state,update_state_tree,4,[{file,"/home/builder/aeternity/apps/aecore/src/aec_chain_state.erl"},{line,702}]},{aec_chain_state,update_state_tree,2,[{file,"/home/builder/aeternity/apps/aecore/src/aec_chain_state.erl"},{line,693}]},{aec_chain_state,internal_insert_transaction,3,[{file,"/home/builder/aeternity/apps/aecore/src/a..."},...]},...]}} in context child_terminated ``` ## Specifications - Virtualization: AWS - Hardware specs: t3.large - OS: Ubuntu 16.04.5 - Node Version: 5.2.0 - Instance ID: i-0dc2fe355c1e42ab7

The error recover mechanism seems to be broken, not marked as a bug but it is clearly one. This could result in filling one’s HDD with garbage logs.

meta_tx’s TTL

github.com/aeternity/aeternity

meta_tx's TTL

opened 11:14AM - 03 Dec 19 UTC

closed 01:20PM - 02 Dec 20 UTC

velzevur

kind/bug breaking/consensus area/generalized_accounts

`aetx:ttl/1` specializes the inner tx and calls its callback's `ttl/1`. In the c…ase of a channel co-authenticated transaction when both participants are GAs, that would result in two embedded meta transactions. Calling `aetx:ttl/1` that would result in the outermost meta_tx's ttl. Instead `aetx:ttl/1` should return the innermost transaction's `ttl/1`.

This bug could result in unexpected results when using generalised accounts: the TTL being used is the one authenticating the inner transaction but it must be the other way around.

Test suite bugs

aest_channels_SUITE ==> test_simple_different_nodes_channel: FAILED badmatch

github.com/aeternity/aeternity

aest_channels_SUITE ==> test_simple_different_nodes_channel: FAILED badmatch

opened 12:50PM - 26 Nov 19 UTC

tolbrino

kind/bug area/statechannels area/tests

## Expected Behavior Tests pass. ## Actual Behavior ``` %%% aest_chann…els_SUITE ==> test_simple_different_nodes_channel: FAILED %%% aest_channels_SUITE ==> {{badmatch,{ok,#{<<"info">> => <<"close_mutual">>, <<"tx">> => <<"tx_+OkLAfiEuEBlz0bOgvRRn2S4RBOecxdlkIUQFjzgA8MBVYURh8aXIo0siqCiUmslCJkGDq1DrxRf5w79kPXtPhpSPFAmlKcDuECsGIAKMPPE3i0gLwhTE91HQZzILSU+IGPpR9kWNIgncSBH0tPmRpnpUKU8SvzXondpx57nLVU91ORxqz5YNQUCuF/4XTUBoQbFNveodt+B570IC8UcdMjjekwZxecIXKZESd7ONTxTy6EBZxxVRkZJRXWytJT2UWghcQZj2EiTzdLSNgN6VMM+7oSGJFyRsrf+hiRVlY8MAgCGEjCc5UAAA27NY1I=">>, <<"type">> => <<"channel_close_mutual_tx">>}}}, [{aest_api,sc_close_mutual,2, [{file,"/home/circleci/aeternity/_build/system_test+test/extras/system_test/common/helpers/aest_api.erl"}, {line,176}]}, {aest_channels_SUITE,simple_channel_test,4, [{file,"/home/circleci/aeternity/_build/system_test+test/extras/system_test/common/aest_channels_SUITE.erl"}, {line,205}]}, {test_server,ts_tc,3,[{file,"test_server.erl"},{line,1755}]}, {test_server,run_test_case_eval1,6,[{file,"test_server.erl"},{line,1262}]}, {test_server,run_test_case_eval,9,[{file,"test_server.erl"},{line,1194}]}]} . ``` ## Steps to Reproduce the Problem None atm. ## Logs, error output, etc. https://circleci.com/gh/aeternity/aeternity/100969

aehttp_sc_SUITE ==> plain.with_open_channel.sc_ws_update_abort: FAILED timeout

github.com/aeternity/aeternity

aehttp_sc_SUITE ==> plain.with_open_channel.sc_ws_update_abort: FAILED timeout

opened 09:23AM - 31 Oct 19 UTC

tolbrino

kind/bug area/tests

## Expected Behavior Tests pass. ## Actual Behavior ``` %%% aehttp_sc_…SUITE ==> plain.with_open_channel.sc_ws_update_abort: FAILED %%% aehttp_sc_SUITE ==> {{timeout,{messages,[{<0.14644.0>,websocket_event,channel,conflict, #{<<"jsonrpc">> => <<"2.0">>, <<"method">> => <<"channels.conflict">>, <<"params">> => #{<<"channel_id">> => <<"ch_2mYFVhAbGMgwQPyPuHS9prC7WmKLMFyy89cqqtRR6CZ9kjcYDA">>, <<"data">> => #{<<"channel_id">> => <<"ch_2mYFVhAbGMgwQPyPuHS9prC7WmKLMFyy89cqqtRR6CZ9kjcYDA">>, <<"error_code">> => 2, <<"error_msg">> => <<"conflict">>, <<"round">> => 5}}, <<"version">> => 1}}]}}, [{aehttp_ws_test_utils,wait_for_msg,5, [{file,"/home/builder/aeternity/apps/aehttp/test/aehttp_ws_test_utils.erl"}, {line,316}]}, {aehttp_sc_SUITE,wait_for_channel_event_,3, [{file,"/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl"}, {line,3676}]}, {aehttp_sc_SUITE,wait_for_channel_event_match,4, [{file,"/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl"}, {line,3641}]}, {aehttp_sc_SUITE,channel_abort_sign_tx,4, [{file,"/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl"}, {line,509}]}, {aehttp_sc_SUITE,sc_ws_update_abort,1, [{file,"/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl"}, {line,3096}]}, {test_server,ts_tc,3,[{file,"test_server.erl"},{line,1755}]}, {test_server,run_test_case_eval1,6,[{file,"test_server.erl"},{line,1262}]}, {test_server,run_test_case_eval,9,[{file,"test_server.erl"},{line,1194}]}]} ``` ## Steps to Reproduce the Problem Can't be reliably reproduced yet. ## Logs, error output, etc. https://circleci.com/gh/aeternity/aeternity/95397#tests/containers/2

Those are bugs in the test setup.

Drop “native” windows support

Bring the discussion in the forum if the community needs the Windows build and if not - deprecate it.

gorbak25 · September 10, 2020, 10:46am

Hi!

I’m speaking as the current lead of the Hyperchain project. I want to emphasize the importance and priority of the maintenance project. It’s not about introducing new features but about keeping the AE ecosystem alive. Currently Aeternity is not only developing new cutting edge products like Hyperchains or Superhero but is also a service provider - SDK, Middleware, Seed Nodes, DB snapshots, Monitoring etc… This proposal is in simple terms “Hey, we need to keep our Core Infrastructure Alive, have someone ready who can fix something in case of an emergency and fix existing bugs”
CC: @Lydia @Tina @YaniUnchained

If the 2 month extension is not approved(possibly THIS week, a simple “Hey, please work on this while we handle the bureaucracy” will be enough) then my team will need to do a lot of those tasks in the scope of Hyperchains in order to release a finished product, which will extend the ETA for releasing hyperchains by possibly months. What I would really like to see done(which can be labelled as General Node Maintenace) before releasing HC is:

Rocksdb upgrade -> performance will increase and the Q/A process will be speed up which will save us a lot of time
Drop windows support -> I don’t think anybody is using that, will speed up Q/A
Transient failures in the SC test suite -> those tests slow us down due to the possibility of rerunning the entire Q/A process
Sync: cleanup dead peers -> This needs to be done because curently we practically never evict dead peers from the peer pool and we only have 1% of active peers here -> this essentially makes the AE network centralized and unsafe…
Sync: fast sync -> Sync can take weeks… We can speed up things by compromising security slightly - this would allow us to drop the centralized DB backup service…
Sync: peer persistance -> If you restart the node then you need to sync the peer pool again which essentially opens you up to eclipse attacks, on the other hand because only 1% of the peers in the pool are actually active this essentially would mean that after an restart it would be inpossible to sync…
Deprecate AEVM -> it clogs up the codebase and should never be used in HC as we have the FATE VM
Make inner transaction of PayingForTx non-valid -> This needs to be fixed as this bug will propagate to all Hyprchains
FATE cannot get blockhash of current generation -> This decreases usefulness of Sophia smart contracts
Crash in aec_chain_metrics_probe
Dev mode -> Actually we started implementing more or less this because otherwise we are unable to test HC properly - currently @radrow.chain is refactoring the SC chain simulator to allow it to be used in the scope of HC

There are other issues which the HC team could tackle but they can be postponed for later(not necessary for the MVP or HC). Keep in mind that any bug in the Node will propagate to Hyperchains and it will be hard to fix them later in hyperchains as we have no control over each individual hyperchain.

Best Regards,
Grzegorz

dimitar.chain · September 10, 2020, 11:04am

Although this had been discussed many times already, those are not even tracked as issues.

radrow.chain · September 10, 2020, 11:11am

I totally support all the proposed tasks. They are all very valuable, and some of them are completely necessary to me (like dev mode (however I am working on something similar at this moment), rocksdb update, fast sync, not even mentioning bugfixes).

Healthy ecosystem is crucial for all of the development we are doing here – not only limited to Hyperchains or Superhero. Writing more serious things requires more serious testing and more flexible (and bug free) environment. While I was working on the staking contract I really felt some of these issues being a chain on my feet – especially the testing part. We get really distracted by situations when something fails in the network and requires discussing what is the maintenance team allowed to fix and what is not. In my opinion, some emergency maintenance budget should be set as well. It is very important to speed up the approval process, as it is mostly work that is required to do other tasks. The HC team has its own things to do and won’t be able to handle all the issues mentioned here keeping reasonable delivery time. And especially, we can’t just ignore them because we don’t want them to propagate into HCs (like for example AEVM support).

From my side as an iris target I would also add Create contracts from other contracts · Issue #197 · aeternity/aesophia · GitHub – this would have a huge impact on aepps development and would drastically increase reliability of the repetetive smart contract models (like bonding curve tokens or hyperchains staking contracts).

This is not an iris target (cause it doesn’t need a hard fork), but will be priceless during further smart contract development: FATE debugger · Issue #201 · aeternity/aesophia · GitHub.

marco.chain · September 10, 2020, 12:39pm

I’d love to see those tasks being approved. very nice to see increasing activity of the core team in the forum!

we (kryptokrauts) need the iris hard fork as soon as possible to be able to introduce cool features in regards to the naming system (e.g. name extender, name bazaar)

hanssv.chain · September 10, 2020, 2:41pm

Just so you don’t misunderstand the “Deprecate AEVM” task, for Hyperchains you can remove AEVM fully, but the Aeternity core node has to keep it. But there won’t be any new AEVM contracts allowed on chain.

uwiger · September 11, 2020, 8:22pm

I have pushed a WIP (Work In Progress) PR for exposing chain events from contract calls.
There are still some issues, e.g. when returning events to the HTTP client.

There may also be some event needed at contract setup (see @hanssv.chain comments).
If the maintenance project is extended, I can continue next week.

dimitar.chain · September 14, 2020, 10:04am

Our progress for the past week:

Ulf Wiger @uwiger

Had worked 39 hours. Mainly worked on Issue #3283 - Expose chain events in contract calls. A Work-In-Progress PR has been pushed. Events are collected and can be subscribed to as internal events. Some work is still required in order to debug the HTTP endpoint for dry-run, where chain events can now be optionally reported. Also, some review and discussion is needed regarding the format of events, and whether some additional event types should be reported.

Dincho Todorov @dincho.chain

Had worked 25 hours. He prepared infrastructure for a release and healed the nodes, he increased their disk size. He investigated the infrastructure alerts and he spent some time on GitHub issue 3301 - HTTP cache tests

Dimitar Ivanov @dimitar.chain

Had worked 44 hours. Those were mostly spent on adding more and more tests to the aesc_utils_tests suite. I’ve identified some improvement points and added missing function specs. I’ve also prepared the 5.5.5 release PR.

gorbak25 · September 14, 2020, 10:34am

Hi!

Regarding maintenance the priorities of the Hyperchain projects are as follows. Maintenance tasks are grouped in 3 categories:

Required for hyperchains(order of decreasing priorities):

Onchain protocol fixes:
If a issue/bug is discovered which affects the onchain protocol then this should be fixed ASAP. This includes bugs in the FATE VM etc…
Fast synchronization:
Right now all nodes operate as “archive” nodes, most people want to quickly get the latest state so optional fast synchronization algorithms need to be implemented and provided as an option for users/node operators - after fast sync is done it would be nice to optionally sync older states(configurable policies as in geth)
Dead peer eviction:
Currently only 1% of the peers in our peer pool is alive - we need to quicker evict dead peers and validate newly gossiped peers - a queue of unchecked peers to be validated seems like a good idea.
Peer pool persistance:
Currently after the node restarts we start retrieving the peer pool from scratch - this is bad, we need to persist the peer pool and possibly after restarting revalidate the peers
Client endpoint for retrieving the peer pool status
Currently there is no easy way for a node maintainer to retrieve the list of peers in the peer pool besides attaching a console to the erlang node and writing some code… This essentially means that it is hard to analyze the status of the network and provide people with seed nodes not affiliated with the Ansalt
Decrease the reliance on centralized seed nodes for bootstraping the node - maintain and provide the community with the peer list(possibly posting the list in the forum, a smart contract etc…)

Nice to have

Regulating the naming system
Deprecate AEVM (maybe remove existing aevm contracts after a public governance vote?)
Querying remote nodes for their version

Not necessary

All bugs in the client software bundled inside the node - stratum, SC FSM, etc…

dimitar.chain · September 14, 2020, 11:28am

Thank you @gorbak25 for the feedback! The idea of this proposal is to help the AE Ecosystem and since the Hyperchain project is one of the most promising ones out there, we are to put your set of required issues to be with highest priority. Once we clear those we will proceed with the rest of the tasks in the proposal. I’ve created issues accordingly and included them in the maintenance project scope. Please note that the dead peers’ one is already included.

@Lydia and @Tina please consider those tasks as part of the proposal above as well.

Tina · September 14, 2020, 2:01pm

Hi @dimitar.chain @gorbak25
Thank you for the proposal and sharing the ideas here. The foundation can prolong the maintenance contract for the next month. Please discuss and finalize the proposal which Hyperchains project can benefit from the maintenance work.

Best
Tina

dimitar.chain · September 14, 2020, 2:44pm

@gorbak25 for completeness regarding your bullet points from your required section:

Yes, those must be fixed ASAP, please consult the bigger chunk of the proposal above.

Yes, this would indeed be great to have. Since this is a non-technical issue, I can not track this in GitHub. Note that this depends heavily on the other peers’ tasks - the dead peers one and the endpoint sone. I consider this to be dangerous before that and we will not start doing those forum posts with seed peers before those are in place.

@Tina from the post above, the proposal is for 2 months:

As stated, I am afraid we won’t be able to solve some bigger issues, esp. regarding dead peers and the fast sync. I think the dead peers is currently the most important issue out there and it could hurt the network. We didn’t touch it in the previous grant period because I am concerned that it could easily spill over one month.

gorbak25 · September 14, 2020, 2:53pm

Really looking forward to get fast sync implemented - otherwise without hosting a trustful and centralized DB backup service HC stakers will need to wait weeks for sync to complete
Yup onchain bugs (with exception to deprecated features as AEVM need to be fixed ASAP).
Decentralizing the seed nodes can indeed only be done after we fix all sync related issues - I wonder how the central part of our node ended up in such bad state

One more thing I forget to add to the above list of tasks (it is “a nice to have” but not required):

upgrade Rocksdb -> This is a low hangling fruit which slows down Q/A significantly.

uwiger · September 14, 2020, 2:53pm

This will be especially true now, since we have some other non-trivial high-priority issues to tackle as well. It’s also not just just a question of man-hours: we will want to utilize the (considerable) expertise of @hanssv.chain, and he has limited availability, so this will add lead time.

dimitar.chain · September 14, 2020, 2:57pm

Well it is working suboptimal but it is still working and there had been just a handful of issues so far The key is the so far part.

…and yes, a RocksDB upgrade is on the list, please consult the first part of the proposal above.

Ulf Wiger @uwiger

Update rocksdb to 6.4.6

When syncing from backup, accept previous states in DB if they don’t differ

Rest API endpoints version prefix

Dev mode

Data and log locations should be configurable from other location

Unhandled error in aec_chain_metrics_probe

More flexible/file-less configuration

Allow configuration by OS environment variables

aehttp_sc_SUITE failure: timeout waiting for channel open messages

State Channels: Inactivity timer in chain watcher

State Channels: Client can ask FSM to quit waiting for minimum depth

State Channels: modifiable minimum_depth default

Hans Svenson @hanssv.chain

FATE cannot get blockhash of current generation

AENS: Review and simplify pointers

Make inner transaction of PayingForTx non-valid

AENS: Increase the name expiry time

AENS: Fix bug in AENS.update signature check

Deprecate AEVM properly for Iris

Dincho Todorov @dincho.chain

Dimitar Ivanov @dimitar.chain

Sync: cleanup dead peers

HTTP Websockets upgrade regression

Out of sync /status endpoint data

aec_chain_state infinity restarts and crashes

meta_tx’s TTL

Test suite bugs

aest_channels_SUITE ==> test_simple_different_nodes_channel: FAILED badmatch

aehttp_sc_SUITE ==> plain.with_open_channel.sc_ws_update_abort: FAILED timeout

Drop “native” windows support

Ulf Wiger @uwiger

Dincho Todorov @dincho.chain

Dimitar Ivanov @dimitar.chain

Required for hyperchains(order of decreasing priorities):

Nice to have

Not necessary

aehttp_sc_SUITE failure: timeout waiting for channel `open` messages

State Channels: modifiable `minimum_depth` default