[Completed] Aeternity node maintenance - iris hard fork release candidate

uwiger · August 26, 2020, 10:44am

With CircleCI back up and running, we have identified an intermittent failure in the aehttp_sc_SUITE (State Channels websocket API test suite).

An issue has been created: Issue #3336. I think it would make sense to include that in the maintenance project, so that we can fix it right away. I estimate ca 1-2 man-day(s) of work.

As far as we can tell, it is a bug in the test suite, and not an underlying problem with the FSM implementation. But intermittently failing test runs become a draw on overall productivity.

dimitar.chain · August 31, 2020, 3:01pm

Our progress for the past week:

@uwiger worked 32.5 hours:
Merged PRs:
Break out chain simulator from chain watcher suite. Fixes: #3322 by uwiger · Pull Request #3329 · aeternity/aeternity · GitHub Break out chain simulator

Reviewed PRs:

Identified new issue affecting CI, mentioned in the forum thread suggesting that we solve it as part of the maintenance project.

Started wiki page describing the challenges of issue #3194

@dimitar.chain worked 36.25 hours:

the PRs from the last week were polished a bit and merged
SC: websocket is not closed on “invalid fsm id” #3163
This was was not a bug. I’ve ran different tests and was never able to reproduce it. On the contrary - both the code and the test coverage looks good.
SC: Feature addition, allow merchant to subscribe to incoming client requests #3069
As Ulf commented - this had been implemented some time ago. There weren’t any WebSocket related this feature and I wrote some. This would result in log examples as well.
ForceProgress transaction has no info on-chain #3229
This again is already implemented and tests coverage is good. I ran some additional tests to make sure that the tests really do what they claim to. (Actually I finished with this today, so it is partially in the next week as well)

@dincho.chain had worked 4 hours:

Add simplified CircleCI release workflow #3339 (it is not yet reviewed)

uwiger · September 1, 2020, 4:19pm

I have created a number of issues, breaking down issue #3194:

github.com/aeternity/aeternity

State Channels: Inactivity timer in chain watcher

opened 03:53PM - 01 Sep 20 UTC

uwiger

area/statechannels

High-level issue: #3194 [High-level discutssion](https://github.com/aeternity/…aeternity/wiki/Making-the-State-Channel-FSM-responsive-before-minimum-depth-confirmation) The idea is to be able to order a timer which triggers if a given event (e.g. any, or specific, channel change, for a given channel ID) doesn't occur within a given number of key-/microblocks. As a first step, this type of event could be requested by the client (perhaps also other types of chain watcher events).

I would like acknowledgemen from the Foundation that I’m cleared to add these issues to the project scope (strictly speaking, they are a refinement of the scope), and to start working on them. Cc @lydia

lydia · September 2, 2020, 8:25am

The priority in the maintenance is to fix all the bugs and technical issues. It seems that the issues list is increasing due to the state channels. Presently there is no application using them. Therefore this can be done on a later stage and not in the present project.

AF Board welcomes applications on the state channels! May be Dimitar can reconsider his decision and make the app. We will be happy to support any state channels application.

@uwiger please tell us why do we update to OTP 22? Can we not move to OTP 23? What are the problems with OTP 23? There are still unsolved bugs in the OTP 22 too that can create problems. What are the criterias of stability you used? Can we also update deps and CircleCI configuration to OTP 23? Do we have tests to compere with the future releases of OTP 23? Please report your test experience to all of us.

uwiger · September 2, 2020, 9:23am

The issue list is only increasing here because I broke down the already approved issue into smaller tasks, which individually can help improve the situation. It would allow for a more limited effort while still adding benefit (i.e. not all those tasks need to be completed in order to deliver improvement).

Having said this, I welcome concrete suggestions on issues that should get higher priority. I intend to do some digging myself and try to come up with some.

And we do have active State Channel users here on the forum. The proposal [withdrawn for other reasons] to create a State Channel Mobile Client was well received, and Hypermine (e.g. @vishwas_hypermine) has actively posted on the forum about their State Channel work, as well as talked about their use of State Channels.

In terms of OTP 22, I have pushed a PR, but this ran into an issue with CircleCI. I am waiting for @dincho.chain to take a look at it.

Regarding OTP 23, I suggest that we wait at least until 23.1 comes out. In the initial tests, the build failed due to a compiler bug. This was quickly fixed by OTP, but this sort of thing is not especially unexpected when trying out an X.0 version of OTP. We are also waiting for updates to erlang-rocksdb that would make it easier to build cleanly on OTP 23.

Even so, our intention is to get CI up and running also using OTP 23. The priority for now should be OTP 22, since it is more mature, and has all the things we currently need.

uwiger · September 2, 2020, 9:50am

Here is one issue I’ve been thinking about picking up:

github.com/aeternity/aeternity

Expose chain "transactions" in contract calls

opened 07:58AM - 09 Jun 20 UTC

closed 06:32PM - 12 Nov 20 UTC

hanssv

kind/feature breaking/api kind/improvement area/fate

This information should be collected during contract execution, and then served …somehow. It is obvious where to serve this when doing a `dry-run` but for normal chain operation/verification we also should make it visible - the middleware will be the main consumer I guess. The simplest thing would be to write to an append only file, but maybe we can come up with something more creative?!

@hanssv.chain and I had a discussion about it before the holidays, and agreed on an approach.

hanssv.chain · September 2, 2020, 10:01am

The “Expose chain transactions in contract calls” is definitely a good candidate. It is one of those embarrassing technical issues/debt we have around. (Not only saying so since I wrote the issue )

marco.chain · September 2, 2020, 12:48pm

this is a really important issue!

I absolutely support this and also the new middleware should be able to handle that information as soon as possible @karol.chain

lydia · September 2, 2020, 12:59pm

@hanssv.chain and @marco.chain Thank you for the reply!
@uwiger Please assign this issue.
Who can take over the important Sync: cleanup dead peers #3290 ?

uwiger · September 2, 2020, 1:09pm

Done, and Hans and Radek removed as assignees.

botanicalcarebot.chain · September 4, 2020, 9:39am

I also suggest not to move onto OTP 23 now, unless any new language feature from OTP 23 is a hard requirement, which doesn’t seem to be the case. Migrating to OTP 22.3 is a lot of work already, considering all dependencies need to be updated and potentially fixed too. It is a stable target to work against, whereas OTP 23 is still fresh and there will be bugs and incompatibilities ahead.

vishwas_hypermine · September 4, 2020, 3:35pm

Hi guys,

I have been working on State channel for quite sometime. And been noticing a few issues here and there (like sometime socket getting disconnected etc.) but could not really figure out the root cause hence did not post. All I can say, it does not seems to be node issue

I am in touch with a company in India with whom I am working parallely on using state channel protocol for a use case. We also have added a couple of RPCs and preparing a demo call with you guys so that I can explain need of those RPC and after that we will raise the PR if it makes sense.

AE state channels are currently in highlight atleast within my network here in India since I have been promoting it for quite sometime.

dimitar.chain · September 7, 2020, 1:46pm

Our progress for the past week:

@uwiger worked 36.5 hours
PR #3292 Pluggable core functionality
This feature is a cornerstone of the Hyperchains work, and has now been merged into master (cooperation between the Hyperchains team and the maintenance project)
PR #3294 Use parse_transform w -pluggable() attrs
This PR is a prerequisite for #3292 above, and has now been merged into master (cooperation between the Hyperchains team and the maintenance project)
PR #3341 Update deps and CircleCI for OTP 22
CI is now up and running for Aeternity on OTP 22. We disabled a job for OTP 23, since there are still some build issues there.
Issue #3283 Expose chain “transactions” in contract calls
This is progressing, but not yet ready.

@dimitar.chain worked 38.75 hours:
ForceProgress transaction has no info on-chain #3229
Finalised it
Missing test SUITE: aesc_utils #3285
I’ve added a few dozens of tests, but a few dozens yet to be added. I’ve found some small improvement points in the code and I’ve addressed those accordingly.

@dincho.chain spent 15 hours on modifying the CI to make it run OTP22 by default. Docker builds were adjusted as well.

dimitar.chain · September 9, 2020, 10:21am

Since this is our last week on this proposal, on Monday we will share our last progress.

In the past month we’ve accomplished a lot and we are happy with our progress there, especially given it was Ulf and me doing the coding and Dincho the DevOps. On this basis we would like to share with you our proposal for the next 2 months. We propose a bigger timeframe so we can tackle some bigger tasks. What is more, Hans can help us out as well. At the moment he can not dedicate more than 16h a week, hopefully this would change for the better.

Below you can find our horizon of tasks for the next 2 months. Please note that we don’t commit that we would do all of those in the timeframe but rather this is the order we would tackle tasks.

So this is our proposal It is up to the foundation to decide if they would like to support it or not. cc @Lydia and @Tina

Ulf Wiger @uwiger

Update rocksdb to 6.4.6

The latest version of erlang-rocksdb supports Rocksdb 6.5.2 Our system currently uses erlang-rocksdb 0.24.0, which uses Rocksdb 5.15.10. A new release should be forthcoming, also adapting the Erlang part to OTP 23. We want to move to a newer Rocksdb not least because Rocksdb takes up a large part of the Aeternity build time. Also, lots of bugfixes and performance improvements have been introduced in later Rocksdb versions.

When syncing from backup, accept previous states in DB if they don’t differ

This would improve things for the Middleware, avoiding unnecessary problems during database import.

Rest API endpoints version prefix

github.com/aeternity/aeternity

Rest API endpoints version prefix

opened 07:15AM - 15 Jun 20 UTC

closed 08:43AM - 20 May 21 UTC

dincho

breaking/api kind/improvement

Rest API endpoints version prefix should be bumped with the node major. Currentl…y the version endpoint prefix is hardcoded to `v2` in the URL, this is plainly wrong and currently useless. The prefix should reflect major (backward incompatible) API changes to signal users and machines about the fact. An example use case is caching layers, e.g. block 1337 can be cached "forever" until its API structure changes for some reason, and changing the prefix will technically invalidate the cache.

This is regular technical debt, and should be fixed.

Dev mode

Supporting “dev mode” (fake) mining instead of running light cuckoo cycle mining. A prototype for this can be said to exist in the test suites, where this is achieved through mocking.

Data and log locations should be configurable from other location

github.com/aeternity/aeternity

data and log locations should be configurable from other location

opened 03:29PM - 30 Jan 20 UTC

uwiger

kind/improvement area/core status/approved

## Expected Behavior Data and log locations should be configurable from the o…utside. ## Actual Behavior Some data and log paths are relative, so end up in the CWD A reasonable solution would be to ensure that if `setup:home()` is set to an absolute path (and by extension also `setup:data_dir()` and `setup:log_dir()`), then all data and log files should end up there. Esp `lager` settings will need to be tweaked for this. ## Steps to Reproduce the Problem 1. See the [ae_plugin](https://github.com/aeternity/ae_plugin) The bootstrap logic performs some path rewriting that should be unnecessary, in order to get the files in the right place (and some still don't) ## Logs, error output, etc. ## Specifications

This would be helpful for plugin applications, and should not be too hard to implement.

Unhandled error in aec_chain_metrics_probe

github.com/aeternity/aeternity

Unhandled error in aec_chain_metrics_probe

opened 12:48PM - 23 Dec 19 UTC

closed 02:45PM - 23 Mar 21 UTC

dincho

kind/bug need/input-requested

## Expected Behavior Process not crashing ## Actual Behavior Crash #…# Steps to Reproduce the Problem Probably inconsistent database. Restart does not help. ## Logs, error output, etc. ``` 2019-12-23 12:17:17.022 [error] emulator Error in process <0.5158.196> on node aeternity@localhost with exit value: {{try_clause,{error,not_rooted}},[{aec_chain_metrics_probe,total_difficulty,0,[{file,"/home/builder/aeternity/apps/aecore/src/aec_chain_metrics_probe.erl"},{line,130}]},{aec_chain_metrics_probe,sample_,2,[{file,"/home/builder/aeternity/apps/aecore/src/aec_chain_metrics_probe.erl"},{line,125}]},{aec_chain_metrics_probe,'-probe_sample/1-fun-0-',1,[{file,"/home/builder/aeternity/apps/aecore/src/aec_chain_metrics_probe.erl"},{line,79}]}]} ``` Might be related as well: ``` 2019-12-23 11:50:27.872 [error] <0.32556.195> CRASH REPORT Process <0.32556.195> with 0 neighbours exited with reason: {{{badmatch,{error,not_rooted}},[{aec_peer_connection,local_ping_obj,1,[{file,"/home/builder/aeternity/apps/aecore/src/aec_peer_connection.erl"},{line,738}]},{aec_peer_connection,prepare_request_data,3,[{file,"/home/builder/aeternity/apps/aecore/src/aec_peer_connection.erl"},{line,592}]},{aec_peer_connection,handle_request,4,[{file,"/home/builder/aeternity/apps/aecore/src/aec_peer_connection.erl"},{line,587}]},{gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,636}]},{gen_server,handle_msg,...},...]},...} in gen_server:call/3 line 214 ``` ## Specifications - Virtualization: AWS - Hardware specs: t3.large - OS: Ubuntu 16.04 - Node Version: 5.3.0

Probably a rare error, but should be easy to fix. Though the origin of the error is unknown, so testing may be a bit tricky, and addressing the root cause even more so. What we can begin to do is to make the metric probe more robust.

More flexible/file-less configuration

This would simplify testing and deployment of closed systems, and should be easy to implement (testing may take a little bit more time).

Allow configuration by OS environment variables

github.com/aeternity/aeternity

Allow configuration by OS environment variables

opened 07:04AM - 15 Jun 20 UTC

closed 01:44PM - 27 Sep 21 UTC

dincho

community kind/improvement

Currently the node can be configuration by command line parameters and configura…tion file. Where the configuration file itself can be changes by command line parameter or `AETERNITY_CONFIG` OS environment var. see https://github.com/aeternity/aeternity/blob/master/docs/configuration.md#user-provided-configuration In the world of containerisation is much more "natural" to use OS environment variables to fully configuration a given piece of software, that would easy the deployment in such environments. e.g. `AETERNITY_NETWORK_ID`, TBD the exact structure/format

This would simplify test setup and development environments. The best way to address it may be to refactor some of the legacy code which checks configuration data. The methods of handling config data evolved over time, and the code reflects this.

aehttp_sc_SUITE failure: timeout waiting for channel `open` messages

github.com/aeternity/aeternity

aehttp_sc_SUITE failure: timeout waiting for channel `open` messages

opened 05:57PM - 25 Aug 20 UTC

uwiger

kind/bug area/tests

The `aehttp_sc_SUITE:sc_ws_min_depth_is_modifiable/1` test case fails with a tim…eout - at least in some runs. ``` === Reason: {timeout,{messages,[{<0.9168.0>,websocket_event,channel, update, #{<<"jsonrpc">> => <<"2.0">>, <<"method">> => <<"channels.update">>, <<"params">> => #{<<"channel_id">> => <<"ch_21woyLUNVapgSZrHrbdTKohDG5Yad9xNw4wFsrLzf8sKFspZim">>, <<"data">> => #{<<"state">> => <<"tx_+QENCwH4hLhAdU5TGcrejxQkGW36Bb7mfyY/N6FwCG5qJxSDCpUmcIv0ie2oy0TzPWq9TpTNeby7zYVcnO/hjIUY1S+KiDTrBLhAn0NBWw3RzA4FS9tujscTinOUTK4jm5RuG7eRMyrMycUVRl4olmgxkDHIPLi3YMMzJ+sdHgK3wHvkxUETOYCDA7iD+IEyAaEBnvLdFTEfyWE16WLId900E+O0wsvSmKqkynYPhUodScSGP6olImAAoQG5u4uTbiAM4+qz6hnyjAQZJnfxP/hQFvbjGc7kagT/PoYkYTnKgAACCgCGEAZ510gAwKCBwESIXtJYQs/2KrdjFqVbmEKLcMJyZ1sylrOYFH8zrQJifO77">>}}, <<"version">> => 1}}, {<0.9163.0>,websocket_event,channel, update, #{<<"jsonrpc">> => <<"2.0">>, <<"method">> => <<"channels.update">>, <<"params">> => #{<<"channel_id">> => <<"ch_21woyLUNVapgSZrHrbdTKohDG5Yad9xNw4wFsrLzf8sKFspZim">>, <<"data">> => #{<<"state">> => <<"tx_+QENCwH4hLhAdU5TGcrejxQkGW36Bb7mfyY/N6FwCG5qJxSDCpUmcIv0ie2oy0TzPWq9TpTNeby7zYVcnO/hjIUY1S+KiDTrBLhAn0NBWw3RzA4FS9tujscTinOUTK4jm5RuG7eRMyrMycUVRl4olmgxkDHIPLi3YMMzJ+sdHgK3wHvkxUETOYCDA7iD+IEyAaEBnvLdFTEfyWE16WLId900E+O0wsvSmKqkynYPhUodScSGP6olImAAoQG5u4uTbiAM4+qz6hnyjAQZJnfxP/hQFvbjGc7kagT/PoYkYTnKgAACCgCGEAZ510gAwKCBwESIXtJYQs/2KrdjFqVbmEKLcMJyZ1sylrOYFH8zrQJifO77">>}}, <<"version">> => 1}}]}} in function aehttp_ws_test_utils:wait_for_msg/5 (/home/builder/aeternity/apps/aehttp/test/aehttp_ws_test_utils.erl, line 324) in call from aehttp_sc_SUITE:wait_for_channel_event_/3 (/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl, line 4294) in call from aehttp_sc_SUITE:wait_for_channel_event_match/4 (/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl, line 4268) in call from aehttp_sc_SUITE:channel_send_chan_open_infos/3 (/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl, line 894) in call from aehttp_sc_SUITE:finish_sc_ws_open/2 (/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl, line 842) in call from aehttp_sc_SUITE:sc_ws_open_/4 (/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl, line 775) in call from aehttp_sc_SUITE:sc_ws_min_depth_is_modifiable/1 (/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl, line 3012) in call from test_server:ts_tc/3 (test_server.erl, line 1755) ``` From some log analysis, it seems as if the problem is that the channel is opened with `minimum_depth => 0`. This confuses the generic channel setup code, which has a finishing phase where, optionally, blocks are mined to ensure that the `create_tx` is actually included in a block, and minimum depth is reached. This is triggered for the test case in question, but the tx has already been included, and since `minimum_depth == 0`, minimum depth has also been reached and the associated info reports already delivered. In a failing run, the following could be seen from the test case output: ``` *** User 2020-08-25 07:23:33.929 *** aec_conductor:start_mining(#{}) (aeternity_dev1@localhost) -> ok *** User 2020-08-25 07:23:33.973 *** aec_conductor:stop_mining() (aeternity_dev1@localhost) -> ok *** CT Error Notification 2020-08-25 07:23:45.980 *** aehttp_ws_test_utils:wait_for_msg failed on line 324 Reason: timeout ``` From the stacktrace above, we can see that the test core is waiting for an `open` info msg (line 842). But scrolling up, we find those messages already delivered, although the test case code wasn't ready for them then. ``` *** User 2020-08-25 07:23:33.908 *** No test registered for this event (Msg = #{<<"jsonrpc">> => <<"2.0">>, <<"method">> => <<"channels.info">>, <<"params">> => #{<<"channel_id">> => <<"ch_21woyLUNVapgSZrHrbdTKohDG5Yad9xNw4wFsrLzf8sKFspZim">>, <<"data">> => #{<<"event">> => <<"open">>}}, <<"version">> => 1}) *** User 2020-08-25 07:23:33.909 *** [initiator] Received msg #{<<"jsonrpc">> => <<"2.0">>, <<"method">> => <<"channels.info">>, <<"params">> => #{<<"channel_id">> => <<"ch_21woyLUNVapgSZrHrbdTKohDG5Yad9xNw4wFsrLzf8sKFspZim">>, <<"data">> => #{<<"event">> => <<"open">>}}, <<"version">> => 1} ```

This bug was detected during the maintenance project, and causes intermittent failures in the CI. It should be fixed, should not take more than 1-2 man-days.

The following issues are broken-down tasks from the already approved issue #3194 (Relax restriction that channel cannot be used before min_depth · Issue #3194 · aeternity/aeternity · GitHub)

State Channels: Inactivity timer in chain watcher

github.com/aeternity/aeternity

State Channels: Inactivity timer in chain watcher

opened 03:53PM - 01 Sep 20 UTC

uwiger

area/statechannels

High-level issue: #3194 [High-level discutssion](https://github.com/aeternity/…aeternity/wiki/Making-the-State-Channel-FSM-responsive-before-minimum-depth-confirmation) The idea is to be able to order a timer which triggers if a given event (e.g. any, or specific, channel change, for a given channel ID) doesn't occur within a given number of key-/microblocks. As a first step, this type of event could be requested by the client (perhaps also other types of chain watcher events).

State Channels: Client can ask FSM to quit waiting for minimum depth

State Channels: modifiable `minimum_depth` default

Hans Svenson @hanssv.chain

FATE cannot get blockhash of current generation

github.com/aeternity/aeternity

FATE cannot get blockhash of current generation

opened 07:49AM - 11 Nov 19 UTC

closed 08:48AM - 08 Oct 20 UTC

ThomasArts

breaking/consensus kind/improvement area/fate

## Expected Behavior For Sophia contract: ``` entrypoint my_hash() = C…hain.block_hash(Chain.block_height) ``` One expects a hash back, but instead `None` is returned. The cause is in: https://github.com/aeternity/aeternity/blob/afef1aa92a1cd6a75f4037d8e2be540ed4114612/apps/aefate/src/aefa_fate_op.erl#L858 There is an off-by-one error `>=` should be `>`. This is consensus breaking and has to be conditional w.r.t. next hard fork. If we change that, a request to relax 256 blocks in the past to something that reflects 24 hours or so. Some discussion needed on how to deal with contracts that change semantics. Clearly, contracts created after the hardfork will have to respect the new semantics, but should we support different call outrcomes for contracts created before hardfork, but called after hardfork? Do we need a general policy for this or is this a case-by-case discussion?

This an outright bug that should be fixed.

AENS: Review and simplify pointers

Currently name pointers allow too much freedom for the user to be creative. This should be revisited

Make inner transaction of PayingForTx non-valid

github.com/aeternity/aeternity

Make inner transaction of PayingForTx non-valid

opened 09:37AM - 03 Dec 19 UTC

closed 10:33AM - 05 Jan 21 UTC

hanssv

breaking/consensus kind/improvement area/core

The initial implementation has a fully fledged normally signed inner transaction…. This means that it is possible to unwrap the PayingForTx and post the inner transaction on-chain. Since the inner transaction isn't intended to be used like this it would be nice to disallow it. The idea is to change what is signed for the inner transaction - the obvious idea is to drop the network-id from the signing schema but perhaps there are other ways as well.

This is a bug in the PayingForTx that would render it useless. The attack vector is described in the GitHub issue. This must be done before Iris release.

AENS: Increase the name expiry time

This is something that came up a few times in the forum already: name expiration was never decided by the public. The idea here is to allow the community to vote on when names should expire.

AENS: Fix bug in AENS.update signature check

This is a bug, it must be fixed.

Deprecate AEVM properly for Iris

This one is a technical debt, it should be resolved ASAP.

Dincho Todorov @dincho.chain

Dincho would be providing us with his DevOps skills so he is needed all over the tasks, really. When he is not overloaded with work, he will be cleaning the issues assigned to him:

Dimitar Ivanov @dimitar.chain

Sync: cleanup dead peers

This bug had beem around for long time now. This would be my priority task. There had been a few attempts to expose the bug, so far all of those exposed some issues but didn’t solve it. It is a black box issue and we would not know how much time and effort it would require to fix. It might take 2 weeks or over a month, exactly how much it would take is to determine my availability for the rest of the tasks. A few more issues might be created from this one. I will need Dincho’s help here as well.

HTTP Websockets upgrade regression

github.com/aeternity/aeternity

HTTP Websockets upgrade regression

opened 11:19AM - 27 Jan 20 UTC

dincho

kind/bug area/statechannels area/api status/approved

## Expected Behavior ``` $ curl localhost:3014/channel -v * Trying ::1...… * TCP_NODELAY set * Connected to localhost (::1) port 3014 (#0) > GET /channel HTTP/1.1 > Host: localhost:3014 > User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0) > Accept: */* > Referer: > < HTTP/1.1 426 Upgrade Required < connection: upgrade < content-length: 0 < date: Mon, 27 Jan 2020 11:08:39 GMT < server: Cowboy < upgrade: websocket < * Connection #0 to host localhost left intact * Closing connection 0 ``` ## Actual Behavior ``` $ curl localhost:3014/channel -v * Trying ::1... * TCP_NODELAY set * Connected to localhost (::1) port 3014 (#0) > GET /channel HTTP/1.1 > Host: localhost:3014 > User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0) > Accept: */* > Referer: > < HTTP/1.1 500 Internal Server Error < content-length: 0 < * Connection #0 to host localhost left intact * Closing connection 0 ``` ## Steps to Reproduce the Problem Install any 5.* version of the node (except rc1) and use the above commands. ## Logs, error output, etc. *`aeternity.yaml` configuration file (formerly named `epoch.yaml`)* ``` websocket: channel: listen_address: 0.0.0.0 ``` ## Specifications - Node Version: 5.* This is working as expected in node versions 4.* and v5.0.0-rc.1 First appears in v5.0.0-rc.2

This bug is breaking some of the tools used by SRE and should be a low hanging fruit.

Out of sync /status endpoint data

github.com/aeternity/aeternity

Out of sync /status endpoint data

opened 04:34PM - 18 Dec 19 UTC

closed 09:54AM - 05 Jan 21 UTC

dincho

kind/bug

## Expected Behavior ``` $ curl -s http://35.166.231.86:3013/v2/status | jq … { "difficulty": 21527826519820, "genesis_key_block_hash": "kh_pbtwgLrNu23k9PA6XCZnUbtsvEFeQGgavY4FS2do3QP8kcp2z", "listening": true, "network_id": "ae_mainnet", "node_revision": "c6c12b039971ebe9a367d76826c6acbbd966fa0d", "node_version": "5.2.0", "peer_connections": { "inbound": 110, "outbound": 20 }, "peer_count": 25238, "peer_pubkey": "pp_21DNLkjdBuoN7EajkK3ePfRMHbyMkhcuW5rJYBQsXNPDtu3v9n", "pending_transactions_count": 104, "protocols": [ { "effective_at_height": 161150, "version": 4 }, { "effective_at_height": 90800, "version": 3 }, { "effective_at_height": 47800, "version": 2 }, { "effective_at_height": 0, "version": 1 } ], "solutions": 0, "sync_progress": 100, "syncing": false, "top_block_height": 184607, "top_key_block_hash": "kh_2hTa446BBHBYKoodtnQxJmXzmophGjmF5P8gtwdeUx8Ji3aZvN" } ``` ## Actual Behavior ``` $ curl -s http://35.166.231.86:3013/v2/status | jq { "difficulty": 21527826519820, "genesis_key_block_hash": "kh_pbtwgLrNu23k9PA6XCZnUbtsvEFeQGgavY4FS2do3QP8kcp2z", "listening": true, "network_id": "ae_mainnet", "node_revision": "c6c12b039971ebe9a367d76826c6acbbd966fa0d", "node_version": "5.2.0", "peer_connections": { "inbound": 110, "outbound": 20 }, "peer_count": 25238, "peer_pubkey": "pp_21DNLkjdBuoN7EajkK3ePfRMHbyMkhcuW5rJYBQsXNPDtu3v9n", "pending_transactions_count": 104, "protocols": [ { "effective_at_height": 161150, "version": 4 }, { "effective_at_height": 90800, "version": 3 }, { "effective_at_height": 47800, "version": 2 }, { "effective_at_height": 0, "version": 1 } ], "solutions": 0, "sync_progress": 100, "syncing": true, "top_block_height": 184607, "top_key_block_hash": "kh_2hTa446BBHBYKoodtnQxJmXzmophGjmF5P8gtwdeUx8Ji3aZvN" } ``` Note that `sync_progress` and `syncing` fields are out of sync. ## Steps to Reproduce the Problem No, looks like the sync processes are sometimes stuck ## Logs, error output, etc. Nope ## Specifications See the status output above. Hardware not related probably.

This is a curious bug that points to a race condition in the code. The result is a confusing API that is hard to reason about.

aec_chain_state infinity restarts and crashes

github.com/aeternity/aeternity

aec_chain_state infinity restarts and crashes

opened 09:54AM - 11 Dec 19 UTC

closed 11:27AM - 18 Dec 20 UTC

dincho

## Expected Behavior Recover or crash the node ## Actual Behavior The n…ode starts spitting exponential number of errors (e.g. 10k/4h) without trying to recover. If the recovery it's not possible it should stop then as it's not operational at all. ## Steps to Reproduce the Problem Unknown ## Logs, error output, etc. ``` 2019-12-09 12:44:26.482 [error] <0.12513.30> Supervisor aec_conductor_sup had child aec_conductor started with aec_conductor:start_link() at <0.12610.30> exit with reason {aborted,{{found_already_calculated_state,<<104,122,94,58,45,227,152,23,188,69,0,106,35,191,113,115,133,113,37,25,201,170,116,99,65,78,0,151,68,239,164,19>>},[{aec_chain_state,update_state_tree,4,[{file,"/home/builder/aeternity/apps/aecore/src/aec_chain_state.erl"},{line,702}]},{aec_chain_state,update_state_tree,2,[{file,"/home/builder/aeternity/apps/aecore/src/aec_chain_state.erl"},{line,693}]},{aec_chain_state,internal_insert_transaction,3,[{file,"/home/builder/aeternity/apps/aecore/src/a..."},...]},...]}} in context child_terminated ``` ## Specifications - Virtualization: AWS - Hardware specs: t3.large - OS: Ubuntu 16.04.5 - Node Version: 5.2.0 - Instance ID: i-0dc2fe355c1e42ab7

The error recover mechanism seems to be broken, not marked as a bug but it is clearly one. This could result in filling one’s HDD with garbage logs.

meta_tx’s TTL

github.com/aeternity/aeternity

meta_tx's TTL

opened 11:14AM - 03 Dec 19 UTC

closed 01:20PM - 02 Dec 20 UTC

velzevur

kind/bug breaking/consensus area/generalized_accounts

`aetx:ttl/1` specializes the inner tx and calls its callback's `ttl/1`. In the c…ase of a channel co-authenticated transaction when both participants are GAs, that would result in two embedded meta transactions. Calling `aetx:ttl/1` that would result in the outermost meta_tx's ttl. Instead `aetx:ttl/1` should return the innermost transaction's `ttl/1`.

This bug could result in unexpected results when using generalised accounts: the TTL being used is the one authenticating the inner transaction but it must be the other way around.

Test suite bugs

aest_channels_SUITE ==> test_simple_different_nodes_channel: FAILED badmatch

github.com/aeternity/aeternity

aest_channels_SUITE ==> test_simple_different_nodes_channel: FAILED badmatch

opened 12:50PM - 26 Nov 19 UTC

tolbrino

kind/bug area/statechannels area/tests

## Expected Behavior Tests pass. ## Actual Behavior ``` %%% aest_chann…els_SUITE ==> test_simple_different_nodes_channel: FAILED %%% aest_channels_SUITE ==> {{badmatch,{ok,#{<<"info">> => <<"close_mutual">>, <<"tx">> => <<"tx_+OkLAfiEuEBlz0bOgvRRn2S4RBOecxdlkIUQFjzgA8MBVYURh8aXIo0siqCiUmslCJkGDq1DrxRf5w79kPXtPhpSPFAmlKcDuECsGIAKMPPE3i0gLwhTE91HQZzILSU+IGPpR9kWNIgncSBH0tPmRpnpUKU8SvzXondpx57nLVU91ORxqz5YNQUCuF/4XTUBoQbFNveodt+B570IC8UcdMjjekwZxecIXKZESd7ONTxTy6EBZxxVRkZJRXWytJT2UWghcQZj2EiTzdLSNgN6VMM+7oSGJFyRsrf+hiRVlY8MAgCGEjCc5UAAA27NY1I=">>, <<"type">> => <<"channel_close_mutual_tx">>}}}, [{aest_api,sc_close_mutual,2, [{file,"/home/circleci/aeternity/_build/system_test+test/extras/system_test/common/helpers/aest_api.erl"}, {line,176}]}, {aest_channels_SUITE,simple_channel_test,4, [{file,"/home/circleci/aeternity/_build/system_test+test/extras/system_test/common/aest_channels_SUITE.erl"}, {line,205}]}, {test_server,ts_tc,3,[{file,"test_server.erl"},{line,1755}]}, {test_server,run_test_case_eval1,6,[{file,"test_server.erl"},{line,1262}]}, {test_server,run_test_case_eval,9,[{file,"test_server.erl"},{line,1194}]}]} . ``` ## Steps to Reproduce the Problem None atm. ## Logs, error output, etc. https://circleci.com/gh/aeternity/aeternity/100969

aehttp_sc_SUITE ==> plain.with_open_channel.sc_ws_update_abort: FAILED timeout

github.com/aeternity/aeternity

aehttp_sc_SUITE ==> plain.with_open_channel.sc_ws_update_abort: FAILED timeout

opened 09:23AM - 31 Oct 19 UTC

tolbrino

kind/bug area/tests

## Expected Behavior Tests pass. ## Actual Behavior ``` %%% aehttp_sc_…SUITE ==> plain.with_open_channel.sc_ws_update_abort: FAILED %%% aehttp_sc_SUITE ==> {{timeout,{messages,[{<0.14644.0>,websocket_event,channel,conflict, #{<<"jsonrpc">> => <<"2.0">>, <<"method">> => <<"channels.conflict">>, <<"params">> => #{<<"channel_id">> => <<"ch_2mYFVhAbGMgwQPyPuHS9prC7WmKLMFyy89cqqtRR6CZ9kjcYDA">>, <<"data">> => #{<<"channel_id">> => <<"ch_2mYFVhAbGMgwQPyPuHS9prC7WmKLMFyy89cqqtRR6CZ9kjcYDA">>, <<"error_code">> => 2, <<"error_msg">> => <<"conflict">>, <<"round">> => 5}}, <<"version">> => 1}}]}}, [{aehttp_ws_test_utils,wait_for_msg,5, [{file,"/home/builder/aeternity/apps/aehttp/test/aehttp_ws_test_utils.erl"}, {line,316}]}, {aehttp_sc_SUITE,wait_for_channel_event_,3, [{file,"/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl"}, {line,3676}]}, {aehttp_sc_SUITE,wait_for_channel_event_match,4, [{file,"/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl"}, {line,3641}]}, {aehttp_sc_SUITE,channel_abort_sign_tx,4, [{file,"/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl"}, {line,509}]}, {aehttp_sc_SUITE,sc_ws_update_abort,1, [{file,"/home/builder/aeternity/apps/aehttp/test/aehttp_sc_SUITE.erl"}, {line,3096}]}, {test_server,ts_tc,3,[{file,"test_server.erl"},{line,1755}]}, {test_server,run_test_case_eval1,6,[{file,"test_server.erl"},{line,1262}]}, {test_server,run_test_case_eval,9,[{file,"test_server.erl"},{line,1194}]}]} ``` ## Steps to Reproduce the Problem Can't be reliably reproduced yet. ## Logs, error output, etc. https://circleci.com/gh/aeternity/aeternity/95397#tests/containers/2

Those are bugs in the test setup.

Drop “native” windows support

Bring the discussion in the forum if the community needs the Windows build and if not - deprecate it.

gorbak25 · September 10, 2020, 10:46am

Hi!

I’m speaking as the current lead of the Hyperchain project. I want to emphasize the importance and priority of the maintenance project. It’s not about introducing new features but about keeping the AE ecosystem alive. Currently Aeternity is not only developing new cutting edge products like Hyperchains or Superhero but is also a service provider - SDK, Middleware, Seed Nodes, DB snapshots, Monitoring etc… This proposal is in simple terms “Hey, we need to keep our Core Infrastructure Alive, have someone ready who can fix something in case of an emergency and fix existing bugs”
CC: @Lydia @Tina @YaniUnchained

If the 2 month extension is not approved(possibly THIS week, a simple “Hey, please work on this while we handle the bureaucracy” will be enough) then my team will need to do a lot of those tasks in the scope of Hyperchains in order to release a finished product, which will extend the ETA for releasing hyperchains by possibly months. What I would really like to see done(which can be labelled as General Node Maintenace) before releasing HC is:

Rocksdb upgrade -> performance will increase and the Q/A process will be speed up which will save us a lot of time
Drop windows support -> I don’t think anybody is using that, will speed up Q/A
Transient failures in the SC test suite -> those tests slow us down due to the possibility of rerunning the entire Q/A process
Sync: cleanup dead peers -> This needs to be done because curently we practically never evict dead peers from the peer pool and we only have 1% of active peers here -> this essentially makes the AE network centralized and unsafe…
Sync: fast sync -> Sync can take weeks… We can speed up things by compromising security slightly - this would allow us to drop the centralized DB backup service…
Sync: peer persistance -> If you restart the node then you need to sync the peer pool again which essentially opens you up to eclipse attacks, on the other hand because only 1% of the peers in the pool are actually active this essentially would mean that after an restart it would be inpossible to sync…
Deprecate AEVM -> it clogs up the codebase and should never be used in HC as we have the FATE VM
Make inner transaction of PayingForTx non-valid -> This needs to be fixed as this bug will propagate to all Hyprchains
FATE cannot get blockhash of current generation -> This decreases usefulness of Sophia smart contracts
Crash in aec_chain_metrics_probe
Dev mode -> Actually we started implementing more or less this because otherwise we are unable to test HC properly - currently @radrow.chain is refactoring the SC chain simulator to allow it to be used in the scope of HC

There are other issues which the HC team could tackle but they can be postponed for later(not necessary for the MVP or HC). Keep in mind that any bug in the Node will propagate to Hyperchains and it will be hard to fix them later in hyperchains as we have no control over each individual hyperchain.

Best Regards,
Grzegorz

dimitar.chain · September 10, 2020, 11:04am

Although this had been discussed many times already, those are not even tracked as issues.

radrow.chain · September 10, 2020, 11:11am

I totally support all the proposed tasks. They are all very valuable, and some of them are completely necessary to me (like dev mode (however I am working on something similar at this moment), rocksdb update, fast sync, not even mentioning bugfixes).

Healthy ecosystem is crucial for all of the development we are doing here – not only limited to Hyperchains or Superhero. Writing more serious things requires more serious testing and more flexible (and bug free) environment. While I was working on the staking contract I really felt some of these issues being a chain on my feet – especially the testing part. We get really distracted by situations when something fails in the network and requires discussing what is the maintenance team allowed to fix and what is not. In my opinion, some emergency maintenance budget should be set as well. It is very important to speed up the approval process, as it is mostly work that is required to do other tasks. The HC team has its own things to do and won’t be able to handle all the issues mentioned here keeping reasonable delivery time. And especially, we can’t just ignore them because we don’t want them to propagate into HCs (like for example AEVM support).

From my side as an iris target I would also add Create contracts from other contracts · Issue #197 · aeternity/aesophia · GitHub – this would have a huge impact on aepps development and would drastically increase reliability of the repetetive smart contract models (like bonding curve tokens or hyperchains staking contracts).

This is not an iris target (cause it doesn’t need a hard fork), but will be priceless during further smart contract development: FATE debugger · Issue #201 · aeternity/aesophia · GitHub.

marco.chain · September 10, 2020, 12:39pm

I’d love to see those tasks being approved. very nice to see increasing activity of the core team in the forum!

we (kryptokrauts) need the iris hard fork as soon as possible to be able to introduce cool features in regards to the naming system (e.g. name extender, name bazaar)

hanssv.chain · September 10, 2020, 2:41pm

Just so you don’t misunderstand the “Deprecate AEVM” task, for Hyperchains you can remove AEVM fully, but the Aeternity core node has to keep it. But there won’t be any new AEVM contracts allowed on chain.

uwiger · September 11, 2020, 8:22pm

I have pushed a WIP (Work In Progress) PR for exposing chain events from contract calls.
There are still some issues, e.g. when returning events to the HTTP client.

There may also be some event needed at contract setup (see @hanssv.chain comments).
If the maintenance project is extended, I can continue next week.

Ulf Wiger @uwiger

Update rocksdb to 6.4.6

When syncing from backup, accept previous states in DB if they don’t differ

Rest API endpoints version prefix

Dev mode

Data and log locations should be configurable from other location

Unhandled error in aec_chain_metrics_probe

More flexible/file-less configuration

Allow configuration by OS environment variables

aehttp_sc_SUITE failure: timeout waiting for channel open messages

State Channels: Inactivity timer in chain watcher

State Channels: Client can ask FSM to quit waiting for minimum depth

State Channels: modifiable minimum_depth default

Hans Svenson @hanssv.chain

FATE cannot get blockhash of current generation

AENS: Review and simplify pointers

Make inner transaction of PayingForTx non-valid

AENS: Increase the name expiry time

AENS: Fix bug in AENS.update signature check

Deprecate AEVM properly for Iris

Dincho Todorov @dincho.chain

Dimitar Ivanov @dimitar.chain

Sync: cleanup dead peers

HTTP Websockets upgrade regression

Out of sync /status endpoint data

aec_chain_state infinity restarts and crashes

meta_tx’s TTL

Test suite bugs

aest_channels_SUITE ==> test_simple_different_nodes_channel: FAILED badmatch

aehttp_sc_SUITE ==> plain.with_open_channel.sc_ws_update_abort: FAILED timeout

Drop “native” windows support

aehttp_sc_SUITE failure: timeout waiting for channel `open` messages

State Channels: modifiable `minimum_depth` default