[Completed] Aeternity node maintenance - iris hard fork release candidate

Our progress for the last week:

Ulf Wiger @uwiger

Ulf did some more work on a database rollback script (and improved extension script support), then paused it for the time being. He clarified the documentation on the new fork resistance and then resumed work on an improved rocksdb mnesia backend, which should significantly increase the robustness (and perhaps speed) of the database. Time spent: 38.2 hours

Dimitar Ivanov @dimitar.chain

Dimitar was providing the infrastructure for the new OAS3. This included a bunch of PRs of our swagger generator in order to teach it how to parse the new specs. Time spent: 39 hours

Dincho Todorov @dincho.chain

Since Dimitar hit rebar plugin, Dincho helped him resolve it. Time spent: 3.75 hours.

Hans did not have time to contribute to the project.

4 Likes

Our progress for the past week:

Ulf Wiger @uwiger

Ulf spent some time moving the refactored rocksdb backend along, then switched over to implementing a way to break out of waiting for minimum_depth confirmation when opening a state channel. This would allow for a state channel to become usable immediately, while still letting you keep track of confirmations. While the solution seems to work at first glance, adapting the test suite for it is a bit of a challenge. This work is still underway. Time spent: 37 hours

Dimitar Ivanov @dimitar.chain

Dimitar improved the swagger generator further more. Since it was decided that we support both the old and the new APIs in order to provide the community all the time they need to upgrade their SDKs and software, Dimitar provided infrastructure for supporting 2 different APIs. Then he started improving the specification to take advantage of OAS3's perks. Time spent: 42.5 hours

Hans and Dincho didn’t contribute last week.

2 Likes

Our progress for the last week:

Ulf Wiger @uwiger

Ulf worked on a way to short-cut the wait for minimum depth confirmation in state channels. This feature is now in a PR awaiting review. One of the main challenges was to find a way to avoid having the new functionality wreck the FSM test suite, which explicitly verifies the message sequences - the new feature allows State Channel clients to tell the FSM to proceed without waiting for the confirmation message, but the message will eventually arrive, and will then be reported. Time spent: 30 hours.

Dimitar Ivanov @dimitar.chain

Dimitar implemented the new OAS3 specification that lives under /v3/ of the external HTTP API. He also implemented the functionality of providing all integers as strings when a specific flag is provided. This is to solve precision issues on the JS end. This functionality is available only in the new HTTP API. Time spent: 41.5 hours.

Hans Svensson @hanssv.chain

Hans published a PR for AEVM depreciation. Time spent: 0 hours :slight_smile:

Dincho Todorov @dincho.chain

Dincho updated the docker builder in CI. While there - he also sped up the docker builds. Time spent: 7 hours

4 Likes

Our progress for the previous week:

Ulf Wiger @uwiger

Ulf worked on bug fixing, the refactoring of the database backend, and finishing the State Channel support for deferring minimum depth confirmation. Time spent: 35 hours

Hans Svensson @hanssv.chain

Tomas, Ulf Norell and Hans joined a call brainstorming a fix for the FATE v1 and FATE v2 interoperability. Time spent: 1.5 hours each

Dincho Todorov @dincho.chain

Dincho was working on 5.9.1 release and deployment. Time spent: 7.5 hours

Dimitar Ivanov

Dimitar finalised the all strings as ints HTTP API functionality. He also helped producing the release and fixed a couple of intermittent test failures that were blocking the release itself. He finalised the functionality allowing delegates to slash and another functionality for delegates to force progress (this was merged only this week, though). He picked up the task for the new transaction type for setting delegates. Time spent: 55 hours

4 Likes

OK, a bit late so here you go our progress for the past 2 weeks:

Week 10 (8-14 March)

Ulf Wiger (@uwiger)

Ulf worked on the database rollback script (a companion to the whitelisting script), but found that for it to work really well, we need support for putting the node in “maintenance mode”, where external interfaces and mining are disabled. He then started working on this. Time spent: 40 hours

Dincho Todorov (@dincho.chain)

Dincho healed broken nodes. He also fixed the DB backup. Time spent: ~5 hours

Dimitar Ivanov (@dimitar.chain)

Dimitar was working of a new transaction type: set delegates. Since there were reports for a potential attack, he did some investigations there. No attack was identified but then reports for missing transactions started coming in which were properly investigated as well. No bug had been identified. Time spent: 36 hours

Week 11 (15-21 March)

Ulf Wiger (@uwiger)

Ulf finished implementation and documentation of the State Channel support for deferring minimum-depth waits, so that channels can be used for off-chain transactions sooner. He also continued working on the improved database backend and on the ability to start (or transition) the node into “maintenance mode”. Time spent: 34 hours

Dincho Todorov (@dincho.chain)

Dincho produced 5.10.0 but this was a bumpy ride. The battle tested script for the release production failed so Dincho had to debug and fix it. Eventually the bug happened to be an actual GitHub bug in their ecosystem, but this was confirmed only this week. Time spent: 13 hours

Dimitar Ivanov (@dimitar.chain)

Some further analysis for the missing transactions was done. Dimitar also improved the release process in order to make it more automated. He also fixed some issues in the OAS3 specification. Time spent: 25.5 hours

3 Likes

Hi all,

A lot of stuff is happening! We’re in the last phases of preparations for the iris release. There had been a lot of testing and improvements, but most importantly Hans and Ulf Norell are back!

Here are our reports:

Week 12 (22 March - 28 March)

Ulf Wiger (@uwiger)

Ulf has been working interchangeably on maintenance mode support, database plugin refactoring and inactivity timers for State Channels. While working on the maintenance mode support, he identified a group of bugs in the OTP application controller (luckily not affecting normally used features, although they preclude an elegant solution for maintenance mode). These are still being discussed. Ulf also fixed a few intermittent CI test failures. Time spent: 38 hrs.

Dimitar Ivanov (@dimitar.chain )

Dimitar finished the OAS3 specification fixes. He also was debugging some inconsistencies in the transaction pool and eventually started working on a new feature to allow GCed transactions to reenter the pool. So far there was a restrictions that those txs can not enter the pool once they’re garbage collected. He also spent some time debugging an issue detected by JSD SDK tests - certain APIs were behaving strangely, this was due to missing setup in the JS SDK tests themselves. Time spent: 41h

Dincho Todorov (@dincho.chain )

Dincho prepared 5.10.1 release, improved the snapshot playbook and integrated Slack notifications to Datadog. He helped debugging a sync-related issue, improved the compression of snapshots and split backup nodes. Time spent: 39 hours

Week 13 (29 March - 4 April)

Ulf Wiger (@uwiger)

Ulf has been working interchangeably on maintenance mode support, database plugin refactoring and inactivity timers for State Channels. While working on the database plugin, Ulf identified a bug in mnesia, for which he submitted a fix which is now merged into OTP master. Time spent: 40 hrs.

Hans Svensson (@hanssv.chain)

Hans was profiling the sync speeds. He also was modelling FATE1 and FATE2 interoperability, he worked on AEVM deprecation and participated in the stuck nodes discussions. Time spent: 16 hours

Dimitar Ivanov (@dimitar.chain )

Dimitar finished the re-entry of GCed transactions in the mempool. He fixed the swagger2 definition, provided missing API specification for paying_for_tx. He worked on a newly reported issue for nodes not providing contract_call objects at already GCed heights. Then he spent some time investigating the stuck nodes and how to resume them, esp. the MDW node. Time spent: 45.75 hours

Dincho Todorov (@dincho.chain )

Dincho was debugging sync speeds and how they can be improved, he improved the infrastructure and migrated some nodes to new backups. He participated in some calls regarding stuck nodes. Time spent: 35 hours

Week 14 (5 April - 11 April)

Ulf Wiger (@uwiger)

Ulf primarily worked on refactoring the database plugin. This work should perhaps rather be characterized as a rewrite. Even so, a new structure is in place and debugging aided by property-based testing is underway. Ulf also assisted in analyzing issues that were uncovered in other areas, and preparing for (and participating in) the Iris AMA. Time spent: 39 hours.

Hans Svensson (@hanssv.chain)

Hans was working on AEVM deprecation, fixing some newly found FATE issues that we need for iris. He also participated in the AMA. Time spent: 22 hours

Ulf Norell (@ulfnorell)

Ulf was working on the same FATE improvements as Hans did. Time spent: 9 hours

Dimitar Ivanov (@dimitar.chain )

Dimitar finished the issue that one can not fetch contact call data if the GC is on at an already GCed height. He also spent some more time regarding the issue of the stuck nodes and double checked that a DB snapshot indeed solves the issue locally. He also prepared a presentation for a Superhero talk (the talk itself was 2 hours long, people are interested in state channels). He also prepared a presentation and participated in the AMA session. Time spent: 49 hours

Dincho Todorov (@dincho.chain )

Dincho updates the downloads page. He completed the backup nodes migration. He also researched apt and setup a repo. He cleaned up old backups and improved terraform setup. Time spent: 32.5 hours

4 Likes

Excited to know the GODS are back :blush: from Hypersign team

$AE to the moon now for sure :slight_smile:

5 Likes

I don’t know how it fits into the plan. but I think we should have a guidance how to build a production release for node including plugins.

currently there only exists the middleware plugin but I guess in the future we will see more plugins and to build a docker image I use the following Dockerfile:

I don’t like following things:

  • folder structure (I don’t 100% understand why this is done like this and didn’t have time to figure out how to organize the plugin stuff better)
  • image size
  • using root user

unfortunately I am not that familiar with building Elixir/Erlang applications. it would be nice to have a general approach for including plugins in a production build.

probably @dincho.chain is the right guy to tackle this

3 Likes

The simplest answer to why the plugin works like it does is that it was by far the easiest way to get off the ground.

I presented it on 30 Jan 2020 as “a first, rough prototype”, and as you can tell from the list of closed PRs, it required only minor extra work in order to support the MDW.
https://twitter.com/uwiger/status/1222845169785065473

One reason for accepting this level initially was that the MDW was essentially an “in-house” project, and we figured that the experience from writing the MDW as a “plugin” could inform us on how to make a more ambitious plugin architecture.

5 Likes

Hello,

in general I’m not really involved into the elixir middleware.

Few notes that could shed some light and are my personal opinion if I would need to operate such a service:

  • even it’s called a plugin, IMO it’s not. It’s more like enabler for other projects to source the node as library/plugin.
  • the elixir middleware does not build & release packages, which additionally complicates the deployment
  • i see that the middleware have a dockerfile ae_mdw/Dockerfile at master · aeternity/ae_mdw · GitHub not sure if it works tho, also I don’t see automatic builds of that image unfortunately
  • I’m not sure what’s your issue with the root user, the node definitely can run without root user, it was even a requirement for some time. This is how the official image is configured: aeternity/Dockerfile at master · aeternity/aeternity · GitHub
2 Likes

this is also my understanding after looking a bit deeper into that topic. but it would be great if we had a generalized approach that allows to build independent projects like the middleware combined with the node. at least if this makes sense - personally I would welcome it :smiley:

who could tackle this? that would be related to the “generalized approach” to build projects that extend the node (mentioned above), right?

it works, I modified it a bit in the project mentioned above where a script downloads the node and middleware with a given that and builds it according to this (slightly modified) Dockerfile.

I had some problems with “mix” (or some dependencies) trying to run without a root user in the Docker setup mentioned above when I remember right. as I am not really familiar with mix I decided not to spend too much of my time into that topic :smiley:

1 Like

@dimitar.chain Please report on the work and the time spent by the core team starting from the Week 15, 2021.

1 Like

Ok, a long due update :slight_smile: I will post some more tomorrow

Week 15 (12 April - 18 April)

Ulf Wiger (@uwiger)

Ulf is doing a major refactoring in the RocksDB mnesia plugin. A total of 33 hours spent.

Hans Svensson (@hanssv.chain)

Hans measured the costs for big terms in FATE, he calibrated profiles and measured unfold method. He also measured gas traversal. Total time spent: 21 hours

Dimitar Ivanov (@dimitar.chain)

Dimitar was fixing Intermittent test failures. Dimitar also produced a RC for iris, planted the versioning for the next hard fork - ceres. Dimitar fixed some failing system tests. A total of 34 hours.

Dincho Todorov (@dincho.chain)

Dincho was improving the brew process. Total of 27.5 hours

Ulf Norell (@ulfnorell)

Ulf was working on gas cost and measurements and gas charging refactoring. A total of 7 hours.

Week 16 (19 April - 25 April)

Ulf Wiger (@uwiger)

Ulf is doing a major refactoring in the RocksDB mnesia plugin. A total of 41 hours spent.

Hans Svensson (@hanssv.chain)

Hans fixed some FATEv2 internals. A total of 19 hours.

Ulf Norell (@ulfnorell)

Ulf was working on gas cost and measurements and gas charging refactoring. A total of 1 hour.

Dimitar Ivanov (@dimitar.chain)

Dimitar started working on AeCanary: inited the project, implemented authentication and authorisation. A pool of HTTP workers to talk to the MDW is also implemented. A total of 44 hours.

Dincho Todorov (@dincho.chain)

Dincho improved the automatic package publishing. Total of 16 hours.

Week 17 (26 April - 2 May)

Ulf Wiger (@uwiger)

Ulf is doing a major refactoring in the RocksDB mnesia plugin. A total of 41 hours spent.

Hans Svensson (@hanssv.chain)

Hans fixed some more FATEv2 internals. Recalculated gas costs for BLOCKHASH, GAS, CALL_R primops. A total of 27 hours.

Dimitar Ivanov (@dimitar.chain)

Dimitar worked on AeCanary: exchanges exposure. This allows us to trace if an exchange is at risk of an attack or not. Some work on tainted accounts was done as well. Dimitar had done some preparations for the hard fork and another RC for iris. A total of 35 hours.

Dincho Todorov (@dincho.chain)

Dincho improved the HTTP caches tests, improved docs and CI

Ulf Norell (@ulfnorell)

Ulf was working on gas cost and measurements and gas charging refactoring. A total of 4 hour.

Week 18 (3 May - 9 May)

Ulf Wiger (@uwiger)

Ulf is doing a major refactoring in the RocksDB mnesia plugin. A total of 40.5 hours spent.

Dimitar Ivanov (@dimitar.chain)

Dimitar worked on AeCanary: some work on tainted accounts. The transactions are now cached so requests are much faster. A total of 23 hours.

Hans Svensson (@hanssv.chain)

Hans released compiler version 5.0.0. A total of 2 hours.

Week 19 (10 May - 16 May)

Ulf Wiger (@uwiger)

Ulf is doing a major refactoring in the RocksDB mnesia plugin. A total of 41 hours spent.

Dimitar Ivanov (@dimitar.chain)

Dimitar worked on AeCanary: exchanges data structures are revisited. A new UI is added. Dimitar fixed a GetTransactionByHash in the RC. A total of 43.5 hours.

Dincho Todorov (@dincho.chain)

Dincho updated testnet nodes. He resolved fork issues there. He configured V3 of the HTTP API, released and deployed 6.0.0-rc2. A total of of 31.5 hours spent.

Ulf Norell (@ulfnorell)

Ulf was reviewing contract create PRs. A total of 8 hours.

Week 20 (17 May - 23 May)

Ulf Wiger (@uwiger)

Ulf worked on the mnesia_rocksdb plugin, and also started looking into
building the Aeternity source code on the upcoming OTP 24.
Ulf also addressed some issues with minimum-depth parameters in State Channels,
and documented some existing, but previously undocumented behavior.
Ulf also participated in a fireside chat on code design
at the Code BEAMv 2021 conference, together with Elixir guru Eric Meadows-Jönsson. A total of 43 hours.

Dimitar Ivanov (@dimitar.chain)

Dimitar implemented a new HTTP for external dry-runing of contract calls. This comes with some limits to keep the node safe. Dimitar implemented some alerts for exchanges. Introduced docker builds and improved docs and public dashboard. Dimitar produced RC3 for iris. A total of 39.5 hours.

Dincho Todorov (@dincho.chain)

Dincho updated main net nodes, fixed the HTTP cache tests and updated secrets. A total of 22 hours.

Week 21 (24 May - 30 May)

Ulf Wiger (@uwiger)

Ulf worked on the mnesia_rocksdb plugin. A total of 37 hours.

Dimitar Ivanov (@dimitar.chain)

Dimitar started the week with improving the public the dashboard of the AeCanary tool. Then he added custom settings for obscuring the alerts. Then he implemented statistical outliers detection. A total of 44 hours.

Dincho Todorov (@dincho.chain)

Dincho updated infrastructure deps. A total of 14 hours.

Week 22 (31 May - 6 June)

Ulf Wiger (@uwiger)

As part of preparations to build on OTP 24, Ulf got some changes merged
into the upstream lager repos, and we now run on the latest lager.
Some minor bugs in OTP 23 and 24 were reported, have already been fixed
by the OTP team, and will soon be made available.
The Ae source is now mostly adapted to the new mnesia_rocksdb backend,
but a new rocksdb release is needed for testing to proceed. A total of 35 hours.

Dimitar Ivanov (@dimitar.chain)

Dimitar was investigating the tx pool and started to work on an improvement to fix it. A total of 24 hours.

Dincho Todorov (@dincho.chain)

Dincho updated TF modules. A total of 7 hours spent.

2 Likes

A follow up update can be found bellow. Note that we have 2 new team members in the core team

Week 23 (7 June - 13 June)

Dimitar Ivanov (dimitar.chain)

Dimitar was tracing an issue in the transaction pool. Fixing it required some refactoring. A total of 44 hours.

Ulf Wiger (uwigeroferlang.chain)

Ulf worked on automating updates of the State Channel message sequence logs in the documentation repository. He also worked on the database backend rewrite. Time spent: 35 hours

earlyriser99

earlyriser99 worked on re-writing the Typescript fork detector in Elixir in preparation for adding it to AECanary, and performed some initial work on email sending from AeCanary. Time spent: 25 hours

zxq9

Craig worked on bootstrapping, configuration of nodes, and Rosetta standards. Time spent: 17 hours

Dincho Todorov (@dincho.chain)

Dincho added TLS support for AeCanary. He also did some onboarding of new members. Time spent: 6 hours.

Week 24 (14 June - 20 June)

Hans Svensson (@hanssv.chain)

Hans fixed a bug in the FATE VM as part of an emergency we had. He also fixed some issues in the Sophia compiler. A total of 9 hours.

Dimitar Ivanov (dimitar.chain)

Dimitar was still doing the refactoring of the transaction pool to fix the empty generations issue. This eventually was reproduced on main net and there was some emergency fixing and releasing of 6.1.0. After some talks with external dev teams, a new endpoint was identified: what is the next nonce for an account. Work in that direction had started. A total of 50 hours.

Ulf Wiger (uwigeroferlang.chain)

Ulf supported the above mentioned emergency fix, continued working on the automatic log updates, and restarted work on the support for maintenance mode. Time spent: 34.5 hours

earlyriser99

earlyriser99 worked on adding email sending from AeCanary and setting up and configuring an online email service (chosen supplier Mailgun). A total of 28 hours

zxq9

Craig worked on more Rosetta implementation requirements research (we need runtime modes within the nodes – oh no!), A JSON schema checker, a rudimentary GUI builder/launcher for nodes, and wallet evaluation. Time spent: 24 hours

Dincho Todorov (@dincho.chain)

Dincho provided some mail support for AeCanary. He also deployed 6.1.0 version of the node. Time spent: 4 hours

Week 25 (21 June - 27 June)

Dimitar Ivanov (dimitar.chain)

We got some feedback and the endpoint for getting the next nonce was revisited and extended. A new debug endpoint was introduced: it allows users to check if their transaction that is in the mempool is blocked by something (ex. missing nonce). An effort started for adding some integration tests were added to the transaction pool. A total of 31.5 hours.

Ulf Wiger (uwigeroferlang.chain)

Ulf worked on maintenance mode support. A prototype advanced application controller for OTP was pushed, and a strategy for also supporting load and start of plugin applications (to be detailed later) is taking shape. Time spent: 31 hours

earlyriser99

earlyriser99 completed the development of email sending from AeCanary and created a UAT environment for canary pre testing. A total of 23 hours

zxq9

Craig worked on a JSON Schema based configuration utility interface (which while it did not result in a user-facing configurator anyone would want to use, did certainly reveal what would be useful to expose and what shouldn’t be), and discovering aecore startup quirks. Time spent: 26 hours

Dincho Todorov (@dincho.chain)

Dincho was interviewing new DevOps for the team. Time spent: 2 hours

Week 26 (28 June - 4 July)

Dimitar Ivanov (dimitar.chain)

Dimitar finished the transaction pool integration suite. Work started on making the swagger generator to allow more than GET and POST http methods. Total of 32.5 hours.

Ulf Wiger (uwigeroferlang.chain)

Ulf did some work on inactivity timers for State Channels (issue #3344) and continued working on maintenance mode. Time spent: 24 hours

earlyriser99

earlyriser99 finished putting the uat AeCanary system live, conducted a number of pull request reviews and started work on getting the aeternity node running on the M1 Mac. A total of 17 hours

zxq9

Craig studied up on protocols, revisited the launcher code and configurator (deciding how the interface should actually present itself to users is not as easy as originally imagined!), checked the status of exchanges and various lore surrounding them, and wrote project internal documentation regarding the formulation of incident response procedures. Time spent: 20 hours

4 Likes

checked the status of exchanges and various lore surrounding them, and wrote project internal documentation regarding the formulation of incident response procedures.

what does this mean?

Dear All in the Aeternity Foundation,

Please let me share the update on the AeonNet release : Sorry for the delays in sharing the progress. I was held up in an array of domestic challenges due to the COVID -19 disease surveillance in India.

Documentation in Progress >>

Added System Context
https://www.notion.so/System-Context-a9ba663f3eea4467804b9d1847e2bb8b

Added Smart Contracts
https://www.notion.so/Smart-Contracts-83f755edd5de41c590c6a2edb098faed

Development in Progress >>

  • Adapted Security Module using ECDH by Milan Radkov
  • Adapted Exchange Module using WeiDex by Milan Radkov
  • Removed Redundant Contracts in the latest update

Next Steps >>

  • Refinement of DEX
  • Refinement of ECDH
  • Refinement of TimeLocks
  • Refinement of AssetList
  • Deployment to Integration Environment
  • Development of Web Application

AF has an ongoing communication with the exchanges. We hope the deposit/withdrawal issues will be solved soon. If you have an constructive suggestion how to make it faster please contact us.

3 Likes

I hope it’s true. Most of the people who still hold AE are Chinese, and I hope the team won’t hurt your hearts

1 Like

AF: For investors, money has a network effect. Money that no one accepts has no value. People who are willing to accept and use it, and the currency with greater circulation, the greater its value. Can you announce more communication details! Let us see that Aeternity has a bright future!

2 Likes

Would the project party please stop keeping silent? Don’t just have ideas, but not actual actions! There is no place to see the progress of the project in the news reports? Is this for everyone to guess?