-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Improved multinode proxy #249
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be really good to have a way to expose backing node health in the webui somehow, even just the ones on the node serving the webui
That would be a good idea. Let me do that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ErrorIsIn is a little sketchy, though I remember writing / reviewing it very deeply, so it's prooobably fine, but would be amazing to confirm.
The multi node switching is working now.
Testing with 3 nodes showed the same results as with 2 nodes. As long as there is at least one healthy node, curio continues to work fine. Some log where 2 of the 3 nodes were stopped shows that curio still continued working after successfully switching to the last healthy chain node.
|
Tested with the latest UI fix and all Chain nodes are now showing as expected on the UI. The health of each of then also follows the out of sync and the reachability. Curio switches nicely between the available ones. One situation that might be confusing is if a new API is added to the base layer. The UI starts tracking its health, but curio in the background will not use that node until curio is restarted. That might be a bit of an out of sync situation of what the user sees, vs what curio is actually using. After restart of curio, everything comes into sync again and the UI agrees with what curio uses. |
Fixes #243 and #200
This PR bring in the changes from filecoin-project/lotus#11470 along with addressing some comments.
This PR also removes the forced check to ensure at least 2 chain nodes are reachable. We don't require this check as there is no network segregation or split-brain issues here. Any node will work as long as it is in sync. This check also has some unintended affects of rendering Curio cluster useless if a node goes down out of 2.