Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: use separate goroutines for sotw bidi streams (#530) #531

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

rueian
Copy link

@rueian rueian commented Dec 21, 2021

To address the #530, I follow the #451 and separate goroutines for both bidi streams.

I also reuse the same chan cache.Response for all watches and remove the reflection.

@alecholmez
Copy link
Contributor

@rueian can you rebase your branch of what was merged? The PR this was based off of is now in main

@rueian
Copy link
Author

rueian commented Jan 7, 2022

@rueian can you rebase your branch of what was merged? The PR this was based off of is now in main

Sure, I have rebased it. Thanks.

@rueian
Copy link
Author

rueian commented Jan 10, 2022

Hi @alecholmez, could you review this if you have free time?

@alecholmez
Copy link
Contributor

alecholmez commented Jan 11, 2022

Hi @alecholmez, could you review this if you have free time?

Yes sorry will review this now

Copy link
Contributor

@alecholmez alecholmez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So before I continue to review this I'd like to spark a conversation with @snowp, @jpeach, and @sunjayBhatia .

We've been down this path before, I'm not entirely sure we want to add another goroutine since that's just more overhead. We went with the dynamic watch selection to cut down on goroutines actually.

Is there any other way we can approach a fix for the deadlock in linear cache without adding goroutines?

pkg/server/sotw/v3/server.go Outdated Show resolved Hide resolved
pkg/server/sotw/v3/server.go Outdated Show resolved Hide resolved
pkg/server/sotw/v3/server.go Outdated Show resolved Hide resolved
pkg/server/sotw/v3/server.go Outdated Show resolved Hide resolved
@rueian rueian force-pushed the sotw-bidi branch 3 times, most recently from bfbc3b0 to 8e247d6 Compare January 12, 2022 17:26
@sunjayBhatia
Copy link
Member

Is there any other way we can approach a fix for the deadlock in linear cache without adding goroutines?

currently every resource update causes a response to be sent via sending on the responder channels, we could batch and coalesce updates with a timer to ensure multiple updates wont be blocked etc. before a response is actually sent

@sunjayBhatia
Copy link
Member

sunjayBhatia commented Jan 12, 2022

also would be nice to have a test that demonstrates this issue, with the new refactor as well, having a harder time piecing together the sequence of events, honestly havent used/looked at the linearcache much myself

@rueian
Copy link
Author

rueian commented Jan 13, 2022

Hi @sunjayBhatia,

Demonstrating the issue is not easy and it is kind of flaky. However, I have still tried to make a test case to do that.

The TestSOTWLinearCacheIntegrationDeadLock test case just tries to simulate what could happen to the LinearCache and SOTW server concurrently with many iterations. And I never pass the test with previous version of SOTW server.

Please let me know what you think about the test.

@sunjayBhatia
Copy link
Member

taking a look 👍🏽

@rueian rueian requested a review from alecholmez January 14, 2022 13:58
@rueian
Copy link
Author

rueian commented Jan 20, 2022

Hi @alecholmez, @sunjayBhatia

Is there any thing I should do to let this PR be merged?

@alecholmez
Copy link
Contributor

@snowp can you actually give this a look too? I'd like more eyes on this

@rueian
Copy link
Author

rueian commented Feb 14, 2022

I was wondering why the delta server won't be deadlocked when I wrote the TestSOTWLinearCacheIntegrationDeadLock test case. And it turns out that the delta server also uses separated goroutine: https://github.com/envoyproxy/go-control-plane/blob/main/pkg/server/delta/v3/server.go#L176-L181

@alecholmez
Copy link
Contributor

ping @snowp

@github-actions
Copy link

github-actions bot commented Apr 3, 2022

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions bot added the stale label Apr 3, 2022
@rueian
Copy link
Author

rueian commented Apr 3, 2022

not stale

@github-actions github-actions bot removed the stale label Apr 3, 2022
@github-actions
Copy link

github-actions bot commented May 3, 2022

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions bot added the stale label May 3, 2022
@rueian
Copy link
Author

rueian commented May 4, 2022

not stale

@github-actions github-actions bot removed the stale label May 4, 2022
@github-actions
Copy link

github-actions bot commented Jun 3, 2022

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions bot added the stale label Jun 3, 2022
@rueian
Copy link
Author

rueian commented Jun 3, 2022

Not staled

@github-actions github-actions bot removed the stale label Jun 3, 2022
@github-actions
Copy link

github-actions bot commented Jul 3, 2022

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions bot added the stale label Jul 3, 2022
@rueian
Copy link
Author

rueian commented Jul 3, 2022

Not staled

@github-actions
Copy link

github-actions bot commented Aug 3, 2022

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants