-
Notifications
You must be signed in to change notification settings - Fork 913
Enhancing the delay in the listener when there is exception in the connection #5448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
rajmishra1997
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: For higher number of error response from server, can we add log/warning to display continuous error response received
| else if (continuousError > 2 && continuousError <= 5) | ||
| { | ||
| // random backoff [60, 90] | ||
| _getNextMessageRetryInterval = BackoffTimerHelper.GetRandomBackoff(TimeSpan.FromSeconds(60), TimeSpan.FromSeconds(90), _getNextMessageRetryInterval); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we have custom implementation of backoff based on retry count, why not use the standard backoff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are increasing the backoff if the number of retries is increasing and the server is still unavailable so that we reduce the load on the server as time progress and it is unavailable.
Basically the ICM which is related to these changes, the ask was if the server is not available the agent should not keep on making requests. So to reduce the frequency of requests we have increased the delay based on the retry count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes we can add a simple exponential backoff delay which increases with attempt numbers but since this retry based custom backoff logic was already there we decided to just increase the delay based on the number of continuous errors.
Earlier it was segregated as <=5 attempts and greater than that, now we have divided that further.
Context
Adding a delay in the Agent listener code before retrying request when there is a retriable exception thrown from the server side.
Related work-item: AB#2338439
Description
When the server is unavailable (not an authentication error), the server returns a different exception (e.g., VssServiceResponseException with status code 404). The agent continuously retries the request indefinitely. Each retry invokes the OAuth token provisioning mechanism on the server. This behavior significantly increases load on an already unavailable server.
Hence we have increased the Backoff delay in the agent code before retrying the requests.
Risk Assessment (Low / Medium / High)
Low
Unit Tests Added or Updated (Yes / No)
NA
Additional Testing Performed
Manually tested connecting the agent with devfabric and then stopping the tfs devfabric web service. The agent delayed the request based on the continuous error count.
Change Behind Feature Flag (Yes / No)
No
Tech Design / Approach
This is done to reduce the load on the server where the agent continuously make requests to the server.
Documentation Changes Required (Yes/No)
No
Logging Added/Updated (Yes/No)
NA
Telemetry Added/Updated (Yes/No)
NA
Rollback Scenario and Process (Yes/No)
NA
Dependency Impact Assessed and Regression Tested (Yes/No)
NA