Core Platform Dashboard and Metrics
Dashboard for Singapore Region: - (Same will be the explanation For Mumbai, US- North Virginia & Ireland)
Dashlet Description: -
V1 API is an internal API associated with most of the application-related as well platform requests. Below are the details of the Dashlets.
1. SGP Application Server- This is the Core platform dashlet for Singapore region and will show all the request and response which comes on web platform.
2. SGP WCF V1 Server:- This Dashlet is majorly for public requests and will have information about Modules like connectors, Converse request Third party requests.
3. SGP WCF V1 Server(New LB):- This is Major Dashlet and will give you the details of all platform backend requests and response. Requests in this Dashlet can be higher than SGP Application Server Dashlet because it gives you request backend calls which support the frontend. (Since rendering requires multiple backend calls)
For example PFB the screenshot:-
4. SGP V2 Server :- This Dashlet will give you the combined request of All V2 APIs running in the background.
Dashboard for Hyderabad Region:- (Same for Canada as well)
Only Difference is It does not have WCF V1 Server New LB, So you all the major request can be viewed in HYD Application server Dashlet and HYD WCF V1 Server Dashlet.
HYD V2 Server dashlet will show all the V2 API request responses in the server.
How to figure out Abnormalities:-
Response time- For any Dashlet, TargetResponseTime should not be greater than 1 sec Ideally.
If it increases more than 1 sec for a longer duration of time, then it will be considered as an abnormality.
To check the Dashboard in detail, click on name as shown below, and Dashboard will open in a full page Then apply the filter and select the time range.
Exceptions:-
1. Always check for trend before raising it to SRE. Either check Day by Day for example, if spike is on Monday then check for previous 3 Mondays. If a Slight spike is there or similar then it is a normal case, and Autoscaling has worked.
2. If it has only spiked for like 1 to 2 minutes, then this can also be neglected as it will be auto-resolved. Only report when it persists more than that.
Request Abnormality
Request Abnormality depends on trends if you see a Higher target response time along with Higher number of request which is not as per the trend previously. Then this can be considered as Abnormality.
For example check the below 2 Screenshots 1 for yesterday and second for day before yesterday. Request count as well as TargetResponseTime is almost similar and in is trend.
On case by Case basis if you are debugging similarly first check for trend then report it or flag it as abnormal.