We were seeing High memory utilization on the xConnect server and it coincides with when our monitoring service reports the site being down. Logs were particularly showing these two errors related to the xConnect –
Error #1:
System.OperationCanceledException: The operation was canceled.
at System.Threading.CancellationToken.ThrowOperationCanceledException()
at System.Web.Http.Filters.ActionFilterAttribute.<CallOnActionExecutedAsync>d__5.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Web.Http.Filters.ActionFilterAttribute.<ExecuteActionFilterAsyncCore>d__0.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Web.Http.Controllers.ActionFilterResult.<ExecuteAsync>d__2.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Web.Http.Filters.AuthorizationFilterAttribute.<ExecuteAuthorizationFilterAsyncCore>d__2.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Web.Http.Controllers.AuthenticationFilterResult.<ExecuteAsync>d__0.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Web.Http.Controllers.ExceptionFilterResult.<ExecuteAsync>d__0.MoveNext()
Error #2:
Sitecore.XConnect.Operations.SetFacetOperation`1[Sitecore.XConnect.Facet]: Sitecore.XConnect.Operations.FacetOperationException: Operation #0, AlreadyExists, Contact {32971b60-1245-0000-0000-0611bab4b38e}, Classification
[Error] ["XdbContextLoggingPlugin"] XdbContext Batch Execution Exception
Sitecore.XConnect.Operations.FacetOperationException: Operation #0, AlreadyExists, Contact {32971b60-1245-0000-0000-0611bab4b38e}, Classification
[Error] XConnect Exception Filter OnException(), url - "https://prod9-xconnect.everence.com/odata/Contacts?%24filter=Identifiers%2fany(id:id%2fIdentifier+eq+'a5642a9a87ad41e1ab65a624b7d1b1f6'+and+id%2fSource+eq+'xDB.Tracker')&%24expand=Identifiers,MergeInfo,ConsentInformation,Classification,EngagementMeasures,ContactBehaviorProfile,Personal,KeyBehaviorCache,ListSubscriptions,AutomationPlanEnrollmentCache,AutomationPlanExit,TestCombinations"
System.OperationCanceledException: The operation was canceled.
at System.Threading.CancellationToken.ThrowOperationCanceledException()
at System.Web.Http.Filters.ActionFilterAttribute.<CallOnActionExecutedAsync>d__5.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Web.Http.Filters.ActionFilterAttribute.<ExecuteActionFilterAsyncCore>d__0.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Web.Http.Controllers.ActionFilterResult.<ExecuteAsync>d__2.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Web.Http.Filters.AuthorizationFilterAttribute.<ExecuteAuthorizationFilterAsyncCore>d__2.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Web.Http.Controllers.AuthenticationFilterResult.<ExecuteAsync>d__0.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Web.Http.Controllers.ExceptionFilterResult.<ExecuteAsync>d__0.MoveNext()
Investigation:
Created a dump using the Procdump tool. Here is the command I used for creating 18GB of the dump file
procdump -m 18000 -ma [Name or PID]
From reviewing the memory dump, we could see two SQL queries that have consumed 8GB and 6GB of memory in your XConnect instance.
Both queries were running the [xdb_collection].[GetInteractionsByContactIds] stored procedure. This stored procedure retrieves interaction data for a contact.
The queries are attempting to retrieve the interactions for a specific Contact ID. After reviewing some of the retrieved interactions from this contact, we could see the User-Agent for this contact was
Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) PingdomTMS/0.8.5 Safari/534.34
The above agent is Pingdom (It is a website monitoring and availability service) to have many interactions stored in XConnect. Although it’s a bot, it was considered a valid user and the interactions were being tracked.
Resolution:
In order to mitigate the interactions from Pingdom, we added the User-Agent of Pingdom to the <excludedUserAgents> configuration (Sitecore.Analytics.ExcludeRobots.config) so that future visits from Pingdom are treated as a bot and not saved to XConnect.
For more details, please see this article here: https://doc.sitecore.com/developers/90/sitecore-experience-platform/en/configure-robot-detection-functionality.html#idp23361
Thank you Sitecore Support for helping us figure out the issue.
Happy Sitecoring!