0 comments on “Monitoring and Debugging Interaction Processing in Sitecore 9 on Azure PaaS”

Monitoring and Debugging Interaction Processing in Sitecore 9 on Azure PaaS

When configuring a new instance of Sitecore XP or maintaining an existing one, you may encounter a situation where your interactions report shows far fewer interactions than expected.
low-interactions
Where are my interactions?
One possible cause is interaction processing which hasn’t kept up with the interactions being logged on your website. In some cases this can be so slow that it appears collection, processing, and reporting aren’t working at all. Here are a few things you can look at to help you diagnose your issue.

 

Are interactions being recorded?

SELECT TOP 10 * FROM xdb_collection.Interactions ORDER BY StartDateTime ASC
Run this command in each of your shard databases to see the recent interactions which have been recorded. Compare the interactions being logged with the expected number and frequency of interactions in the environment you’re looking at.

 

How many interactions are waiting to be processed?

SELECT COUNT(*) FROM xdb_processing_pools.InteractionLiveProcessingPool
This command will indicate the number of interactions waiting to be processed. Monitoring the number of records in this table can give you an indication of the number of new records being created and the number of new interactions which are being queued for processing.
If the number of records is steadily building up, either processing isn’t working or it’s working too slowly to handle the workload.
If you’re collecting interactions but not seeing the size of the live interaction processing pool change at all, there might be an issue with aggregation.

If Analytics reports don’t look quite right, there are some things you can try:

Disable device detection

We encountered an issue with slow processing on a recent project. After logging an issue with Sitecore support, they advised:
Device detection has been known to cause the slowness in rebuilding reporting DB.
Try disabling device detection to determine if this has been impacting the speed of processing.

 

Check the CPU usage on your processing role

If you’re consistently seeing a high level of activity, you may need to scale your processing instances up or out.

high-average-cpu
Time for more instances…

Check connection strings

Use the Server Role Configuration Reference to ensure you have the correct settings on each of your servers

Check Application Insights errors

Check in Application Insights for any repeated error messages that might indicate misconfiguration.

 

millions-of-interactions
That’s more like it!

Helpful links

0 comments on “Don’t Forget your Sitecore Caching Strategy”

Don’t Forget your Sitecore Caching Strategy

Releasing a scalable Sitecore instance requires an in-depth knowledge of Sitecore’s multi-layered caching architecture. Here is a run through of what you will need to pull your projects Sitecore caching strategy together. Including Tips, tricks and pitfalls.

HTML/Rendering Cache Settings

HTML caching settings have been part of the core Sitecore product for many versions now. It’s worth chatting about these every now and again as they are critically important to the performance of your Sitecore instance.

CacheSettings

Indeed one of the first things we look for when reviewing a project that has performance complaints is to see if the Sitecore HTML cache settings have been done at all. The difference that properly setup cache settings can have (compared to a site without any) really shouldn’t be underestimated.

There are a lot of blog posts that define the above settings. Here is a good one to get you up to speed. We have also put some information on the various other layers of Sitecore cache at the bottom of this page.

Sample Caching Strategy Document

For the projects I run, I find it useful to have an overall caching strategy page that summarises the settings for every single rendering. This gives us a nice reference point whenever these settings need adjusting to see what might be affected.

cachingstrategy

 

Failure to Cache

In our experience performance, problems are usually reported by clients who have no caching settings turned on at all. This can cause the website to react very slowly or even bring the site down in times of heavy traffic.

Sitecore does have other layers of cache that will kick in (data, item and pre-fetch cache) if you fail to enable HTML caching. The first line of defence is the HTML caching and when properly configured really takes the pressure off all these other areas of caching and prevents the database from getting hit.

Imagine the following scenario for our made up Sitecore client “Bikes R Us”:

  • A page that has a large extended navigation displaying links to 50 other sub-pages across the site.
  • The content of the page contains several rendering components that also contain links to a number of products across various categories.
  • The code to construct this page traverses not only the tree to build the navigation but also numerous product sub-categories to gather all the links.
  • Developer A – has had no proper exposure to caching strategies before and marks the page as done without any HTML caching settings enabled.
  • The site goes live a month later.
  • “Bikes R Us” marketing team starts advertising via EDMs a month later and things go really well. The campaign also goes viral on Social Media with a bike offer too good to refuse.
  • The page that developer A built experiences more traffic than ever expected.
  • Unfortunately, with no caching the code to construct the page is hit again and again.
  • Data layer and Item caching do assist to a point, however, Developer A never increased the default cache limits so calls to other pages are reducing the effectiveness of these layers overall.
  • After a few hours, traffic to the website increases to the point that the server runs out of CPU capacity and starts sending back 500 errors instead of serving up pages.

The scenario above is entirely avoidable when a proper caching strategy is completed as part of the development. Ideally, the caching strategy should be completed as each component of the website is developed and then tested to save on double handling. The caching strategy should then be reviewed, double checked and fully in place before a full performance test is done on the website.

Unfortunately, what often happens on big site builds is the deadline looms and the caching strategy which should be verified before go-live gets forgotten about. Failure to do so causes severe performance issues and leads to the client asking questions a few weeks/months later.

Incorrectly Configured Cache

On the opposite side of the coin, an incorrectly tuned cache can also cause havoc with some areas of the site. Examples of this include Web Forms for Marketeers and member portals. The caching of forms or components that contain data related to you members will:

  • Cause forms to behave with unexpected behaviour
  • Potentially show sensitive user data belonging to one user to many other users
  • XDB personalised components may behave in an unexpected manner.

 

XDB and Caching

In general, it’s fairly difficult to turn on caching for those components that need to react to personalisation on a per-user basis.  The problem is if your entire homepage is making use of personalisation you may not be able to cache certain components on that page at all. The inability to cache those components properly means the specifications of the server will need to be ramped up to deal with the additional processing that occurs with each page hit.

The “Vary By User” rendering setting is probably going to help you on personalised components up to a point.

Caching and Performance Testing

Caching is closely related to performance testing and your overall caching strategy will affect the outcome of these tests. The aim of the performance test is to benchmark what amount of traffic your production environment can handle during this process.

If your hosting in the cloud why not setup your servers to autoscale when needed.

An often-forgotten point is that performance testing should be complemented by stress testing above and beyond your expected traffic requirements. The main aim of this stress test is to identify the breaking point of your productions environments so that you have this knowledge for the future. This will help your team to prepare for those extra-ordinary traffic events.

When it comes to performance/stress testing there is little point running the test from a single source or development computer. You will be limited by a single network connections capacity and this is not a true test particularly for those making use of cloud hosting.

We always recommend using a service like blazemeter or Azure load tests.

** Thanks to Derek Aceik’s resident DevOps extraordinaire for helping me with the above recommendations.

An additional cache setting

It’s worth getting to know each of the HTML/Rendering cache settings well as you will need to have a detailed knowledge of each of these when looking at your strategy overall. One particular setting we found was missing that we tend to use regularly was the ability to only have a variation based on “Vary By URL”. A member of the team (Jose D) was kind enough to hook this up for us on a recent project. We are happy to share this with the wider community in hope that you also find it useful for your projects.

Increase the default cache limits

Outrageously this is also an often-overlooked part of getting your Sitecore project onto production. The performance tuning guide pretty much spells this out for you. You need to increase the default caching sizes that come out of the box with a Sitecore vanilla install. The caching limits provided are appropriate for developer machines but grossly inadequate for production environments which really need a healthy cache size to be responsive. For instance, out of the box, the HTML cache size is 50MB while on a reasonable production server this should start at 100MB as a baseline. That’s 20 times increase.

Take a look at Sitecore’s performance tuning document in order to get these settings correct. Section 4.1

Fine Tuning

Configuring the cache correctly for your production server can take some time to get right. You will need to monitor the /sitecore/admin/cache.aspx page.

In order to get these settings right have a good look at Sitecore’s performance tuning document. Section 4.2 is very important and give you a guide as to how cache tuning should be performed.

Prefetch Cache

Remember that fine-tuning your site will involve adjusting the items that Sitecore prefetches on startup. Once again the performance tuning document has all the details on how to do this. It’s another important step to get things running smoothly. See the references at the bottom of this article to see how the Pre-fetch cache fits into the overall caching architecture.

Sitecore.Caching.CustomCache

By implementing caching within your code to wrap complex logic you can save your server a lot of processing effort. Particularly around I/O intensive code where a lot of data to be shifted/filtered/searched it really is a great idea and worth adding to your Sitecore coding arsenal.

To get up to speed on how to build a custom cache we recommend reading this document.

The main way to achieve your custom cache is to write an implementation of Sitecore.Caching.CustomCache. You can then wrap your logic with the custom cache to prevent the same code being hit every time.

var cacheKey = string.Concat(
string.Format("MyCustomKey-{0}", Sitecore.Context.Language.Name), ":", filterParam);

var result = this.sitecoreCacheService.GetOrAddToCache(cacheKey, () =>
{ 
 ... 
 return "MyDataResult"
});

return result;

 

Cloudflare / Akamai considerations

Many sites rely on a third-party service provider to sit in front of their website to add an additional layer of caching. This is great and helps sites scale to meet demand. It shouldn’t be used as an excuse not to do a caching strategy at all on the Sitecore side.

Remember that pages are likely to sit in the 3rd party cache only for a certain period of time. So, if your site has 1000s of content pages that are each only accessed semi-regularly the user will bi-pass the third-party cache altogether. In these cases, the Sitecore cache becomes the next line of defence.

With regard to caching and Cloudflare. The cache will only kick in on the media library and your Web API endpoints if the Cache-Control header is set to public and given a valid MaxAge.

  1. For your WEB API endpoints, we found it handy to use the attribute mentioned in this stack overflow page.  See CacheControlAttribute.cs
  2. For media library URLs you need to enable:
<!--  MEDIA RESPONSE - CACHEABILITY The HttpCacheability is used to set media response headers. Possible values: NoCache, Private, Public, Server, ServerAndNoCache, ServerAndPrivate Default value: public--> <setting name="MediaResponse.Cacheability" value="public" />

 

Disable Caching on CM, Enable on CD

Remember to disable HTML caching on CM environments as it may cause issues with the Experience Explorer and Preview modes.

  • Set cacheHtml=”false”  on your CM servers <site> node.

You can also disable the media cache on CM Servers so that content editor never get cached images:

<?xml version="1.0"?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:set="http://www.sitecore.net/xmlconfig/set/" xmlns:role="http://www.sitecore.net/xmlconfig/role/">
 <sitecore role:require="Standalone OR ContentDelivery OR ContentManagement OR Processing">
 <settings>
 <!--
 CACHING ENABLED
 Determines if caching should be enabled at all
 Specify 'true' to enable caching and 'false' to disable all caching
 -->
 <setting patch:instead="*[@name='Media.CachingEnabled']" role:require="Standalone OR ContentManagement" name="Media.CachingEnabled" value="false" />
 <setting patch:instead="*[@name='Media.CachingEnabled']" role:require="ContentDelivery" name="Media.CachingEnabled" value="true" />
 </settings>
 </sitecore>
</configuration>

 

Note: 

  • Don’t change the setting called “Caching.Enabled” on CM servers.

Reference Material:

Understanding the cache layers

The following is taken from http://learnsitecore.cmsuniverse.net/Developers/Articles/2009/07/CachingOverview.aspx

wcCcM

Definitions:

These definitions are described in the following stack overflow post:

Prefetch cache

This is item data pulled out from the database when the site starts up – from the Sitecore docs:

“Each database prefetch cache entry represents an item in a database. Database prefetch cache entries include all field values for all versions of that item, and information about the parent and children of the item.

Populating the prefetch cache results in smoother user experiences immediately after application restarts. Excessive use of prefetch caches can affect the time required for application initialization.”

Data cache

This cache is to minimise the round trips to the database, it again pulls item information from Sitecore but the difference being it does it when the item is requested (rather than start-up of the site); it will pull the data from the pre-fetch cache if it’s there or go back to the database if not.

Item cache

This cache has objects of type Sitecore.Data.Items.Item which would be used in code; when an item is requested in code it will look in the Item cache, then back up the data cache and up again to pre fetch cache and finally to the database.

HTML cache

This output caches the HTML from sublayouts and renderings, there are a nice level of configuration to only cache the HTML based on querystrings, different data etc.