Blog

1 comment on “Don’t Forget your Sitecore Caching Strategy”

Don’t Forget your Sitecore Caching Strategy

Releasing a scalable Sitecore instance requires an in-depth knowledge of Sitecore’s multi-layered caching architecture. Here is a run through of what you will need to pull your projects Sitecore caching strategy together. Including Tips, tricks and pitfalls.

HTML/Rendering Cache Settings

HTML caching settings have been part of the core Sitecore product for many versions now. It’s worth chatting about these every now and again as they are critically important to the performance of your Sitecore instance.

CacheSettings

Indeed one of the first things we look for when reviewing a project that has performance complaints is to see if the Sitecore HTML cache settings have been done at all. The difference that properly setup cache settings can have (compared to a site without any) really shouldn’t be underestimated.

There are a lot of blog posts that define the above settings. Here is a good one to get you up to speed. We have also put some information on the various other layers of Sitecore cache at the bottom of this page.

Sample Caching Strategy Document

For the projects I run, I find it useful to have an overall caching strategy page that summarises the settings for every single rendering. This gives us a nice reference point whenever these settings need adjusting to see what might be affected.

cachingstrategy

 

Failure to Cache

In our experience performance, problems are usually reported by clients who have no caching settings turned on at all. This can cause the website to react very slowly or even bring the site down in times of heavy traffic.

Sitecore does have other layers of cache that will kick in (data, item and pre-fetch cache) if you fail to enable HTML caching. The first line of defence is the HTML caching and when properly configured really takes the pressure off all these other areas of caching and prevents the database from getting hit.

Imagine the following scenario for our made up Sitecore client “Bikes R Us”:

  • A page that has a large extended navigation displaying links to 50 other sub-pages across the site.
  • The content of the page contains several rendering components that also contain links to a number of products across various categories.
  • The code to construct this page traverses not only the tree to build the navigation but also numerous product sub-categories to gather all the links.
  • Developer A – has had no proper exposure to caching strategies before and marks the page as done without any HTML caching settings enabled.
  • The site goes live a month later.
  • “Bikes R Us” marketing team starts advertising via EDMs a month later and things go really well. The campaign also goes viral on Social Media with a bike offer too good to refuse.
  • The page that developer A built experiences more traffic than ever expected.
  • Unfortunately, with no caching the code to construct the page is hit again and again.
  • Data layer and Item caching do assist to a point, however, Developer A never increased the default cache limits so calls to other pages are reducing the effectiveness of these layers overall.
  • After a few hours, traffic to the website increases to the point that the server runs out of CPU capacity and starts sending back 500 errors instead of serving up pages.

The scenario above is entirely avoidable when a proper caching strategy is completed as part of the development. Ideally, the caching strategy should be completed as each component of the website is developed and then tested to save on double handling. The caching strategy should then be reviewed, double checked and fully in place before a full performance test is done on the website.

Unfortunately, what often happens on big site builds is the deadline looms and the caching strategy which should be verified before go-live gets forgotten about. Failure to do so causes severe performance issues and leads to the client asking questions a few weeks/months later.

Incorrectly Configured Cache

On the opposite side of the coin, an incorrectly tuned cache can also cause havoc with some areas of the site. Examples of this include Web Forms for Marketeers and member portals. The caching of forms or components that contain data related to you members will:

  • Cause forms to behave with unexpected behaviour
  • Potentially show sensitive user data belonging to one user to many other users
  • XDB personalised components may behave in an unexpected manner.

 

XDB and Caching

In general, it’s fairly difficult to turn on caching for those components that need to react to personalisation on a per-user basis.  The problem is if your entire homepage is making use of personalisation you may not be able to cache certain components on that page at all. The inability to cache those components properly means the specifications of the server will need to be ramped up to deal with the additional processing that occurs with each page hit.

The “Vary By User” rendering setting is probably going to help you on personalised components up to a point.

Caching and Performance Testing

Caching is closely related to performance testing and your overall caching strategy will affect the outcome of these tests. The aim of the performance test is to benchmark what amount of traffic your production environment can handle during this process.

If your hosting in the cloud why not setup your servers to autoscale when needed.

An often-forgotten point is that performance testing should be complemented by stress testing above and beyond your expected traffic requirements. The main aim of this stress test is to identify the breaking point of your productions environments so that you have this knowledge for the future. This will help your team to prepare for those extra-ordinary traffic events.

When it comes to performance/stress testing there is little point running the test from a single source or development computer. You will be limited by a single network connections capacity and this is not a true test particularly for those making use of cloud hosting.

We always recommend using a service like blazemeter or Azure load tests.

** Thanks to Derek Aceik’s resident DevOps extraordinaire for helping me with the above recommendations.

An additional cache setting

It’s worth getting to know each of the HTML/Rendering cache settings well as you will need to have a detailed knowledge of each of these when looking at your strategy overall. One particular setting we found was missing that we tend to use regularly was the ability to only have a variation based on “Vary By URL”. A member of the team (Jose D) was kind enough to hook this up for us on a recent project. We are happy to share this with the wider community in hope that you also find it useful for your projects.

Increase the default cache limits

Outrageously this is also an often-overlooked part of getting your Sitecore project onto production. The performance tuning guide pretty much spells this out for you. You need to increase the default caching sizes that come out of the box with a Sitecore vanilla install. The caching limits provided are appropriate for developer machines but grossly inadequate for production environments which really need a healthy cache size to be responsive. For instance, out of the box, the HTML cache size is 50MB while on a reasonable production server this should start at 100MB as a baseline. That’s 20 times increase.

Take a look at Sitecore’s performance tuning document in order to get these settings correct. Section 4.1

Fine Tuning

Configuring the cache correctly for your production server can take some time to get right. You will need to monitor the /sitecore/admin/cache.aspx page.

In order to get these settings right have a good look at Sitecore’s performance tuning document. Section 4.2 is very important and give you a guide as to how cache tuning should be performed.

Prefetch Cache

Remember that fine-tuning your site will involve adjusting the items that Sitecore prefetches on startup. Once again the performance tuning document has all the details on how to do this. It’s another important step to get things running smoothly. See the references at the bottom of this article to see how the Pre-fetch cache fits into the overall caching architecture.

Sitecore.Caching.CustomCache

By implementing caching within your code to wrap complex logic you can save your server a lot of processing effort. Particularly around I/O intensive code where a lot of data to be shifted/filtered/searched it really is a great idea and worth adding to your Sitecore coding arsenal.

To get up to speed on how to build a custom cache we recommend reading this document.

The main way to achieve your custom cache is to write an implementation of Sitecore.Caching.CustomCache. You can then wrap your logic with the custom cache to prevent the same code being hit every time.

var cacheKey = string.Concat(
string.Format("MyCustomKey-{0}", Sitecore.Context.Language.Name), ":", filterParam);

var result = this.sitecoreCacheService.GetOrAddToCache(cacheKey, () =>
{ 
 ... 
 return "MyDataResult"
});

return result;

 

Cloudflare / Akamai considerations

Many sites rely on a third-party service provider to sit in front of their website to add an additional layer of caching. This is great and helps sites scale to meet demand. It shouldn’t be used as an excuse not to do a caching strategy at all on the Sitecore side.

Remember that pages are likely to sit in the 3rd party cache only for a certain period of time. So, if your site has 1000s of content pages that are each only accessed semi-regularly the user will bi-pass the third-party cache altogether. In these cases, the Sitecore cache becomes the next line of defence.

With regard to caching and Cloudflare. The cache will only kick in on the media library and your Web API endpoints if the Cache-Control header is set to public and given a valid MaxAge.

  1. For your WEB API endpoints, we found it handy to use the attribute mentioned in this stack overflow page.  See CacheControlAttribute.cs
  2. For media library URLs you need to enable:
<!--  MEDIA RESPONSE - CACHEABILITY The HttpCacheability is used to set media response headers. Possible values: NoCache, Private, Public, Server, ServerAndNoCache, ServerAndPrivate Default value: public--> <setting name="MediaResponse.Cacheability" value="public" />

 

Disable Caching on CM, Enable on CD

Remember to disable HTML caching on CM environments as it may cause issues with the Experience Explorer and Preview modes.

  • Set cacheHtml=”false”  on your CM servers <site> node.

You can also disable the media cache on CM Servers so that content editor never get cached images:

<?xml version="1.0"?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:set="http://www.sitecore.net/xmlconfig/set/" xmlns:role="http://www.sitecore.net/xmlconfig/role/">
 <sitecore role:require="Standalone OR ContentDelivery OR ContentManagement OR Processing">
 <settings>
 <!--
 CACHING ENABLED
 Determines if caching should be enabled at all
 Specify 'true' to enable caching and 'false' to disable all caching
 -->
 <setting patch:instead="*[@name='Media.CachingEnabled']" role:require="Standalone OR ContentManagement" name="Media.CachingEnabled" value="false" />
 <setting patch:instead="*[@name='Media.CachingEnabled']" role:require="ContentDelivery" name="Media.CachingEnabled" value="true" />
 </settings>
 </sitecore>
</configuration>

 

Note: 

  • Don’t change the setting called “Caching.Enabled” on CM servers.

Reference Material:

Understanding the cache layers

The following is taken from http://learnsitecore.cmsuniverse.net/Developers/Articles/2009/07/CachingOverview.aspx

wcCcM

Definitions:

These definitions are described in the following stack overflow post:

Prefetch cache

This is item data pulled out from the database when the site starts up – from the Sitecore docs:

“Each database prefetch cache entry represents an item in a database. Database prefetch cache entries include all field values for all versions of that item, and information about the parent and children of the item.

Populating the prefetch cache results in smoother user experiences immediately after application restarts. Excessive use of prefetch caches can affect the time required for application initialization.”

Data cache

This cache is to minimise the round trips to the database, it again pulls item information from Sitecore but the difference being it does it when the item is requested (rather than start-up of the site); it will pull the data from the pre-fetch cache if it’s there or go back to the database if not.

Item cache

This cache has objects of type Sitecore.Data.Items.Item which would be used in code; when an item is requested in code it will look in the Item cache, then back up the data cache and up again to pre fetch cache and finally to the database.

HTML cache

This output caches the HTML from sublayouts and renderings, there are a nice level of configuration to only cache the HTML based on querystrings, different data etc.

 

0 comments on “JS & CSS Minification an Alternative Helix Approach”

JS & CSS Minification an Alternative Helix Approach

On a recent project, we were looking at a way to meet a number of criteria to improve page load times.

With regard to CSS and JS requests these criteria were:

  • Reduce the number of CSS and JS HTTP requests on page load.
  • Reduce the size of CSS and JS files (minification)

At the same time, we were running a Sitecore Helix project and we needed this to seamlessly fit into our project and CI builds.

In the past, I had also done some Umbraco development and was familiar with a nice little package called “ClientDependency Framework (CDF) by Shannon Deminick”  that comes out of the box with that particular CMS.

CDF will cut down your server requests in a few ways:

  • Combining  (It combines multiple JS files into a single server call on the fly)
  • compressing
  • minifying output
  • Composition
  • Caching (Processing of files into a composite file it cached)
  • Persisting the combined composite files for increased performance when applications restart or when the Cache expires

It also has a development mode were asset includes are not touched so that you can debug as usual.

The problem we had to solve was how to make the above package work in a way that it could plug and play with the Helix way of doing things.

The habitat (helix example project)  provides a way to include assets file via a global theme and also on an idividual rendering level.

Global themes should be used to load assets that are to be applied across a whole website.

Rendering level assets should be used to load css and javascript that only apply directly to that particular rendering.

The current habitat solution contains a module called “Assets” that lives in the foundation layer. This module will cleverly round up all the assets that need to be rendered on the page by the Helix Architecture.

In order to use ClientDependency Framework (CDF) and seamlessly plug it into the Assets module, we have provided two new layout tags for use:

Usage in Layouts
- @CompositeAssetFileService.Current.RenderStyles()
- @CompositeAssetFileService.Current.RenderScript(ScriptLocation.Body)

The above integration is currently on a pull request (waiting approval) on the main Helix Habitat GitHub  repository:

See: https://github.com/Sitecore/Habitat/pull/349

It will remain available on Aceik’s fork of the habitat project. 

Setup Steps: 

To use this service you will need to rename the following and run the gulp build:
/// – “App_Config/ClientDependency.config.disabled”. to App_Config/ClientDependency.config
/// – “\src\Foundation\Assets\code\Web.config.transform.ClientDependency.minification.example” to Web.config.transform

Once running (not in debug mode) you will see the following asset includes in the html source.

/DependencyHandler.axd/bdc200f5bb6df7e817066f4b98499322/12/js

A note about Cloudflare and minification: 

If your using cloudflare in front of your website,  JS and CSS minification can be turned on as feature in Cloudflare.  You can also compact the number of server requests made by using a feature called Rocket Loader. 

0 comments on “Understanding Form Submission Tracking (WFFM and xDB)”

Understanding Form Submission Tracking (WFFM and xDB)

This blog is targeted at a marketing audience that may be wondering how to interpret the WFFM tracking metrics, as shown in the form reports.

Definitions of metrics as proposed by Sitecore:

  • Visits – the total number of visitors who visited the page containing the web form.
  • Submission attempts – the total number of times that visitors clicked the submit button.
  • Dropouts – the total number of times that visitors filled in form fields but did not submit the form.
  • Successful submissions – the total number of times that visitors successfully submitted the web form and data was collected.

FormTracking

If you’re trying to test the above values by doing form submissions yourself and you don’t fully understand what is happening in the background, it will get confusing very quickly. Indeed some clients have asked me to investigate the tallies above as they believed the report was broken when some of the numbers started declining.

You really need to understand that just closing your browser when trying to affect the above metrics doesn’t fool the xDB into thinking your a different user.

Thanks to xDB cookies the values shown next to the metrics above can actually be reassigned.  What I mean by this is if a user is recorded as a dropout and they then return a few hours later to re-submit the form. The tally next to dropouts will actually reduce by one and that value will be added to successful submissions.  If you’re trying to test the tally for correctness a drop in some of these counts will leave you scratching your head.

The only guaranteed way for the user to be treated and a unique visitor is to clear your cookies and browser temp data.

A good way to perform these tests is to use “Incognito mode in chrome“. As this will prevent cookies from being stored.


This provides us with an explanation as to why you might see some of the tallies go into the negative. Which makes sense when you realise that dropouts can be converted to successful submissions.  How then do we explain when submission attempts drop in number?

Running a few tests on this reveals that when a user is converted from a drop-out to a successful submission, the submission attempts recorded against this user are also adjusted down.


The main takeaway from this is that xDB is clever and knows when the same user returns to a form.  It adjusts the tallies in the form report accordingly.

If your marketing department doesn’t care about how well a user is tracked and these tallies confuse them. Let’s say it wants to track the exact number of times a form was submitted (same user or not). You could achieve this independently just by tracking the number of times the thank-you page is loaded.

 

0 comments on “Advanced Scheduled Publishing”

Advanced Scheduled Publishing

Why do we want to schedule publishing?

  1. We want to schedule publishing to ensure that content scheduled to go live will go live when expected. By default, if a scheduled publish isn’t setup any content scheduled to go live will have to wait until the next manual publish after the scheduled date/time.
  2. We want to remove the publish option from editors so that publishing isn’t over utilised.
  3. Reducing the amount of publishes in a day reduces the number of times the Sitecore HTML cache is cleared and therefore increase site performance.

The out of the box scheduled publishing agent provided by Sitecore solves point 1 and 2 above but the issue is publishing at times when a publish is not required. The default publishing agent will publish every x interval all day. This is not efficient because it is publishing when possibly no changes have occurred and that will still cause the HTML cache to clear.

Our Advanced Scheduled Publishing Module provides the option to schedule a publish between a start and finish date and set the interval. For example 9-5 and every 60 minutes.

It also provides the option to set up one-off scheduled publishes at a specific time. So, for example, one publish at 2pm and one at 2am.

These two options can be used individually or in combination. In combination you can allow for a common scenario where we want to publish every 60mins from 9am – 6pm and then a one-off publish at 12am so that any content scheduled to be published will do so ready for the new day.

The configuration of these intervals are managed via a configuration item you create and manage in Sitecore. The documentation found here. The last publish time is updated here also, allowing editors to have more visibility on when the last publish occurred.

Our module is built on top of the Sitecore scheduled tasks so it will check for a time and interval within the range of the scheduled task frequency so, therefore, could occur slightly earlier or later depending on your frequency value. This is a well-documented limitation as the Sitecore scheduled tasks run within the context of a web application which could go down, or be taken down at any time.

Links

Sitecore Marketplace

Github repo

 

0 comments on “Sitecore Helix: Lets Talk Layers”

Sitecore Helix: Lets Talk Layers

Here are some notes from the decision-making process our team uses with regard to what goes where and in which layer. Of course, the helix documentation does go over the guidelines but it’s not until you start working with the architecture that things begin to become clear.

Project Layer

Definition: The Project layer provides the context of the solution. This means the actual cohesive website or channel output from the implementation, such as the page types, layout and graphical design. It is on this layer that all the features of the solution are stitched together into a cohesive solution that fits the requirements.

Comment: The project layer is probably the most straightforward layer to understand. In our project, the modules in this layer remained lightweight and mostly contain razor view files that allow content editors to build up the HTML structure of the pages.

The website content (under home), page templates and component templates are serialized by Unicorn and also live in this layer.

It’s also worth mentioning one particular gotcha you may hit in development to do with template inheritance and you can read more in this blog post.

Feature or Foundation?

Feature Definition: The Feature layer contains concrete features of the solution as understood by the business owners and editors of the solution, for example news, articles, promotions, website search, etc. The features are expressed as seen in the business domain of the solution and not by technology, which means that the responsibility of a Feature layer module is defined by the intent of the module as seen by a business user and not by the underlying technology. Therefore, the module’s responsibility and naming should never be decided by specific technologies but rather by the module’s business value or business responsibility.

Discussion: For our feature modules, we aimed for single concrete features that are independent of each other. They may contain views, templates, controllers, renderings, configuration changes and related business logic code to tie it all together. The point is to always stick to the rule: “Classes that change together are packaged together”.

When building feature modules, it’s also very handy to think about the feature modules removal as you build it. Keep asking yourself how easy would it be to roll back this module and what would I need to do. Doing so will help you to keep those dependencies under control.

Foundation Definition: The lowest level layer in Helix is the Foundation layer, which as the name suggests forms the foundation of your solution. When a change occurs in one of these modules it can impact many other modules in the solution. This mean that these modules should be the most stable in your solution in term of the Stable Dependencies Principle.

 

Discussion: We found that our foundation modules usually consist of frameworks or code that provide a structural functionality to support the web application as a whole. Each foundation module may be used by multiple feature modules to provide them with the support they need to run properly. Our foundation modules contain API calls, configuration, ORM structures (Glass Mapper), initialisation code, interfaces and abstract base classes.

An important point is that unlike feature layer modules, the foundation layer modules can have dependencies to other foundation layer modules. If this was not the case it would be very difficult to construct the foundation layer in the first place.

For the most part, the team can make some fairly quick decisions about what goes where in the initial project planning. And what goes where is fairly obvious after you get familiar with the habitat example project. The main dilemma you’re going to encounter is around where your repositories and services (key business logic) might need to sit.

Business LogicWhat goes where! Help!

Let’s consider the definitions above, they seem straightforward enough. However, in agile projects where things may change rapidly or requirements are not immediately clear (which happens a lot), you’re inevitably going to need to make some judgment calls.

What am I talking about with the above statement, well let’s say one developer codes up a feature module at the beginning of that project. At first, it seems like that particular portion of code is only required by that particular feature. Down the track a requirement surfaces whereby the same business logic needs to be used in another Feature module. Helix rules dictate:

  • Classes that change together are packaged together.
  • No dependencies between feature projects should exist.

A lesser developer may be tempted at this point to duplicate the code in both feature modules to get the job done quickly. This, however, breaks some fairly important fundamental coding standards that many of us try to stick to. Step back to consider the technical debt that duplicate code leave behind vs dependencies between your Helix feature modules.

The solution to this dilemma; it’s time to refactor that feature logic and move it into the foundation layer. After which any feature modules that needs to reference the same code can reference it in the foundation layer.

Remember “with great power comes great responsibility” and this is especially true when touching code in the foundation layer. The code you touch in the foundation layer should be highly stable and well tested as the helix guidelines suggest.

Was the original decision a mistake?

On the flip side of the coin its worth considering that it wasn’t a mistake to put that piece of code in the feature layer to start with. Technically if no one else needed to use the code at the time and it was reasonably unforeseen that anyone else would need to use it, then it probably was the correct call.

Accept that things may change

The team members on your helix project will need to be flexible and accepting of change. I think it’s worth being prepared for some open discussions within your team about what goes in the foundation layer and what goes in the feature layer. It’s certainly going to be open to interpretation and a topic of debate. A team that can work together and be open to a change of direction within their code structure will help the code base stay within the helix guidelines as the project evolves.

Good luck!

 

0 comments on “Helix Template Inheritance”

Helix Template Inheritance

During Sitecore Helix development it’s important to understand the role that template inheritance plays in keeping your dependencies in check.

The current conventions shown in the Habitat example site is to:

  • Create your template fields in modules within the foundation and feature layers with templates starting with a ‘_’.
  • featureinheritance
  • Inside the project layer, create appropriate templates that inherit from the feature and foundation layer templates. inheritanceproject

Each project layer template should inherit from one or more feature / foundation layer templates.

We believe the benefits of sticking to this approach are as follows:

  • Your project layer templates can be composed of template fields from multiple modules in other layers. With the potential for page templates to contain the functionality from multiple modules if need be.
  • Your content tree does not directly rely on feature and foundation layer templates making their removal down the track easier if need be.

Data Source Locations

Helix introduces the concept of Data Source Locations which are documented on this page.

datasourcelocations

Data source locations are very useful for supporting multi-tenant, multi-site solutions. You don’t have to use them, however using them does future proof your Helix solution with the ability to add additional tenants down the track.

The datasource location and template resolution have been extended in the Habitat project. This means that it is also possible to define datasource templates and locations for each site, in addition to on the rendering itself. This is done through an extension of the getRenderingDatasource pipeline and the addition of a site: prefix to the Datasource Location field.

Site:prefix

datasource-location

The above IFrame rendering uses the syntax site:iframe which via the getRenderingDatasource pipeline will look for a data source location within the site called ‘iframe’.

This is great for multi-tenant situations but it also means that the helix dependency rules are not broken. Pointing the data source location field of your rendering to a folder within a website technically breaks the Helix dependency rules. This is a very important point and one that might be missed on a first pass through the Helix documentation.

The trap of using Feature Templates

Developers that are new to helix may not realise that any feature layer modules should have a matching template in the website layer that inherits from the feature layer.

datasource-location2

A follow-on effect from this is that you may skip the creation of a project layer template altogether and simply use the feature layer template in the “Datasource Template” field.

Technically the above dependency is not incorrect. Habitat has one more surprise install for you, however. Once you start to add content blocks to your page based on the data source locations, your insert options for the Local data sources folder start to get automatically populated.

InsertOptions

So for each unique component, you add to the page via Experience Explorer your going to get the template of that component in the insert options.

once again this dependency is not technically incorrect, although it will probably have your testers asking why strangely named templates are showing up in the insert options. The major drawback we could think (as mentioned earlier in the article) is that your content items will be highly dependent on that particular feature module.

  • You will have multiple dependencies between content items and the feature modules.
  • Attempting to remove the feature layer will disrupt all of those content items directly.

On the flip side using the project template instead, means we have a single (or reduced) point of removal for that feature module within the project templates.

The main takeaway we took by running into the above mistake (using the feature template in our data source locations) was that we should be pointing the data source locations at a project template instead.

  • Always create a project layer template that will inherit from a feature/foundation module template.
  • Ideally content items shouldn’t reference feature layer templates directly, even though this doesn’t break the helix dependency rules. 

 

0 comments on “Getting settings from Sitecore config that aren’t in the settings section”

Getting settings from Sitecore config that aren’t in the settings section

Recently I had to write some code that got the value of the scheduling frequency in Sitecore config.  Normally, to get a value from Sitecore config, I’d simply use:

Sitecore.Configuration.Settings.GetSetting("yoursettingsname")

However, in this case, the value that I want is not in the <settings> node and the above code won’t work.  The value I want is in the <scheduling> node:

 <!-- SCHEDULING -->
 <scheduling>
 <!-- Time between checking for scheduled tasks waiting to execute -->
 <frequency>00:05:00</frequency>

To get around this, instead, I used:

Sitecore.Configuration.Factory.GetConfigNode("scheduling/frequency")?.InnerXml

The above code should work for any Sitecore config item just by specifying the correct Xpath location for the value you require.

1 comment on “Sitecore Geo-Spatial Results with Azure Search”

Sitecore Geo-Spatial Results with Azure Search

This is the second blog post in this series. In the first blog post, I spoke about setting up Azure Search.

In this post, I am going to talk about using a Geo-Spatial lookup query that Azure Search has built in and the associated Index data type called Edm.GeographyPoint

The Azure Search overview can be found here and the configuration document here.

If you’re simply looking for a working example jump over to our GitHub repository that contains the full working helix site.

https://github.com/Aceik/Sitecore-Azure-Search-Spatial

You can also find the latest version of the Sitecore package here:

https://github.com/Aceik/Sitecore-Azure-Search-Spatial/tree/master/lib/Package


Notice on the configuration document that it has a list of EDM data types and it also references the document from Microsoft which has the same list.

A notable exception is the that EDM.GeographyPoint is mentioned in the Microsoft document but not the Sitecore EDM list. So if you need to implement a geo-spatial lookup and get back results based on a latitude and longitude that isn’t possible with the Sitecore Azure Search provider out of the box.

If you dig a little deeper you can see that EDM.GeographyPoint is not in the search code. Have a look at the source code in Sitecore.ContentSearch.Azure.Schema.dll CloudSearchTypeMapper.GetNativeType()

We asked Sitecore support when and if EDM.GeographyPoint would be supported by the Sitecore provider and the answer was no plans in the immediate future.

The solution we came up with to solve this problem essentially involved two steps:

  • Getting data of type EDM.GeographyPoint into the Azure Search Index.
  • Performing a Spatial Query on the Azure Search Index (using geo.distance) in a timely manner.

To support both of the above operations we had to do a little digging into the core Sitecore Azure search code. Read on to find out about the solution we came up with or jump on over to the GitHub repository to look over the code.

Getting data into the index

As noted above the EDM.GeographyPoint data type simply isn’t coded into the Sitecore DLLs.  You can see this by looking at CloudSearchTypeMapper.GetNativeType() inside the DLL Sitecore.ContentSearch.Azure.Schema.dll.

If you’re still not convinced have a look in the Sitecore.ContentSearch.Azure.Http.dll

EdmTypes

The next thing we tried was adding Edm.GeographyPoint as a field converter.

Once again we hit a brick wall with this solution as that doesn’t change the fact that Edm.GeographyPoint is not actually a known cloud type in the core DLLs.

If you look at the error message you get when attempting to add Edm.GeographyPoint as a Converter it tells us that CloudIndexFieldStorageValueFormatter.cs is attempting to look up the native EDM type.

Exception: System.NotSupportedException
Message: Not supported type: ‘Edm.GeographyPoint’
Source: Sitecore.ContentSearch.Azure at Sitecore.ContentSearch.Azure.Schema.CloudSearchTypeMapper.GetNativeType(String edmTypeName) at Sitecore.ContentSearch.Azure.Converters.CloudIndexFieldStorageValueFormatter.FormatValueForIndexStorage(Object value, String fieldName) at Sitecore.ContentSearch.Azure.CloudSearchDocumentBuilder.AddField(String fieldName, Object fieldValue, Boolean append)
….

Working around this limitation took a bit of trial and error but eventually, we nailed it down to the following steps:

  1. Created a new search configuration called spatialAzureIndexConfiguration.
  2. Create a new index configuration based on the new configuration in step 1.
  3. In the search configuration from step 1.
    • Add the following computed field.
    • <fields hint="raw:AddComputedIndexField">
       <field fieldName="geo_location" type="Aceik.Foundation.CloudSpatialSearch.IndexWrite.Computed.GeoLocationField, Aceik.Foundation.CloudSpatialSearch" />
       </fields>
    • Add a new cloud type mapper to map to EDM.GeographyPoint
    • <cloudTypeMapper ref="contentSearch/indexConfigurations/defaultCloudIndexConfiguration/cloudTypeMapper">
       <maps hint="raw:AddMap">
       <map type="Aceik.Foundation.CloudSpatialSearch.Models.GeoJson, Aceik.Foundation.CloudSpatialSearch" cloudType="Edm.GeographyPoint"/>
       </maps>
       </cloudTypeMapper>
    • Replace the standard index field storage value formatter with the following the following class.
    • <indexFieldStorageValueFormatter type="Aceik.Foundation.CloudSpatialSearch.IndexWrite.Formatter.GeoCloudIndexFieldStorageValueFormatter, Aceik.Foundation.CloudSpatialSearch">
       <converters hint="raw:AddConverter">
         ...
       </converters>
       </indexFieldStorageValueFormatter>
  4. Our new formatter GeoCloudIndexFieldStorageValueFormatter.cs  inherits from the formatter in the core dlls and overrides the method FormatValueForIndexStorage. It detects if the field name contains “_geo” and basically prevents CloudSearchTypeMapper.GetNativeType from ever running and falling over.

 

The computed field GeoLocationField returns a GeoJson POCO which is serialised to the GeoJson format (http://geojson.org/) which is required for storage in Azure Search Index.

Getting data out of the index

The GitHub solution demonstrates two solutions.

Both solutions above at the end of the day allow you to perform an OData Expression query against your Azure Search Indexes. Using the “geo.distance” filter allows us to perform spatial queries using the computing power of the cloud.

The only downside is that search results returned don’t actually include the distance as a value. They are correctly filtered and ordered by the distance yet the actual distance value itself is not returned. We suggest voting for this change on this ticket :). For now, we suggest using the System.Device.Location.GeoCoordinate to figure out the distance between two coordinates.


References:

In researching for possible solutions already out there we had a look at the following implementations for Lucene and SOLR:

  1. Sitecore Solr Spatial: https://github.com/ehabelgindy/sitecore-solr-spatial
  2. Sitecore Lucene Spatial: https://github.com/aokour/Sitecore.ContentSearch.Spatial

Special Thanks:

  • Ed Khoze –  For the research, he did and help with the initial prototype. Plus google address lookup advice.
  • Jose Dominguez – For getting the OData filtering query working.

 

0 comments on “WFFM 170518 Tips & Tricks”

WFFM 170518 Tips & Tricks

Below you will find some tips and tricks that we found when implementing WFFM for Sitecore 8.2 Update 4.

Version: Web Forms for Marketers 8.2 rev. 170518

  1. You can’t mix Web Forms and the MVC Form rendering
    • 49048 16:14:46 ERROR Rendering control {F2CCA16D-7524-4E99-8EE0-78FF6394A3B3} not found for '/sitecore/content/Home'. Item id: {110D559F-DEA5-42EA-9C1C-8A5DF7E70EF9}, database: master.
    • I hit this problem on a vanilla install of 8.2.4 with WFFM installed. If you try to add this form to the homepage item via the MVC form rendering provided expect this problem.
    • Cause: The sample layout provided for the default homepage uses a .aspx page.
    • Solution: You need to create a layout using MVC and assign to the page you want to add the MVC form rendering to.
  2. SIM (sitecore instance manager) hangs when install Sitecore 8.2.4 and Web Forms for Marketers 8.2 rev. 170518
    • Cause: We are not sure of the exact cause but installing this package with SIM results in the process crashing and burning.
    • Solution:  Install the Sitecore 8.2.4 with SIM and once up and running install the WFFM via the package manager in Sitecore instead.
  3. If your site does not have xDB available, you may want to use an SQL data store for your forms.
    • Not every site requires xDB to be turned on and the additional hosting and license cost may force your hand on this one.
    • If you want to revert back to the SQL database. Locate the DB files installed in the root of your website in the Data folder. Attach the DB’s to your SQL server.
      • The database should contain 2 x tables. WFFM SQL DB
    • Add the database connection string to the file connectionstring.config in the folder: App_Config
      • If your building with Helix create a transform file in your Foundation.Forms project.
      • foundationforms
      • <?xml version="1.0" encoding="utf-8"?>
        <connectionStrings xmlns:xdt="http://schemas.microsoft.com/XML-Document-Transform">
         <add name="wfm"
         connectionString="Data Source=WINDEV;Initial Catalog=Sitecore_Wffm;Integrated Security=False;User ID=sa;Password=itsasecret"
         xdt:Transform="Insert"/>
        </connectionStrings>

         

    • Create a patch config file to turn on sqlFormsDataProvider:
      <configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
       <sitecore>
       <wffm>
       <analytics>
       <formsDataProvider patch:instead="*[@ref='/sitecore/wffm/analytics/analyticsFormsDataProvider']" ref="/sitecore/wffm/analytics/sqlFormsDataProvider"/>
       <sqlFormsDataProvider type="Sitecore.WFFM.Analytics.Providers.SqlFormsDataProvider, Sitecore.WFFM.Analytics">
       <param name="connectionStringName">wfm</param>
       </sqlFormsDataProvider>
       </analytics>
       </wffm>
       </sitecore>
      </configuration>

       

    • Not that when using the SQL Data Provider the tab “Detailed reports” will not work as it requires xDB
    • Image_2017-07-19_09.54.24_AM
  4. If your using the Send Mail Message action, located here: /sitecore/system/Modules/Web Forms for Marketers/Settings/Actions/Save Actions/Send Email Message
    • Your going to want to overwrite the default mail server settings before deploying.
    • Create a patch config file for these settings
    • <configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
       <sitecore>
       <settings>
       <setting name="MailServer" value="smtp.sendgrid.net" />
       <!-- MAIL SERVER USER
       If the SMTP server requires login, enter the user name in this setting
       -->
       <setting name="MailServerUserName" value="apikey" />
       <!-- MAIL SERVER PASSWORD
       If the SMTP server requires login, enter the password in this setting
       -->
       <setting name="MailServerPassword" value="SG.sdsdadXXCASFDa3423dfaXXXXXXXXXXXXXXXXXXXX" />
      
      <setting name="MailServerPort" value="25"/>
       </settings>
       </sitecore>
      </configuration>
    • This is handy in Helix setups.
    • If your using Azure why not take advantage of the Send Grid free tier. You can send a bunchof emails for free.
    • Also note that on the Send Email Message action you will need to set the host as well.
    • hostparameter

 

References:

0 comments on “Setting up Azure Search in Sitecore Helix”

Setting up Azure Search in Sitecore Helix

This applies to those setting up Azure Search to work with a Sitecore Helix architecture based on the Habitat example and using Sitecore 8.2 Update 3.

Depending on your use case Azure Search is a smart option if you’re hosting in the cloud. The main upside it likely has over SOLR is that its run as PaaS and you don’t need to delve into the SOLR setting files in order to enact scaling and replication. Nor does it require a true Virtual Machine running tomcat, that you may at times need to know how to administer. I’m not suggesting that SOLR isn’t a great project (it is) but it does have a learning curve if you need to tweak things under the hood. Azure Search also has its limitations as listed on this page.

If you’re following the setup instructions (like I was) there is a fairly lengthy step that involves you switching over configuration files. Basically, on a default installation, you need to disable/delete all Lucene file system based configuration files and enable a number of Azure configuration files for Azure Search PaaS.

Using Helix and gulp in Visual Studio all members of your team are probably going to need to replicate these setup steps (without doing anything manually in the running IIS developer instance).  So here are two tricks I used:

ConnectionStrings

Under your Foundation.Indexing module I placed a file called:

CaptureConnectionStrings

ConnectionString.config.transform

<?xml version="1.0" encoding="utf-8"?>

<connectionStrings xmlns:xdt="http://schemas.microsoft.com/XML-Document-Transform">
 <add name="cloud.search"
 connectionString="serviceUrl=https://mysearchinstance.search.windows.net;apiVersion=2015-02-28-Preview;apiKey=7CA4...5"
 xdt:Transform="Insert"/>
</connectionStrings>

 

Gulp Config Rename

In order to automate the deletion of all the Lucene config files I wrote the following Gulp task:

/*****************************
 https://doc.sitecore.net/sitecore_experience_platform/setting_up_and_maintaining/search_and_indexing/configure_azure_search#_Create_a_Search
 // If running on Azure Search we need to disable lucene default configs
*****************************/
gulp.task('Z-Search-Disable-Lucene-Configs', function () {

var root = config.websiteRoot + "/App_Config/Include/";
 var socialRoot = config.websiteRoot + "/App_Config/Include/social/";
 var listManagementRoot = config.websiteRoot + "/App_Config/Include/ListManagement/";
 var fxmRoot = config.websiteRoot + "/App_Config/Include/FXM/";
 var testingRoot = config.websiteRoot + "/App_Config/Include/ContentTesting/";

// Main root config folder
 var roots = [root + "Sitecore.ContentSearch.Lucene.DefaultIndexConfiguration.config",
 root + "Sitecore.ContentSearch.Lucene.Index.Analytics.config",
 root + "Sitecore.ContentSearch.Lucene.Index.Core.config",
 root + "Sitecore.ContentSearch.Lucene.Index.Master.config",
 root + "Sitecore.ContentSearch.Lucene.Index.Web.config",

root + "Sitecore.Marketing.Lucene.Index.Master.config",
 root + "Sitecore.Marketing.Lucene.Index.Web.config",
 root + "Sitecore.Marketing.Lucene.IndexConfiguration.config",
 root + "Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Lucene.Index.Master.config",
 root + "Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Lucene.Index.Web.config",
 root + "Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Lucene.IndexConfiguration.config",

root + "Sitecore.Speak.ContentSearch.Lucene.config",
 root + "Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Lucene.Index.Web.config",
 root + "Sitecore.Marketing.Definitions.MarketingAssets.Repositories. Lucene.IndexConfiguration.config",

root + "Sitecore.Speak.ContentSearch.Lucene.config",
 root + "Sitecore.ContentSearch.Lucene.DefaultIndexConfiguration.Xdb.config"
 
 ];




var fxmFiles = [
 fxmRoot + "Sitecore.FXM.Lucene.DomainsSearch.DefaultIndexConfiguration.config",
 fxmRoot + "Sitecore.FXM.Lucene.DomainsSearch.Index.Master.config",
 fxmRoot + "Sitecore.FXM.Lucene.DomainsSearch.Index.Web.config",
 ];




var listManFiles = [
 listManagementRoot + "Sitecore.ListManagement.Lucene.Index.List.config",
 listManagementRoot + "Sitecore.ListManagement.Lucene.IndexConfiguration.config",
 ];




var socialsFiles = [
 socialRoot + "Sitecore.Social.Lucene.Index.Master.config",
 socialRoot + "Sitecore.Social.Lucene.Index.Web.config",
 socialRoot + "Sitecore.Social.Lucene.IndexConfiguration.config",
 socialRoot + "Sitecore.Social.Lucene.Index.Analytics.Facebook.config"
 ];




var testingFiles = [
 testingRoot + "Sitecore.ContentTesting.Lucene.IndexConfiguration.config",
 ];

return Promise.all([
 new Promise(function (resolve, reject) {

gulp.src(roots).pipe(rename(function (path) {
 path.extname = ".config.disabled";
 })).pipe(gulp.dest(root));




gulp.src(fxmFiles).pipe(rename(function (path) {
 path.extname = ".config.disabled";
 })).pipe(gulp.dest(fxmRoot));

gulp.src(listManFiles)
 .pipe(rename(function (path) {
 path.extname = ".config.disabled";
 })).pipe(gulp.dest(listManagementRoot));

gulp.src(socialsFiles).pipe(rename(function (path) {
 path.extname = ".config.disabled";
 })).pipe(gulp.dest(socialRoot));

gulp.src(testingFiles).pipe(rename(function (path) { path.extname = ".config.disabled"; }))
 .pipe(gulp.dest(testingRoot));

}),
 new Promise(function (resolve, reject) {
 del(roots, { force: true });
 del(fxmFiles, { force: true });
 del(listManFiles, { force: true });
 del(socialsFiles, { force: true });
 del(testingFiles, { force: true });
 })
 ]).then(function () {
 // Other actions
 del(roots, { force: true });
 del(fxmFiles, { force: true });
 del(listManFiles, { force: true });
 del(socialsFiles, { force: true });
 del(testingFiles, { force: true });
 });
 
});

 

In order to automate the renaming of all the Azure config files I wrote the following Gulp task:

 

gulp.task('Z-Search-Enable-Azure-Configs', function () {

var root = config.websiteRoot + "/App_Config/Include/";
 var socialRoot = config.websiteRoot + "/App_Config/Include/social/";
 var listManagementRoot = config.websiteRoot + "/App_Config/Include/ListManagement/";
 var fxmRoot = config.websiteRoot + "/App_Config/Include/FXM/";
 var testingRoot = config.websiteRoot + "/App_Config/Include/ContentTesting/";

// Main root config folder
 var roots = [root + "Sitecore.ContentSearch.Azure.DefaultIndexConfiguration.config.disabled",
 root + "Sitecore.ContentSearch.Azure.Index.Analytics.config.disabled",
 root + "Sitecore.ContentSearch.Azure.Index.Core.config.disabled",
 root + "Sitecore.ContentSearch.Azure.Index.Master.config.disabled",
 root + "Sitecore.ContentSearch.Azure.Index.Web.config.disabled",

root + "Sitecore.Marketing.Azure.Index.Master.config.disabled",
 root + "Sitecore.Marketing.Azure.Index.Web.config.disabled",
 root + "Sitecore.Marketing.Azure.IndexConfiguration.config.disabled",
 root + "Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Azure.Index.Master.config.disabled",
 root + "Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Azure.Index.Web.config.disabled",
 root + "Sitecore.Marketing.Definitions.MarketingAssets.Repositories.Azure.IndexConfiguration.config.disabled"
 ];

gulp.src(roots) .pipe(rename(function (path) { path.extname = ""; path.basename = path.basename.replace(".disabled");
 })).pipe(gulp.dest(root));

// Testing files

var testingFiles = [
 testingRoot + "Sitecore.ContentTesting.Azure.IndexConfiguration.config.disabled",
 ];

gulp.src(testingFiles).pipe(rename(function (path) { path.extname = ""; path.basename = path.basename.replace(".disabled"); }))
 .pipe(gulp.dest(testingRoot));
 //del(testingFiles, { force: true });

// FXM files

var fxmFiles = [
 fxmRoot + "Sitecore.FXM.Azure.DomainsSearch.DefaultIndexConfiguration.config.disabled",
 fxmRoot + "Sitecore.FXM.Azure.DomainsSearch.Index.Master.config.disabled",
 fxmRoot + "Sitecore.FXM.Azure.DomainsSearch.Index.Web.config.disabled",
 ];

gulp.src(fxmFiles) .pipe(rename(function (path) { path.extname = ""; path.basename = path.basename.replace(".disabled");
 })).pipe(gulp.dest(fxmRoot));
 
 var listManFiles = [
 listManagementRoot + "Sitecore.ListManagement.Azure.Index.List.config.disabled",
 listManagementRoot + "Sitecore.ListManagement.Azure.IndexConfiguration.config.disabled",
 ];
 
 gulp.src(listManFiles)
 .pipe(rename(function (path) { path.extname = ""; path.basename = path.basename.replace(".disabled");
 })).pipe(gulp.dest(listManagementRoot));

var socialsFiles = [
 socialRoot + "Sitecore.Social.Azure.Index.Master.config.disabled",
 socialRoot + "Sitecore.Social.Azure.Index.Web.config.disabled",
 socialRoot + "Sitecore.Social.Azure.IndexConfiguration.config.disabled"
 ];

gulp.src(socialsFiles).pipe(rename(function (path) {
 path.extname = ""; path.basename = path.basename.replace(".disabled");
 })).pipe(gulp.dest(socialRoot));
});

 

I hope this saves you and your team some time when switching over to Azure Search on your Sitecore Helix project. Please leave us some comments if you have some more tips along these lines.