Workflow Manager 1.0 Refresh Disaster Recovery (further) Explained

With the release of SharePoint 2013, Microsoft released a new platform for workflows called Workflow Manager (WFM). As of this writing the current version is 1.0 Cumulative Update 3. Unfortunately disaster recovery (DR) for this product is not as straight forward as just setting up database replication.

Following are a list of resources I’ve used to implement disaster recovery:

I found that each of the above references hold vital clues to making DR for WFM work, but none of them had details upon which I was stumbling. There are two basic concepts where I needed to do additional research:

    • Certificates (which ones to use where and how to restore effectively)
    • Changing service accounts and admin groups upon a failover

As pointed out there are plenty of TechNet articles and blogs that talk about how to do WFM Disaster Recovery (DR), so I am not going into detail on the individual steps, but I decided to document my discoveries in hopes that others can benefit from my experiences.

So, at a high level, the basic operation is as follows. I’ll have sections below describing each of the areas where I had concerns:

    • Install production WFM and configure
    • Configure your backup/replication strategy for the WF/SB databases
    • Install WFM in DR
    • Execute the failover process
    • Re-connect SharePoint 2013
    • (Optional) Changing RunAsAccount and AdminGroup

Install Production WFM and Configure

Certificates – AutoGenerate or custom Certs?

Installing WFM 1.0 CU3 is fairly well documented in several places, but the one piece that I feel needs to be called out is regarding certificate configuration. There are options to Autogenerate your certificates (self-signed), to use your own domain certificates, or to use certs acquired from a 3rd party certificate authority. There are some businesses who have no restrictions against self-signed certs, but this will affect your restoration of service in the DR environment. As noted in Spencer’s blog, there are a total of six or seven possible certificates. Auto-generating your WFM certificates will dictate your restoration process in a failover scenario. One reason for this is that the WorkflowOutbound certficate is created with private keys, but they non-exportable.

Configure Your Backup/Replication Strategy for the WF/SB Databases

The key to disaster recovery with WFM (as with many products) is the data store. In this case we are referring to the SQL Server databases. Again, this information is in the related links and there are two things to keep in mind:

  1. You can use pretty much any replication method – backup/restore, mirroring, log shipping — except for SQL Server 2012 AlwaysOn, which is unsupported at this time. It is also crucially important to keep the WF/SB databases backed up as close in time as possible as the content databases in order to preserve the WF instance integrity.

UPDATE: With the release of Workflow Manager CU 4, SQL AlwaysOn is now supported and should be considered as the High Availability/Disaster Recovery solution. You can find information on CU4 here. And you can find installation information here.

  1. You do not need to backup the management databases, WFManagementDb and SBManagementDb, as they will be re-created during the recovery process.

Install WFM in DR

Depending on whether you want a cold or warm standby WFM farm, you will either have already installed the servers or will perform this as part of your recovery process. NOTE: WFM does *not* support a hot standby configuration. There are a couple of keys to your DR installation:

  • You will install the bits on the DR app servers, but you will *not* configure the product at this time.
  • If you are choosing to do a warm standby, then you may also import the necessary certificates ahead of time.
    • If you are using:
      • Auto-generated certificates, then it’s important to know that you need to export/import the Service Bus certificates from Prod to DR and for the Workflow Manager certificates you can auto-generate them in DR (remember you cannot import/export the WF certificates because the private keys are marked as non-exportable)
      • Custom domain certificates, then you will export/import all of them from Prod to DR
  • The Service Bus root certificate should be imported into the LocalMachine\TrustedRootAuthorities store.
  • The other Service Bus certs should be imported into the LocalMachine\Personal store.

Executing the Failover Process

In the event of a disaster (or just a need to failover), the following process is required.

  1. Restore the 4+ SQL databases (WFResourceManagementDb, WFInstanceManagementDb, SBGatwayDatabase, SBMessageContainer01 – n) from prod_SQL to dr_SQL.
  2. Assuming the steps above have been followed to install WFM in DR, then you need to use powershell to restore the SB farm. If you were doing a true ‘cold standby’, then you need to install (but not configure) the SB/WF bits from Web Platform Installer.
  3. Restore the SBFarm, SBGateway, and MessageContainer databases and settings (do this on only one WFM node)
      • The SBManagementDB will be created in DR during this ‘restore’ process
      • The RunAsAccount *must* be the same as the credentials used in production
  1. Again, using powershell, run Add-SBHost on each node of the farm.
  2. If you used auto-generated certificates for the WFFarm in prod, then when you restore the WFFarm you will auto-generate new ones. However this also means that you may need to restore the PrimarySymmeticKey to the new SBNamespace.
  3. At this point, restore the WFFarm using powershell (do this on only one WFM node)
  4. Run Add-WFHost on each node of the farm.

At this point, the new WF Farm should be in a working state. You can test this by navigating to the endpoint in a browser and you should receive output similar to the image below:

clip_image002

Re-connect SharePoint 2013

If WF certificates were re-generated in DR, then you will need to recreate the SharePoint Trusted Root Authority. Export the WF SSL certificate and add it to the SharePoint farm using New-SPTrustedRootAuthority.

Create a new registration to the Workflow farm using Register-SPWorkflowService.

There is a cache of security trusts, so in order to see the change more immediately you will likely need to execute the timer job “Refresh Trusted Security Token Services Metadata feed.” with the following powershell:

Start-SPTimerJob –Identity ‘RefreshMetadataFeed’

(Optional) Changing RunAsAccount and AdminGroup

Summary

The process above should work in most (if not all) scenarios, but I welcome any comments if you encounter problems or challenges. I’ve spent many hours on this over the past 6 months off and on and it’s very possible that I’ve missed something. Smile

I’ll add the last section about changing service accounts once I have the complete set of steps for WF accounts. Service Bus added powershell cmdlets, which makes this easier, but Workflow Manager has not as of yet.

UPDATE: With the release of Workflow Manager CU 4, one can now change the credentials for the Workflow Manager Service with the Set-WFCredentials powershell commandlet. You can find information on CU4 here. And you can find installation information here.

Advertisements

Configuring UserPhotoExpiration for User Profile Photo Sync between Exchange 2013 and SharePoint 2013

Following on my previous post about different user profile photo options for SharePoint 2013, I wanted to expand on some research that I had done for one of my customers in this area regarding the expiration values. There are a couple of scarcely documented* properties that will also affect when a user’s photo is re-synchronized from Exchange 2013 instead of just using the cached photo in SharePoint 2013.

To cover the basics, I used this blog to configure the integration between SharePoint 2013 and Exchange 2013 for user photos. There are a couple of SPWebApplication properties that are set here:

  • UserPhotoImportEnabled – this property defines if SharePoint should import photos from Exchange
  • UserPhotoExpiration – this property defines (in hours) how long the photo in the user photo library of the MySite host should be considered valid before attempting to synchronize a potentially updated photo
  • UserPhotoErrorExpiration – this property tells SharePoint that if encountered an error attempting to retrieve a new photo less than ‘this many’ hours ago, then do not attempt again

These are fairly well-known properties, but there are a couple of others that affect how often or *if* your user photo sync will happen. These additional properties are contained in the web application property bag:

  • DisableEnhancedBrowserCachingForUserPhotos – (default: not present) If this property is set to ‘false’ or is not present, then SharePoint will bypass the blob cache. If the property is set to ‘true’, then SharePoint will check the timestamp and will bypass the blob cache if the timestamp passed in is within 60 seconds of the current time
  • AllowAllPhotoThumbnailsTriggerExchangeSync – (default: not present) If this property is set to ‘false’ or is not present, then SharePoint will only trigger a sync with Exchange if the thumbnail being requested is the Large thumbnail.

So what I want to explain below is a series of steps that I took in my lab to hopefully illustrate how these properties work.

Anne accesses her profile page to change her photo and it properly redirects her to Outlook Web App (OWA) where she can upload her latest professionally taken headshot (photo1.jpg). Upon completion, she returns to her profile page and sees the new photo. She also navigates to OWA and sees the photo there as well.

Adam navigates to Anne’s profile page and sees the recently uploaded photo.

Anne decides that she wants to upload a different photo (photo2.jpg) and does so through OWA instead of through her profile page. In this case photo2.jpg does not show up immediately in her profile page and she and other users are still seeing photo1.jpg; however OWA is showing photo2.jpg.

Why is SharePoint not updating the photo?

Basically, it will depend on the settings above combined with what method was used to change the photo. In the above scenario, Anne changed her photo the second time via OWA. SharePoint has no way to know that the photo was changed until its cached photo expiration (UserPhotoExpiration) value has passed from the first time the photo was changed. Even after the photo expiration has passed, there still has to be some action that triggers for SharePoint to check. In this case, Anne navigating to her profile page (since it’s the large thumbnail) should trigger SharePoint to evaluate the expiration values and if needed re-sync Anne’s photo.

Why wouldn’t I reduce the UserPhotoExpiration value to 0 hours?

I’m sure that in some installations this would not be a problem, but the point of a cache (in this case SharePoint’s user photo library) is to reduce round-trips to the authoritative source of data. You likely do *not* want SharePoint to be contacting Exchange every time someone accesses their profile photo.

How do I set these properties?

In powershell, of course! For the first three above there are examples within this blog, but I’ll duplicate them here along with the other two. To reiterate, these values are for the sake of examples and you should do your own testing to find out what works for your environment:

$wa = Get-SPWebApplication https://my.contoso.lab

$wa.UserPhotoImportEnabled = $true

$wa.UserPhotoExpiration = 6

$wa.UserPhotoErrorExpiration = 1

$wa.Properties.Add(“DisableEnhancedBrowserCachingForUserPhotos”, “true”)

$wa.Properties.Add(“AllowAllPhotoThumbnailsTriggerExchangeSync”, “true”)

$wa.Update()

  

* – I say scarcely documented as I could find very few references to DisableEnhancedBrowserCachingForUserPhotos or AllowAllPhotoThumbnailsTriggerExchangeSync that were related to the topic. I did, however, find one that was helpful in directing my research: http://sharepoint.sigicom.ch/Blog/Beitrag/2/Profile-Picture-Cache. And since I am not fluent in German, I’m thankful for the translation tools we have available on the Internet.