Chasing the new feature

Chasing new Feature

 
As FIM 2010 product wanders into the wild, inevitably more and more Identity Management professionals are working on the projects where they have to make design decisions regarding FIM’s implementation. With an addition of an "App Store" (The Web Portal) programmers are presented with much wider variety of available options. In the last few months I have been seeing, in my opinion, dangerous trend of "chasing the new feature" of the product in favor of more traditional design. Let’s step back and take a look at options that have been available to IdM professionals before FIM 2010 timeframe. I think that clear understanding of features and definition of goals should help an IT professional with identification of appropriate tool(s) to do the job.

Old Times

Since 2003 to 2010 customization of MIIS/ILM-based solution would entail you to write specialized .NET code. There were few predefined places in the product where you could plug your custom code. Most commonly used interface exposed by sync engine was and still is Microsoft.MetadirectoryServices.IMASynchronization;
That interface implementation would provide a programmer with an array of options to control behavior of objects that are coming from a connected directory (including advanced filtering, joining and projection), as well as similar functionality during the export; you could also express your business logic to create new entries in data-sources in the provisioning module(s). So from the year of 2003 to the year of 2010 life of MIIS/ILM professional was relatively easy (or at least well defined and outlined). Once you’ve fully understood the sequence of events and places you can affect the data, you are in the good place. At that point design of the right solution was a matter of properly aligning business requirements and the Sync Engine capabilities: synchronization and provisioning.

Standing by established methodology

Since its conception the Sync Engine was and still is a "state-based" solution, meaning that when we are working in the framework of Sync Engine we are not looking at transactions of objects in each connected data-source, but rather operating with the last-known-state of an entire directory. That rule is an important one to learn and fully embrace. I would argue that most computer systems are not "state-based" and most programming models are based on the assumption that you are working with a transaction. I have seen several projects that I was hired to fix, that were written by seasoned programmers. There were no errors in the code, as such, but rather design/architecture errors deriving from the fact that programmers didn’t take into account the fact that they have been working with state-based system. When "transaction" model is applied to the Sync Engine, it produces rather adverse results. State-base system relies on all of the data being present in the system at the moment when each individual attribute of an individual object is processed. When data is not available a programmer would, naturally, reach-out to an external data-source and access the data suitable for that object. And THAT is a design problem from Sync Engine stand point.
So why the external lookup is so bad, one would ask?

Don’t think outside of the box

Unlike your job interview in the early to mid-90s, when you practically had to say "I am thinking outside of the box" in your resume to be hired (that and an ability to spell word ‘Java’), designing well performing Sync Engine solution was and still is – making sure that you are fitting all data you want to access inside of the box. The Sync Engine box that is. The key word to remember here is "convergence". Data should ‘converge’ when it is fully synchronized and satisfies all business rules/requirements plugged-in into the system. Meaning that all business rules are satisfied and system has no pending changes for the object in question. Therefore having all data "in the box" will make your solution perform better, to be less complex and more manageable.
However unachievable, and "against the grain" some design decision might appear to be, you still should consider every possible option to avoid breaking the rule of "no external calls". So YOU should think outside of the box to make sure that your BOX left to think inside of itself.
Allow me to provide an example. Several years ago I was working on the project where we had to use Unique ID system. This unique ID system was responsible for distributing uniformly formatted IDs to ANY system in the enterprise, regardless of the platform, ownership or any other factor. The ID could be distributed to a system account, an employee, a contractor or an intern. Certain subset of users might never qualify to have an ID in Sync-Engine-managed system, yet they could have an ID in other systems and therefore would have a record in the Unique ID system. The only attributes that system have had exposed to us are ID, isAvaialbe, DateOfAssignment
When I came aboard, on that project, the solution for this "problem" was an external call to the Unique ID system out from Import Attribute Flow in HR Management Agent to get next available ID and mark it as "reserved".
The first problem we have encountered with that design is "orphaned" IDs. Somehow we have "reserved" lots of IDs that have not been actually used by anybody. Troubleshooting revealed that when and if the object would fail to be provisioned, for whichever reason, Sync Engine would faithfully roll-back the transaction, however it would never release the ID that was used during Import Attribute Flow, therefore all consecutive runs will request yet another ID and another and another; as you can guess we have had plenty of "orphaned" IDs at that moment.
I have also seen number of "let’s call out and check for uniqueness" blocks of code within provisioning logic. That kind of practice generally slows down the system to the crawl due to the fact that every synchronization cycle would require system to call-out for every object that passing through the pipe.
If you are still not convinced the most commonly used "trick" of external call is a creation of home directories for roaming profiles. Sync Engine doesn’t come with management agent that would make a file system calls to physically create a "directory" object on the file system. I am not sure why that is, but I suspect that it have something to do with the fact that Microsoft doesn’t use roaming profiles internally. So every time your client asks you to create a directory – you make an external call during export operation. What is the harm is that? Consider following questions: a) Are you creating very first directory on that share? B) What happens with that directory during user de-provisioning? C) What happens if you’ll delete connector space and have to re-provision objects?
As you can see — if you are not managing (really managing) the directory, the record, the row, the ID, or whatever that is you are calling out for – you can’t guarantee convergence of the data, and therefore your solution have greater chance to fail or perform poorly under stress or during disaster (exactly at the time when you would want system to perform as reliably and as fast as possible)

Applying existing patterns on FIM portal

In my IdM 101 presentations to the clients I was often calling the ability to use code in Sync Engine "product’s greatest strength and product’s greatest weakness". With introduction of ‘The Portal’ that statement is more true than ever.
By contrast with the Sync Engine The Portal is a transaction-based system. It is "married" with the state-based Sync Engine by means of "special" management agent that is not quite the same as other management agents. Sync Engine is a delivery vehicle for The Portal and integral part of the product.
In the past few month I have observed a trend of using Portal for operations that is should not be used, in my honest opinion. I was talking with one of consultants in Europe and have heard that instead of trying to create an object with the Sync Engine (as it should be done for all managed objects), "we had one of our guys to write us a Workflow that would call Power Shell that would just create the object on the system for us". Frankly, that particular conversation generated the blog entry you are reading.
I believe that the problem comes with perception that "Hey! I am transaction-based! I can do whatever I want to do. And by the way – look, I can stick my code in this new place called "workflow". Exciting!"… And that is true statement. Portal provides more places to "stick" your custom code than Sync Engine dreamed of. You have several types of workflows, UI, etc. So what is the problem with that?
The problem is that we need to keep in mind that we are implementing an Identity Management solution(s), and not the chasing the most adventures way of creating new software. I am sure it’s cool to write a Workflow in .NET 4.0 with Visual Studio 2010 which will call PowerShell 2.0 which will user WinRM 2 to perform some wonderful operation right after user clicked the submit button. In fact it could be very well justified thing to do, but one should not forget about the data convergence paradigm.
Making external calls form portal is no deferent than making an external calls form the Sync Engine. Yes you can do it, but should you? Discarding previously accumulated knowledge and experience from your MIIS/ILM days is careless. Yes, your toolbox has expanded; your ability to execute some tasks right after your user clicks the submit button doesn’t change your goal to achieve seamless management of identity; the best way to do it is to make sure that your data is fully managed. Your design patterns should follow that rule and find best possible solution(s); even if it is not using trendiest technologies of this month.
Sync Engine is the most mature part of the product; it is a delivery system for your managed objects. There is no shame in using it to the fullest possible extent. Don’t flirt with your data – own it. That might culminate in you writing an extensible management agent, creation of new Metaverse object type, analyzing expected rule entry object of FIM, or configuration of an additional out-of-the-box management agent instance(s) to bring an object/attribute into the realm of managed identity. The overhead in time that you might spend upfront in doing so will pay-off during update of the system, during disaster recovery scenarios or troubleshooting.

How to decide which tool/method to use

The rules that I’ve discerned for myself in the last two years of working with FIM are simple. They are based on two assertions:
a) Portal is a customer-facing workflow-driven application.
b) Sync Engine is the delivery vehicle.
Do I need to persistently manage the object at all times (disaster recovery included) —  it’s a Sync Engine’s job
Do I need to make a decision on whether to allow or deny access to a particular user request — it’s a Portal’s job
And ‘yes’, there are plenty of "gray areas" and ‘no’ there is no definitive answer for every solution, nevertheless these rules helped me in navigating through the architectural decision making process.
I hope this "speaking out loud" entry will help you too
 
Advertisements
  1. DMitry – I hear you mate. I have been trying to put this argument into words for a long time, but I think you put the most compelling case yet for how NOT to do external calls with FIM. I love it :). Now all we need to do is improve the performance overhead of utils.findmventries 🙂

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: