Skip to content. | Skip to navigation

Personal tools

Navigation

You are here: Home / weblog

Dominic Cronin's weblog

Showing blog entries tagged as: .NET

Stripping namespace declarations from XML

Posted by Dominic Cronin at Nov 19, 2017 12:30 PM |

I've recently been working on an application that will allow members of our content management teams to search within a chosen folder in Tridion for specific content. You might think that's well enough covered by the built-in search functionality, but we're heading towards a search and replace feature, so we pretty much have to process the content ourselves. In the end users' view of the world, a Rich Text field in a component has... well...  a rich text view, and, for the power-users, a Source tab where you can see the underlying HTML. That's all fine, but once you get to the technical implementation, it's a bit more complicated, and we'll end up replicating some of Tridion's own smoke and mirrors to present a view to the users that's consistent with what they are used to. This means not only that we need to be able to translate from text to HTML, but also from "XML in the XHTML namespace" to HTML. One of the bulding blocks we need to do this is the ability to take XML with namespace declarations, and get rid of them so that the result isn't in a namespace. 

A purist (such as myself) might say that the only correct way to parse XML is with an XML parser, and just in case you've never ended up there, I heartily recommend that you read this answer on Stack Exchange before proceding further. Still - in this case, what I want to do is amenable to RegExes, and yes, I know: now I have two problems. Anyway - FWIW - I started this at the office, thinking I'd just quickly Google for a namespace-stripping regex and I'd be on my way. Suffice it to say that the Internet is rubbish at this. I ended up with a page of links to rubbish regexes that just weren't going to float my boat. So I mailed the problem to myself at home, and today, in the quiet of a Sunday morning, it didn't seem quite so daunting. Actually, I'm still considering whether an XML-parser approach, or an XSLT might not be better, and I may end up there if my needs turn out to be more complex, but for now, here's the namespace stripper. 

static Regex namespaceRegex = new Regex(@"    
xmlns # literal (:[^\s=]+)? # : followed by one or more non-whitespace, non-equals chars \s* # optional whitespace = # literal \s* # optional whitespace (?<quote>['""]) # Either a single or double quote - giving it the name 'quote' for back-reference .+? # Non-greedily match anything \k<quote> # The end-quote to match the one we found earlier ", RegexOptions.Singleline | RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace);
public static string RemoveNamespacesFromDocument(string xml) { return namespaceRegex.Replace(xml, string.Empty); }

Of course, this is written in C#, and I'm taking advantage of the IgnorePatternWhitespace feature in .NET regexes, which allows for the copious comments that might well be necessary if I ever have to actually read this code instead of just writing it. 

But just in case you are hardcore, and all that named matches and commenting fuss is for wusses, here's the TL;DR...

@"(?is)xmlns(:[^\s=]+)?\s*=\s*(['""]).+?\2"

What's not to like? :-) 

Why the "new" Tridion events system is a game-changer

Posted by Dominic Cronin at Sep 14, 2013 07:45 PM |
Filed under: , ,

When SDL released Tridion 2011, a lot had changed. So much so, that the introduction of a new Events system was almost unremarkable. After all, they had to replace the old one, so there was a new one. Nothing to see here, move along now please. Most of the effort in those days went into a flurry of upgrades and ports of old-style events systems to the new architecture. So you might be forgiven if you hadn't ever stopped to think just how much of a difference the new architecture makes. Specifically - we now subscribe to events using a mechanism based  on .NET multicast delegates. This has a couple of consequences.

Firstly, we are freed from the need to write dispatchers. To implement an events system with the old "COM+"-based system, you would implement an interface containing all the event handler methods, and register your implementation with a specific COM ProgID. Tridion would ask COM+ to instantiate an object of that ProgID, and merrily call into whichever of the interface methods were configured to be called. This meant there could only be one implementation. All your functionality had to be in that implementation, even if different parts of your system had different requirements.  So if, for example, you were using Tridion for your Internet site and for your intranet, or for whatever other reason you were running diverse sites, then you'd need a dispatcher. This would be a simple events system implementation that did nothing more than pass on the calls to one of several different implementations, usually depending on configuration. So calls coming from your Internet publications would go to one DLL, and the ones from your intranet would go to another, but Tridion itself would only see one interface: that of your dispatcher. This was quite a pain. You could separate out different concerns this way, but you wouldn't want to do more than carving it up into very big chunks. Like I said - Internet and intranet, or maybe different customers or departments. Nothing more fine-grained than that anyway. The new events system meant we didn't need to have a dispatcher any more, and the "configuration" could mostly be baked into the code itself.

For myself, (and I suspect for others), this was such a relief that it was enough. It wasn't until some time later that I realised that it was just a beginning. We'd got so used to limiting ourselves to big chunks that it didn't really sink in that we could really start slicing things up. The game-changer I referred to in the title of this piece is exactly that. We can slice it up as small as we want. OK - big deal, you might say - but if we can slice it up arbitrarily, then we can write an events system implementation for a single concern. And that means [ta-da!!] that we can start making re-usable modules that can just be "dropped in" on whatever project needs them. I recently wrote a Component Save event handler that enforces height and width constraints on multimedia components. It does one thing - that's all, so I can use it whenever I have that need. When I went to configure it, I noticed that on my research system I already have three other events handlers registered. These are all from Tridion, and belong to Audience Manager, UGC, and External Content Library respectively. Without looking, I don't know or care whether any of them subscribes to the Initiated phase of a Component Save. They can all co-exist.

So now I'm looking forward to seeing a lot more (small and useful) events systems made available in the community - the days are gone when an events system only made sense for a single implementation.

Getting IIS Express to run in a 64 bit process, and other fun Tridion content delivery configurations

Posted by Dominic Cronin at Jul 24, 2013 07:55 PM |

In the last couple of days, I've spent far more time than I'd like figuring out how to get a Tridion-based web application to run correctly under Visual Studio. There are three basic choices:

  1. Run it directly using Visual Studio
  2. Run it using IIS Express
  3. Run it using IIS (non-Express version)

As the application is intended to run on a 64 bit architecture, there are some challenges. Visual Studio runs in 32 bit mode, so the first option is out. Using full-on IIS is an attractive thought; you can manually configure the application pool to run in 64 bit mode. Unfortunately, getting a debug session up and running takes more configuration than that. You have to set up the web site correctly, and it was just too fiddly. I ran out of time, or steam or whatever. (Somebody will probably tell me it's easy, and I dare say it is when you know how, and aren't spending time you really should be spending on something else. Any hints are always welcome.)

Of course, with a Tridion site, half the game is making sure you have the correct DLLs in place for the processor architecture you are using. Along the way, I discovered that the quick and dirty way to tell if you have a 32 or 64 bit version of xmogrt.dll (Juggernet's "native" layer) is the size. The 64 bit version comes in at 1600KB and the 32 bit version is about half that at 800KB or so. This varies from version to version, so on a 2013 system, it's 1200ish/900ish KB, but once you get the hang of it, you can tell them apart at sight, which is pretty useful.  The other DLLs are also important, although as far as I can tell, only Tridion.ContentDelivery.AmbientData.dll is hard-compiled for 64 bit architecture, at least on the 2011 system I was working on. The rest of the .NET assemblies are compiled to MSIL, which of course, will run on either architecture.

But I digress. The thing I wanted to blog (and this will definitely be tagged note-to-self) was how to get IIS Express to run in 64 bit mode. By default it runs on 32 bits, but if you follow this link:

http://visualstudio.uservoice.com/forums/121579-visual-studio/suggestions/3254745-allow-for-iis-express-64-bit-to-run-from-visual-st

... you will find the following nugget of goodness:

You can configure Visual Studio 2012 to use IIS Express 64-bit by setting the following registry key:

reg add HKEY_CURRENT_USER\Software\Microsoft\VisualStudio\11.0\WebProjects /v Use64BitIISExpress /t REG_DWORD /d 1

However, this feature is not supported and has not been fully tested by Microsoft. Improved support for IIS Express 64-bit is under consideration for the next release of Visual Studio.

Very handy indeed. Running under IIS express is just one click of the button. Just works.

And by way of a PS. (Post Script that is, not PowerShell) here's how you find the processor architecture of a DLL (This time on my 2013 image.)

PS C:\inetpub\www.visitorsweb.local\bin> [reflection.assemblyname]::GetAssemblyName((resolve-path '.\Tridion.ContentDelivery.AmbientData.dll')).ProcessorArchitecture
MSIL

Well anyway - it's no fun scratching your head over stuff like this. Maybe this helps.

Debugging 64 bit Tridion content delivery on IIS 7.5

I'm currently developing a web application which will run on Windows 2008 R2 and which is intended to run in a 64bit Application pool. This means that I'm running IIS 7.5, and that the web application is installed with the 64 bit versions of the Tridion content delivery assemblies. As you'll know if you've tried to run this kind of web application in a 32 bit process, you pretty soon get exceptions telling you that you have an invalid format. This gets a little inconvenient if you just start to debug your web application in Visual Studio. By default, if you have a page selected, and hit the big green Run triangle, the page will launch in IIS Express. If you have IIS 7.5, then IIS Express runs a 32 bit process, so the default setup just isn't going to work for you.

So - what to do? I had two options:

  1. Configure the properties of the web application to debug using IIS rather than IIS Express
  2. Launch the web page directly from the browser, and attach the debugger to the correct w3wp.exe process.

 

To be honest, the second of these was the choice that most matched my usual debugging approaches. Having said that, I did try the first approach, but so far without success. Visual Studio 2012 has frozen on me a few times while trying this. I'm interested if anyone has any tips on getting this working, but right now, I'm happy enough that I was able to succeed in attaching a debugger to w3wp.exe.

My biggest challenge was to figure out which process I wanted to attach to. On my development server, I have quite a few web sites running, and it's not altogether obvious which w3wp.exe to attach to. Attaching to them all might work in a trivial case, but realistically, it takes quite a while to load all the dlls, and adding any more processes than necessary is just going to hurt too much. So - how do you find out which process it is?

The first step is to ensure you have the IIS powershell provider installed on your server. These days, this is shipped as a module, so if it's available on your system, you should be able to open a powershell and type:

Get-Module -ListAvailable

If the response includes "WebAdministration" you are good to go. Just import the module as follows:

Import-Module WebAdministration

If this succeeds, you should be able to "change directory" into the IIS provider. (Although a PowerShell purist might prefer set-location... whatever floats your boat!)

cd IIS: 

If you can't find the module, then go into the Server manager, and check that you have the relevant role services for IIS installed. On other platforms, you might find that you can install it from the WebInstaller from the MSDN web site.

Now you're ready to find the process that you want to attach to: Assuming that your application pool is called "MyApplicationPool", then you can list its worker processes like this: (or use "dir" or "ls", either of which is an alias for "gci")

> gci IIS:\AppPools\MyApplicationPool\WorkerProcesses
Your output should look something like this:
Process  State      Handles  Start Time
Id
-------- -----      -------  ----------
2608     Running    776      1/2/2013 6:55:33 PM

This assumes, of course, that your app pool is actually running, but you'd have made sure it was before trying to debug it, right. Anyway - as you can see, the process id is there just to read off, and you can get straight on with your debugging session.

A poor man's Component synchroniser - or using the .NET framework to run XSLT from the PowerShell

Posted by Dominic Cronin at Aug 12, 2012 07:10 PM |

Just lately, I've been doing some work on porting the old Component Synchroniser power tool to the current version of Tridion. If you are familiar with the original implementation, you might know that it is based on a pretty advanced XSLT transformation (thankfully, that's not the part that needs porting), that pulls in data about the fields specified by the schema (including recursive evaluation of embedded schemas), and ensures that the component data is valid in terms of the schema. Quite often on an upgrade or migration project, any schema changes can be dealt with well enough by this approach, but sometimes you need to write a custom transformation to get your old component data to match the changes you've made in your schema. For example, the generic component synchroniser will remove any data that no longer has a field, but if you add a new field that needs to be populated on the basis of one of the old fields, you'll be reaching for your favourite XSLT editor and knocking up a migration transform.

This might sound like a lot of work, but very often, it isn't that painful. In any case, the XSLT itself is the hard part. The rest is just about having some boilerplate code to execute the transform. In the past, I've used various approaches, including quick-and-dirty console apps written in C#. As you probably know, in recent times, I've been a big fan of using the Windows Powershell to work with Tridion servers, and when I had to fix up some component migrations last week, of course, I looked to see whether it could be done with the PowerShell. A quick Google led me (as often happens!) to Scott Hanselman's site where he describes a technique using NXSLT2. Sadly, NXSLT2 now seems to be defunct, and anyway it struck me as perhaps inelegant, or at the least less PowerShell-ish to have to install another executable, when I already have the .NET framework,, with System.Xml.Xsl.XslCompiledTransform, available to me.

I've looked at doing XSLT transforms this way before, but there are so many overloads (of everything) that sometimes you end up being seduced by memory streams and 19 flavours of readers and writers. This time, I remembered System.IO.StringWriter, and the resulting execution of the transform took about four lines of code. The rest of what you see below is Tridion code that executes the transform against all the components based on a given schema. Sharp-eyed observers will note that in spite of a recent post here to the effect that I'm trying to wean myself from the TOM to the core service, this is TOM code. Yup - I was working on a Tridion 2009 server, so that was my only option. The good news is that the same PowerShell/XSLT technique will work just as well with the core service.

$tdse = new-object -com TDS.TDSE

$xslt = new-object System.Xml.XmlDocument
$xslt.Load("c:\Somewhere\TransformFooComponent.xslt")
$transform = new-object System.Xml.Xsl.XslCompiledTransform
$transform.Load($xslt)
$sb = new-object System.Text.StringBuilder
$writer = new-object System.IO.StringWriter $sb
filter FixFooComponent(){
$sb.Length = 0
$component = $tdse.GetObject($_, 2)
$xml = [xml]$component.GetXml(1919)
$transform.Transform($xml, $null, $writer)
$component.UpdateXml($sb.ToString())
$component.Save($true)
}
$schema = $tdse.GetObject("/webdav/SomePub/Building%20Blocks/System/Schemas/Foo.xsd",1)
([xml]$schema.Info.GetListUsingItems()).ListUsingItems.Item | ? {$_.Type -eq 16}| %{$_.ID} | FixFooComponent

Why should your Tridion GUI extension 'model' have it's own service layer on top of the core service?

Posted by Dominic Cronin at Aug 08, 2012 06:49 PM |

I've spent some time lately looking at the architecture for the next phase of implementing the Component Synchroniser for the Tridion Power Tools project. This meant looking through most of the other power tools, because, of course, they are a great resource for anyone wanting to build a Tridion GUI extension. The down side of this is sometimes, reading the code, you can observe a pattern being used, but it can be hard to tell why this would be a good or bad design. I'd noticed that the model of pretty much every power tool is implemented as a WCF service, often acting as a very thin wrapper around the core service client. As I was wondering about this, I posed the following question in the private chat channel used by the Tridion MVPs and community builders:

So if you're doing a gui extension, is it reckoned to be bad form to access the core service directly from your aspx. Or is it just coincidence that most (all?) of the power tools have an additional service layer?

This was enough to spark quite an informative debate, and in keeping with the spirit of the thing, I promised to write it up for general consumption. The contributors were Frank van Puffelen, Nuno Linhares, Peter Kjaer and Jeremy Grand-Scrutton.

The general feeling was that you ought to stick to the pattern I had observed in the power tools. The reasons were as follows:

  • Ease of coding - The Anguilla framework can automatically generate a JavaScript proxy for your service.
  • Maintainability - if you talk directly from JavaScript to the core service, you will not get any compile-time checks, whereas your own service layer would be built in .NET and would therefore have some defences against future (likely) changes in the core service client.
  • Consistency with the rest of the CME - In the CME, views are typically considered fully client-side. Where the CME does use Aspx, this is only to generate some HTML on the server, and typically not to for implementation logic.
  • Known issues -  ASP.NET postbacks in Anguilla views have been known to cause problems for some people, since e.g. popups won't keep their state through a postback (or an F5 press for that matter).

 

According to  these criteria, the actual design I was looking at could use the core directly, as my idea was to generate some HTML. In practice, it turns out that there are other reasons to stick with the extra service layer. Even so, I'm very glad I asked the question, and that the answers I got were so informative. Thanks guys!

How to set up the location of your .NET .config file when you're doing COM+ interop

Posted by Dominic Cronin at Jul 31, 2006 10:00 PM |
Filed under:

How to set up the location of your .NET .config file when you're doing COM+ interop

Today I was struggling with trying to get my .NET application to pick up it's config file. I'd tried creating a .config in the same directory as the application dll, but it didn't work. This was because the application in question is activated via COM+ interop using the /codebase setting. (In other words, the COM+ settings in the registry tell the COM+ loader to use the CLR, and additionally specify the location of the assembly DLL for the benefit of the .NET loader.) This means that as far as .NET is concerned, the directory isn't the "base" of the application.

 

This was a tough problem, and I might have ended up doing one of two ugly things to solve it:

  • Creating a %windir%/system32/dllhost.exe.config file
  • Adding my configuration settings to machine.config

Either of these would have polluted a much wider area than I'd have liked.

Then I came across a blog entry by Rinat Shagisultanov which gave me sufficient of a clue to find a much better solution. The basic idea is this:

  • Configure the application in COM+ (if necessary adding sufficient System.EnterpriseServices goo to make this work)
  • Create a directory and point to it from the ApplicationRootDirectory setting of the COM+ application
  • Create a .manifest file

Now the CLR will regard that directory as the base of your application, and look there for the config file.

 

Thanks Rinat.

Pulling website information out of the IIS metabase

Posted by Dominic Cronin at May 24, 2006 10:00 PM |
Filed under: , , ,

Pulling website information out of the IIS metabase

Someone recently asked me how to find the URL where the Tridion user interface is running. The idea was to automate the set-up for Tridion Site-Edit. The following snippet of code doesn't solve their problem, but for me it was a bit of fun exploring how to pull up this data out of the IIS metabase using C#. Although this blog entry is categorised as "Tridion", this code isn't particularly Tridion-specific. It's sufficient to show that it's not particularly painful to work with IIS programatically. You can also write to the metabase using similar techniques.

 

using System;
using System.Collections.Generic;
using System.Text;
using System.DirectoryServices;

namespace Hinttech.Dotnet.Samples.WebSiteDumper
{
    class Program
    {
        static void Main( string[] args )
        {
            using (DirectoryEntry webServers = new DirectoryEntry("IIS://localhost/W3SVC"))
                {
                foreach( DirectoryEntry server in webServers.Children)
                {
                    PropertyValueCollection serverComment = server.Properties["ServerComment"];
                    if ( serverComment.Value != null && serverComment[0].ToString() == "Tridion Content Manager")
                    {
                        Console.WriteLine("The Tridion web site is running on: ");
                        Console.WriteLine("===================================");
                        // If you want the https sites too, you need to do the same thing for "SecureBindings"
                        foreach ( string serverBinding in server.Properties["ServerBindings"] )
                        {
                            string[] serverBindingParts = serverBinding.Split(':');
                            string ipAddress = serverBindingParts[0];
                            string port = serverBindingParts[1];
                            string hostHeader = serverBindingParts[2];
                            if (string.IsNullOrEmpty(ipAddress))
                            {
                                ipAddress="Default";
                            }
                            Console.WriteLine(
                                "\tIP Address = {0}\n\tPort= {1}\n\tHostHeader= {2}",
                                ipAddress, port, hostHeader );
                            Console.WriteLine("===================================");
                        }                                            
                        server.Dispose();
                        break;                        
                    }
                server.Dispose();
                }
	    }

#if DEBUG
            Console.ReadKey();
#endif
        }
    }
}