Skip to content. | Skip to navigation

Personal tools

Navigation

You are here: Home / weblog

Dominic Cronin's weblog

Showing blog entries tagged as: poetry

Stripping namespace declarations from XML

Posted by Dominic Cronin at Nov 19, 2017 12:30 PM |

I've recently been working on an application that will allow members of our content management teams to search within a chosen folder in Tridion for specific content. You might think that's well enough covered by the built-in search functionality, but we're heading towards a search and replace feature, so we pretty much have to process the content ourselves. In the end users' view of the world, a Rich Text field in a component has... well...  a rich text view, and, for the power-users, a Source tab where you can see the underlying HTML. That's all fine, but once you get to the technical implementation, it's a bit more complicated, and we'll end up replicating some of Tridion's own smoke and mirrors to present a view to the users that's consistent with what they are used to. This means not only that we need to be able to translate from text to HTML, but also from "XML in the XHTML namespace" to HTML. One of the bulding blocks we need to do this is the ability to take XML with namespace declarations, and get rid of them so that the result isn't in a namespace. 

A purist (such as myself) might say that the only correct way to parse XML is with an XML parser, and just in case you've never ended up there, I heartily recommend that you read this answer on Stack Exchange before proceding further. Still - in this case, what I want to do is amenable to RegExes, and yes, I know: now I have two problems. Anyway - FWIW - I started this at the office, thinking I'd just quickly Google for a namespace-stripping regex and I'd be on my way. Suffice it to say that the Internet is rubbish at this. I ended up with a page of links to rubbish regexes that just weren't going to float my boat. So I mailed the problem to myself at home, and today, in the quiet of a Sunday morning, it didn't seem quite so daunting. Actually, I'm still considering whether an XML-parser approach, or an XSLT might not be better, and I may end up there if my needs turn out to be more complex, but for now, here's the namespace stripper. 

static Regex namespaceRegex = new Regex(@"    
xmlns # literal (:[^\s=]+)? # : followed by one or more non-whitespace, non-equals chars \s* # optional whitespace = # literal \s* # optional whitespace (?<quote>['""]) # Either a single or double quote - giving it the name 'quote' for back-reference .+? # Non-greedily match anything \k<quote> # The end-quote to match the one we found earlier ", RegexOptions.Singleline | RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace);
public static string RemoveNamespacesFromDocument(string xml) { return namespaceRegex.Replace(xml, string.Empty); }

Of course, this is written in C#, and I'm taking advantage of the IgnorePatternWhitespace feature in .NET regexes, which allows for the copious comments that might well be necessary if I ever have to actually read this code instead of just writing it. 

But just in case you are hardcore, and all that named matches and commenting fuss is for wusses, here's the TL;DR...

@"(?is)xmlns(:[^\s=]+)?\s*=\s*(['""]).+?\2"

What's not to like? :-) 

One-nine availability

Posted by Dominic Cronin at Aug 16, 2014 09:43 AM |

A couple of weeks ago, this site went down. That happens from time to time. It went down just as I left the country to go on holiday, and it could only be fixed via physical access, so it was down for a week. At least one person has commented that maybe I should stop with this silliness of running my own server on an old Gentoo box in the meter cupboard, and get some proper hosting.

The thing is, that when I started this blog, some years ago now, I went through a detailed requirements analysis, and a full MoSCoWMeh matrix. If you aren't familiar with MoSCoWMeh, this is an enhanced variant of the well-known MoSCoW technique, which also accommodates the needs of private and hobby-run systems.

The requirement for reliable hosting and 5-nines up-time was classified as Meh, and has remained so since. So now you know.

Using helpers in Tridion Razor templating

Today, for the first time, I used a helper in a Razor Tridion template. I'd made a fairly standard 'generic link' embedded schema, so that I could combine the possibility of a component link and an external link in a link list, and allow for custom link text. (Nothing to see here, move along now please.)  However, when I came to template the output, I wanted to have a function that would process an individual link. A feature of Razor templating is that you can define a @helper, which is a bit like a function, except that instead of a return value, the body is an exemplar of the required output. There is also support for functions, so to lift Alex Klock's own example:

@functions {
    public string HelloWorld(string name) {
        return "Hello " + name;
    }
}

and

@helper HelloWorld(string name) {
    <div>Hello <em>@name</em>!</div>
}

will serve fairly similar purposes.

What I wanted to do today, however was slightly different; I didn't want to pass in a string, but a reference to my embedded field. All the examples on the web so far are about strings, and getting the types right proved interesting. I started out with some code like this:

@foreach(var link in @Fields.links){
  @RenderLink(link);
}

So I needed a helper called RenderLink (OK - this might be a very trivial use-case, but a real problem all the same.). But what was the type of the argument? In theory, "links" is an EmbeddedSchemaField (or to give it it's full Sunday name: Tridion.ContentManager.ContentManagement.Fields.EmbeddedSchemaField) but what you get in practice is an object of type "Tridion.Extensions.Mediators.Razor.Models.DynamicItemFields". I'd already guessed this by poking around in the Razor Mediator sources, but after a few of my first experiments went astray, I ended up confirming that with @link.GetType().FullName

Well I tried writing a helper like this:

@using Tridion.Extensions.Mediators.Razor.Models 
@helper RenderLink(DynamicItemFields link){
... implementation
}

but that didn't work, because when you try to call the methods on 'link' they don't exist.

And then, just for fun, of course, I tried

@using Tridion.ContentManager.ContentManagement.Fields 
@helper RenderLink(EmbeddedSchemaField link){
... implementation
}

but that was just going off in an even worse direction. Yeah, sure, that type would have had the methods, but what I actually had hold of was a DynamicItemFields. Eventually, I remembered some hints in the mediator's documentation and tried using the 'dynamic' keyword. This, it turns out, is what you need. The 'dynamic' type lets you invoke methods at run-time without the compiler needing to know about them. (At last, I was starting to understand some of the details of the mediator's implementation!)

@helper RenderLink(dynamic link){
... implementation
}

This may be obvious with hindsight (as the old engineers' joke has it ... for some value of 'obvious') . For now, I'm writing another blog post tagged #babysteps and #notetoself, and enjoying my tendency to take the road less travelled.

TWO roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood

And looked down one as far as I could


To where it bent in the undergrowth;
Then took the other, as just as fair,
And having perhaps the better claim,

Because it was grassy and wanted wear;


Though as for that the passing there
Had worn them really about the same,
And both that morning equally lay

In leaves no step had trodden black.


Oh, I kept the first for another day!
Yet knowing how way leads on to way,
I doubted if I should ever come back.

I shall be telling this with a sigh


Somewhere ages and ages hence:
Two roads diverged in a wood, and I—
I took the one less traveled by,
And that has made all the difference.

-- Robert Frost