Encoded umlauts in file share search results.

If the SharePoint Search Crawler is indexing files on a file share that have german Umlauts (e.g. überprüfungsbedürftig.docx) than you will have problems with produced links, as they will have all of the umlauts encoded.

Encoding special characters in a links makes perfect sense, if you are dealing with a web link. The problem comes with indexing a file share, where the “links” are really only file path. Those paths when encoded make no sense and therefore produce an 404 error.

To fix this you can simply write some javascript code, that intercepts any clicked links that reference a file share and changes the encoded umlauts into their UTF-8 representation.

var linkClick=false;
document.onclick = function(e)
{
    linkClick = true;
    var elemntTagName = e.target.tagName;
    if(elemntTagName=='A' || elemntTagName=='STRONG')
    {
		var targetLink = null;

		if(elemntTagName=='A')
		{
           		var targetLink = e.target.getAttribute("href");
		}
		else if(elemntTagName=='STRONG')
		{
			var targetLink = e.target.parentElement.attributes.getNamedItem("href").value;
		}

		if(targetLink && targetLink.substr(0, 7) === "file://")
		{
			//%C3%BC - ü
			//%C3%9C - Ü
			//%C3%A4 - ä
			//%C3%84 - Ä
			//%C3%B6 - ö
			//%C3%96 - Ö
			var targetLink = targetLink.replace("%C3%BC", "ü").replace("%C3%9C", "Ü").replace("%C3%A4", "ä").replace("%C3%A4", "Ä").replace("%C3%B6", "ö").replace("%C3%96", "Ö");

			window.location.href = targetLink;
		}

        return false;
    }
}

Leave a Reply

Your email address will not be published. Required fields are marked *


six − = 1