Saturday, June 16, 2007

If you're crawling case-sensitive URLs in your search catalog, such as an Apache-hosted intranet, you'll run into some strange errors. In your crawl log, you might see "Content for this URL is excluded by the server because a no-index attribute.". This implies that there is a no-index or no-follow metadata tag on the content, although there probably isn't. Further, the log will indicate the URLs as being all lower-case, when they aren't...

There is a Microsoft hot fix for this (SharePoint Hotfix, April 12 2007). The KB article is at http://support.microsoft.com/kb/932619. Basically you have to call up Microsoft and ask for the download, which they will send you a link for.

Once you've installed the hotfix, go to your Shared Service provider(s)'s Search Settings. Reset all the crawled content and do a full crawl on your sources. Subsequently, the SharePoint crawler will pick up and crawl the URLs appropriately.

No comments:

Post a comment

Note: only a member of this blog may post a comment.