How to Link SES with Oracle Content Server

How to link the Oracle® Secure Enterprise Search (SES 10.1.8.2) with Oracle Content Server (CS 10gR3)

 

By W.ZH, Jan 2009

 

 

This is a looking easy but very very trick task in fact. First, you need to read the Oracle® Secure Enterprise Search Administrator’s Guide, especially the chapter of “Overview of XML Connector Framework” and “Setting Up Oracle Content Server Sources”. Of course, you need to know how to run the Oracle content server and manage it from the admin manage UI.


So if you just try to run by “Setting Up Oracle Content Server Sources”, you will totally be lost at this stage and 100% guarenteed, you cannot link to content server successfully. Please note that there is a very cool bug in the Oracle® Secure Enterprise Search Administrator’s Guide. All the word of the “RSSCrawlerExport” must be replaced by “SESCrawlerExport”. Because the component name RSSCrawlerExport is wrong or it was chanaged in CS part! (SES team and content server team must hate each other deeply from heart.) But unfortunately, you cannot get any doc from the content server doc library to know how to support linking to SES.

 

Yes, there is not any piece of text about the SESCrawlerExport (or RSSCrawlerExport ;-)) from content server doc library. OK, then you need goolge a doc named as “Searching Oracle Content Server (Stellent) with Oracle Secure Enterprise Search 10.1.8”. Read this doc and you will know how to do! How to install the component and link the SES with CS. They do not tell you where to get the SESCrawlerExport.zip, it is in the extras folder of your content server installation package, just search it out from windows or linux box. Copy to your PC and you can install it by this white paper. After run through this doc, 50%, you will fail in the middle.

1. Refer to my article of “How to install and config the SESCrawlerExport in CS“, there are some mismatch of the config info between this doc and the SESCrawlerExport config UI. Such as FeedType, is not in the 10R3 UI to let you config.

2. FeedLoc is a folder name that cannot start with \ or /.

3. Then you can try to run that http://my.host.com/stellent/idcplg?IdcService=SES_CRAWLER_DOWNLOAD_CONFIG&source=<sourceName> to get the configFile.xml file, finally. But , again, please note that you cannot use the RSS_CRAWLER_DOWNLOAD in Oracle® Secure Enterprise Search Administrator’s Guide (Becaue of wrong name). This URL is to verify that SES can get the feed config file from CS. But before you starting doing any SES craw, you need let the CS to create the items’ feeds files and feeds control files for SES. This step call taking a snapshot in CS. So you need to do the snapshot before you can get the feeds and feeds config.

 4. To run a snap shot, you can input the url or you can go the UCM admin UI, there is a menu under the “administration” called “SES crawler export”. Is the snapshot runs in the page of http://my.host.com/stellent/idcplg?IdcService=SES_CRAWLER_EXPORT? Of coz not! Remember, this is the default SESCrawlerExport  packge comes with CS. It does not run….!  So three hours research get this: the reason is because of the YahooUserInterfaceLibrary, its lib files not total be put into the admin web server. Java scripts missed.

 /idc/resources/yui/yahoo/yahoo-min.js
/idc/resources/yui/event/event-min.js

What u need to do is:

Go to your CS server, copy the yahoo and event folder from  /$CSHOME/ibr/weblayout/resources/yui/ to the CS admin UI web folder: /$CSHOME/contentserver/weblayout/resources/yui/

Then you can get java scripts used by that export page. OK. Restart your CS and then try to run… invoke snpashot,,, it runs!

 

So by now, if u follow the “Searching Oracle Content Server (Stellent) with Oracle Secure Enterprise Search 10.1.8”. Step by step, you should able to link the SES with CS, then use the SES search UI to search out the resource in your CS. Please note that some files type, such as pic can not be indexed from SES crawler by default, so you might see nothings can be searched out if you only have pics in CS. You can do more research to find out how to do it. Review the scheduler log from SES will help you check the crawler working status.

 

Advertisements