Question
| ||||
Answer | ||||
When you configure a Web Content Management (WCM) site to be searchable, a Managed Web Content site content source (in the specified search collection) is created for you with the following parameters:
Portal user id: <copied_from_wcm_site_configuration> Portal user password: <copied_from_wcm_site_configuration> Stop collecting after (min): 30 Stop fetching a document after (sec): 60 Links expire after (days): 7 Remove broken links after (days): 1 Schedulers tab: Scheduled Update every 4 hours You might want to change the default parameter values. For example, if you do not get expected search results, you might try increasing the "Stop collecting after (min)" value because the default 30 minutes might not be long enough for the crawler to get all the WCM content in the site. You may also increase the "Stop fetching a document after (sec)" value. The time interval between the crawler runs must be more than the maximum crawler execution time. The reason is that a crawler cannot be executed if it is currently running. If a crawler job is started while the crawler is running, this execution is ignored and the crawler is only executed at the next scheduled time, provided that it is not running already. Some of the default parameter values for new content sources are configurable in <wp_root>/wcm/shared/app/config/wcmservices/SearchService.properties:
SearchService.RecrawlInterval=4 # Remove broken links after (days) SearchService.BrokenLinksExpirationAge=1 | ||||
Blogged with the Flock Browser