Thursday, April 2, 2009

Steps to recover a corrupted WebSphere Portal Server

What would you do if you encounter any of this issues on your Portal Environment ?

your portal is working fine but the next day, suddenly nobody can login

You manage to fix the login but another issue arises. Your WCM Content cannot be seen by both normal users, WPSADMINS group and the WPSADMIN himself

PDM Content also cannot be seen

All applications shows "Application Error, please contact your system administrator for more details"

All the pages that you've created magically is gone

To make the matters worst, the site will be launch on that same week

However, the database still contains the information you need.
These happened to me and my team, and we happily recovered the system. Well after hard work and some combination of advance technique.

This is the summary of the events. I will tell you on the next post how we manage to recover the system.

Portal Version : Non-clustered Websphere Portal 6.0 Express with fixpack integrated with DB2 and Active Directory. SSO via Credential vault to external systems.

We investigated and we found out that the error we are receiving is Account Locked error. We tested the account using Softerra and all the accounts work. However in Portal, all accounts are locked.

We decided to redo the security by running a WPSConfig disable-security and WPSConfig enable-security-ldap

After re-doing the security, all accounts worked but lo and behold, data and contents are missing. Furthermore, all applications are not accessible (error) and , the most painful thing, no pages are accessible, even the Portal Administrator Page

We tried to access via the URL mapping, but no pages are accessible.

We investigated the database and all database are available.

We decided to re-do the enabling of security but it didnt work. (silly yes, but when your in this state, you will do whatever to recover anything.)

Scenario : Data in the database are intact. But Portal is behaving weirdly. First of all, no pages are showing and no pages are accessible and WPSAdmin cannot even access the Administration Page. Secondly, contents are not shown and documents are not shown. Thirdly, it's nearing the launch date.

What we did :

First thing that came to our mind is how are we going to make portal work with all the data inside. We did some troubleshooting but we gave up as there is no point going to the database and find if there's any mapping incorrectly done. So, we decided that, we need to dump all the data.

Based on experience, we should be able to dump the portal by using the migration concept, which is to use the following tools :

XMLAccess to dump the pages and configuration of portal
WPSconfig to dump the Contents
WPMigrate to dump the documents.

Step 1 : I created a file called :ExportRelease.xml based on this link :

Step 2: Run xmlaccess to download the files. I ran :

xmlaccess.bat -in ExportRelease.xml -user wpsadmin -password password -url http://intranet:10038/wps/config -out D:\Backup\PortalConfig.xml

Note that when dumping, use the Portal Port rather than the HTTP Port. HTTP Port produces a timeout which will disconnect the command with the WAS, resulting to an error.

For more information, check this out :

Step 3: Backup the files located at [PortalServer]\deployed\archive. This folder contains all the portlets that was deployed. Note however that updates are not re-deployed on this folder, so you need to re-deploy or update the portlets that has recent changes.

Step 4: Backup the following :

1. [PortalServer]\Installable
2. Themes and Skins

Step 5: For our case, since I'm using my own developed desktop SSO Module, I need it for reference but its not require so I did a backup of the security.xml and other security related XML Files. If in doubt, just backup the wp_profile

Step 6 : Remember to backup your application database. This is not the Portal but your own application database. We missed out this part so we have to manually do a recover. We will have another post on how we did that.

Step 7 : We dump the WCM Content (thanks to my WCM Analysts, Mei mei Oen and Julius Soestrisno). Since we have more than 10 libraries, we have to connect to each of the libraries. This is quite a long process but please be referred to :

Basically the command to dump is : WPSconfig.bat export-wcm-data

If you have multiple libraries, you need to do this for all the libraries. If you can't remember your WCM Library name, you can refer to the dumped file PortalConfig.xml and search for all instances of :


Step 8 : Dump the Documents by running this command :

WPmigrate.bat staging-to-production-pdm60-export

Step 9 : For us, since we're very KIASU, we backup the whole IBM Folder and Database

Step 10 : Uninstall Portal and DB2

Step 11 : Re-install Portal and Db2 with the same fixpack level.

Step 12: Transfer Database to Db2

Step 13 : Re-configure Portal Security with AD.

Step 14 : Test and do a backup of all things so that there's a checkpoint in-case the next steps fail

This is it.. next steps is to restore the settings :

Step 15 : run WPSConfig action-empty-portal to empty the portal configuration

Step 16 : Copy back the /archive folder

Step 17 : Copy back the Installabe apps folder and the rest of files (including theme and skins)

Step 18 : Import the configuration by following this commands :

xmlaccess.bat -in PortalConfig.xml-user wpsadmin -password password -url http://intranet:10038/wps/config

Step 19 : Once done, do a login and test. Voila!! We can see now everything. However, web content is still not there so :

Step 20 : Re-import web content by running the command : WPSconfig.bat import-wcm-data. Ofcourse, follow the instructions on this link :

Step 21: Re-import Documents by running the command : WPmigrate.bat staging-to-production-pdm60-import

Voila!!! Done. We manage to revert back everything. Ofcourse we did some manual things like check the security, update the Portlets,etc. but that's it.

Stay tune next time on this scenario : You forgot to backup your DB2 database and your database was uninstalled, but the Datafiles are still there. How do you recover these data files ?.