Friday, February 29, 2008

GAPP performance approach

Yesterday evening I visited an Amis query about performance. Speaker this time was Gerwin Hendriksem from Amis, a real performance freak. He designed his own approach for solving performance problems, and it was a good session. Gerwin used Oracle data mining to spot performance issues. He gathered system data for more then a year and using data mining to analyze and predict the bottlenecks. He succesfully used this approach with at least two customers.
Next week he will be giving the same presentation in the VS at the Hotsos symposium in Dallas. He also gave his approach a name, the GAPP approach.
So, be ware of te GAPP....

Thursday, February 21, 2008

Changing IP and hostname in Application Server 10g

Because we moved our new OEM10gr2 server to the datacenter, the IP address and the hostname changed. When added to the datacenter I had to start all the services again, but first I had to change the opmn configuration. Instead of changing all the config files by hand, I noticed a shell script in the $OMS_HOME/chgip/scripts, called

This script changes all the config files and replaces the ip and hostname. It will ask you for the source and target ip/hostname...

Saved me some time...

Friday, February 15, 2008

Discovering windows host in linux oms...

Yesterday I decided to add a windows host to my Oracle 10gr2 Grid. The 10gr2 Grid control is running on Oracle Enterprise Linux release 4.5
I downloaded the windows agent and installed this agent on the windows machine. After installing I checked my OEM site to see if the target was already discovered....
I did not see the windows host, so I decided to check the windows agent log files. The following error was found :

Thread-7244 ERROR pingManager: nmepm_pingReposURL: Cannot connect to https://oemserver.local:1159/em/upload: retStatus=-12008-02-14 12:45:17 Thread-928 ERROR upload: Error in uploadXMLFiles. Trying again in 900.00 seconds or earlier.2008-02-14 12:45:32 Thread-2924 ERROR upload: Error in uploadXMLFiles

It seems the agent cannot connect to the oms, but why ? I tried uploading the data manually...

D:\oracle\agent\10203\agent10g\BIN>emctl upload
Oracle Enterprise Manager 10g Release 3 Grid Control (c) 1996, 2007 Oracle Corporation. All rights reserved.
EMD upload error: uploadXMLFiles skipped :: OMS version not checked yet..

Checking Metalink brought me the sollution. It had to unsecure the agent...

To unlock the oms you do
OMS_HOME/bin/emctl secure oms unlock

Then restart the agent
AGENT_HOME/bin/emctl start agent

and unsecure it using:
AGENT_HOME/bin/emctl unsecure agent

then issue a clearstate for the agent:
AGENT_HOME/bin/emctl clearstate agent

then attempt the upload
AGENT_HOME/bin/emctl upload agent

And thsi time the upload was succesfull. Checking the status of the agent today showed me the following...

D:\oracle\agent\10203\agent10g\BIN>emctl status agent
Oracle Enterprise Manager 10g Release 3 Grid Control (c) 1996, 2007 Oracle Corporation. All rights reserved.
Agent Version :
OMS Version :
Protocol Version :
Agent Home : D:\oracle\agent\10203\agent10g
Agent binaries : D:\oracle\agent\10203\agent10g
Agent Process ID : 6828
Agent URL : http://agenthost.local:3872/emd/main/
Repository URL : http://oemhost.local:4889/em/upload/
Started at : 2008-02-14 14:36:10
Started by user : SYSTEM
Last Reload : 2008-02-14 14:36:10
Last successful upload : 2008-02-15 10:39:37
Total Megabytes of XML files uploaded so far : 20.17
Number of XML files pending upload : 0
Size of XML files pending upload(MB) : 0.00
Available disk space on upload filesystem : 95.40%
Last successful heartbeat to OMS : 2008-02-15 10:41:37
Agent is Running and Ready

Tuesday, February 05, 2008

Concurrent Manager recovery

This morning I faced an issue with the concurrent managers in a production environment.

After bouncing the database because of a memory fault, I decided also to startup the concurrent managers again.
But after starting them, using the start script in the $COMMON_TOP/admin/scripts, the application showed me no concurrent managers ??
Checking the logfile of the internal manager for errors...

APP-FND-01564: ORACLE error 1000 in afpsmrsc
Cause: afpsmrsc failed due to ORA-01000: maximum open cursors exceeded.
The SQL statement being executed at the time of the error was: &SQLSTMT and was executed from the file &ERRFILE.

The parameter open_cursors seemed high enough, so I looked for another reason. I also checked the v$open_cursor, which showed me a lot of open cursors. I already bounced the database, so that did not solve anything.

I decided to use the 'Concurrent Manager recovery' in Oracle Application Manager.

This recovery proces showed me concurrent managers with an active status in the database, but no active process on the OS or a active connection to the database. Maybe, that's the reason for the open cursor error ?
After cleaning them using this recovery tool, and going throuhg all the steps, I started the concurrent managers again. And this time, they started !

Friday, February 01, 2008

Grid control 10g

The last two days I have been busy installing and upgrading Grid control 10g.
Because of problems wth an older version of OEM 10g , we decided to install a new version (10gr2).
On a new dell server I installed Oracle Enterprise Linux release 4 (update 6). After installing linux, I started the Grid install.
Next step was the upgrade of the oms and agent. I also decided to upgrade the database itself, which is default (even after the upgrade...)
Before upgrading the database, I tested the Grid control. Everything worked fine. Stopped and started the oms/agents a few time to check stability. No problems so far.
I started the upgrade, which means I first had to install the Oracle software in a new $ORACLE_HOME. After this fresh install, upgrade the new ORACLE_HOME to 10.2.03. When the software is patched to the desired level, use the dbua from your new home to upgrade the excisting database. At this point, some problems occured.
At first the dbua could not perform any OEM configuration ?? Checking the logfile only told me to try this I did. Oke, the configuration ended without errors, so lets try to start things up.
I started up the oms, no problems. Next, I started the agent....
Seems the agent would not start, the http server could not be started ? Does thsi has anything to do with my database upgrade ??? I can't see how ?
The logfiles showed me there was a port conflict. Suddenly the agent http server wants to use the same port as the oms http server ? Before the upgrade the could work together without a problem, but suddenly they are fighting over the same 6102 port....
The problem was fixed easy, I just changed the local port in the /oem/oemprd01/oms10g/opmn/conf/opmn.xml file.
After this change, the startup succeeded.
But I still don't understand why this port conflict arrived after the database upgrade ?