You are on page 1of 4

SAP Note

 

  98051 - Database Reconnect: Architecture and function
Version   8     Validity: 22.04.2008 - active  

 

Language   English

Header Data
Released On 22.04.2008 20:25:32
Release Status Released for Customer
Component
BC-DB-DBI DB Independent Database Interface
Priority
Recommendations / Additional Info
Category
Help for error analysis

Symptom
This note describes the database reconnect mechanism and provides an overview of its functions. This
note does not deal with technical details, such as profile parameter settings or release
dependencies. Note 24806 in particular covers a great deal of these technical details and informs
you of the latest technical reference.

Other Terms
Reconnect, switch over, high availability, offline backup

Reason and Prerequisites
Contents:
1) Introduction
2) What are the prerequisites for the reconnect mechanism?
3) Functions
4) Profile parameters
5) Restrictions
6) Offline backup
7) Switch over
8) Problem analysis
9) FAQ (Frequently asked questions)
1. Introduction
Every ABAP dialog, batch, update and spool work process sets up its own (private) connection to
the database at the start. If this database connection is interrupted, the work process attempts
to set up a new connection. This process is called "database reconnect".
The advantage offered by the database reconnect without restarting the work process is that a
work process survives, even if reconnecting to the database is unsuccessful. The work process
remains in a special status (reconnect status) in which it always attempts to set up the
database connection again whenever a request is made to the database, for example, the request
from a user who calls a new transaction.
2. What are the prerequisites for the reconnect mechanism?
The reconnect mechanism is activated by default by rsdb/reco_trials > 0 for all kernel releases
as of 3.0D. For details about from which release which default is set, see Note 24806.
The work process uses the database error codes to recognize a situation which requires the
database connection to be set up again. In this case, the work process must analyze whether the
error is a "normal" SQL error or whether the error code signifies a collapsed connection (there
may be connection problems to the database, such as network instability or the function of the
database itself is impaired). The reconnect-relevant return codes are, on the one hand,
predefined in the kernels, and on the other hand, they can be dynamically enabled and
deactivated. If a return code is neither in the static nor in the dymamic reconnect list, no
reconnect occurs. The dynamic list is determined by the profile parameter
rsdb/reco_add_error_codes. Note 24806 provides information about which return codes fall into
the reconnect class (release-dependent). Note 41678 describes the profile parameter
rsdb/reco_add_error_codes.
3. Functions
a) The reconnect scenario is triggered by the breakdown of the database when the R/3 System
is in operation. Those work processes that already started an R/3 transaction before the
database breakdown receive a database (SQL) error.
b) To keep the R/3 System in a consistent status, a rollback is carried out within R/3 and
on the database in which the cancelled transaction is reset. This is also mostly documented
in a short dump. If the work process finds the return code in the reconnect class, the
system first stores a message of the following structure in the developer trace dev_w*:
B ***LOG BYM=> severe db error 12571    work process is in the reconnect state [dbsh    
0649]
A similar message (with BYM) is then entered in the SYSLOG.
c) After the R/3 rollback, the work process attempts several times to set up the database
connection again. Here, the profile parameters rsdb/reco_trials and rsdb/reco_sleep_time
determine the number of attempts and the pauses in between. In the case of short-term
problems (for example, when the network or database capacity is overloaded), the work
process will be able to set up the connection again when the appropriate settings are made.

that is. b) If. 4. The buffer contents of the R/3 system are retained on the application server. Note 24874 addresses this topic. all work processes are in the reconnect status. If the system load is very low and a work process edits a user request hours after the database loss. This therefore does not affect any high priority processes. As a result. for whatever reason. after the database is opened again. However. the application server work processes reconnect themselves on request again to the database. there is no loss in performance (due to the empty. Whenever they are assigned a new user request in this status. in transaction SM50). the futile reconnect attempts have the following effect: M  *** ERROR => ThHdlReconnect: db_reconnect failed (-1) [thxxhead 1304] g) If the database (connection) is available again. the work process stops. yet to be filled buffers) after you restart the application server. e) In addition. you want to restart a work process in productive operation. session terminated". Through the database reconnect without restarting the work processes. they attempt to set up the database connection again before they start processing the request. 5. d) If a work process sets up several database connections to databases that may be different (multiconnect). the WP stops. Use the 31H kernel patch level 84 or higher (Note 92412) if you want to perform offline backups when R/3 is in operation. and check that the following parameters are set to their (default) value: rsdb/reco_sosw_for_db = OFF (important!) rsdb/reco_trials      = 3 rsdb/reco_sleep_time  = 5 You should also choose the setting rsdb/reco_sosw_for_db=OFF together with a high availability solution for this offline backup variant (when R/3 is in use). The informed WPs write the following messages in their developer traces: B  ***LOG BV4=> reconnect state is set for the work process f) From there on. if you perform the offline backup when R/3 activity is at its lowest. session terminated" and the user's session terminates. the system informs all work processes of the application server about the database problem. NO reconnect occurs but an initial connection setup does occur.  It is not in reconnect status and you must manually restart it once you can access the database again (for example. However.0D) ensure correct reconnect behavior for all databases supported. This is documented in the developer trace: B  ***LOG BYY=> work process left reconnect state [dblnk    1467] An equivalent message also appears in the SYSLOG. the reconnect attempt is also successful and the above-mentioned loop is broken. no further steps are triggerred during a connection termination. At present. However. Note: The work process does not end but is ready to receive the next user request. You must implement these actions manually or using a corresponding application logic. h) A work process can restore only one database connection and write a SYSLOG or developer trace message when it executes a request. uncritical and repetitive database accesses determine the missing database connection.  If the database is not available. it is highly probable that internal. it carries out an initial connection setup. you do not have to shut down the application servers. e) The name of the reconnect host must differ from the primary host name. Exception: a) The database error code is not within the reconnect class and must be supplemented with the profile parameter rsdb/reco_add_error_codes (Notes 24806 and 41678) b) You are using a parallel database (for example. the reconnect of this work process and the respective message also only occur at this later time. After the attempts determined in rsdb/reco_trials. you must note that. from the viewpoint of R/3. Offline backup For an off-line backup. The user's session is terminated with this message. the WP remains in the reconnect status and waits for the next request. If an initial connection setup is unsuccessful after three attempts. an offline backup of the database when the R/3 System is in productive operation represents the same "disruption" as an unplanned nonavailability of the database. With the exception of cleaning up (rollback) in R/3 and in the database. This should not be misinterpreted as a malfunction. Restrictions a) At the start or restart of a work process.               You should therefore change reconnect profile parameters only for well-founded reasons. which will . Switch over Switch over or high availability database solutions require a second database host. The database interface therefore cannot determine whether you should/must reschedule a batch job or whether you must repeat an update. the user receives the dialog box "Connection to database lost. a reconnect is not carried out for all the other connections. 6. c) The reconnect mechanism is implemented in the database interface. the system performs a reconnect only for the default connections set up automatically at the start of the work process. 7. In the developer trace. Oracle Parallel Server).d) If the connection setup fails. Profile parameters The default values of the profile parameters should (as of Kernel Release 3. the user involved receives the following dialog box: "Connection to database lost.

ensure that you describe the error scenario as accurately as possible. If problems occur. communication software and operating system: An SQL statement or a connect call up that was issued by the work process hangs..3.1G Version 84 DB:  SQL*NET V2    without additional patches      Oracle 7.51 Service Pack 4      R/3 3.0: Offline Backup with R/3 System running . despite the fact that the database (connection) is working: Does the system continue to write entries concerning the unsuccessful reconnect attempts to the developers trace files of the affected work processes.3  without additional patches 9.) should appear in the dev_w* developer traces. 8. You should pay particular attention to the following points: a) The work process fails to set up the reconnect time and time again. try to determine whether important connect information is perhaps not accessible (anymore) (for example. Problem analysis Reconnect scenarios should basically run as described above. FAQ (Frequently asked questions) Other Attributes Transaction codes HIER SM50 Validity This document is not restricted to a software component or software component version References This document refers to: SAP Notes 1633292   Suspending security session cleanup during offline backups 864267   Code page conversion error after WP has left reconnect state 592393   FAQ: Oracle 459268   Database reconnect: EXEC SQL and work process messages 437362   Composite note ORA-12500 393768   DB2/390: VIPA exploitation is supported. You can use the R/3 tool dpmon to track down signs for this. no longer returns. One cause may be the fact that the database return code does not (yet) belong to the reconnect class (Note 24806). When you restart. So that SAP and the database manufacturer can process the error. This means that the subroutine call of the database library. In this situation. the entries described in the FUNCTIONS section (B ***LOG BYM=> .. It is most likely that you will find the background and cause of the errors only with SAP Support.0F     Kernel 3. A high availability solution does not require you to change the profile parameters. which should trigger the corresponding action. It is therefore important to find the reason for the restart. the work process is unable to break out of this status and must eventually be terminated. The flow of operation for the default setting of the parameter is described in the "Functions" section with the exception that the standby host is active when the work processes set up the connection again. using transaction SM50)? b) The affected work process was (automatically) restarted: This can be seen in the developer traces and SYSLOG. You can use the profile parameter rsdb/reco_add_error_codes (Note 41678) to correct this. even though the database (connection) is available again? Using another tool (for example. The developers traces may also enable such an interpretation. For example: HW:  COMPAQ OS  :  NT 3. a configuration file). In particular. 103754   Database Reconnect: Information strategy of WPs 92412   Reconnect after offline backup 81499   No reconnect after offline backup 65663   Batch/Spool: Database reconnect problem in 30E/30F 41678   Database reconnect: Additive error return codes 34703   3. you may find that the following situation arises in the interaction of database software. If you contact SAP concerning your problem.replace the primary database host if this fails (IP address switch over). Is there a core file in the "work" directory? This may contain information about the cause of the process termination. it is useful to analyze the trace files. Can the work process restore the database connection after restart (for example. you must describe your configuration in this case as accurately as possible. R3trans -d). c) One or all work processes hang: If problems occur or also when the database shuts down. the case that is described in the section 'Restrictions' occurs and no reconnect occurs.

51 Oracle: Work procs. hang in Reconnect 29849   30A. hang in Reconnect 24874   Database Reconnect for Oracle parallel server 24806   Database Reconnect: technical details and settings This document is referenced by: SAP Notes (16) 592393   FAQ: Oracle 864267   Code page conversion error after WP has left reconnect state 1633292   Suspending security session cleanup during offline backups 459268   Database reconnect: EXEC SQL and work process messages 437362   Composite note ORA-12500 393768   DB2/390: VIPA exploitation is supported.51 Oracle: Work procs.29849   30A. 65663   Batch/Spool: Database reconnect problem in 30E/30F 81499   No reconnect after offline backup 92412   Reconnect after offline backup 103754   Database Reconnect: Information strategy of WPs 24806   Database Reconnect: technical details and settings 24874   Database Reconnect for Oracle parallel server 26455   NT 3. 30B: sql error 0 performing CON 34703   3. 30B: sql error 0 performing CON 26455   NT 3.0: Offline Backup with R/3 System running 41678   Database reconnect: Additive error return codes .