Search

Sunday, June 19, 2016

Resolve DBA_2PC_PENDING Entries : Distibuted Transactions

Resolve DBA_2PC_PENDING Entries issues in case of distributed transactions in Oracle Database


Disscusion points:

1. Problem Description

2. Solutions

   2.1 Dba_2pc entries without a corresponding transaction

   2.2 Distributed transaction without corresponding dba_2pc entries

   2.3 How to PURGE the DISTRIBUTED transaction in PREPARED state, when COMMIT
       or ROLLBACK FORCE hangs ?, where we have entries for both Distributed transaction and dba_2pc entries.

1. Problem Description:

As a result of a failed commit of a distributed transaction, some entries can be left in dba_2pc views, i.e.
dba_2pc_pending and dba_2pc_neighbors. The RECO process checks these views to recover the failed txn. However, in some cases RECO cannot perform the recovery. One cause is that all sites involved in the
 transaction not being  accessible at the same  time. Another cause is dba_2pc views being  inconsistent with the  transaction table, which  is the topic of this article. This cause can further be classified as follows:


   1. dba_2pc views have entries for a non-existent distributed transaction
   2. There is a distributed transaction for which there are no entries in dba_2pc views
   3. How to PURGE the DISTRIBUTED transaction in PREPARED state, when COMMIT
       or ROLLBACK FORCE hangs ?, where we have entries for both Distributed transaction and dba_2pc entries.

Solutions to each subclass is provided in the rest of the article.

2. Solutions:
2.1 Dba_2pc entries without a corresponding transaction. In this case dba_2pc views show distributed transactions but there are no txns in reality. If the state of the transaction is committed,  rollback forced or
commit forced then this is normal and it can be cleaned up using dbms_transaction.purge_lost_db_entry

However, if the state of the transaction is PREPARED and there is no entry in the transaction table for it then this entry can be cleaned up manually as follows:

SQL> set transaction use rollback segment SYSTEM;
SQL> delete from sys.pending_trans$ where local_tran_id = ;
SQL> delete from sys.pending_sessions$ where local_tran_id = ;
SQL> delete from sys.pending_sub_sessions$ where local_tran_id = ;

SQL> commit;

Example: The following query reports a dist. txn. in prepared state

SQL> select local_tran_id, state from dba_2pc_pending;

           LOCAL_TRAN_ID          STATE
           ---------------------- ----------------
           29.32.93725             prepared

Given that a transaction id is composed of  triple, '29.32.93725' is located in rollback segment# 1. To find out the list of  active transactions in that rollback segment, use:

SQL>  SELECT KTUXEUSN, KTUXESLT, KTUXESQN, /* Transaction ID */
             KTUXESTA Status, KTUXECFL Flags
      FROM x$ktuxe
      WHERE ktuxesta!='INACTIVE'
      AND ktuxeusn= 1; <== this is the rollback segment#

     no rows selected

It is not possible to rollback force or commit force this transaction.

SQL>  rollback force '29.32.93725';

ORA-02058: no prepared transaction found with ID 29.32.93725

Hence, we have to manually cleanup that transaction:

set transaction use rollback segment SYSTEM;

SQL>  delete from sys.pending_trans$
 where local_tran_id = '29.32.93725';
SQL>  delete from sys.pending_sessions$ where local_tran_id = '29.32.93725';
SQL>  delete from sys.pending_sub_sessions$ where local_tran_id = '29.32.93725';
SQL>  commit;

2.2 Distributed transaction without corresponding dba_2pc entries

In this case dba_2pc views are empty but users are receiving distributed txn related errors, e.g. ORA-2054, ORA-1591. Normally such a case should not appear and if it is reproducible a bug should be filed. Here is the list of several alternative solutions that can be used in this case:

     a. Perform incomplete recovery
     b. Truncate the objects referenced by that transaction and import them
     c. Use _corrupted_rollback_segments parameter to drop that rollback segment
     d. Insert dummy entries into dba_2pc views and either commit or rollback force the distributed transaction.

The first three solutions are discussed in Backup and Recovery manuals and in  the notes referred above. In the 4th solution a dummy entry is inserted into  the dictionary so that the transaction can be manually committed or rolled back.

Note that RECO will not be able to process this txn and distributed txn recovery should be disabled before using this method. Furthermore, please take a BACKUP of your database before using this method.

The following example describes how to diagnose and resolve this case. Suppose that users are receiving

ORA-1591: lock held by in-doubt distributed transaction 29.32.93725 and the following query returns no rows:

SQL> select local_tran_id, state from dba_2pc_pending where local_tran_id='29.32.93725';
     no rows selected

Furthermore querying the rollback segment shows that 29.32.93725 remains in prepared state

SQL> SELECT KTUXEUSN, KTUXESLT, KTUXESQN, /* Transaction ID */
           KTUXESTA Status, KTUXECFL Flags
    FROM x$ktuxe
    WHERE ktuxesta!='INACTIVE'
     AND ktuxeusn= 1;  /* <== Replace this value with your txn undo seg#

Which is displayed in the first part of the transaction ID */

     KTUXEUSN   KTUXESLT   KTUXESQN STATUS           FLAGS
    ---------- ---------- ---------- ---------------- ------------------------
             1         92      66874 PREPARED         SCO|COL|REV|DEAD

Trying to manually commit or rollback this transaction
SQL> commit force '29.32.93725';

ORA-02058: no prepared transaction found with ID 29.32.93725

raises ORA-02058 since dba_2pc views are empty. In order to use commit force or  rollback force a dummy record should be inserted into pending_trans$ as follows:

SQL> alter system disable distributed recovery;
    insert into pending_trans$ ( LOCAL_TRAN_ID, GLOBAL_TRAN_FMT,
        GLOBAL_ORACLE_ID, STATE, STATUS, SESSION_VECTOR,
        RECO_VECTOR, TYPE#, FAIL_TIME, RECO_TIME)
    values( '29.32.93725', /* <== Replace this with your local tran id */
        306206,                  /*                                         */
        'XXXXXXX.12345.1.2.3',   /*  These values can be used without any    */
        'prepared','P',          /*  modification. Most of the values are   */
        hextoraw( '00000001' ),  /*  constant.                              */
        hextoraw( '00000000' ),  /*                                         */
        0, sysdate, sysdate );

    insert into pending_sessions$
    values( '29.32.93725',/* <==Replace only this with your local tran id */
        1, hextoraw('05004F003A1500000104'),
        'C', 0, 30258592, '',
        146
      );

   commit;
   commit force '29.32.93725';

 If commit force raises an error then note the errormessage and execute the following:

SQL> delete from pending_trans$ where local_tran_id='29.32.93725';
SQL> delete from pending_sessions$ where local_tran_id='29.32.93725';
SQL>   commit;
SQL> alter system enable distributed recovery;

Otherwise run purge the dummy entry from the dictionary, using  
SQL> alter system enable distributed recovery;
SQL> connect / as sysdba
SQL> COMMIT;

Use following query to retrieve the value for such _smu_debug_mod parameter:
col Parameter for a20
col "Session Value" for a20
col "Instance Value" for a20

SQL> SELECT a.ksppinm "Parameter",b.ksppstvl "Session Value",c.ksppstvl "Instance Value"
FROM x$ksppi a, x$ksppcv b, x$ksppsv c
WHERE a.indx = b.indx AND a.indx = c.indx AND a.ksppinm = '_smu_debug_mode'
/

-- set it temporarily to 4:
SQL> alter system set "_smu_debug_mode" = 4;  /* if automatic undo management
is being used */

2.3 How to PURGE the DISTRIBUTED transaction in PREPARED state, when COMMIT
 or ROLLBACK FORCE hangs ?, where we have entries for both Distributed
 transaction and dba_2pc entries.

ORA-01591: lock held by in-doubt distributed transaction 19.8.406254

The row exist from dba_2pc_pending & Rollback segment

SQL> SELECT LOCAL_TRAN_ID,STATE FROM DBA_2PC_PENDING;
LOCAL_TRAN_ID STATE
----------------- -----------
19.8.406254 prepared

SQL> SELECT KTUXEUSN, KTUXESLT, KTUXESQN, /* Transaction ID */
KTUXESTA Status, KTUXECFL Flags
FROM x$ktuxe
WHERE ktuxesta!='INACTIVE'
AND ktuxeusn= 44; /* <== Replace this value with your txn undo seg#

Which is displayed in the first part of the transaction ID */

KTUXEUSN KTUXESLT KTUXESQN STATUS FLAGS
---------- ---------- ---------- ---------------- ------------------------
44 88 85589 PREPARED SCO|COL|REV|DEAD

SQL> Commit force 19.8.406254;
SQL> rollback force 19.8.406254;
Executing COMMIT or ROLLBACK FORCE hangs :
The wait event is ""free global transaction table entry"
Purging the transaction should fail with below error:

SQL> EXEC DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('19.8.406254');

BEGIN DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('19.8.406254'); END;
*
ERROR at line 1:
ORA-06510: PL/SQL: unhandled user-defined exception
ORA-06512: at "SYS.DBMS_TRANSACTION", line 94
ORA-06512: at line 1

Solution:
--------
You have to implement both the solution :
2.1 Dba_2pc entries without a corresponding transaction
2.2 Distributed transaction without corresponding dba_2pc entries
1.
SQL> delete from sys.pending_trans$ where local_tran_id = '19.8.406254';
SQL> delete from sys.pending_sessions$ where local_tran_id = '19.8.406254';
SQL> delete from sys.pending_sub_sessions$ where local_tran_id ='19.8.406254';
SQL>    commit;

2. Now insert the dummy record as explained in section:
2.2 Distributed transaction without corresponding dba_2pc entries
       commit;
3. Commit force '19.8.406254'
4. Purge the transaction:

SQL>   exec dbms_transaction.purge_lost_db_entry('19.8.406254');
You can prepare the dynamic query to clean if you have more no. of force commited transactions / rollbacked transactions.

SQL> select 'execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('''||local_tran_id||''');' ||chr(10)||'commit;' from dba_2pc_pending where state='forced commit';

Take the output to a .sql file and run it in sql window.