Nigel Noble's Oracle Blog

23/07/2010

10gR2 – 11gR2, ASSM Space Management “Bug”

Filed under: 10gR2, 11gR1, 11gR2, assm — Nigel Noble @ 10:59 am

ASSM (Automatic Segment Space Management) has an issue when trying to re-use “deleted” space created by another session.  There is a very specific set of circumstances which must occur for this issue to show itself, but will result in tables (and I suspect indexes) growing significantly larger than they need to be. I am aware that the problem exists in versions 10.2.0.3 through to the current 11gR2 inclusive although I don’t know which Oracle release first introduced the problem.

The conditions required to cause the issue

My site has a number of daemon style jobs running permanently on the database loading data into a message table. The daemon is managed by dbms_job and is re-started when the database is first started and can run for many days, weeks (or in some cases months) before the job is ever stopped. We only need to keep the messages for a short time, so we have another daemon job whose role is to delete the messages from the table as soon as the expiry time is reached. In one example we only need to retain the data for a few minutes (after which time we no longer need it) and we also wanted to keep the table as small as possible so it remained cached in the buffer cache (helped by a KEEP pool). When we developed the code, we expected the message table to remain at a fairly constant size of 50 – 100MB in size. What we found was the table continued to grow at a consistent rate to many gigabytes in size until we stopped the test. If we copied the table (eg using “create table <new table> as select * from <current table>”) the new table would again be 50MB so most of the space in the table was empty “deleted” space. The INSERT statements were never re-using the space made free by the delete statement (run in another session).

(more…)

Advertisements

05/07/2010

10.2.0.5, KEEP pool / Serial Direct Read

Filed under: 10.2.0.5, 11gR1, Performance — Nigel Noble @ 11:36 am

Jonathan Lewis made reference to a 11g bug related to using a KEEP POOL in his note Not KEEPing.  Oracle 11g introduced a new feature called adaptive serial direct path reads which allows large “Full Scan” disk reads  to be read using “direct path reads” rather than through the buffer cache. In many cases “direct IO” can give significant increase in performance when reading data for first time (from disk), however can be significantly slower if your subsequent queries could have been serviced from the buffer cache. The bug Jonathan references (Bug 8897574) causes problems if you assign any large object to a KEEP POOL because by default, 11g would read large objects using the new direct IO feature and avoid  ever placing the object in the KEEP POOL. The whole point of using the KEEP POOL is to identify objects you do want to protect and keep in a cache. 

The 10.2.0.5 patchset has also back-ported the same direct read feature which is new to 11g although I don’t know if the rules are the same as 11g. The site where I work makes significant use of KEEP pools and also has spent some time investigating aspects relating to serial direct IO vs. buffer cache IO. 

 I want to use this blog entry to explore a number of related issues but also demonstrate that the 11g bug Jonathan identified seems to also exists in 10.2.0.5 patchset (and 11gR2). This blog item will cover:

  • Brief reference relating to the 11g “adaptive serial direct path read”
  • The 10.2.0.5 implementation and how to switch it on
  • 10.2.0.5 demonstration showing the relative difference for different types of IO vs read from cache
  • 10.2.0.5 demonstration which shows the KEEP pool bug also exists (but not by default)
  • Some real life comparison figures of disk reads via “direct path read” and via “buffer cache” to show why the “adaptive serial direct path read” feature is worth exploring in more detail.   

 

(more…)

11/05/2010

11.1.0.7 poor plsql array performance

Filed under: 11gR1, Performance — Nigel Noble @ 2:05 pm

The PLSQL application at my site is dependent on the usage of very large PLSQL arrays (several GB in size). During the testing on 11.1.0.7 we found a severe performance issue when the application first loaded the plsql arrays with data. The application used to load the array data in tens of seconds but was now taking 6 or 7 minutes to do the same work. Further investigation showed that this problem did not exist on any other version of Oracle (We tested 10.2.0.3, 10.2.0.4, 11.1.0.6 and 11.2.0.1).

A colleague wrote a test case to show the issue for Oracle support so they could identify a fix and we could request a patch:

Source of the script:
Package Header (has no body!)

CREATE OR REPLACE PACKAGE tst_pkg_array IS

   TYPE typ_rec_1 IS RECORD(
       attr1  NUMBER(12)
      ,attr2  NUMBER(12)
      ,attr3  NUMBER(12)
      ,attr4  NUMBER(12)
      ,attr5  NUMBER(12)
      ,attr6  VARCHAR2(100)
      ,attr7  VARCHAR2(100)
      ,attr8  VARCHAR2(100)
      ,attr9  VARCHAR2(100)
      ,attr10 VARCHAR2(100)
      ,attr11 DATE
      ,attr12 DATE
      ,attr13 DATE
      ,attr14 DATE
      ,attr15 DATE
      ,attr16 NUMBER(12)
      ,attr17 NUMBER(12)
      ,attr18 NUMBER(12)
      ,attr19 NUMBER(12)
      ,attr20 NUMBER(12)
      ,attr21 VARCHAR2(100)
      ,attr22 VARCHAR2(100)
      ,attr23 VARCHAR2(100)
      ,attr24 VARCHAR2(100)
      ,attr25 VARCHAR2(100));

   TYPE typ_tab_1 IS TABLE OF typ_rec_1 INDEX BY PLS_INTEGER;

   tab_test_simple typ_tab_1;

END tst_pkg_array;
/

Test script to generate timings

set timing on

spool run.log append

set time on

select * from v$version;

set serveroutput on size 100000

DECLARE

   --
   PROCEDURE pr_put_elements_into_array(i_num_elements IN NUMBER) IS

      l_rec_test tst_pkg_array.typ_rec_1;

      tab_test_local tst_pkg_array.typ_tab_1;

   BEGIN
      --
      -- start by clearing down
      tst_pkg_array.tab_test_simple.delete;
      tst_pkg_array.tab_test_simple := tab_test_local;
      --
      l_rec_test.attr1  := 10000000;
      l_rec_test.attr2  := 10000000;
      l_rec_test.attr3  := 10000000;
      l_rec_test.attr4  := 10000000;
      l_rec_test.attr5  := 10000000;
      l_rec_test.attr6  := 'ABCDEFGH';
      l_rec_test.attr7  := 'ABCDEFGH';
      l_rec_test.attr8  := 'ABCDEFGH';
      l_rec_test.attr9  := 'ABCDEFGH';
      l_rec_test.attr10 := 'ABCDEFGH';
      l_rec_test.attr11 := SYSDATE;
      l_rec_test.attr12 := SYSDATE;
      l_rec_test.attr13 := SYSDATE;
      l_rec_test.attr14 := SYSDATE;
      l_rec_test.attr15 := SYSDATE;
      l_rec_test.attr16 := 10000000;
      l_rec_test.attr17 := 10000000;
      l_rec_test.attr18 := 10000000;
      l_rec_test.attr19 := 10000000;
      l_rec_test.attr20 := 10000000;
      l_rec_test.attr21 := 'ABCDEFGH';
      l_rec_test.attr22 := 'ABCDEFGH';
      l_rec_test.attr23 := 'ABCDEFGH';
      l_rec_test.attr24 := 'ABCDEFGH';
      l_rec_test.attr25 := 'ABCDEFGH';
      --
      FOR n IN 1 .. i_num_elements LOOP
         tst_pkg_array.tab_test_simple(n) := l_rec_test;
--       Using the line below instead of the one above (ie using a local variable rather than a package state variable) resolves the issue on 11.1.0.7
--       To test that, comment out the line above, and "comment in" the line below
--         tab_test_local(n) := l_rec_test;
      END LOOP;

   END pr_put_elements_into_array;

   PROCEDURE pr_run_test(i_num_elements IN NUMBER) IS
      l_duration_secs NUMBER;
      l_start_time    NUMBER;

   BEGIN

      l_start_time := DBMS_UTILITY.get_time;
      pr_put_elements_into_array(i_num_elements => i_num_elements);
      l_duration_secs := (DBMS_UTILITY.get_time - l_start_time) / 100;
      dbms_output.put_line(i_num_elements || ' elements took ' || (l_duration_secs) || ' secs or ' ||
                           (l_duration_secs / (i_num_elements / 1000)) || ' secs per thousand');
   END pr_run_test;
BEGIN
   pr_run_test(i_num_elements => 1000);
   pr_run_test(i_num_elements => 10000);
   pr_run_test(i_num_elements => 100000);
   pr_run_test(i_num_elements => 200000);
   pr_run_test(i_num_elements => 300000);
END;
/

exit

Output on 11.1.0.7

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
PL/SQL Release 11.1.0.7.0 - Production
CORE    11.1.0.7.0      Production
TNS for Linux: Version 11.1.0.7.0 - Production
NLSRTL Version 11.1.0.7.0 - Production

Elapsed: 00:00:00.40
1000 elements took .01 secs or .01 secs per thousand
10000 elements took .09 secs or .009 secs per thousand
100000 elements took 6.1 secs or .061 secs per thousand
200000 elements took 21.65 secs or .10825 secs per thousand
300000 elements took 46.55 secs or .1551666666666666666666666666666666666667
secs per thousand

PL/SQL procedure successfully completed.

Elapsed: 00:01:14.40

The above example shows the script took over 1 minute to complete and you can see the result degreades exponentially (per 1000 requests) as more entries are added to the array.

Output on 10.2.0.4


BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

Elapsed: 00:00:00.02
1000 elements took .01 secs or .01 secs per thousand
10000 elements took .05 secs or .005 secs per thousand
100000 elements took .51 secs or .0051 secs per thousand
200000 elements took 1.02 secs or .0051 secs per thousand
300000 elements took 1.17 secs or .0039 secs per thousand

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.89

Patch details:

Patch 7671793: EXCESSIVE MEMORY USAGE WHEN USING KGHU
patch is avaiable for Linux x86-64)

Output once patch was applied:

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
PL/SQL Release 11.1.0.7.0 - Production
CORE    11.1.0.7.0      Production
TNS for Linux: Version 11.1.0.7.0 - Production
NLSRTL Version 11.1.0.7.0 - Production

Elapsed: 00:00:00.06
1000 elements took 0 secs or 0 secs per thousand
10000 elements took .05 secs or .005 secs per thousand
100000 elements took .55 secs or .0055 secs per thousand
200000 elements took .88 secs or .0044 secs per thousand
300000 elements took 1.25 secs or .004166666666666666666666666666666666666667
secs per thousand

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.84

You can now see with the patch applied, the above test takes less than 3 seconds to complete.

Although our application uses plsql arrays in a very extreme way, it may be worth reviewing this patch if your application has some unexplained slow down and is using large plsql arrays AND 11.1.0.7.

I have tested this issue on 11.1.0.7.3 (latest PSU patch 9352179) and the problem still exists. It’s not clear if this is a generic bug or one specific to our platform (Linux x86-64). Anoyingly I have found that the available patch ( 7671793) clashes with all the avaiable  11.1.0.7 psu patches, so a merge patch would be needed or you could try a non 11.1.0.7 version of Oracle.

Create a free website or blog at WordPress.com.