We had a piece of PL/SQL code that copies data from remote view to a local table. This code executes once a day and took about 4 hours. As if this is not bad enough, we received a request to run this piece of code 3-4 times daily which will total up to about 16 hours per day.
The original code to copy data from a remote view consists of 3 steps:
- Copy over employee numbers (one column only) from remote view to local temp table. This step takes few seconds only.
- Use employee numbers copied locally to do a lookup serially (lookup row by row) from remote view. This step takes about 4 hours to copy 20,000 rows. The reason it takes that long is one column from the view is an actual function. Most of the time is spent executing this function and for so many reasons (not all are technical) we can’t get to function to tune it.
- Commit one time at the end.
Here how the original code looks like:
– Step 1:
– truncate local temp table so we can copy employee numbers
– copy over employees numbers to local temp table from the remote view
1 | execute immediate 'truncate table temp_emp_no' ; |
2 | execute immediate 'truncate table employees' ; |
4 | insert into temp_emp_no select distinct EMPLOYEE_NUMBER from remote_employee_view@remote_link; |
– Step 2:
– Loop through copied over employee numbers and do a remote lookup 1 row at a time
02 | select emp_no from temp_emp_no; |
05 | exit when c 1% notfound; |
07 | select EMPLOYEE_NUMBER, first_name, last_name, RATE, PERIOD, total_rate |
08 | from remote_employee_view@remote_link; |
09 | where EMPLOYEE_NUMBER = i.emp_no; |
At this point we were left with one option only which is run the PL/SQL code in parallel (multithreading), but as you might now, PL/SQL doesn’t support multithreading natively,
To simulate multithreading we need to accomplish 2 steps:
- Break the job in multiple pieces (threads).
- Schedule every thread to run concurrently.
Remember first step which is coping employee numbers only from remote view to local table. We hash-partitioned this local table into 4 partitions. Each of these partitions will translate into a thread as you will see.
3 | PARTITION BY HASH (EMP_NO) |
This is the thread code (stored procedure) which takes in a partition name as a parameter and copies employee data from remote view for that partition only.
01 | CREATE procedure refresh_employee_data_part (p_name in varchar 2 ) |
04 | TYPE EmpCurTyp IS REF CURSOR; |
05 | v_emp_cursor EmpCurTyp; |
06 | sql_stmt varchar 2 ( 2048 ); |
07 | v_emp_no varchar 2 ( 30 ); |
10 | sql_stmt := 'select emp_no from vb_emp_no partition(' || p_name || ')' ; |
11 | open v_emp_cursor for sql_stmt ; |
14 | fetch v_emp_cursor into v_emp_no; |
15 | EXIT WHEN v_emp_cursor%NOTFOUND; |
19 | RATE, PERIOD, TOTAL_RATE) |
20 | select distinct employee_number, vacation_balance, rate, period, total_rate |
21 | from remote_employee_view@remote_link; |
22 | where EMPLOYEE_NUMBER = v_emp_no; |
This procedure glues all parts together. We schedule the previous procedure 4 times and achieve concurrency.
01 | CREATE procedure refresh_employees |
05 | execute immediate 'truncate table vb_emp_no' ; |
07 | select distinct employee_number from remote_view@db_link; |
09 | execute immediate 'truncate table employees' ; |
11 | dbms_scheduler.create_job(job_name => dbms_scheduler.generate_job_name( 'VB_P1_' ), |
12 | job_type => 'PLSQL_BLOCK' , |
13 | job_action => 'begin refresh_employee_data_part(' 'VB_EMP_NO_P1' '); end;' , |
14 | comments => 'Thread 1 to refresh employees' , |
18 | dbms_scheduler.create_job(job_name => dbms_scheduler.generate_job_name( 'VB_P2_' ), |
19 | job_type => 'PLSQL_BLOCK' , |
20 | job_action => 'begin refresh_employee_data_part(' 'VB_EMP_NO_P2' '); end;' , |
21 | comments => 'Thread 2 to refresh employees' , |
25 | dbms_scheduler.create_job(job_name => dbms_scheduler.generate_job_name( 'VB_P3_' ), |
26 | job_type => 'PLSQL_BLOCK' , |
27 | job_action => 'begin refresh_employee_data_part(' 'VB_EMP_NO_P3' '); end;' , |
28 | comments => 'Thread 3 to refresh employees' , |
32 | dbms_scheduler.create_job(job_name => dbms_scheduler.generate_job_name( 'VB_P4_' ), |
33 | job_type => 'PLSQL_BLOCK' , |
34 | job_action => 'begin refresh_employee_data_part(' 'VB_EMP_NO_P4' '); end;' , |
35 | comments => 'Thread 4 to refresh employees, |
The procedure would finish immediately even if there are errors. To check the execution status look under DBA_SCHEDULER_JOB_RUN_DETAILS to look for errors and running time.
End Result the job finished in 50 minutes more than 4x faster.
Hazem Ameen
Senior Oracle DBA
0 comments:
Post a Comment