Tuesday, 10 December 2019

Where would we be if we just believe?

As a science aficionado, there are certain phrases that ... catch the eye.

Recently on twitter there was an interesting thread that continued from Michelle Skamene's post on Top 15 Tuning Tips for APEX.  Michelle provided a wonderful follow-up post summarising the outcomes of the thread.

Point 9 suggests we avoid HTML in our queries, and use HTML expressions. This is undeniably good practice, but there was a question regarding how much performance is gained. Patrick Wolf summarises it well (sorting, XSS, context switching).

I've been curious about this for a while myself, and what better way to verify the truth than do some science ;p

I thought about the recent experimentation I did with interpreted code, and recent client work I've done with bind variables, so I thought I'd not only compare timings with a simple test harness I use, but see what v$sqlarea had to say.

When considering what SQL to compare, I decided to use the simple bit of SQL I used for my AskTOM Office Hours demo app. I saw how quickly the embedded HTML expanded to fit repeated business rules, and why not use something simple. If any difference was to be seen, let's see if it shows up with something basic, just like Juergen suggested.

The first query is just a simple query on scott.emp, with an extra expression to determine if the row is 'special'. This information is used declaratively within APEX, so these calculations are absent from this testing. The complexity has been transferred, but simplified. That's the reason it's best practice.
cursor c_1 is
select /*+ qb_name(c1) */ empno, ename, job, hiredate
  ,case
   when hiredate < date '2000-01-01'
     then 'special'
   end last_cent
from scott.emp
So, let's just concentrate on the differences in the SQL we can measure.

The second query has the HTML tags embedded, repeated on columns where I substituted the column into the class attribute in the previous example. The code required expands quickly, hence the argument to keep the SQL and presentation layer separate.
select /*+ qb_name(c2) */empno
  ,'<b>'||ename||'<b>' ename
  ,case when hiredate < date '2000-01-01'
    then '<span style="color:purple;font-weight:bold;">'
     ||'<span title="'||job||'">'||to_char(hiredate,'fmddth Mon YYYY')||'</span>'
   end ||'</span>'
   as hireDate
  ,case when hiredate < date '2000-01-01'
     then '<span style="color:purple;font-weight:bold;">'||job||'</span>'
   end job
from scott.emp
The third query introduces context switching to PL/SQL due to the inclusion of htf.bold and apex_escape.html. This protects the output we'd otherwise have escaped using the relevant property. Or you could mitigate usage by sanitising data on the way in.
cursor c_3 is
select /*+ qb_name(c3) */empno
  ,htf.bold(apex_escape.html(ename)) ename
  ,case when hiredate < date '2000-01-01'
    then '<span style="color:purple;font-weight:bold;">'
     ||'<span title="'||apex_escape.html(job)||'">'||to_char(hiredate,'fmddth Mon YYYY')||'</span>'
   end ||'</span>'
   as hireDate
  ,case when hiredate < date '2000-01-01'
     then '<span style="color:purple;font-weight:bold;">'||apex_escape.html(job)||'</span>'
   end job
from scott.emp;
If we just compare throughput, there is a clear loser. Context switching between SQL & PL/SQL is just too much, though I would like to think in recent versions of the database, those functions could be enhanced with such devices as the UDF pragma?
iterations:50000
     3.98 secs (.0000796 secs per iteration)
     5.16 secs (.0001032 secs per iteration)
    33.09 secs (.0006618 secs per iteration)
I just run these queries lots of times to measure throughput. Tom Kyte wrote a more enhanced test bed called runstats, if you want juicier details.
-- and start timing...
    l_start := dbms_utility.get_time;

    FOR i IN 1 ..c_iterations
    LOOP
     for x in c_1
     loop
        null;
     end loop;

    END LOOP;
    l_end := dbms_utility.get_time-l_start;
    dbms_output.put_line( TO_CHAR(l_end/100, '99990.00') 
                       || ' secs ('||(l_end/c_iterations/100) || ' secs per iteration)' );
What about v$sqlarea?
select substr(sql_text, 1, 22)sql, fetches, parse_calls, buffer_gets, sharable_mem
   ,persistent_mem, runtime_mem, user_io_wait_time
   ,plsql_exec_time, cpu_time, elapsed_time, physical_read_bytes 
from v$sqlarea  
where sql_text like 'SELECT%SCOTT.EMP%'
order by 1

v$sqlarea results

These statistics validate the timings, and I think also validates this as a performance tip that belongs in Michelle's list.
While performance difference with/without HTML may be marginal in this case, the supplementary benefits make it clear best practice.

No comments: