After a couple of replies I decided to enter my reply, in full nerd mode.
select * from people_coming_to_lunch; People -------- Kate Scott Karolina 3 rows selected.And of course one of the other SQL geeks (name redacted) replied to extend the data set.
select * from people_coming_to_lunch union select 'Shanequa' from dual;And I couldn't help myself. I had to play the performance card and suggest that UNION ALL would be the appropriate usage, and should be the default you type out. Always. Until you decide otherwise.
That's because sorts are expensive. And a UNION will need sorting to check for duplicates.
That sort of all the rows isn't necessary |
create table people_coming_to_lunch (people varchar2(30)); insert into people_coming_to_lunch values ('Scott'); insert into people_coming_to_lunch values ('Kate'); insert into people_coming_to_lunch values ('Karolina'); create unique index lunch_people on people_coming_to_lunch(people); select * from people_coming_to_lunch union all select 'Shanequa' from dualBy using UNION ALL instead of UNION, you're telling the database not to even bother sorting the set to eliminate any potential duplicates, since your advanced human brain knows there will be no duplicates.
With only a few rows, the timing of sheer throughput is barely noticable.
iterations:1000 0.30 secs (.0003 secs per iteration) -- UNION 0.25 secs (.00025 secs per iteration) -- UNION ALL iterations:10000 1.72 secs (.000172 secs per iteration) 1.09 secs (.000109 secs per iteration) iterations:50000 10.94 secs (.0002188 secs per iteration) 8.48 secs (.0001696 secs per iteration)So I turned it up a notch and added about 5000 rows to the table.
insert into people_coming_to_lunch select table_name from all_tables; 5000 rows insertedHere's the explain plan without the sort.
That's one less chunk of 5000 rows to process |
Now the differences in performance stand out.
iterations:1000 6.79 secs (.00679 secs per iteration) -- UNION 2.85 secs (.00285 secs per iteration) -- UNION ALL iterations:5000 42.91 secs (.008582 secs per iteration) 19.89 secs (.003978 secs per iteration) iterations:5000 31.70 secs (.00634 secs per iteration) 22.83 secs (.004566 secs per iteration) iterations:5000 30.75 secs (.00615 secs per iteration) 16.76 secs (.003352 secs per iteration)Upto twice as long for the same statement?
No thanks, not when I could just type 4 extra characters to get an easy performance win.
Turns out this topic formed my first technical post. Back in 2009, after almost 10 years of using SQL, that was the first thing I blogged about. How about that.
No comments:
Post a Comment