Wednesday, 19 September 2018

Remove duplicate from APEX collection

One of my favourite SQL analytic functions is row_number(), and I've used it in the past to identify, then remove duplicates.

In this case, I have an APEX collection that represents a session based view history of products/people/events, or whatever your users might be browsing.

I've created an option to consolidate that view history, and remove any record you might have opened more than once.

Collections are a little hard to play with outside of APEX, so I use the create session procedure in the OraOpenSource libraries to simulate an APEX session within SQL Developer.

exec oos_util_apex.create_session(120,'WESLEYS')

For my test case I simulate adding a few entries in my collection, varying a date column slightly.
begin
apex_collection.create_or_truncate_collection('TEST');
apex_collection.add_member('TEST', 'Scott',  p_d001 => sysdate - 5);
apex_collection.add_member('TEST', 'Dmitri', p_d001 => sysdate - 2);
apex_collection.add_member('TEST', 'Lino',   p_d001 => sysdate - 4);
apex_collection.add_member('TEST', 'Scott',  p_d001 => sysdate - 3);
apex_collection.add_member('TEST', 'Joel',   p_d001 => sysdate - 2);
apex_collection.add_member('TEST', 'Sabine', p_d001 => sysdate - 4);
apex_collection.add_member('TEST', 'Penny',  p_d001 => sysdate - 5);
apex_collection.add_member('TEST', 'Scott',  p_d001 => sysdate - 6);
apex_collection.add_member('TEST', 'Jackie', p_d001 => sysdate - 3);
apex_collection.add_member('TEST', 'Joel',   p_d001 => sysdate - 1);
end;
/

Here is a query that will use an analytical function to add a computed column that identifies the most recent entry for any name, and assign it a 1. Any subsequent entries for that name will get a 2, 3, 4 etc.
select seq_id, c001 name
  ,row_number() over 
    (partition by c001 -- look for duplicates in this set of columns
     order by d001 desc -- put records I want to keep first
     ) rn
     ,d001 dt
from apex_collections
where collection_name = 'TEST'
order by name;

In this case, seq_id 1, 5 & 6 represent the records I want to trim from the list.

Duplicate entries highlighted

Turn the query into an in-line view to filter out any records where the row_number() is not 1, as we are not allowed to have window functions in the where clause (ORA-30483).
select seq_id, rn from
 (select seq_id, row_number() over (partition by c001 order by d001 desc) rn
  from apex_collections
  where collection_name = 'TEST'
) where rn != 1;

    SEQ_ID         RN
---------- ----------
         6          2
         1          2
         5          3

Reverse the order by, and I'll keep the first entry instead.
Got more columns that signify uniqueness? Expand the partition by clause.

A standard aggregate query would be able to identify the names of those duplicates, plus how many you have, but no set of records with the unique identifiers to remove. Using min(seq_id) would be insufficient once you more than two entries.
select count(*), c001 name, min(seq_id)
from apex_collections
where collection_name = 'TEST'
group by c001
having count(*) > 1

  C NAME        MN
--- ---------- ---
  2 Joel         6
  3 Scott        1

We can fold the analytical query into a PL/SQL loop and remove the duplicates from my collection.
begin
  << remove_duplicates >>
  for r_rec in (
    select seq_id, rn from
     (select seq_id, row_number() over (partition by c001 order by d001 desc) rn
      from apex_collections
      where collection_name = 'TEST'
    ) where rn != 1
  ) loop
    APEX_COLLECTION.DELETE_MEMBER
      (p_collection_name =>  'TEST'
      ,p_seq             => r_rec.seq_id);
  end loop remove_duplicates;
end anon;
/

Where the resulting collection is minus the older duplicate.
SEQ_ID     NAME        RN DT        
---------- ---------- --- ----------
         2 Dmitri       1 2018-09-16
         9 Jackie       1 2018-09-15
        10 Joel         1 2018-09-17
         3 Lino         1 2018-09-14
         8 Penny        1 2018-09-13
         7 Sabine       1 2018-09-14
         4 Scott        1 2018-09-15

Of course, this is not limited to APEX collections, but it gave me a chance to play with more toys.

No comments: