row_number()
, and I've used it in the past to identify, then remove duplicates.In this case, I have an APEX collection that represents a session based view history of products/people/events, or whatever your users might be browsing.
I've created an option to consolidate that view history, and remove any record you might have opened more than once.
Collections are a little hard to play with outside of APEX, so I use the create session procedure in the OraOpenSource libraries to simulate an APEX session within SQL Developer.
exec oos_util_apex.create_session(120,'WESLEYS')
For my test case I simulate adding a few entries in my collection, varying a date column slightly.
begin apex_collection.create_or_truncate_collection('TEST'); apex_collection.add_member('TEST', 'Scott', p_d001 => sysdate - 5); apex_collection.add_member('TEST', 'Dmitri', p_d001 => sysdate - 2); apex_collection.add_member('TEST', 'Lino', p_d001 => sysdate - 4); apex_collection.add_member('TEST', 'Scott', p_d001 => sysdate - 3); apex_collection.add_member('TEST', 'Joel', p_d001 => sysdate - 2); apex_collection.add_member('TEST', 'Sabine', p_d001 => sysdate - 4); apex_collection.add_member('TEST', 'Penny', p_d001 => sysdate - 5); apex_collection.add_member('TEST', 'Scott', p_d001 => sysdate - 6); apex_collection.add_member('TEST', 'Jackie', p_d001 => sysdate - 3); apex_collection.add_member('TEST', 'Joel', p_d001 => sysdate - 1); end; /
Here is a query that will use an analytical function to add a computed column that identifies the most recent entry for any name, and assign it a 1. Any subsequent entries for that name will get a 2, 3, 4 etc.
select seq_id, c001 name ,row_number() over (partition by c001 -- look for duplicates in this set of columns order by d001 desc -- put records I want to keep first ) rn ,d001 dt from apex_collections where collection_name = 'TEST' order by name;
In this case, seq_id 1, 5 & 6 represent the records I want to trim from the list.
Duplicate entries highlighted |
select seq_id, rn from (select seq_id, row_number() over (partition by c001 order by d001 desc) rn from apex_collections where collection_name = 'TEST' ) where rn != 1; SEQ_ID RN ---------- ---------- 6 2 1 2 5 3
Reverse the order by, and I'll keep the first entry instead.
Got more columns that signify uniqueness? Expand the partition by clause.
A standard aggregate query would be able to identify the names of those duplicates, plus how many you have, but no set of records with the unique identifiers to remove. Using
min(seq_id)
would be insufficient once you more than two entries.select count(*), c001 name, min(seq_id) from apex_collections where collection_name = 'TEST' group by c001 having count(*) > 1 C NAME MN --- ---------- --- 2 Joel 6 3 Scott 1
We can fold the analytical query into a PL/SQL loop and remove the duplicates from my collection.
begin << remove_duplicates >> for r_rec in ( select seq_id, rn from (select seq_id, row_number() over (partition by c001 order by d001 desc) rn from apex_collections where collection_name = 'TEST' ) where rn != 1 ) loop APEX_COLLECTION.DELETE_MEMBER (p_collection_name => 'TEST' ,p_seq => r_rec.seq_id); end loop remove_duplicates; end anon; /
Where the resulting collection is minus the older duplicate.
SEQ_ID NAME RN DT ---------- ---------- --- ---------- 2 Dmitri 1 2018-09-16 9 Jackie 1 2018-09-15 10 Joel 1 2018-09-17 3 Lino 1 2018-09-14 8 Penny 1 2018-09-13 7 Sabine 1 2018-09-14 4 Scott 1 2018-09-15
Of course, this is not limited to APEX collections, but it gave me a chance to play with more toys.
No comments:
Post a Comment