Remove Postgres / Redshift duplicates without unique key

Sometimes you got duplicates in database and have to deal with it. There's 2 possibilities - you have an auto-generated unique id on the table or not. The first case is very easy

Let say we have table feedback where we collect all the feedbacks given by our users. This table has field feedback_id which represents unique id of external system from which data was posted.

id feedback_id type date
0001 MX-001 livechat 2020-01-01
0002 MX-001 livechat 2020-01-01
0003 MX-001 livechat 2020-01-01
0004 MX-002 pendo 2020-01-03
0005 MX-002 pendo 2020-01-03

If we created an auto-generated Id on it then we safe

delete from feedback
where id in (
    with dups as (
        select id,
            row_number() over (partition by feedback_id order by date desc) rn
        from feedback
    )
    select id from dups where rn > 1
)

But what to do if there's no unique Id? Temporary table is an approach - first we create feedback_temp where only unique records from all set of duplicates will be stored.

CREATE TABLE feedback_temp as (
    with dups as (
        select feedback_id
        from feedback
        group by 1
        having count(*) > 1
    )
    , sorted as (
        select row_number() over (partition by ff.feedback_id order by date desc) rn, f.*
        from feedback f
            right join dups d on d.feedback_id = f.feedback_id
    )
    select * from sorted where rn = 1
);

Next step is to remove the duplicates from initial table

delete from feedback
where feedback_id in (
    select feedback_id
    from feedback
    group by 1
    having count(*) > 1
);

Then we just need to insert data back and remove temp table

BEGIN TRANSACTION;

insert into feedback
(select * from feedback_temp);

drop table feedback_temp;

END TRANSACTION;
[aws][redshift][sql][postgres][duplicates]