DQ blank data cleanup/investigation
Assign

Data Source
Priority
Status
Description
Estimated Completion
Tags
Property
DQ Blank Data assumptions
- content column in discord_messages table shouldn’t be blank
- proposal_id column in stg_bankless_snapshot_1 table shouldn’t be blank
Current results:

Additional checks to see users attached sending messages without content:
select d.username, count(*)
from discord_messages m
join discord_user d on m.author_user_id = d.discord_user_id
where coalesce(content, '') = ''
group by 1
order by 2 desc;
top results - mostly bots (with some exceptions):

There are records sent by some bots with content though:
select d.username, count(*)
from discord_messages m
join discord_user d on m.author_user_id = d.discord_user_id
where coalesce(content, '') != ''
and lower(d.username) like '%bot%'
group by 1
order by 2 desc;
top results:

Suggested actions
- review discord pipeline to see if there is a way to pull content for bot messages?
- review discord pipeline to possibly back-populate blank content?
- delete records with content=NULL?