Create Data Tests for BANK Subgraph pipeline

Assign
Data Source
Priority
Status
Completed
Description
Estimated Completion
Tags
Property
ย 
Created SQL scripts to test for data duplicates:
SELECT 
  a.*
FROM subgraph_bank_transactions a
JOIN (SELECT id, graph_id, amount_display, COUNT(*)
      FROM subgraph_bank_transactions 
      GROUP BY id, graph_id, amount_display
      HAVING COUNT(*) > 1) b 
ON a.id = b.id
AND a.graph_id = b.graph_id
AND a.amount_display = b.amount_display 
ORDER BY a.id
ย 
ย 
ย 
ย 
Notes from DaoDash meeting (March 4th, 2022):
  1. Pipeline update
    1. Getting to production is larger task
    2. To dos:
      1. Automation
      2. Testing (Unit test, Data testing (duplicate testing, sanity testing, integrity testing))
        1. poor manโ€™s way: our own custom scripts (SQL scripts)
        2. dbt as a tool ELT+ with checks built-in: https://www.getdbt.com/pricing/
        3. data integrity tools: greatexpectations.io
      3. Continuous integration , integration testing
      4. dockerization , logging
      5. ย 
        ย 
        ย 

Create Data Tests for BANK Subgraph pipeline

Assign
Data Source
Priority
Status
Completed
Description
Estimated Completion
Tags
Property
ย 
Created SQL scripts to test for data duplicates:
SELECT 
  a.*
FROM subgraph_bank_transactions a
JOIN (SELECT id, graph_id, amount_display, COUNT(*)
      FROM subgraph_bank_transactions 
      GROUP BY id, graph_id, amount_display
      HAVING COUNT(*) > 1) b 
ON a.id = b.id
AND a.graph_id = b.graph_id
AND a.amount_display = b.amount_display 
ORDER BY a.id
ย 
ย 
ย 
ย 
Notes from DaoDash meeting (March 4th, 2022):
  1. Pipeline update
    1. Getting to production is larger task
    2. To dos:
      1. Automation
      2. Testing (Unit test, Data testing (duplicate testing, sanity testing, integrity testing))
        1. poor manโ€™s way: our own custom scripts (SQL scripts)
        2. dbt as a tool ELT+ with checks built-in: https://www.getdbt.com/pricing/
        3. data integrity tools: greatexpectations.io
      3. Continuous integration , integration testing
      4. dockerization , logging
      5. ย 
        ย 
        ย