How to do manual data reconciliation using python and SQL

Data reconciliation is an important part of the data migration. After the completion of data migration we have to compare data from the source data to destination data and here we are going to compare the record count of data which is one of the part of data reconciliation
Whenever we are doing data migration we have to building manifesto file which will have records of when, where and how much data got transferred. Here In this blog we are going to do manual data reconciliation using this manifesto file with these steps before starting reconciliation you have to migrate the data to BigQuery
- We will create manifesto table in BigQuery using terraform,
- Load manifesto CSV data to BigQuery using python script
- Compare Manifesto record count to BigQuery record count data
1. Create manifesto table using terraform
Here we are going to use the same dataset which we have in created previous blog Create BigQuery dataset and table using terraform we will deploying our manifesto table in the same dataset by using dataset id
main.tf
variable.tf
terraform.tfvars
2. Load manifesto data from CSV file to BigQuery using python
Here python script we can use to load data from CSV file in GCP bucket
In this python script need to add your project id and your bucket name
3. Compare BigQuery table record count to manifesto record count
We need to run this script on BigQuery
Output after running the SQL query

Conclusion
In this blog we have created BigQuery manifesto table, load CSV data to table and then SQL query to compare manifesto record to BigQuery table. Here you can get all the script from GitHub account and Excel sheet from google sheet.
Links
Reference
Load data using python