There is a data tracking report need to be update. I have some learning on this task.
Requirement
- Update report cache workflow and structure
- Clean up deprecated code
- Some old and long MySQL queries
- Show the expired data in the report when the new data is not ready
- Current behaviour: If new data is not ready, the report page will be blank
Context
- The remote read-only database is slow, drag down the development speed
- There are two data sources, I need to merge data from two database to produce report data
- One database store business data and other store analytic data
Preparing data for new report and mistake I made
Notation
Support we have 3 items: a, b, c
Business data set B
have {a, b}
Analytic data set A
have {b, c}
Requirement
We want to have all items a, b, c
in the report, if business data or analytic data is missing, just leave the field empty
First attempt (fail)
I implement the code like this:1
2
3
4for each item ia in A:
for each item ib in B:
if ia == ib :
merge them
We only iterate data set A, which have item b
and c
only. Item a
is missing.
Hotfix (works but ugly)
I patch the code to add missing item back
1 | for each item ib in B: |
Let’s name the data return from first attempt R1
and our hotfix data as R2
As item b
is appear in both R1
and R2
I have to remove duplciated item with extra code1
2R3 = R1 union R2
R3 = R3 - (R1 intercept R2)
The code become very complex, and I have to implement some set operations.
A better solution
1 | extract item key from B |
Learning
- Create mock data for slow database query to speed up development time
- Think before code
- Clarify what data need to be output
- Think twice on the impelemntation before code
- Set theory can be used when solving data set problem.
- Use functional utility function
How underscore-like functional utility function help in this case?
Save development time
We can identify some common in the “better solution”
- Extract item key from set
_.pluck()
: https://underscorejs.org/#pluck
- From item from data set by key
_.find()
: https://underscorejs.org/#find
- Union two set of keys
_.union():
https://underscorejs.org/#union
These functions can be find in underscore and they are well tested. We can save development and debug time if we can use a library.
Easy to change
Underscore use chain pattern. We can change logic flow by insert, delete and reorder lines of code. We don’t need to handle the nested scode and function call when update the logic
Easy to read
Function defined in underscore is common, they have well defined name which is commonly use. This make the code easier to understand. It is also a good way to learn how to name functions.
Change log
Aug 27, 2018
Restructure the post, remove some irrelevant idea and sentences.