Functional utility library is good

There is a data tracking report need to be update. I have some learning on this task.

Requirement

  1. Update report cache workflow and structure
  2. Clean up deprecated code
    • Some old and long MySQL queries
  3. Show the expired data in the report when the new data is not ready
    • Current behaviour: If new data is not ready, the report page will be blank

Context

  • The remote read-only database is slow, drag down the development speed
  • There are two data sources, I need to merge data from two database to produce report data
    • One database store business data and other store analytic data

Preparing data for new report and mistake I made

Notation

Support we have 3 items: a, b, c
Business data set B have {a, b}
Analytic data set A have {b, c}

Requirement

We want to have all items a, b, c in the report, if business data or analytic data is missing, just leave the field empty

First attempt (fail)

I implement the code like this:

1
2
3
4
for each item ia in A:
for each item ib in B:
if ia == ib :
merge them

We only iterate data set A, which have item b and c only. Item a is missing.

Hotfix (works but ugly)

I patch the code to add missing item back

1
2
3
4
for each item ib in B:
for each item ia in A:
if ib == ia :
merge them

Let’s name the data return from first attempt R1 and our hotfix data as R2
As item b is appear in both R1 and R2
I have to remove duplciated item with extra code

1
2
R3 = R1 union R2
R3 = R3 - (R1 intercept R2)

The code become very complex, and I have to implement some set operations.

A better solution

1
2
3
4
5
6
7
8
9
10
11
extract item key from B
extract item key from A
union two set of keys
for each key:
// find data and fill populate the item
for each item ib in B:
if key == ib
merge
for each item ia in A:
if key == ia
merge

Learning

  1. Create mock data for slow database query to speed up development time
  2. Think before code
    • Clarify what data need to be output
    • Think twice on the impelemntation before code
  3. Set theory can be used when solving data set problem.
  4. Use functional utility function

How underscore-like functional utility function help in this case?

Save development time

We can identify some common in the “better solution”

  1. Extract item key from set
  2. From item from data set by key
  3. Union two set of keys
    • _.union(): https://underscorejs.org/#union
      These functions can be find in underscore and they are well tested. We can save development and debug time if we can use a library.

Easy to change

Underscore use chain pattern. We can change logic flow by insert, delete and reorder lines of code. We don’t need to handle the nested scode and function call when update the logic

Easy to read

Function defined in underscore is common, they have well defined name which is commonly use. This make the code easier to understand. It is also a good way to learn how to name functions.

Change log

Aug 27, 2018

Restructure the post, remove some irrelevant idea and sentences.