ARC-VA-Modeling — PIRATES

2.0 Data Modeling

Data Modeling.jpg

2.1 Key Data Objects

Data Modeling 2.jpg

Once the data is collected, it will have to be modeled for efficient storage and retrieval for the current problem statement. For this use case, it is probably most beneficial to make “Association” as the primary data object. Every record for an association will associate a volunteer with a previous disaster they have been assigned to. This data object would need supporting data objects like “Volunteer” and “Disaster”. Each volunteer record, for instance, would store values for the features we would have picked to represent volunteers.

By this stage scripts to anonymize data would have kicked in and sensitive volunteer information like phone no. or email ID, would have been replaced by unintelligible one way hashes. These 3 data objects will represent 3 vectors most useful to our use case. The number of dimensions these vectors exist in will depend on the number of features we have picked to represent them. From this point on, we can purely look at the problem as vectors living in some higher dimensional space and work on understanding how these vectors behave w.r.t each other.

kisspng-vector-field-three-dimensional-space-shape-fields-vector-5ad96b0601b2d9.887411671524198150007.jpg

2.2 Gold Standard

Supervised Learning

Supervised Learning.jpg

The “Association” vectors constructed will be a bare minimum for a use case of this sort to learn from. Additionally, if we can add another dimension to it, and manually mark each association as “Optimal” or “Sub-optimal”, we can choose extremely optimized supervised learning algorithms downstream. Generally, supervised learning will lead to much better forecasts and also allow us to evaluate learnings from our model against the gold standard learnings of a human expert.