Team Mate Building

Quick View

Goal: Develope a way to recommend team mates for future event, data based on applicants’ questionarie answer. (e.g. School, location, major…etc)

Metrics: Nope, this was an unsupervised learning project.

Results:

  • Kmodes is a clustering method focus on categorical data.
  • Some columns contain too many values, so appropriate clean techniques were applied to simplified data into categorical type.
  • The estimation is conducted by comparing cost value between variant number of clusters, the optimal point should see decreasing rate of cost started to slow.
  • The results shows that 5 clusters is optimal.
  • To maintain certain diversity, I give a small probability to let small group (e.g. ppl in Art or business major) be able to join with major group(e.g. ppl in CS major).
  • Of course cosine similarity is a solution to this.

Github