Mechanisms of Action (MoA) Prediction
Quick View
Goal:
- Predict MoA (only 0 and 1) from data of gene and cell vitality, build multiple binary classifier
Metrics:
- Minimum Log loss
Procedure:
- MoA tag are extremely imbalanced, average 89 positive tags in each column from 21K entries, so a special kfold function has beed used
from iterstrat.ml_stratifiers import MultilabelStratifiedKFold
-
Label Smoothing has been conducted to help improve accuracy in this multiple output case A great explanation here
- Use Pipeline function to automatically finished column transform
- Numerical data
- Quantile transform
- PCA
- Categorical
- One-hot-labeling
- Numerical data
- Build DNN model with 3 NN layer with appropriate regulation.
- Use keras.tuner to do hyper parameters tuning and code is in Notebook