Text this: Deep learning based video spatio-temporal modeling for emotion recognition