SenticNet (Temasek Laboratories)My first foray into NLP research
The SenticNet team is a multidisciplinary research group based at the School of Computer Science and Engineering at NTU, lead by Dr Erik Cambria. Sentic computing focuses on polarity detection and emotion recognition of natural language by leveraging on semantics and linguistics in combination with statistical methods like machine learning.
In the summer of 2016, I worked closely with Soujanya Poria on the deep learning aspect of Sentic Computing: trying new ideas on combining word embeddings with deep neural networks for text classification.
First steps in research
I shall be forever grateful for the opportunity to learn from the incredible people at the SenticNet team and to be involved with research so early in my university life. It all started in my second semester at NTU when I audited the Machine Learning course (CZ4041) out of pure fascination and eventually emailed Erik about my interests. I got no academic credit for sitting in that class but I’m glad I pursued it nonetheless!
After getting a general introduction to Machine Learning, I diligently studied Richard Socher’s CS224d - Deep Learning for Natural Language Processing course and learnt Tensorflow along the way. Although following the course was difficult for me, I was fascinated by deep learning too much to let my lack of background knowledge stop me. Eventually, I understood the mathematics behind the intuitions and became comfortable with reading papers, and was in a position to start working on my own ideas and assisting the team with ongoing research projects.
Work on word embeddings
I released an open source project on techniques to augment pre-trained word embeddings (like word2vec) with surrounding context in sentences using bi-directional Recurrent Neural Networks. In the repository, I analyzed the effect of applying such augmentations in the context of text analysis.