This is epic cool stuff, we wanted to predict results of voter elections before they were declared and we did extreme NLP to get to know the results were even declared. We were right too! with RBI chair professor.

And of course with StanfordNLP you get access to do it for hindi too. Why Hindi? Because the analysis was done on voter comments on policies, we classified first the voter comments based on issues after tokenization, then did sentiment analysis and then applied deep NLP over gigabytes of review data. Finally weighing it with votes of the constituencies, that is since the constituencies themselves have a fixed number of seats to be sent to the Lok Sabha (Indian legislative organ that forms the government at the center) we weighted them in the ratio of seats that the state contributes to know people's opinions on the issue vis a vis their tendency. First the tendency is measured by assigning a score from 0 to 1 on a left vs right agenda, if it is let us say 0.1, the voter has a left leaning stance and hence would likely vote for a leftist party. Then it is scored to get the voter's tendency for the current government's rule, which is decided by whether there is an active expression of positive or negative comments. Then based on experiments done at a small scale on regional ward level elections a prediction is then made on the test data set, that is the one for the general elections. The sampling was completely random since comments were collected from newspapers and feedbacks on news channels (the bias is removed by balancing with equal number of left leaning and right t leaning, and equal number of incumbent and opposition favouring sources). The results were ofcourse prone to errors for the random sampling in the channels could not be controlled by the experimenters. Then the funny thing happened, can I use it also on transcripts of news readers? that is this can also be used to highlight the effect of the news channel's convincing power to sway its audience. Can it be used for that? Haha, let us see. Again being a proprietary project, only the method can be discussed in detail and some snippets of code transferred to local machine are all that is left.

Sample Hindi news that was input
Topic detection from voter review. Policy topics and number of test voter reviews
Classification on policy accuracy with and without random topics
Accuracy across topics
Schema for training and detecting polarity of voter reviews for both Hindi and English comments
SVM architecture
Naive bayes classifier (simplifired diagram of steps to be taken for supervised NLP) for health policies
System architecture of the heterogeneous ensemble method applied to ensemble methodology used finally for generating positive, neutral or negative sentiment
Debiasing trained examples and detecting polarity on scale of 0 to 1
Hindi sentiment analysis on voter reviews of policies
Voter sentiment on major national parties for dictionary based approach for education policies
Voter sentiment on major national parties for naive bayes classifier
Voter sentiment on major national parties for SVMs for economic policies. If we notice they are most positive for BJP, the party that one and they are largely negative for the largest party in opposition

We introduced a recurrent convolutional neural network (RCNN) for text classification to get policy topic from the review.In this model, when learning word representations, a recurrent structure is applied to capture contextual information, which will considerably reduce noise. The max-pooling layer judges which words play key roles in text classification to capture the key components in the text automatically. Unlike the traditional CNN method, it uses a bi-directional recurrent structure instead of a fixed window size for filtering Decision Tree, Naïve Bayes, KNN, SVM with linear kernel, and boosting were investigated. The SVM based classifier with the normalized term frequency feature representation performs best on bill titles with 75% accuracy.

SVM, CNN and bert were the best performers across the board and an ensemble was used to predict coter sentiment on election based policies for major parties.