Twitter Sentiment Analysis using Data Integration Platform

9 mins read

Introduction

This article explains how to perform sentimental analysis on the tweeted text on the twitter and move the processed data into an SQL Server.

Steps Involved

Using the Syncfusion Data Integration Platform:

Bring in the real-time tweets with the hashtag #syncfusion and #dashboardcloud.
Clean and process the JSON data into a required flat schema.
Perform sentimental analysis over the tweeted text using the Stanford CoreNLP.
Move the final processed data along with the sentiment score into a SQL Database.

Using the Syncfusion Dashboard

Create a dashboard to showcase the real-time twitter sentiment analysis.

For steps 1 to 4, we will be defining a data flow in the Data Integration Platform as shown in the following image.

Data flow diagram for twitter sentimental analysis

Step 1: Use the get twitter component to bring in real-time tweets with the hashtags #syncfusion and #dashboardcloud.

Ensure that you get the consumer key, consumer secret, access token, and access token secret from the twitter developer site by referring to this guide. Before creating the dataflow, refer to the following configurations where you can provide your hashtags under the terms to filter on property.

Twitter Processor

Step 2: Data preparation — clean and process the JSON data into required fields (attributes) using the processors — Evaluate the JSON Path and Update attribute.

A screenshot of a cell phone

Description automatically generated

Evaluate the JSON Path is used to filter the fields like user details, tweeted text, created date, retweet details, language, friends count, followers count, favorites count, and location.

Note:

These attributes and property names will be used as column names in the SQL table.

JSON Path processor

In the Update Attribute processor

We use a conditional expression to fetch the exact tweeted text based on the following conditions. Then, we add hashtags as an additional attribute. To learn more about tweets, refer to the Twitter documentation.

A screenshot of a cell phone

Description automatically generated

Also, cleanse the data for extracting created dates suitable for creating the dashboard.

Step 3: Perform Sentiment analysis over the processed tweet using Stanford Core NLP.

We use the Execute Stream command processor to run the Python script with the Stanford CoreNLP service to process the tweeted text and evaluate its sentiment (mood and the score).

To configure the environment to run Python sentiment analysis script within DIP, follow these steps:

Install the Stanford NLP package from this location.
Start the server using the following command.

C:\<installed location>\stanford-corenlp-full-2018-10-05> java -mx5g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -timeout 10000

Note:

Java in the command refers to JAVA_PATH. If the JAVA_PATH is set already just run the above command or else mention the JAVA_PATH in the place of java used in the command.

Install the “pycorenlp” python package using the following command.
- pip install pycorenlp

4. We use the following Python script to perform sentiment analysis. It is a very basic one that uses default settings in the CoreNLP to gauge sentiment.

from pycorenlp import StanfordCoreNLP
import sys
import re
 
nlp=StanfordCoreNLP('http://localhost:9000')
#handle the invalid characters in input data
text=re.sub('[^a-zA-Z0-9 \n\.]','',sys.argv[1])
res=nlp.annotate(text,
properties={
'annotators': 'sentiment',
'outputFormat':'json',
'timeout': 1000,
})
for s in res["sentences"]:
print("'%s', %s, %s" %(" ".join([t["word"] for t in s["tokens"] ],
s["sentimentValue"],s["sentiment"]))

Save the previous code in a file named (sentiment.py) and provide this file location as command-line arguments for the Execute Stream Command processor as depicted in the following screenshot.

A screenshot of a cell phone

Description automatically generated

Command Arguments: <python script file location> <tweeted text>

Command Path: Python exe (installed location)

Step 4: Move the processed data along with sentiment score into a SQL table.

We use the following processors in this step:

Extract Text — Extract the sentiment results from the Python script into an attribute.
Update Attribute — Update sentiment and sentiment score attributes from the sentiment results.
Attributes to Json — Create the JSON out of the attributes (fields), we want to track in the dashboard.
ConvertJsontoSQL — Convert the JSON string into SQL insert statements.
PutSQL — Execute the insert statements generated.

A screenshot of a social media post

Description automatically generated

Note:

Make sure you have created a SQL table (tweets sentiment) using the following query and create a controller service for it in the Data Integration Platform. For more details on controller settings, refer to our documentation.

Create Table [dbo].[tweetssentiment](
[tweet id] [bigint] NULL,
[userid] [bigint] NULL,
[username] [varchar] (500) NULL,
[screenname] [varchar] (500) NULL,
[tweetedtext] [varchar] (500) NULL,
[language] [varchar] (500) NULL,
[location][varchar] (500) NULL,
[created_at] [varchar] (500) NULL,
[hashtag] [varchar] (500) NULL,
[retweet_count] [int] NULL,
[favourite_count] [int] NULL,
[friends_count] [int] NULL,
[followers_count] [int] NULL,
[sentiment] [varchar] (500) NULL,
[sentiment score] [varchar] (500) NULL
) ON [PRIMARY]
GO

The data integration workflow can be scheduled in real-time by setting it to “0 sec”. So, that it looks for input tweets constantly, or it can be scheduled in intervals.

A screenshot of a cell phone

Description automatically generated

Step 5: The final step is to create a Twitter sentiment analysis dashboard like the following in the Syncfusion Dashboard Cloud.

To learn the basics of creating a dashboard, refer to these links:

You can follow the steps covered in these links to create your dashboards easily.

A screenshot of a cell phone

Description automatically generated

Did you find this information helpful?

Yes

Comments (0)

Twitter Sentiment Analysis using Data Integration Platform

Step 2: Data preparation — clean and process the JSON data into required fields (attributes) using the processors — Evaluate the JSON Path and Update attribute.

Step 3: Perform Sentiment analysis over the processed tweet using Stanford Core NLP.

C:\<installed location>\stanford-corenlp-full-2018-10-05> java -mx5g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -timeout 10000

Java in the command refers to JAVA_PATH. If the JAVA_PATH is set already just run the above command or else mention the JAVA_PATH in the place of java used in the command.

- pip install pycorenlp

Step 4: Move the processed data along with sentiment score into a SQL table.

Step 5: The final step is to create a Twitter sentiment analysis dashboard like the following in the Syncfusion Dashboard Cloud.

Access denied