We all are familiar with Gnip, Inc. which provides data from dozens of social media websites via a single API. It is also known as the Grand Central Station for social media web. One of its popular API is PowerTrack which provides Tweets from Twitter in realtime along with the ability to filter Twitter’s full firehose, giving its customers only what they are interested in.
This ability of Gnip’s PowerTrack has a number of applications and can be applied anywhere, such as with Spark Streaming. Yes!!!, we can integrate Gnip’s PowerTrack with Apache Spark’s Streaming library and build a powerful utility which can provide us Tweets from Twitter in real-time. Also, we can apply all available features of Apache Spark on Gnip’s PowerTrack data to do real-time analysis.
In this blog, we will see a utility which will help us to pull Tweets from Gnip using Spark Streaming and have better handling of…
View original post 154 more words