Getting Twitter keyword data in table format using Python

The goal of this project is to search for tweets which contains a particular keyword. Let's say you need to get Tweets and link in the tweet that contains "#tableau". This code will help you to get all the tweets in table format and ready to use. The table will contain:

  • Date of the Tweet
  • Twitter ID of the person who posted the Tweet
  • User name of the person who posted the Tweet
  • Followers she/he has
  • Tweet in text format
  • Link in the Tweet (if any)
  • Times the tweet was marked as Favourite by other users
  • Times the Tweet was Retweeted

The process will be as follows:

  1. We will import necessary packages.
  2. We will input all the keys, token and secret in the code. You can find a link after this post which will help you to generate the keys.
  3. We will input the keyword and the date from which we want to conduct our search.
  4. Define a function to build a dataframe.
  5. Get the final output using the function.

I hope you enjoy this.

Installing packages

In [ ]:
! pip install pandas
! pip install tweepy

Importing packages

In [2]:
import pandas as pd
import tweepy as tw

Input your api keys below

In [3]:
consumer_key= '****************'
consumer_secret= '*************************'
access_token= '************************'
access_token_secret= '********************'

Connecting to Twitter API service

In [4]:
auth = tw.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tw.API(auth, wait_on_rate_limit=True)
In [5]:
keyword = "#tableau" + " -filter:retweets"
from_date = "2019-08-12"  #format "yyyy-mm-dd"

Function to convert twitter data into a Table

In [6]:
def twitter_table(keyword, from_date):

    all_tweets = tw.Cursor(api.search,
                       q=keyword ,
                       lang="en",
                       since=from_date).items()

    # Collect a list of tweets
    tweets_list= [[tweet.created_at, tweet.user.screen_name,
                   tweet.user.name, tweet.user.followers_count, 
                   tweet.text, tweet.favorite_count, 
                   tweet.retweet_count] for tweet in all_tweets]

    tweet_table = pd.DataFrame(data=tweets_list, 
                        columns=['Date', 'User Twitter ID',
                                 'User Name', 'Followers', 
                                 'Tweet','Favourites', 'Retweet'])
    
    tweet_table[['Tweet', 'Link']] = tweet_table['Tweet'].str.rsplit(" https:", 1, expand=True)
    tweet_table['Link'] = ' https:' + tweet_table['Link']
    
    tweet_table = tweet_table[['Date', 'User Twitter ID', 'User Name',
                               'Followers', 'Tweet', 'Link',
                               'Favourites', 'Retweet']]
    
    return tweet_table

Calling out function to get the data from Twitter

In [7]:
twitter_table(keyword, from_date).head(10)
Out[7]:
Date User Twitter ID User Name Followers Tweet Link Favourites Retweet
0 2019-08-12 15:13:00 sarahlovesdata Sarah Bartlett 5999 @LearnVizWithMe @J_JCookie Welcome @J_JCookie!... NaN 2 0
1 2019-08-12 15:00:43 learnmet Learnmet.com 8044 10 Things You Must Do After You Sign Up on #Le... https://t.co/kwWBMoRHNr 0 1
2 2019-08-12 14:54:46 couponfree01 Coupon Free 1914 Udemy Free Discount - Business Analysis: Essen... https://t.co/EPMHL6P6Sv 4 4
3 2019-08-12 14:44:17 BlaiseDenton Blaise Denton 3 2nd Makeover Monday: Clinical Trials. The numb... https://t.co/Owyg7HkSaq 1 0
4 2019-08-12 14:41:06 Poulincogsci Christina P. Gorga 1067 Hey #Datafam. Who has experience integrating #... https://t.co/r223NAfB2m 4 2
5 2019-08-12 14:33:51 SegunOworu Segun Oworu 717 Storytelling with data in beautiful visual: Th... https://t.co/JKtCLKd4jX 0 1
6 2019-08-12 14:14:16 itylergarrett tyler garrett 971 Early birds gets the worm, and coffee.\nDownto... https://t.co/O1UxLjgtEZ 4 1
7 2019-08-12 14:04:01 EmmaWhyte Emma Whyte 3239 Hey #tableau #datafam can we talk about on-boa... https://t.co/3GsLb5XTrL 4 4
8 2019-08-12 12:13:19 stackArmor stackArmor 2147 Readout @stackArmor #AWSSecuritySolutions Prov... https://t.co/73CX5Gzl4P 3 3
9 2019-08-12 12:04:54 infolabNL The Information Lab 510 Fan of table calculations, set actions and the... https://t.co/dYm8eRr1Kd 1 1

Author: Amandeep Saluja