Use Twitter API to get tweets and output to csv file using Python -


i have code scrape tweets using twitter api , output csv file. problem that:

  1. i cannot add header of columns csv file
  2. each row seperated new line, don't know why don't want it.

sorry cannot post image...

my output looks like:

first row: thu may 14 00:24:55 +0000 2015 tweets 0 ...
second row:
third row: thu may 14 00:24:59 +0000 2015 tweets 0 ...

here code:

# aim of program scrape tweets tweeter based on keywords` # out put txt file contains tweets tweepy import stream tweepy import oauthhandler tweepy.streaming import streamlistener import time htmlparser import htmlparser import json import csv  ckey = xxxxxxxxxx csecret = xxxxxxxxx atoken = xxxxxxxxxxxxxxxxx asecret = xxxxxxxxxxxxxxx  class listener(streamlistener):     def on_data(self, data):         try:             data_json = json.loads(htmlparser().unescape(data))             tweet_time = data_json["created_at"]             tweet_text = data_json["text"]             retweet_count = data_json["retweet_count"]             tweet_name = data_json["entities"]["user_mentions"][0]["name"]             tweet_screenname = data_json["entities"]["user_mentions"][0]["screen_name"]             tweet_fw = data_json["user"]["followers_count"]             tweet_fd = data_json["user"]["friends_count"]             tweet_st = data_json["user"]["statuses_count"]             print retweet_count             f = csv.writer(open("tweets.csv", "a+"))             f.writerow([tweet_time, tweet_text, retweet_count, tweet_name,                     tweet_name, tweet_screenname, tweet_fw, tweet_fd, tweet_st])             return true          except baseexception, e:             print "failed on data, " + str(e)             time.sleep(5)      def on_error(self, status):         print status   # authorize api auth = oauthhandler(ckey, csecret)  auth.set_access_token(atoken, asecret)  # call main function twitterstream = stream(auth, listener())  # search based on keywords: "avengers" , "captain" twitterstream.filter(track=["avengers", "captain"]) 

any highly appreciated!!

best, morpheus


Comments

Popular posts from this blog

apache - PHP Soap issue while content length is larger -

asynchronous - Python asyncio task got bad yield -

javascript - Complete OpenIDConnect auth when requesting via Ajax -