Python is great language for all sorts of things. Very active developer community creates many libraries which extend the language and make it easier to use various services. One of those libraries is tweepy. Tweepy is open-sourced, hosted on GitHub and enables Python to communicate with Twitter platform and use its API. For an introduction on the library Twython – check out this article.
At the time of writing, the current version of tweepy is 1.13. It was released on January 17, and offers various bug fixes and new functionality compared to the previous version. The 2.x version is being developed but it is currently unstable so a huge majority of the users should use the regular version.
Installing tweepy is easy, it can be cloned from the Github repository:
git clone https://github.com/tweepy/tweepy.git python setup.py install
Or using easy install:
pip install tweepy
Either way provides you with the latest version.
- Add, Remove, and Search Packages in Python with pip
- How to Install Django on Windows, Mac and Linux
- Python for Android: The Scripting Layer (SL4A)
Using tweepy
Tweepy supports accessing Twitter via Basic Authentication and the newer method, OAuth. Twitter has stopped accepting Basic Authentication so OAuth is now the only way to use the Twitter API.
Here is a sample of how to access the Twitter API using tweepy with OAuth:
import tweepy # Consumer keys and access tokens, used for OAuth consumer_key = '7EyzTcAkINVS3T2pb165' consumer_secret = 'a44R7WvbMW7L8I656Y4l' access_token = 'z00Xy9AkHwp8vSTJ04L0' access_token_secret = 'A1cK98w2NXXaCWMqMW6p' # OAuth process, using the keys and tokens auth = tweepy.OAuthHandler(consumer_key, consumer_secret) auth.set_access_token(access_token, access_token_secret) # Creation of the actual interface, using authentication api = tweepy.API(auth) # Sample method, used to update a status api.update_status('Hello Python Central!')
The result of this code is the following:
The main difference between Basic and OAuth authentication are the consumer and access keys. With Basic Authentication, it was possible to provide a username and password and access the API, but since 2010 when the Twitter started requiring OAuth, the process is a bit more complicated. An app has to be created at dev.twitter.com.
OAuth is a bit more complicated initially than Basic Auth, since it requires more effort, but the benefits it offers are very lucrative:
- Tweets can be customized to have a string which identifies the app which was used.
- It doesn’t reveal user password, making it more secure.
- It’s easier to manage the permissions, for example a set of tokens and keys can be generated that only allows reading from the timelines, so in case someone obtains those credentials, he/she won’t be able to write or send direct messages, minimizing the risk.
- The application doesn’t reply on a password, so even if the user changes it, the application will still work.
After logging in to the portal, and going to “Applications”, a new application can be created which will provide the needed data for communicating with Twitter API.
This is a screen which has all of the data needed to talk to Twitter network. It is important to note that by default, the app has no access to direct messages, so by going to the settings and changing the appropriate option to “Read, write and direct messages”, you can enable your app to have access to every Twitter feature.
Twitter API
Tweepy provides access to the well documented Twitter API. With tweepy, it’s possible to get any object and use any method that the official Twitter API offers. For example, a User
object has its documentation at https://dev.twitter.com/docs/platform-objects/users and following those guidelines, tweepy can get the appropriate information.
Main Model
classes in the Twitter API are Tweets
, Users
, Entities
and Places
. Access to each returns a JSON-formatted response and traversing through information is very easy in Python.
# Creates the user object. The me() method returns the user whose authentication keys were used. user = api.me() print('Name: ' + user.name) print('Location: ' + user.location) print('Friends: ' + str(user.friends_count))
Name: Ahmet Novalic Location: Gradacac,Bih Friends: 59
All of the API methods are documented here: http://packages.python.org/tweepy/html/api.html
Tweepy StreamingAPI
One of the main usage cases of tweepy is monitoring for tweets and doing actions when some event happens. Key component of that is the StreamListener
object, which monitors tweets in real time and catches them.
StreamListener
has several methods, with on_data()
and on_status()
being the most useful ones. Here is a sample program which implements this behavior:
class StdOutListener(StreamListener): ''' Handles data received from the stream. ''' def on_status(self, status): # Prints the text of the tweet print('Tweet text: ' + status.text) # There are many options in the status object, # hashtags can be very easily accessed. for hashtag in status.entries['hashtags']: print(hashtag['text']) return true def on_error(self, status_code): print('Got an error with status code: ' + str(status_code)) return True # To continue listening def on_timeout(self): print('Timeout...') return True # To continue listening if __name__ == '__main__': listener = StdOutListener() auth = tweepy.OAuthHandler(consumer_key, consumer_secret) auth.set_access_token(access_token, access_token_secret) stream = Stream(auth, listener) stream.filter(follow=[38744894], track=['#pythoncentral'])
So, this program has a StreamListener
implemented and the code is set up to use OAuth. The Stream
object is created, which uses that listener as output. Stream, being another important object in tweepy also has many methods, in this case filter()
is used with parameters passed. “follow” is a list of followers whose tweets are monitored, and “track” is a list of hashtags which will trigger the StreamListener
.
In this example, we have used my user ID to follow and the #pythoncentral hashtag as a condition. After running the program and tweeting this status:
The program almost instantly catches the tweet, and calls the on_status()
method, which produces the following output in the console:
Tweet text: Hello Again! #pythoncentral pythoncentral
Besides printing the tweet, in the on_status()
method there are some additional things which illustrate the number of possibilities that can be done with the tweet data:
# There are many options in the status object, # hashtags can be very easily accessed. for hashtag in status.entities['hashtags']: print(hashtag['text'])
This code traverses through entities, picks the “hashtags” one and for each hashtag the tweet contains, it prints its value. This is just a sample; a complete list of tweet entities is located here: https://dev.twitter.com/docs/tweet-entities.
Conclusion
To sum up, tweepy is a great open-source library which provides access to the Twitter API for Python. Although the documentation for tweepy is a bit scarce and doesn’t have many examples, the fact that it heavily relies on the Twitter API, which has excellent documentation, makes it probably the best Twitter library for Python, especially when considering the Streaming
API support, which is where tweepy excels. Other libraries like python-twitter provide many functions too, but the tweepy has most active community and most commits to the code in the last year.