Twitter and YouTube data

Gaining Access to Twitter and YouTube data.

 

Unlike Reddit which does not require access rights both Twitter and YouTube require you to take steps in order to access data. We outline what you need to do here but these may change and it is a good idea to seek out the latest information from the vosonSML website

https://cran.r-project.org/web/packages/vosonSML/vignettes/Intro-to-vosonSML.html

Twitter Data

Twitter-owner Elon Musk has recently decided to close down free access to Twitter's application programming interface (API), which gives users access to tweet data. So the only way is to pay for data and use the firehose.


There are in essence two ways to collect Twitter data that do not use the firehose. The first method only requires the user to have an active Twitter account and uses the rtweet library. The following simple r commands will download the data looking for the term “sunbelt” within the Tweet and we search for 500 Tweets.

library(rtweet)

twitterdata<-search_tweets(q="sunbelt",n=500)

 

This will need to be authenticated and the user will be taken to a screen to sign into their Twitter account. This screen should be closed oncde you have signed in and the download will commence. Note this stage may not occur if the user is still logged into Twitter and has done this on a previous occasion on the same machine.

Users who want to take advantage of other information available by Twitter may want to use the more sophisticated approach of using API keys and tokens. To do this you will need to set up a developer account with Twitter and hence generate your own API keys. This is done on the Twitter developer website https://developer.twitter.com. Once you have applied and been granted access you will need to generate some API tokens associated with an app. This can either be an app that you are constructing or a third party app that you have access to. Once you have your apiKey and your apiSecret and your two access tokens accessToken and accessTokenSecret associated with the app MyApp (say) then the following r code will download Twitter data again for 500 Tweets containing “sunbelt”.

 

myDevKeys <- list(appName = "MyApp", apiKey = "xxxxxxxxxxxxx", apiSecret = "xxxxxxxxxxxxxxx", accessToken = "xxxxxxxxxxx", accessTokenSecret = "xxxxxxxxxxx")

twitterAuth <- Authenticate("twitter", appName = myDevKeys$appName, apiKey = myDevKeys$apiKey,apiSecret = myDevKeys$apiSecret, accessToken = myDevKeys$accessToken, accessTokenSecret = myDevKeys$accessTokenSecret)

twitterData <- twitterAuth %>% Collect(searchTerm = "sunbelt",numTweets = 500)

YouTube

Obtaining YouTube data is far easier than Twitter as you only need an api key and this is simply generated via the google developers console. You do not need a developer account. Go to https://console.cloud.google.com/apis/dashboard. Click on credentials and then Create credentials and finally API key. The created key now appears in the dialogue box. Note if you created a key before this is also stored on the cloud console. Once you have the key then the following r code will collect comments associated with the YouTube video https://www.youtube.com/watch?v=TxBj8R7XKe4.

 

myAPIKey <- "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"

youtubeAuth <- Authenticate("youtube", apiKey = myAPIKey)

videoIDs <- GetYoutubeVideoIDs(https://www.youtube.com/watch?v=TxBj8R7XKe4

youtubeData <- youtubeAuth %>%

Collect(videoIDs = videoIDs, maxComments = 500, writeToFile = TRUE)