Note: it’s somewhat ironic to me that the tweet this post talks about has since been deleted. The technique described below doesn’t work for deleted tweets, but it should still for public tweets that are still published.
It can be difficult to have a conversation in Twitter but people somehow seem to manage. You can reply to someone’s tweet, and other people can reply to your replies, which forms a conversation thread of sorts. But the display of the thread is difficult to interpret.
What’s worse is that there is no Twitter API call to get the replies to a given tweet. If you have the JSON for a tweet in hand you can use the
in_reply_to_status_id property to fetch the tweet that it is responding to. But the converse is not true: there is no straightforward way to get the tweets that are in response to a given tweet. If I’m wrong about that please let me know. For a much more thorough discussion and analysis of these constraints see Alexander Nwala’s Tweet Visibility Dynamics in a Tweet Conversation Graph.
It’s a bit of a hack but you can use Twitter’s Search API to programmatically scan through tweets directed at a given user (e.g.
to:barackobama), and inspect them to see if any are in response to a given tweet. You can also stop scanning when you arrive at tweets that are older than the tweet you are looking for responses to, since to my knowledge it’s impossible to reply to a tweet from the future. Yeah, that was my dry attempt at a joke. The big caveat here is that Twitter’s Search API only allows you to retrieve tweets from the last week. So this technique will only work for fetching conversation threads from the last week.
In the Documenting the Now project we are building tools to help researchers study Twitter. We’ve added a command to twarc that performs this heuristic to rebuild a given reply thread for a given tweet identifier. So to get the replies to this tweet:
let’s make this shit huge https://t.co/iP8IOY3CqB— laura olin (@lauraolin) January 25, 2017
you can run this command:
% twarc replies 82407791092769177 > replies.json
This will only get the initial set of replies to the tweet. If you want to get the entire conversation thread you can use the
% twarc replies 82407791092769177 --recursive > replies.json
That will get the replies to the replies, and will also walk up the conversation chain if the supplied tweet identifier is itself a reply to another tweet. In addition it will follow tweets that are quotes.
To demonstrate that it’s working we’ve added a little utility called network.py that will read a set of tweets and write out the network of conversation as a GEXF for loading into Gephi, or DOT for use with Graphviz or as a standalone HTML file that uses D3 to visualize the conversation in your browser. Here’s how you run it:
% ./network.py replies.json replies.html
and here’s what the D3 visualization looks like for that tweet above. Try clicking on the nodes in the graph to see the tweets that the node represents. You can see the quote is colored yellow, and the original tweet (the one with no parent) is colored red.
Paul Butler also recently added the ability to drag and drop a file of tweets generated with the twarc replies command in his Treeverse. Treeverse is a Chrome plugin which provides a much more usable display of a conversation thread. Here’s a screenshot of looking at that same set of replies.
The nice thing about the D3 vidualization is that it’s possible to restyle the presentation using CSS. You can also use it to visualize the network of tweets that were not acquired using the
replies command. For example here is a visualization that was generated from a search for the #datarefuge hashtag a few days ago. I recorded it as a video on a large screen because there were so many nodes.
If you get a chance to try any of this or have any thoughts about it I’d love to hear from you.