Note: it’s somewhat ironic to me that the tweet this post talks about has since been deleted. The technique described below doesn’t work for deleted tweets, but it should still for public tweets that are still published.
It can be difficult to have a conversation in Twitter but people somehow seem to manage. You can reply to someone’s tweet, and other people can reply to your replies, which forms a conversation thread of sorts. But the display of the thread is difficult to interpret.
What’s worse is that there is no Twitter API call to get the replies
to a given tweet. If you have the JSON for a tweet in hand you can use
in_reply_to_status_id property to fetch the tweet that
it is responding to. But the converse is not true: there is no
straightforward way to get the tweets that are in response to a given
tweet. If I’m wrong about that please let
me know. For a much more thorough discussion and analysis of these
constraints see Alexander
Visibility Dynamics in a Tweet Conversation Graph.
It’s a bit of a hack but you can use Twitter’s Search
API to programmatically scan through tweets directed at a given user
to:barackobama), and inspect them to see if any are
in response to a given tweet. You can also stop scanning when you arrive
at tweets that are older than the tweet you are looking for responses
to, since to my knowledge it’s impossible to reply to a tweet from the
future. Yeah, that was my dry attempt at a joke. The big caveat here is
that Twitter’s Search API only allows you to retrieve tweets from the
last week. So this technique will only work for fetching conversation
threads from the last week.
In the Documenting the Now project we are building tools to help researchers study Twitter. We’ve added a command to twarc that performs this heuristic to rebuild a given reply thread for a given tweet identifier. So to get the replies to this tweet:
let’s make this shit huge https://t.co/iP8IOY3CqB— laura olin (@lauraolin) January 25, 2017
you can run this command:
% twarc replies 82407791092769177 > replies.json
This will only get the initial set of replies to the tweet. If you
want to get the entire conversation thread you can use the
% twarc replies 82407791092769177 --recursive > replies.json
That will get the replies to the replies, and will also walk up the conversation chain if the supplied tweet identifier is itself a reply to another tweet. In addition it will follow tweets that are quotes.
To demonstrate that it’s working we’ve added a little utility called network.py that will read a set of tweets and write out the network of conversation as a GEXF for loading into Gephi, or DOT for use with Graphviz or as a standalone HTML file that uses D3 to visualize the conversation in your browser. Here’s how you run it:
% ./network.py replies.json replies.html
and here’s what the D3 visualization looks like for that tweet above. Try clicking on the nodes in the graph to see the tweets that the node represents. You can see the quote is colored yellow, and the original tweet (the one with no parent) is colored red.
Paul Butler also recently added the ability to drag and drop a file of tweets generated with the twarc replies command in his Treeverse. Treeverse is a Chrome plugin which provides a much more usable display of a conversation thread. Here’s a screenshot of looking at that same set of replies.
The nice thing about the D3 vidualization is that it’s possible to
restyle the presentation using CSS. You can also use it to visualize the
network of tweets that were not acquired using the
command. For example here is a visualization that was generated from a
search for the #datarefuge hashtag a
few days ago. I recorded it as a video on a large screen because there
were so many nodes.
If you get a chance to try any of this or have any thoughts about it I’d love to hear from you.