Dear Twitter: Here Is How to Fix Your API

orian marx Dear Twitter: Here Is How to Fix Your API

Mr. Marx.

This is a guest post by Orian Marx, a serial entrepreneur obsessed with tackling information overload. He has a computer science background with expertise in Flash and Flex front-end development. He’s a born-and-raised New Yorker who likes to swing dance when he’s not working.

I want to make something clear from the start: I love Twitter, though sometimes I wonder if I’m suffering from Stockholm syndrome. I have devoted much of the last three years of my life to working with the Twitter API and continue to pursue building the world’s best Twitter client for professionals in the form of Siftee. Recently Twitter staff has been reaching out to developers with a renewed vigor in the hopes of recapturing some of the goodwill and enthusiasm that has been squandered in the past two years. I applaud them for that. The new developer discussion site and documentation portal are significant improvements. Jack Dorsey has reached out for feedback. I recently spent time on the phone with Jason Costa, Twitter’s developer relations manager, at his request.  I think these are all good signs for the ecosystem.

With that said, there is a lot of feedback to give. This post is a technical one focused on the API itself, not on Twitter’s relationship with developers. Although it’s technical I’ve tried really hard to make it readable to “normals.” 🙂

Without further ado, here’s what’s wrong with the Twitter API …

Users can’t fetch all their old content.

Twitter imposes a variety of artificial limitations on how far back you can access tweets. Users can only access the most recent 800 tweets in their home timeline, their most recent 800 mentions and the most recent 3,200 tweets they’ve sent. Additionally a user may have favorited tweets that have become “too old” to be retrieved. What this effectively means is most long-term users of Twitter will never be able to access all of their old content.

Twitter imposed these constraints early on, due to their limited infrastructure capacity and a need to focus on reliable accessibility for recent content. While I can understand an ongoing need to prevent third parties from crawling all of Twitter’s old content, I believe that individual users should be able to access all old content directly relating to their account, which means all their Mentions, Direct Messages, Favorites and sent Tweets, without restriction.

With Siftee we attempt to archive as much of this content as possible for our users so they can search over it and see old conversations.

You can’t fetch all the replies to a tweet.

Twitter’s Issue 142 is one of the oldest and most infamous shortcomings of the API: there is no way to retrieve all the replies to a particular tweet. Issue 142 is about to celebrate its third birthday and has the dubious honor of being assigned to a programmer who no longer works for Twitter and it has been given a status of “WontFix.” I have long felt this is the most obvious shortcoming of the entire API and addressing it has more potential than any other item in this post to fundamentally change the nature of the Twitter experience. Consider that Twitter allows you to see everyone who has retweeted a tweet–allowing you to see all the replies to a tweet would revitalize Twitter as a medium for conversation rather than just broadcasting.

Unfortunately Twitter sees itself as an information delivery system rather than a social network so this is likely to continue to go unresolved.

You can’t see who favorited a tweet.

Similar to not being able to fetch all the replies to a particular tweet, you cannot fetch a list of users who have favorited a particular tweet. While Twitter’s User Streams API supports notifying users when their tweets are favorited in real-time, there are no methods for finding out who favorited your tweets (or anyone else’s) in the past. This seems like a missed opportunity for user discovery.

Native retweets don’t allow for comments.

When Twitter takes a phenomenon that is naturally emerging from the ecosystem (in this case, the retweet) and attempts to formalize it, a good rule of thumb would be that the result should keep all the existing functionality of the phenomenon and ideally make some or all of it better. When Twitter rolled out native retweeting, they solved a number of “problems” that most users probably didn’t care about (making tweets look like they came from the original author; making sure the tweet text wasn’t tampered with) while eliminating a very significant element of the original phenomenon (adding your own commentary to someone else’s).

We now have a situation where most Twitter clients support both the original “RT” approach to retweeting as well as the native retweet functionality. This sort of potential UI confusion is cited as one of the main reasons why Twitter wants to stop third parties from developing new clients. Of course this completely fails to acknowledge the reality that if Twitter’s own solution to retweeting was actually better across the board than the original behavior it would have been near universally adopted. I don’t mean to be harsh but I feel like I’m in pretty good company when Twitter’s own co-founder and Executive Chairman Jack Dorsey said he doesn’t use the native retweet functionality because it doesn’t fit how he retweets, by which he means he likes to include his own comments. I’m pretty sure he’s not the only one.

Think of it this way: the current system only allows for implicit agreement. There is no way to natively retweet someone while stating “I totally disagree with this.” Imagine only being allowed to quote political candidates you agree with. This is similar to Facebook’s problematic decision to create a Like button but no corresponding Dislike. It artificially skews potential activity in the system. This doesn’t make sense for a social network, but as I noted earlier, Twitter doesn’t consider itself a social network.

At this point it might be very difficult for Twitter to improve the situation. One idea is that Twitter clients could implement a behavior where a user could follow up a retweet with a separate reply to the original tweet and have the two be visually linked.

DMs can’t be marked as replies.

Direct messages do not have the in_reply_to property that tweets have; in other words, there is no way to explicitly link one direct message as being a reply to another. This means there is no way to break up direct message conversations between two users except perhaps by how much time has passed between messages, which is a very unreliable way of breaking up conversations. All this requires as a fix is implementing the exact same functionality that already exists for tweets to mark one as in reply to another.

DMs aren’t threaded as conversations.

Developers access direct messages via two endpoints: direct_messages and direct_messages/sent. The first represents a user’s inbound direct messages and the second represents all their outbound direct messages. This is, unfortunately, the entirely wrong model for how private messaging should be represented. This model makes it extremely difficult to surface old conversations between the user and another specific person because it requires the developer to go back in time by loading all old direct messages the user has sent and received just to find the ones sent to or received from a specific account. This is why Twitter clients that show DMs as per-user conversations (including don’t let you go as far back in time as you would like to go. It simply becomes unwieldy and requires too many potentially extraneous calls to the API.

The right model would allow developers to fetch a list of accounts the user has had DM conversations with in reverse chronological order (the same way your phone shows you who you’ve been texting) and then fetch just the messages between the user and another specific account (again, the same way your phone does it).

Search results aren’t tweets.

Twitter search does something very weird. It doesn’t return tweets! It returns information that roughly resembles tweets but leaves out many of the standard fields and totally changes one very important one: the user ID of the sender. This is noted in a warning on the Search API page, which notes that the issue (Issue 214) is being “tracked.” Unfortunately this issue has been “tracked”since 2008, when Twitter acquired Summize to form the core of its search capabilities. The last comment on Issue 214 sums things up pretty well: it’s “more or less obvious that they are not ever going to fix it.”

Twitter developers could pretty easily fix the issues of missing and incorrect fields in search results by taking all the tweet IDs returned in a search and looking up the original tweets, but unfortunately …

Tweets can’t be looked up in bulk.

Twitter provides some nifty bulk lookup tools such as users/lookup which lets a developer send up to 100 user IDs or screen names to Twitter and get back a full representation of that user (their real name, bio, friends and follower counts, etc). Unfortunately there is no equivalent for tweets. If you have a list of tweet IDs (unique numbers linked to each tweet) and you want to look up the actual corresponding tweets, you have to do it one at a time.

This wasn’t really such a big deal for a long time, as there was rarely a situation where you would need to look up lots of tweets using IDs. Twitter makes lots of different services available for getting tweets in bulk such as fetching a user’s home timeline or their mentions. But what if a developer wanted to look up something that Twitter didn’t provide a service for, say the top most-favorited tweets? At one point in time they might have turned to a third party such as Favstar. Unfortunately (I seem to be using that word a lot) Twitter made what I consider to be one of their greatest strategic errors by changing their Terms of Service to prevent third parties from making tweets available via their own APIs. Instead, “If you provide an API that returns Twitter data, you may only return IDs (including tweet IDs and user IDs)” (section 4.A. of the ToS). Twitter allows developers to get up to 200 tweets at a time for things like a user’s home timeline, counting it as a single request against your rate limit (the number of times you can request certain things from Twitter per hour). But getting a similar 200 tweets from a third party service requires getting 200 tweet IDs and then requesting each tweet individually from Twitter, using up over half the current rate limit of 350 requests per hour.

Talking about Twitter’s recent terms of service changes and their implications is fodder for a different post. The takeaway here is that Twitter should never have forced third parties to make only tweet IDs available through their own APIs if Twitter wasn’t ready to release a corresponding bulk tweet lookup service.

Lists don’t show @replies to people you follow but are not members of the list.

There are a number of characteristics of Twitter lists that can be very confusing for users. One thing many people don’t realize is you do not need to follow the accounts you put on a list. In fact, lists can be a great way to keep track of things people are saying without following them and cluttering up your home timeline. Another thing many people don’t realize about lists is that they will never show tweets that are replies to accounts that are not on the list. This, in my opinion, is not a good thing.

The reason this is not a good thing is because many people think of lists as a way to organize the people they are already following into more manageable groups. If I follow someone and I put them on a list, such as my “Twitter Developers” list, I would expect any tweets of theirs that I see in my home timeline to also appear on my Twitter Developers timeline. But that is not the case. For example, if I didn’t put myself on my Twitter Developers list I would not see any @replies to me from anyone on my Twitter Developers list when looking at that list, even though I would see them in my home timeline. This is confusing.

There are good reasons from an infrastructure perspective as to why Twitter may have had to build lists this way. It would be great if Twitter would enable an option to request list tweets as either filtered or unfiltered for @replies to accounts you follow whether or not they are on the list. This is not too likely to happen, but the missing @replies could be pretty easily restored on the client side by merging the appropriate tweets from a user’s home timeline into their applicable list timelines.

Lists should not be capped.

Twitter imposes two limitations on lists: a user cannot create more than 20 of them, and each of them cannot have more than 500 members. Neither of these restrictions make much sense and my guess is that they do more harm toward user adoption of lists than they do good in terms of preventing Twitter’s infrastructure from being overloaded (which is the only justification for these restrictions in the first place).

Capping lists at 500 members doesn’t make sense given that Twitter can clearly already handle generating timelines for accounts that are following tens of thousands of people. Most users will not build such large lists for personal organization anyway, but consider cases such as building a list of conference attendees (Twitter arguably had its first major spurt of adoption at the 2007 SXSW conference as a way for attendees to communicate) or maintaining a company directory (Twitter maintains a list of all its employees – a list which apparently has special privileges as it currently has more then 500 members).

As for limiting accounts to 20 lists, again this seems very arbitrary and unnecessary. Most users will never create more than 20 lists, but those that would want to would likely have good reason to. What if a university wanted to create a list of all the students in each of its courses? What if a large company wanted to create a list for each of its departments? What if the US government wanted to create a list of government officials on Twitter in each of the 50 states? As we continue to generate more and more information, curation becomes ever more necessary. Twitter should be trying to get ahead of the curve on this, especially with new services like Google+ entering the fray.

Errors aren’t consistent.

This is purely anecdotal but from my extensive experience in building Siftee I can say that the Twitter API throws a lot of random errors that frequently don’t accurately reflect whatever the problem may be. Sometimes Twitter is having a capacity issue but the API tells you you’re requesting some information that doesn’t exist (when it does). Sometimes it spits HTML at you rather than a properly formed error response. Experience indicates there are a wide variety of possible error messages one can receive from the Twitter API but unfortunately these messages aren’t documented anywhere. Yes, there is the error codes and responses documentation page, but this seems to just scratch the surface of actual error messages you may receive.

The crossdomain.xml for non-search APIs is too restrictive.

I saved this one for last as this is a pet peeve that only impacts developers using Flash (and possibly Silverlight). Twitter hates Flash (along with everybody else these days). Why do I say that? Because Twitter has since day one made it impossible for Flash to directly access any Twitter services except the Search API. This is very simply due to the lack of a sufficiently open crossdomain.xml file on Twitter servers which Flash needs to satisfy security constraints. I’ve been bringing this issue up for three years herehereherehere and hereSiftee is currently built in Flash using the Flex framework and every request it makes cannot go directly to Twitter but instead has to be routed through a PHP proxy due to this extremely stupid constraint that no other major web service imposes. Regardless of how you feel about Flash, it makes no sense for Twitter to alienate a whole industry of developers simply because it can’t get around to reviewing this issue.


This post is not meant to be exhaustive. There are lots of other things Twitter could be doing. In fact we’re doing lots of great stuff with Siftee that can’t currently be done with the Twitter API. My goal was to cover what I see as some of the long-standing issues that should have been addressed a long time ago. I don’t mean to suggest that all of these things are easy to fix. I don’t work for Twitter and I am no expert in building large-scale web services. However after several years and raising more than a billion dollars I would have expected many of these issues to be non-existent. I look forward to seeing what comes of Twitter’s renewed interest in gathering feedback from developers.

A version of this post first appeared on Mr. Marx’s blog.