Thoughts on web development, tech, and life.

Category: Plaxo (Page 1 of 5)

Leaders in Tech podcast appearance

I was recently a guest on the Leaders in Tech podcast. We covered a lot of ground, from my childhood and how I got interested in tech and AI, to many lessons I learned working in startups (first Plaxo, now Triller) and inside big companies (Google+, Google Photos, Google Assistant). In particular, the conversation focused on the advantages startups have when it comes to driving innovation, and why, despite their advantages in terms of resources and distribution, it’s hard to get the same results inside larger organizations. We finished with a discussion of how AI is likely to impact the careers of software engineers (my bet is it will remain more of an amplifier than a replacement for years to come).

I think this is one of the best summaries of my experience and thoughts on Silicon Valley and entrepreneurship that I’ve managed to capture. I hope you’ll find it useful and would love to hear your feedback!

Turns out we still need Plaxo (or something like it)


Despite having helped build a startup 12 (!) years ago focused on keeping your address book up to date, despite the massive intervening adoption of smart phones and social networks, and despite being connected to an unusually tech-savvy cohort of friends and relatives, my address book is still a disastrous mess. Sound familiar? This thought always lingered in my head, but it confronted me full-force this weekend while my wife and I struggled in vain to produce an updated mailing list to announce the birth of our new son. We’d done this for our last child (sadly using a shared Google spreadsheet in absence of any more purpose-built tool), so we were only 2-3 years out-of-date, yet an astonishing 45% of the addresses of our friends and family needed updating, and the task was not easily achieved.

This led me to ponder why this seemingly tractable problem remains so stubbornly unsolved, and what, if any, hope lies ahead.

The false saviors

It’s hard to remember that when we started Plaxo in 2002, there was no iPhone, no Facebook, not even Friendster or Flickr. Our biggest opportunity was writing a Microsoft Outlook plugin. Since then, many people have said that the problem of out-of-date contact info would soon be a relic of the past. Smartphones would keep us seamlessly connected. Social network profiles would always contain our latest info. Number portability, Gmail, and mainstream messaging services would create long-lasting identifiers. The death of snail mail would obviate the need for physical addresses. And so on. We often worried about these trends, and in some sense they have each helped us get more connected, yet they clearly haven’t solved the core problem. Why not? A few reasons”¦

Contact info still changes frequently (esp. in aggregate). As mentioned above, nearly half of the people we sent birth announcements to last time physically moved in the past 2-3 years. Now in our age range that’s probably more than usual (buying a first house, getting a new job, etc.), but it’s still staggeringly high. And it’s not just physical addresses. I’m constantly wondering “is this the right email address or cell phone number to use for so-and-so, or is it dead / rarely checked these days?” Ditto for “who do I know who still works at [insert company here]?” Even using circles in Google+, I’m often wondering “gee, is my Googlers circle still a safe place for Google-only discussions?” Even when the info hasn’t changed, I’m often unsure if it’s still current. And of course I keep meeting new people, many of whom I haven’t ever collected the relevant info for (even if they wouldn’t mind me having it).

Social networks suck at contact info. As surprising as the staleness of our address list was, I was even more surprised how few of my contacts had their latest address in Google+, Facebook, or elsewhere online. Many of these people used to work at Plaxo, and all are online using social networks daily. Yet the info is either missing, stale, or not shared with me. Why? In theory, social networks subsumed Plaxo-like functionality, but in practice there are deep reasons why they fall short.

  • Contact info is buried. I have to click to someone’s profile, click to their about/contact tab, scroll down, and then hope their info is shared with me. When’s the last time you viewed your own “about” tab (let alone someone else’s contact info section)? It’s out of sight, out of mind. In fact, even my own home address was out of date on Google+ and Facebook until long after I moved. It’s just not something you naturally think about while checking your news feed.
  • You don’t want everyone to have your info. Even though most social networks provide a way for you to share personal contact info, most users don’t want all of their “friends” to have all of their personal details. I recall hearing back in the day as Facebook grew that more people were actually deleting their contact info because they were making more and more loose-tie friends who they didn’t feel comfortable sharing that info with. On Google+ you can (of course) use circles to finely control who sees your home info, and Facebook has since followed suit, but as my wife put it when I asked her today, “eew, I don’t want to put my home address on Facebook”. You have to trust the site itself, your ability to navigate their privacy controls (and keep them up-to-date as your life changes), and the site’s ability to honor your choices before you’ll use social networks to share sensitive info. For most people, that bar has not yet been met.
  • Oh, and that whole “data portability” thing. When I want to see if I have someone’s latest home address, where should I look? My address book? Only if I can pull in the info shared with me from social networks. Not surprisingly, Google+ syncs to Google Contacts, but everything else is still a walled garden. You have to go knock on all the doors. For every person you care about. Every time. Even though they chose to share it with you. Sound broken? I agree. But I guess we weren’t loud enough.

Smartphones aren’t smart about contact info. Your smartphone address book does a good job of following you around from device to device and desktop to mobile. The only problem is that it is syncing garbage data. You’d think that using it to send and receive phone calls, text messages, emails, and looking up driving directions would make your address book fresher and more complete. But you’d be wrong. Chalk it up to app fragmentation and no one really trying hard to solve the problem in the first place, esp. at the OS level. Even at Google, the Gmail Contacts team is separate from the android “People app” team, and most OEMs bundle their own separate address book. Good luck.

No one’s helping you scrounge. Another fascinating if infuriating aspect of my recent Labor Day labor to update our mailing list was how often I could find the addresses by scrounging through my email archives, text messages, and manually entered address book info. I’d even been to many of the homes I lacked addresses for! In the absence of a good contact info syncing solution, most people still solve the problem “out of band” via existing communication channels (“leaving now, btw what’s your address in SF?” “it’s 123 Fake St., see you soon!”). Yet nothing is helping you extract, save, and aggregate that info for the next time you need it. It’s still a painful, manual process that you need to “get good at”. And this is for close ties–not random acquaintances–so it’s surely just an “entropy” problem, not a “stalking” problem.

The wisdom of crowds? Another trend I was sure would take off in the age of the cloud was some kind of solution for pooling contact info between families, friends, and organizations. At Plaxo we used to always have the problem, “Who’s got the cell phone for that new guy we just hired?” and the answer was always “someone in the room” (you just don’t know who in advance, and you have to ask them first). Many families still have a designated aunt who dutifully maintains the conglomerated birthday and holiday card list. And “hey Garret, remind me what Pete’s new address is?” still gets the job done in a pinch without offending. So why does each address book still start from scratch as if it were the only record of knowledge in the universe?

A new hope?

In the years since leaving Plaxo to help start Google+, I’ve talked to nearly every “address book 2.0” startup that’s reared its head. Apparently I’ve got a reputation in the Valley for not being “over” this contact info problem. Many have offered clever twists, but none have fundamentally addressed the challenges above. And perhaps unsurprisingly, as a result, none have saved the world from its frustrating fragmentation. So why am I still eternally optimistic? Because it’s a real, mainstream problem, there’s no good reason people want fragmentation to persist, and increasingly smartphones do participate in the events that collect or verify contact info. Plus there’s still the cloud opportunity for making each other smarter. So how might such a solution emerge?

When trying to solve any complex social problem, one good question to ask is, “What does the ideal solution state look like?” In the case of up-to-date contact info, I’d argue we still don’t know. You could say it’s everyone being on a single social network and using it perfectly, but is that ever going to be realistic? I’d say it’s more likely a mix of assisted sync and scrounge. In other words: help me collect whatever’s been shared with me via social networks or communication tools. And the place to do that is logically a smartphone (backed by a cloud-hosted account). Google or Apple are, in theory, in a great position to make progress on this, but I suspect it will be a startup that gets the job done, since it can be their sole focus and brand identity.

Such a startup would have to embrace the messy reality I’ve outlined above and turn it into a strength. Use all the available APIs and other tricks to help me find the contact info that has been shared with me. Keep track of when it was last updated (don’t make me guess). Parse through all my emails and texts for stuff that looks like contact info. Use my phone’s location history to ask me whose house I just visited when it doesn’t look like a business. Remind me what email or phone number each contact last used, and let me easily ping them back if I need some updated info. Help me build custom lists for things like holiday or birth announcements, and use that as an opportunity to ask for updated info. And partner with sites like TinyPrints not only to send those cards but also to send change-of-address cards when I myself move (something you should also be able to detect using my phone). Once you start gaining traction helping individuals keep their address books up-to-date, add a layer to pool it with family, friends, and colleagues in a privacy-preserving way (e.g. an easy way to see who knows someone’s phone number, but you still have to ask them to share it with you).

Is there enough here to build a successful business around? You be the judge. But is this still a real problem that real people still wish someone would solve? Abso-f*cking-lutely.

Implementing PubSubHubbub subscriber support: A step-by-step guide

One of the last things I did before leaving Plaxo was to implement PubSubHubbub (PuSH) subscriber support, so that any blogs which ping a PuSH hub will show up almost instantly in pulse after being published. It’s easy to do (you don’t even need a library!), and it significantly improves the user experience while simultaneously reducing server load on your site and the sites whose feeds you’re crawling. At the time, I couldn’t find any good tutorials for how to implement PuSH subscriber support (add a comment if you know of any), so here’s how I did it. (Note: depending on your needs, you might find it useful instead to use a third-party service like Gnip to do this.)

My assumption here is that you’ve already got a database of feeds you’re subscribing to, but that you’re currently just polling them all periodically to look for new content. This tutorial will help you “gracefully upgrade” to support PuSH-enabled blogs without rewriting your fundamental polling infrastructure. At the end, I’ll suggest a more radical approach that is probably better overall if you can afford a bigger rewrite of your crawling engine.

The steps to add PuSH subscriber support are as follows:

  1. Identify PuSH-enabled blogs extract their hub and topic
  2. Lazily subscribe to PuSH-enabled blogs as you discover them
  3. Verify subscription requests from the hub as you make them
  4. Write an endpoint to receive pings from the hub as new content is published
  5. Get the latest content from updated blogs as you receive pings
  6. Unsubscribe from feeds when they’re deleted from your system

1. Identify PuSH-enabled blogs extract their hub and topic

When crawling a feed normally, you can look for some extra metadata in the XML that tells you this blog is PuSH-enabled. Specifically, you want to look for two links: the “hub” (the URL of the hub that the blog pings every time it has new content, which you in turn communicate with to subscribe and receive pings when new content is published), and the “self” (the canonical URL of the blog you’re subscribing to, which is referred to as the “topic” you’re going to subscribe to from the hub).

A useful test blog to use while building PuSH subscriber support is http://pubsubhubbub-example-app.appspot.com/, since it lets anyone publish new content. If you view source on that page, you’ll notice the standard RSS auto-discovery tag that tells you where to find the blog’s feed:

<link title="PubSubHubbub example app" type="application/atom+xml" rel="alternate" />

And if you view source on http://pubsubhubbub-example-app.appspot.com/feed, you’ll see the two PuSH links advertised underneath the root feed tag:

<link type="application/atom+xml" title="PubSubHubbub example app" rel="self" />
<link rel="hub" href="http://pubsubhubbub.appspot.com/" />

You can see that the “self” link is the same as the URL of the feed that you’re already using, and the “hub” link is to the free hub being hosted on AppEngine at http://pubsubhubbub.appspot.com/. In both cases, you want to look for a link tag under the root feed tag, match the appropriate rel-value (keeping in mind that rel-attributes can have multiple, space-separated values, e.g. rel="self somethingelse", so split the rel-value on spaces and then look for the specific matching rel-value), and then extract the corresponding href-value from that link tag. Note that the example above is an ATOM feed; in RSS feeds, you generally have to look for atom:link tags under the channel tag under the root rss tag, but the rest is the same.

Once you have the hub and self links for this blog (assuming the blog is PuSH-enabled), you’ll want to store the self-href (aka the “topic”) with that feed in your database so you’ll know whether you’ve subscribed to it, and, if so, whether the topic has changed since you last subscribed.

2. Lazily subscribe to PuSH-enabled blogs as you discover them

When you’re crawling a feed and you notice it’s PuSH-enabled, check your feed database to see if you’ve got a stored PuSH-topic for that feed, and if so, whether the current topic is the same as your stored value. If you don’t have any stored topic, or if the current topic is different, you’ll want to talk to that blog’s PuSH hub and initiate a subscription so that you can receive real-time updates when new content is published to that blog. By storing the PuSH-topic per-feed, you can effectively “lazily subscribe” to all PuSH-enabled blogs by continuing to regularly poll and crawl them as you currently do, and adding PuSH subscriptions as you find them. This means you don’t have to do any large one-time migration over to PuSH, and you can automatically keep up as more blogs become PuSH-enabled or change their topics over time. (Depending on your crawling infrastructure, you can either initiate subscriptions as soon as you find the relevant tags, or you can insert an asynchronous job to initiate the subscription so that some other part of your system can handle that later without slowing down your crawlers.)

To subscribe to a PuSH-enabled blog, just send an HTTP POST to its hub URL and provide the following POST parameters:

  • hub.callback = [the URL of your endpoint for receiving pings, which we’ll build in step 4]
  • hub.mode = subscribe
  • hub.topic = [the self-link / topic of the feed you’re subscribing to, which you extracted in step 1]
  • hub.verify = async [means the hub will separately call you back to verify this subscription]
  • hub.verify_token = [a hard-to-guess token associated with this feed, which the hub will echo back to you to prove it’s a real subscription verification]

For the hub.callback URL, it’s probably best to include the internal database ID of the feed you’re subscribing to, so it’s easy to look up that feed when you receive future update pings. Depending on your setup, this might be something like http://yoursite.com/push/update?feed_id=123 or http://yoursite.com/push/update/123. Another advantage of this technique is that it makes it relatively hard to guess what the update URL is for an arbitrary blog, in case an evil site wanted to send you fake updates. If you want even more security, you could put some extra token in the URL that’s different per-feed, or you could use the hub.secret mechanism when subscribing, which will cause the hub to send you a signed verification header with every ping, but that’s beyond the scope of this tutorial.

For the hub.verify_token, the simplest thing would just be to pick a secret word (e.g. “MySekritVerifyToken“) and always use that, but an evil blog could use its own hub and quickly discover that secret. So a better idea is to do something like take the HMAC-SHA1 of the topic URL along with some secret salt you keep internally. This way, the hub.verify_token value is feed-specific, but it’s easy to recompute when you receive the verification.

If your subscription request is successful, the hub will respond with an HTTP 202 “Accepted” code, and will then proceed to send you a verification request for this subscription at your specified callback URL.

3. Verify subscription requests from the hub as you make them

Shortly after you send your subscription request to the hub, it will call you back at the hub.callback URL you specified with an HTTP GET request containing the following query parameters:

  • hub.mode = subscribe
  • hub.topic = [the self-link / topic of the URL you requested a subscription for]
  • hub.challenge = [a random string to verify this verification that you have to echo back in the response to acknowledge verification]
  • hub.verify_token = [the value you sent in hub.verify_token during your subscription request]

Since the endpoint you receive this verification request is the same one you’ll receive future update pings on, your logic has to first look for hub.mode=subscribe, and if so, verify that the hub sent the proper hub.verify_token back to you, and then just dump out the hub.challenge value as the response body of your page (with a standard HTTP 200 response code). Now you’re officially subscribed to this feed, and will receive update pings when the blog publishes new content.

Note that hubs may periodically re-verify that you still want a subscription to this feed. So you should make sure that if the hub makes a similar verification request out-of-the-blue in the future, you respond the same way you did the first time, providing you indeed are still interested in that feed. A good way to do this is just to look up the feed every time you get a verification request (remember, you build the feed’s ID into your callback URL), and if you’ve since deleted or otherwise stopped caring about that feed, return an HTTP 404 response instead so the hub will know to stop pinging you with updates.

4. Write an endpoint to receive pings from the hub as new content is published

Now you’re ready for the pay-out–magically receiving pings from the ether every time the blog you’ve subscribed to has new content! You’ll receive inbound requests to your specified callback URL without any additional query parameters added (i.e. you’ll know it’s a ping and not a verification because there won’t be any hub.mode parameter included). Instead, the new entries of the subscribed feed will be included directly in the POST body of the request, with a request Content-Type of application/atom+xml for ATOM feeds and application/rss+xml for RSS feeds. Depending on your programming language of choice, you’ll need to figure out how to extract the raw POST body contents. For instance, in PHP you would fopen the special filename php://input to read it.

5. Get the latest content from updated blogs as you receive pings

The ping is really telling you two things: 1) this blog has updated content, and 2) here it is. The advantage of providing the content directly in the ping (a so-called “fat ping“) is so that the subscriber doesn’t have to go re-crawl the feed to get the updated content. Not only is this a performance savings (especially when you consider that lots of subscribers may get pings for a new blog post at roughly the same time, and they might otherwise all crawl that blog at the same time for the new contents; the so-called “thundering herd” problem), it’s also a form of robustness since some blogging systems take a little while to update their feeds when a new post is published (especially for large blogging systems that have to propagate changes across multiple data-centers or update caching tiers), so it’s possible you’ll receive a ping before the content is available to crawl directly. For these reasons and more, it’s definitely a best-practice to consume the fat ping directly, rather than just using it as a hint to go crawl the blog again (i.e. treating it as a “light ping”).

That being said, most crawling systems are designed just to poll URLs and look for new data, so it may be easier to start out by taking the “light ping” route. In other words, when you receive a PuSH ping, look up the feed ID from the URL of the request you’re handling, and assuming that feed is still valid, just schedule it to crawl ASAP. That way, you don’t have to change the rest of your crawling infrastructure; you just treat the ping as a hint to crawl now instead of waiting for the next regular polling interval. While sub-optimal, in my experience this works pretty well and is very easy to implement. (It’s certainly a major improvement over just polling with no PuSH support!) If you’re worried about crawling before the new content is in the feed, and you don’t mind giving up a bit of speed, you can schedule your crawler for “in N seconds” instead of ASAP, which in practice will allow a lot of slow-to-update feeds to catch up before you crawl them.

Once you’re ready to handle the fat pings directly, extract the updated feed entries from the POST body of the ping (the payload is essentially an exact version of the full feed you’d normally fetch, except it only contains entries for the new content), and ingest it however you normally ingest new blog content. In fact, you can go even further and make PuSH the default way to ingest blog content–change your polling code to act as a “fake PuSH proxy” and emit PuSH-style updates whenever it finds new entries. Then your core feed-ingesting code can just process all your updated entries in the same way, whether they came from a hub or your polling crawlers.

However you handle the pings, once you find that things are working reliably, you can change the polling interval for PuSH-enabled blogs to be much slower, or even turn it off completely, if you’re not worried about ever missing a ping. In practice, slow polling (e.g. once a day) is probably still a good hedge against the inevitable clogs in the internet’s tubes.

6. Unsubscribe from feeds when they’re deleted from your system

Sometimes users will delete their account on your system or unhook one of their feeds from their account. To be a good citizen, rather than just waiting for the next time the hub sends a subscription verification request to tell it you no longer care about this feed, you should send the hub an unsubscribe request when you know the feed is no longer important to you. The process is identical to subscribing to a feed (as described in steps 2 and 3), except you use “unsubscribe” instead of “subscribe” for the hub.mode values in all cases.

Testing your implementation

Now that you know all the steps needed to implement PuSH subscriber support, it’s time to test your code in the wild. Probably the easiest way is to hook up that http://pubsubhubbub-example-app.appspot.com/ feed, since you can easily add content it to it to test pings, and it’s known to have valid hub-discovery metadata. But you can also practice with any blog that is PuSH-enabled (perhaps your shiny new Google Buzz public posts feed?). In any case, schedule it to be crawled normally, and verify that it correctly extracts the hub-link and self-link and adds the self-link to your feed database.

The first time it finds these links, it should trigger a subscription request. (On subsequent crawls, it shouldn’t try to subscribe again, since the topic URL hasn’t changed. ) Verify that you’re sending a request to the hub that includes all the necessary parameters, and verify that it’s sending you back a 202 response. If it’s not working, carefully check that you’re sending all the right parameters.

Next, verify that upon sending a subscription request, you’ll soon get an inbound verification request from the hub. Make sure you detect requests to your callback URL with hub.mode=subscribe, and that you are checking the hub.verify_token value against the value you sent in the subscription request, and then that you’re sending the hub.challenge value as your response body. Unfortunately, it’s usually not easy to inspect the hub directly to confirm that it has properly verified your subscription, but hopefully some hubs will start providing site-specific dashboards to make this process more transparent. In the meantime, the best way to verify that things worked properly is to try making test posts to the blog and looking for incoming pings.

So add a new post on the example blog, or write a real entry on your PuSH-enabled blog of choice, and look in your server logs to make sure a ping came in. Depending on the hub, the ping may come nearly instantaneously or after a few seconds. If you don’t see it after several seconds, something is probably wrong, but try a few posts to make sure you didn’t just miss it. Look at the specific URL that the hub is calling on your site, and verify that it has your feed ID in the URL, and that it does indeed match the feed that just published new content. If you’re using the “light ping” model, check that you scheduled your feed to crawl ASAP. If you’re using the “fat ping” model, check that you correctly ingested the new content that was in the POST body of the ping.

Once everything appears to be working, try un-hooking your test feed (and/or deleting your account) and verify that it triggers you to send an unsubscribe request to the hub, and that you properly handle the subsequent unsubscribe verification request from the hub.

If you’ve gotten this far, congratulations! You are now part of the real-time-web! Your users will thank you for making their content show up more quickly on your site, and the sites that publish those feeds will thank you for not crawling them as often, now that you can just sit back and wait for updates to be PuSH-ed to you. And I and the rest of the community will thank you for supporting open standards for a decentralized social web!

(Thanks to Brett Slatkin for providing feedback on a draft of this post!)

HipChat is consumer-meets-enterprise done right — check it out!

Three of Plaxo’s best engineers and designers left almost a year ago to start a new company (much as they’d done a few years ago with HipCal, which Plaxo acquired in 2006). After a brief private beta, today they are launching to the public.

HipChatMeet HipChat. It’s a new (and, IMO, very clever and promising) approach to group collaboration within companies and teams–essentially group chat plus file-sharing done with the simplicity and polish of a great consumer app, but targeted at the enterprise. And it’s meant to spread organically and bottoms-up by attracting enthusiastic team members who really find it useful, rather than top-down through long sales-cycles to CIOs–in other words, winning by actually being better for the people that use it every day. You’ll be able to tell this from the moment you start using it–it’s distinctly “un-enterprise-y” in all the right ways, yet every enterprise needs something like this to be more productive and organized.


[ More HipChat screenshots ]

I’m excited about HipChat for several reasons:

First, the founders (Pete Curley, Garret Heaton, and Chris Rivers) are all rockstar talents and super nice guys; the best of the young web 2.0 “bootstrap from nothing and build something genuinely good that grows because people are using and loving it” approach that’s only become feasible recently. Whatever they work on, I know it’ll be well thought through and well executed, and it’ll keep getting better over time. These are good guys to know and watch, and they’re just getting started.

Second, group collaboration is a space that everyone knows is important, and yet that nothing does a good job of solving today. At Plaxo we’ve tried tons of wikis, internal blogs, mailing lists, document depots, dashboards, you name it. They’re always too complicated and cumbersome and never have streamlined workflows that work the way you need. One of my early surprises coming to Google is that for all their efforts and internal tools, the situation is ultimately not much better. Information is still spread everywhere across a variety of systems, is too hard to find and curate, and too often forces you to just ask the person next to you and hope for the best. Maybe new tools like Google Wave will make a difference here, but of course the more flexible and general-purpose a tool like that is, the greater the risk that it will do too many things and none of them just the way you want. HipChat may not be the magic solution to this complex problem either, but it’s refreshing to see the team apply a consumer-app eye and discipline to the problem–focusing on specific task arcs to really nail, and an end-to-end polish and friendliness that’s so clearly lacking from most other groupware tools.

This last point deserves its own slot: in my experience, the only way to really advance the state of technology making a real difference in the lives of real people is to subject it to the harsh Darwinian landscape of consumer software and devices, where if it doesn’t “just work” and provide a compelling and enjoyable experience, it doesn’t get used. This is the sharpening steel that’s honed all the best apps we have today, from gmail to facebook to the iPhone to boxee, and so on. And if you think about it, it’s the missing piece that makes most enterprise software so terrible–your company buys it, and you’re stuck with it, like it or not. The typical enterprise “fitness function” yields a much slower and sloppier rate of evolution than the consumer one, and that I believe is the main reason the quality of the two classes of apps differs so much. So it’s great to see an increase in companies willing to try and swim upstream to gain corporate adoption with a consumer mindset, whether it’s Google Apps, box.net, Yammer, or now HipChat.

If you work on a team, if you’re dissatisfied with the state of collaboration tools, or if you just want to see a really well done new app, I encourage you to check out HipChat. We used several early betas inside Plaxo, and while any new communications tool faces an uphill battle to gain critical mass of adoption and change old habits, enough of us had enough “eureka moments” using HipChat to see its strong potential and to wish that we could fast-forward time until more people are using it and it’s had even more love put into it. The next best thing we can do is to spread the good word and give our support to the team, so consider this post a down payment!

Joseph Smarr has new work info…

High on my to-do list for 2010 will be to update my contact info in Plaxo, because I’ll be starting a new job in late January. After nearly 8 amazing years at Plaxo, I’m joining Google to help drive a new company-wide focus on the future of the Social Web. I’m incredibly excited about this unique opportunity to turbo-charge my passionate pursuit of a Social Web that is more open, interoperable, decentralized, and firmly in the control of users.

I’ve worked closely with Google as a partner in opening up the social web for several years, and they’ve always impressed me with their speed and quality of execution, and more importantly, their unwavering commitment to do what’s right for users and for the health of the web at large. Google has made a habit of investing heavily and openly in areas important to the evolution of the web””think Chrome, Android, HTML5, SPDY, PublicDNS, etc. Getting the future of the Social Web right””including identity, privacy, data portability, messaging, real-time data, and a distributed social graph””is just as important, and the industry is at a critical phase where the next few years may well determine the platform we live with for decades to come. So when Google approached me recently to help coordinate and accelerate their innovation in this area, I could tell by their ideas and enthusiasm that this was an opportunity I couldn’t afford to pass up.

Now, anyone who knows me should immediately realize two things about this decision””first, it in no way reflects a lack of love or confidence from me in Plaxo, and second, I wouldn’t have taken this position if I hadn’t convinced myself that I could have the greatest possible impact at Google. For those that don’t know me as well personally, let me briefly elaborate on both points:

I joined Plaxo back in March of 2002 as their first non-founder employee, before they had even raised their first round of investment. I hadn’t yet finished my Bachelor’s Degree at Stanford, and I’d already been accepted into a research-intensive co-terminal Masters program there, but I was captivated by Plaxo’s founders and their ideas, and I knew I wanted to be a part of their core team. So I spent the first 15 months doing essentially two more-than-full-time jobs simultaneously (and pretty much nothing else). Since that time, I’ve done a lot of different things for Plaxo””from web development to natural language processing to stats collection and analysis to platform architecture, and most recently, serving as Plaxo’s Chief Technology Officer. Along the way, I’ve had to deal with hiring, firing, growth, lack of growth, good press, bad press, partnerships with companies large and small, acquisitions””both as the acquirer and the acquiree””and rapidly changing market conditions (think about it: we started Plaxo before users had ever heard of flickr, LinkedIn, friendster, Gmail, Facebook, Xobni , Twitter, the iPhone, or any number of other companies, services, and products that radically altered what it means to “stay in touch with the people you know and care about across all the tools and services that you and they use”). When I joined Plaxo, there were four of us. Now we have over 60 employees, and that’s not counting our many alumni. All of this is to make the following plain: Plaxo has been my life, my identity, my passion, and my family for longer than I’ve known my wife, longer than I was at Stanford, and longer than I’ve done just about anything before. Even at a year-and-a-half since our acquisition by Comcast, Plaxo has the same magic and mojo that’s made it a joy and an honor to work for all these years. And with our current team and strategic focus, 2010 promises to be one of the best years yet. So I hope this makes it clear that I was not looking to leave Plaxo anytime soon, and that the decision to do so is one that I did not make lightly.

Of all the things I’ve done at Plaxo over the years, my focus on opening up the Social Web over the past 3+ years is the work I’m proudest of, and the work that I think has had the biggest positive impact””both for Plaxo and the web itself. Actually, it really started way back in 2004, when I first read about FOAF and wrote a paper about its challenges from Plaxo’s perspective, for which I was then selected to speak at my first industry conference, the FOAF Workshop in Galway, Ireland. Since that time, I realized what a special community of people there were that cared about these issues in a web-wide way, and I tried to participate on the side and in my free time whenever possible. After leading Plaxo’s web development team to build a rich and complex new AJAX address book and calendar (something that also reinforced to me the value of community participation and public speaking, albeit on the topic of high-performance JavaScript), I knew I wanted to work on the Social Web full-time, and luckily it coincided perfectly with Plaxo’s realization that fulfilling our mission required focusing on more than just Outlook, webmail, and IM as important sources of “people data”. So we crafted a new role for me as Chief Platform Architect, and off I went, turning Plaxo into the first large-scale OpenID Relying Party, the first live OpenSocial container, co-creator of the Portable Contacts spec, co-creator and first successful deployment of hybrid onboarding combining OpenID and OAuth, and so on. Along the way I co-authored the Bill of Rights for Users of the Social Web, coined the term Open Stack, was elected to the Boards of both the OpenID Foundation and OpenSocial Foundation, and worked closely with members of the grass-roots community as well as with people at Google, Yahoo, Microsoft, AOL, Facebook, MySpace, Twitter, LinkedIn, Netflix, The New York Times, and others, often as a launch partner or early adopter of their respective forays into supporting these same open standards. And collectively, I think it’s fair to say that our efforts greatly accelerated the arrival, quality, and ubiquity of a Social Web ecosystem that has the potential to be open, decentralized, and interoperable, and that may define the next wave of innovation in this space, much as the birth of the web itself did nearly 20 years ago.

But we’re not done yet. Not by a long shot. And the future is never certain.

At the recent OpenID Summit hosted by Yahoo!, I gave a talk in which I outlined the current technical and user-experience challenges standing in the way of OpenID becoming truly successful and a “no-brainer” for any service large or small to implement. Despite all the progress that we’ve made over the past few years, and that I’ve proudly contributed to myself, there is no shortage of important challenges left to meet before we can reach our aspirations for the Social Web. There is also no shortage of people committed to “fighting the good fight”, but as with any investment for the future with a return that will be widely shared, most people and companies are forced to make tough trade-offs about whether to focus on what already works today or what may work better tomorrow. There are a lot of good people in a lot of places working on the future of the Social Web, and we need them all and more. But in my experience, Google is unmatched in its commitment to doing what’s right for the future of the web and its willingness to think long-term. One need only look at the current crop of Social Web “building blocks” being actively worked on and deployed by Google””including OpenID, OAuth, Portable Contacts, OpenSocial, PubSubHubbub, Webfinger, Salmon, and more””to see how serious they are. And yet they came to me because they want to turn up the intensity and focus and coordination and boldness even more.

I talked to a lot of Googlers before deciding to join, and from the top to the bottom they really impressed me with how genuinely they believe in this cause that I’m so passionate about, and how strong a mandate I feel throughout the company to do something great here. I also heard over and over how surprisingly easy it still is to get things built and shipped — both new products, tools, and specs, as well as integrating functionality into Google’s existing services. And, of course, there are so many brilliant and talented people at Google, and so much infrastructure to build on, that I know I’ll have more opportunity to learn and have an impact than I could ever hope to do anywhere else. So while there are other companies large and small (or perhaps not yet in existence) where I could also have some form of positive impact on the future of the Social Web, after looking closely at my options and doing some serious soul searching, I feel confident that Google is the right place for me, and now is the right time.

Let me end by sincerely thanking everyone that has supported me and worked with me not just during this transition process but throughout my career. I consider myself incredibly fortunate to be surrounded by so many amazing people that genuinely want to have a positive impact on the world and want to empower me to do the best that I can to contribute, even it means doing so from inside (or outside) a different company. It’s never easy to make big decisions involving lots of factors and rapidly changing conditions, let alone one with such deep personal and professional relationships at its core. Yet everyone has treated me with such respect, honesty, and good faith, that it fills me with a deep sense of gratitude, and reminds me why I so love living and working in Silicon Valley.

2010 will be an exciting and tumultuous year for the Social Web, and so will it be for me personally. Wish us both luck, and here’s to the great opportunities that lie ahead!

What an RP Wants, Part 2 (OpenID Summit 2009)

What an RP Wants, Part 2
OpenID Summit 2009 (Hosted by Yahoo!)
Mountain View, CA
November 2, 2009

Download PPT (2.1 MB)

I was invited to give a talk at the OpenID Summit as a follow-up to my talk “What an RP Wants“, which I gave in February at the OpenID Design Summit. In both cases, I shared my experiences from Plaxo’s perspective as a web site that is trying to succeed at letting users sign up using accounts they already have on Google, Yahoo, and other OpenID Provider sites. This talk reviewed the progress we’ve made as a community since February, and laid out the major remaining challenges to making it a truly-successful end-to-end experience to be an OpenID Relying Party (RP).

My basic message was this: we’ve made a lot of progress, but we’ve still got a lot left to do. So let’s re-double our efforts and commit ourselves once again to working together and solving these remaining problems. As much success as OpenID has had to date, its continued relevance is by no means guaranteed. But I remain optimistic because the same group of people that have brought us this far are still engaged, and none of the remaining challenges are beyond our collective ability to solve.

See more coverage of the OpenID Summit, including my talk, at The Real McCrea.

And here are a couple of video excerpts from my talk:

The Social Web: An Implementer’s Guide (Google I/O 2009)

The Social Web: An Implementer’s Guide
Google I/O 2009
San Francisco, CA
May 28, 2009

Download PPT (7.3 MB)

Google invited me back for a second year in a row to speak at their developer conference about the state-of-the-art of opening up the social web. While my talk last year laid out the promise and vision of an interoperable social web ecosystem, this year I wanted to show all the concrete progress we’ve made as an industry in achieving that goal. So my talk was full of demos–signing up for Plaxo with an existing Gmail account in just two clicks, using MySpaceID to jump into a niche music site without a separate sign-up step, ending “re-friend madness” by honoring Facebook friend connections on Plaxo (via Facebook Connect), killing the “password anti-pattern” with user-friendly contact importers from a variety of large sites (demonstrated with FriendFeed), and sharing activity across sites using Google FriendConnect and Plaxo. Doing live demos is always a risky proposition, especially when they involve cross-site interop, but happily all the demos worked fine and the talk was a big success!

I began my talk by observing that the events of the last year has made it clear: The web is going social, and the social web is going open. By the end of my talk, having showed so many mainstream sites with deep user-friendly and user-friendly interoperability, I decided to go a step further and declare: The web is now social, and the social web is now open. You don’t have to wait any longer to start reaping the benefits. It’s time to dive in.

Test-Driving the New Hybrid

The quest to open up the Social Web is quickly shifting from a vision of the future to a vision of the present. Last week we reached an important milestone in delivering concrete benefits to mainstream users from the Open Stack. Together with Google, we released a new way to join Plaxo–without having to create yet-another-password or give away your existing password to import an address book. We’re using a newly developed “hybrid protocol” that blends OpenID and OAuth so Gmail users (or any users of a service supporting these open standards) can, in a single act of consent, create a Plaxo account (using OpenID) and grant access to the data they wish to share with Plaxo (using OAuth).

We’re testing this new flow on a subset of Plaxo invites sent to @gmail.com users, which means we can send those users through this flow without having to show them a long list of possible Identity Provider choices, and without them having to know their own OpenID URL. The result is a seamless and intuitive experience for users (“Hey Plaxo, I already use Gmail, use that and don’t make me start from scratch”) and an opportunity for both Plaxo and Google to make our services more interoperable while reducing friction and increasing security. I’m particularly excited about this release because it’s a great example of “putting the pieces together” (combining multiple Open Stack technologies in such a way that the whole is greater than the sum of the parts), and it enables an experience that makes a lot of sense for mainstream users, who tend to think of “using an existing account” as a combination of identity (“I already have a gmail password”) and data (“I already have a gmail address book”). And, of course, because this integration is based on open standards, it will be easy for both Google and Plaxo to turn around and do similiar integrations with other sites, not to mention that the lessons we learn from these experiments will be helpful to any sites that want to build a similar experience.

To learn more about this integration, you can read Plaxo’s blog post and the coverage on TechCrunch, VentureBeat, or ReadWriteWeb (which got syndicated to The New York Times, neat!), and of course TheSocialWeb.tv. But I thought I’d take a minute to explain a bit more about how this integration works “under the hood”, and also share a bit of the backstory on how it came to be.

Under the hood

For those interested in the details of how the OpenID+OAuth hybrid works, and how we’re using it at Plaxo, here’s the meat: it’s technically an “OAuth Extension” for OpenID (using the standard OpenID extension mechanism already used by simple registration and attribute exchange) where the Relying Party asks the Identity Provider for an OAuth Request Token (optionally limited to a specific scope, e.g. “your address book but not your calendar data”) as part of the OpenID login process. The OP recognizes this extension and informs the user of the data the RP is requesting as part of the OpenID consent page. If the user consents, the OP sends back a pre-authorized OAuth Request Token in the OpenID response, which the RP can then exchange for a long-lived Access Token following the normal OAuth mechanism.

Note that RPs still need to obtain an OAuth Consumer Key and Secret offline beforehand (we’ve worked on ways to support unregistered consumers, but they didn’t make it into the spec yet), but they *don’t* have to get an Unauthorized Request Token before initiating OpenID login. The point of obtaining a Request Token separately is mainly to enable desktop and mobile OAuth flows, where popping open a web browser and receiving a response isn’t feasible. But since OpenID login is always happening in a web browser anyway, it makes sense for the OP to generate and pre-authorize the Request Token and return it via OpenID. This also frees the RP from the burden of having to deal with fetching and storing request tokens–given especially the rise in prominence of “directed identity” logins with OpenID (e.g. the RP just shows a “sign in with your Yahoo! account” button, which sends the OpenID URL  “yahoo.com” and relies on the OP to figure out which user is logging in and return a user-specific OpenID URL in the response), the RP often can’t tell in advance which user is trying to log in and whether they’ve logged in before, and thus in the worst case they might otherwise have to generate a Request Token before every OpenID login, even though the majority of such logins won’t end up doing anything with that token. Furthermore, the OP can feel confident that they’re not inadvertently giving away access to the user’s private data to an attacker, because a) they’re sending the request token back to the openid.return_to URL, which has to match the openid.realm which is displayed to the user (e.g. if the OP says “Do you trust plaxo.com to see your data”, they know they’ll only send the token back to a plaxo.com URL), and b) they’re only sending the Request Token in the “front channel” of the web browser, and the RP still has to exchange it for an Access Token on the “back channel” of direct server-to-server communication, which also requires signing with a Consumer Secret. In summary, the hybrid protocol is an elegant blend of OpenID and OAuth that is relatively efficient on the wire and at least as secure as each protocol on their own, if not more so.

Once a user signs up for Plaxo using the hybrid protocol, we can create an account for them that’s tied to their OpenID (using the standard recipe) and then attach the OAuth Access Token/Secret to the new user’s account. Then instead of having to ask the user to choose from a list of webmail providers to import their address book from, we see that they already have a valid Gmail OAuth token and we can initiate an automatic import for them–no passwords required! (We’re currently using Google’s GData Contacts API for the import, but as I demoed in December at the Open Stack Meetup, soon we will be able to use Portable Contacts instead, completing a pure Open Stack implementation.) Finally, when the user has finished setting up their Plaxo account, we show them a one-time “education page” that tells them to click “Sign in with your Google account” next time they return to Plaxo, rather than typing in a Plaxo-specific email and password (since they don’t have one).

However, because Google’s OP supports checkid_immediate, and because Plaxo sets a cookie when a user logs in via OpenID, in most cases we can invisibly and automatically keep the user logged into Plaxo as long as they’re still logged into Gmail. Specifically, if the user is not currently logged into Plaxo, but they previously logged in via OpenID, we attempt a checkid_immediate login (meaning we redirect to the OP and ask them if the user is currently logged in, and the OP immediately redirects back to us and tells us one way or the other). If we get a positive response, we log the user into Plaxo again, and as far as the user can tell, they were never signed out. If we get a negative response, we set a second cookie to remember that checkid_immediate failed, so we don’t try it again until the user successfully signs in. But the net result is that even though the concept of logging into Plaxo using your Google account may take some getting used to for mainstream users, most users will just stay automatically logged into Plaxo (as long as they stay logged into Gmail, which for most Gmail users is nearly always).

The backstory

The concept of combining OpenID and OAuth has been around for over a year. After all, they share a similar user flow (bounce over to provider, consent, bounce back to consumer with data), and they’re both technologies for empowering users to connect the websites they use (providing the complementary capabilities of Authentication and Authorization, respectively). David Recordon and I took a first stab at specifying an OpenID OAuth Extension many months ago, but the problem was there were no OpenID Providers that also supported OAuth-protected APIs yet, so it wasn’t clear who could implement the spec and help work out the details. (After all, OAuth itself was only finalized as a spec in December 07!). But then Google started supporting OAuth for its GData APIs, and they subsequently became an OpenID provider. Yahoo! also became hybrid-eligible (actually, they became an OpenID provider before Google, and added OAuth support later as part of Y!OS), and MySpace adopted OAuth for its APIs and shared their plans to become an OpenID provider as part of their MySpaceID initiative. Suddenly, there was renewed interest in finishing up the hybrid spec, and this time it was from people in a position to get the details right and then ship it.

The Google engineers had a bunch of ideas about clever ways to squeeze out extra efficiency when combining the two protocols (e.g. piggybacking the OAuth Request Token call on the OpenID associate call, or piggybacking the OAuth Access Token call on the OpenID check_authentication call). They also pointed out that given their geographically distributed set of data centers, their “front channel” and “back channel” servers might be on separate continents, so assuming that data could instantly be passed between them (e.g. generating a request token in one data center and then immediately showing it on an authorization page from another data center) wouldn’t be trivial (the solution is to use a deterministically encrypted version of the token as its own secret, rather than storing that in a database or distributed cache). As we considered these various proposals, the tension was always between optimizing for efficiency vs. “composability”–there were already a decent number of standalone OpenID and OAuth implementations in the wild, and ideally combining them shouldn’t require drastic modifications to either one. In practice, that meant giving up on a few extra optimizations to decrease overall complexity and increase the ease of adoption–a theme that’s guided many of the Open Stack technologies. As a proof of the progress we made on that front, the hybrid implementation we just rolled out used our existing OpenID implementation as is, and our existing OAuth implementation as is (e.g. that we also used for our recent Netflix integration), with no modifications required to either library. All we did was add the new OAuth extension to the OpenID login, as well as some simple logic to determine when to ask for an OAuth token and when to attach the token to the newly created user. Hurrah!

A few more drafts of the hybrid spec were floated around for a couple months, but there were always a few nagging issues that kept us from feeling that we’d nailed it. Then came the Internet Identity Workshop in November, where we held a session on the state of the hybrid protocol to get feedback from the larger community. There was consensus that we were on the right track, and that this was indeed worth pursuing, but the nagging issues remained. Until that night, when as in IIW tradition we all went to the nearby Monte Carlo bar and restaurant for dinner and drinks. Somehow I ended up at a booth with the OpenID guys from Google, Yahoo, and Microsoft, and we started rehashing those remaining issues and thinking out loud together about what to do. Somehow everything started falling into place, and one by one we started finding great solutions to our problems, in a cascade that kept re-energizing us to keep working and keep pushing. Before I knew it, it was after midnight and I’d forgotten to ever eat any dinner, but by George we’d done it! I drove home and frantically wrote up as many notes from the evening as I could remember. I wasn’t sure what to fear more–that I would forget the breakthroughs that we’d made that night, or that I would wake up the next morning and realize that what we’d come up with in our late night frenzy was in fact totally broken. 🙂 Thankfully, neither of those things happened, and we ended up with the spec we’ve got today (plus a few extra juicy insights that have yet to materialize).

It just goes to show that there’s still no substitute for locking a bunch of people in a room for hours at a time to focus on a problem. (Though in this case, we weren’t so much locked in as enticed to stay with additional drink tickets, heh.) And it also shows the power of collaborating across company lines by developing open community standards that everyone can benefit from (and thus everyone is incentivized to contribute to). It was one of those amazing nights that makes me so proud and grateful to work in this community of passionate folks trying to put users in control of their data and truly open up the Social Web.

What’s next?

Now that we’ve released this first hybrid experiment, it’s time to analyze the results and iterate to perfection (or as close as we can get). While it’s too early to report on our findings thus far, let me just say that I’m *very* encouraged by the early results we’re seeing. 😉 Stay tuned, because we’ll eagerly share what we’ve learned as soon as we’ve had time to do a careful analysis. I foresee this type of onboarding becoming the norm soon–not just for Plaxo, but for a large number of sites that want to streamline their signup process. And of course the best of part of doing this all with open standards is that everyone can learn along with us and benefit from each other’s progress. Things are really heating up and I couldn’t be more excited to keep forging ahead here!

Portable Contacts: The (Half) Year in Review

I’m excited and humbled by the amazing progress we’ve made this year on Portable Contacts, which started out as little more than a few conversations and an aspirational PowerPoint deck this summer. We’ve now got a great community engaged around solving this problem (from companies large and small as well as from the grass-roots), we had a successful Portable Contacts Summit together, we’ve got a draft spec that’s getting pretty solid, we’ve got several implementations in the wild (with many more in the works), we’ve achieved wire-alignment with OpenSocial’s RESTful people API, and we’ve seen how Portable Contacts when combined with other “open building blocks” like OpenID, OAuth, and XRD creates a compelling “Open Stack” that is more than the sum of its parts.

At the recent Open Stack Meetup hosted by Digg, I gave a presentation on the state of Portable Contacts, along with several demos of Portable Contacts in action (and our crew from thesocialweb.tv was on hand to film the entire set of talks). In addition to showing Portable Contacts working with Plaxo, MySpace, OpenSocial, and twitter (via hCard->vCard->PoCo transformers), I was thrilled to be able to give the first public demo of Portable Contacts working live with Gmail. Better still, I was able to demo Google’s hybrid OpenID+OAuth onboarding plus OAuth-protected Portable Contacts Gmail API. In other words, in one fell swoop I was able to sign up for a Plaxo account using my existing Google account, and I was able to bring over my google credentials, my pre-validated gmail.com e-mail address, and my gmail address book–all at once, and all in an open, secure and vendor-neutral way. Now that’s progress worth celebrating!

I have no doubt that we’re on the cusp of what will become the default way to interact with most new websites going forward. The idea that you had to re-create an account, password, profile, and friends-list on every site that you wanted to check out, and that none of that data or activity flowed with you across the tools you used, will soon seem archaic and quaint. And if you think we came a long way in 2008, you ain’t seen nothing yet! There has never been more momentum, more understanding, and more consolidated effort behind opening up the social web, and the critical pieces–yes, including Portable Contacts–are all but in place. 2009 is going to be a very exciting year indeed!

So let me close out this amazing year by saying Thank You to everyone that’s contributed to this movement. Your passion is infectious and your efforts are all having a major and positive impact on the web. I feel increddibly fortunate to participate in this movement, and I know our best days are still ahead of us. Happy New Year!

Web 2.0/Web 3.0 Mashup (EmTech08)

Web 2.0/Web 3.0 Mashup
Emerging Technologies Conference at MIT (EmTech08)
Boston, MA
September 24, 2008

Attribution: ValleywagI was invited to speak on a panel at EmTech, the annual conference on emerging technologies put on by MIT’s TechnologyReview Magazine, on the future of the web. The conference spans many disciplines (alternative energy, cloud computing, biotech, mobile, etc.) and we were the representatives of the consumer internet, which was quite a humbling task! Robert Scoble moderated the panel, which featured me, David Recordon, Dave Morin, and Nova Spivak.

It was a loose and lively back-and-forth discussion of the major trends we see on the web today: it’s going social, it’s going open, it’s going real-time, and it’s going ubiquitous. These trends are all working together: it’s now common (at least in silicon valley) to use your iPhone on the go to see what articles/restaurants/etc your friends have recommended from a variety of distributed tools, aggregated via FriendFeed, Plaxo Pulse, or Facebook. A lot of the vision behind the Semantic Web (structured data enabling machine-to-machine communication on a user’s behalf) is now happening, but it’s doing so bottoms-up, with open standards that let users easily create content online and share it with people they know. As the audience could clearly tell from our passionate and rapid-fire remarks, this is an exciting and important time for the web.

We got lots of positive feedback on our panel from attendees (and also via twitter, of course), as well as from the TR staff. We even received the distinct honor of attracting snarky posts from both Valleywag and Fake Steve Jobs (if you don’t know the valley, trust me: that’s a good thing). You can watch a video of the entire panel on TechnologyReview’s website.

I must say I’m quite impressed with TechnologyReview and EmTech. They do a good job of pulling together interesting people and research from a variety of technical frontiers and making it generally accessible but not dumbed-down. The piece they wrote recently on opening up the social web (which featured a full page photo of yours-truly diving into a large bean bag) was perhaps the most insightful mainstream coverage to date of our space. They gave me a free one-year subscription to TR for speaking at EmTech, and I’ll definitely enjoy reading it. Here’s looking forward to EmTech09!

« Older posts

© 2024 Joseph Smarr

Theme by Anders NorenUp ↑