Joseph Smarr

Thoughts on web development, tech, and life.

Leaders in Tech podcast appearance (part 2)

After my initial conversation on the Leaders in Tech podcast, the host asked me to come back and follow-up with more of my thoughts on AI and the future of work and what we humans will or won’t still want or need to do in the future. We discussed what we can learn from the study of the human brain, and in particular how the pattern-matching cerebral cortex is distinct from the goal-oriented “old brain”, the latter of which is still largely missing from the AI models we’re building. While a lot of knowledge work will undoubtedly be augmented if not replaced by AI over time, we reflected on how much of being an effective leader in tech (or in most professions) still comes down to innately human characteristics of passion, empathy, group coordination, and so on, as well as how we will continue to be driven by work that affords us autonomy, mastery, and purpose, even if it becomes disconnected from how we provide for our basic needs.

Throughout the interview, you will hear why I am still fundamentally optimistic about “team human” and our potential to thrive in a world of technological abundance, which AI can help us usher in (if we don’t mess things up in the meantime, of course!).

Leaders in Tech podcast appearance

I was recently a guest on the Leaders in Tech podcast. We covered a lot of ground, from my childhood and how I got interested in tech and AI, to many lessons I learned working in startups (first Plaxo, now Triller) and inside big companies (Google+, Google Photos, Google Assistant). In particular, the conversation focused on the advantages startups have when it comes to driving innovation, and why, despite their advantages in terms of resources and distribution, it’s hard to get the same results inside larger organizations. We finished with a discussion of how AI is likely to impact the careers of software engineers (my bet is it will remain more of an amplifier than a replacement for years to come).

I think this is one of the best summaries of my experience and thoughts on Silicon Valley and entrepreneurship that I’ve managed to capture. I hope you’ll find it useful and would love to hear your feedback!

Returning to my startup roots

After nearly 12 years at Google, the last 5 of which I’ve spent leading core conversational technology for Google Assistant, I’m excited to share that I’m joining TrillerNet as Chief Technology Officer. I will always love Google and remain super bullish on Assistant’s future, but here’s why I found this opportunity so enticing.

The internet and social networking were supposed to enable us to connect in meaningful ways with our friends and the artists, thinkers, and brands we care about. But it’s nearly 20 years since my last startup, Plaxo, helped usher in this “social web” phase, and it’s clear by now that this is not the future we were promised. Sure, the people you want to follow are now on Social, but we mostly receive undifferentiated, one-way broadcasts from them. They don’t really know who we are, what we’ve seen and done, or what we’re interested in, and what personalization does exist comes more from “surveillance capitalism” than real two-way connection.

It doesn’t need to be this way: the data we need to differentiate ourselves as consumers (what we read, watch, listen to, purchase; the places we visit, etc.) all leave “digital breadcrumbs” now (in addition to our social media comments/likes). Recent advances in AI (including Natural Language Processing, unsupervised clustering, large language models, and more) have given us the tools to understand all of that data well enough to enable a new level of two-way personal engagement at scale. But since this is all happening across multiple sites and services, a neutral arbiter is needed to tie it all together–a company that deeply understands the needs of both creators and consumers and can develop the technology to help them connect like never before.

Enter TrillerNet, the unlikely but oddly ideally-positioned rocket ship, built from a remarkable convergence of multiple startups spanning technology, entertainment, and the burgeoning creator economy. It combines (1) the “AI-driven conversational superpowers” of Amplify.ai (which originally caught me eye), (2) deep experience and credibility with the creator/influencer community from Triller and Proxima’s backing, and (3) the ability to repeatedly create marquee cultural moments with Verzuz and FITE TV. Bold new initiatives like CLIQZ hint at the massive potential at the intersection of those three core differentiators. And it’s still early days.

I’ve been excited about Amplify.ai’s technology and success in both the commercial and political arenas since their CMO (and my longtime friend and collaborator from Plaxo), John McCrea, joined them several years back. Earlier this year, after their acquisition by TrillerNet, Amplify.ai CEO Mahi de Silva became CEO of the overall TrillerNet conglomerate. He sensed the coming “1 + 1 = 11” opportunity to pair the strong creator relationships and cultural engine of Triller with the breakthrough conversational AI capabilities of Amplify.ai and recognized that, with my two decades in Silicon Valley focusing on social networking, identity, and data portability on the one hand and NLP and conversational AI on the other hand, I was the “unlikely but oddly ideally positioned” CTO to lead the company’s next phase of transformational growth.

It’s surely going to be a wild ride. I’m excited to get back to my startup roots and see how the world has changed and what I’ve learned from my time at Google that does and doesn’t carry over. I welcome any advice or support and will have lots more to say as things unfold!

Why I’m all in for Warren

Note: This is the first time I’ve written about politics on my personal blog. It’s also the first time in my political memory where a lot of smart people who live near me don’t already share most of my political views. Given the unprecedented stakes of the 2020 election, and before the voting starts, I feel compelled to make this pitch and hope you’ll see it as a good faith attempt to help us all reach a better outcome.

Warren 2020 logo 02

I feel strongly that Elizabeth Warren has the best chance of beating Donald Trump in 2020 and also the best chance of delivering non-incremental change to the health and well-being of America and its citizens once elected. She’s the candidate this moment in our history calls for, and we should do everything we can to answer that call.

If this doesn’t strike you as obviously true, yet you know me well enough to respect my intellect and opinions on other matters, please read on and get in touch with me to discuss / debate further until I can convince you or we can figure out in detail where our values or views of the world differ. 🙂

Warren is the most electable candidate

All savvy political observers believe that, to win in 2020, the democratic candidate has to excite and turn out the base in record numbers and hold/persuade enough moderate voters, esp. in battleground states. I worry Biden won’t be able to sufficiently excite the base (I’m having flashbacks to 2016). I worry that Bernie will turn off too many moderates (due to flaunting his “democratic socialist” brand, his “revolutionary” policy proposals, and, more importantly, his general lack of specifics on what his policies would look like in detail and how he’d get them done).

In contrast, I believe Elizabeth Warren can excite the base as much as Bernie without scaring off everyone else. After all, she’s a Harvard Professor who grew up as Betsy, the Republican girl from Oklahoma. She knows both worlds. She wants to restore Capitalism to its proper working function (better regulation and balance of power between capital and labor), not blow it up and replace it with a different system. She has detailed plans for what she intends to do, how she intends to pay for it, and how she intends to get it passed and enacted. There’s no uncertainty about what you’d get with a Warren presidency. And despite being a wonk’s wonk, she’s incredibly relatable and charismatic in person.

Beyond policy and character, she’s shown she can win elections against Republican incumbents, even when starting far behind in the polls. She’s running on a message of anti-corruption against the most corrupt President in history. Everyone (including her opponents) agrees she’s running the best political organization in the field. Put it all together, and there’s no one I’d rather take my chances on competing against Trump in the fall.

There’s (justifiably) so much fear about Trump getting re-elected that many people are feeling risk averse and clinging to what looks like the safest choice (usually: an old white man who won’t rock the boat too much). I believe this is a fundamental misreading of the dynamics at play this year. We should be playing to win, not playing not-to-lose, and that means backing the candidate that we truly believe in. Warren is that candidate for me, hands down, and if you haven’t taken a good look at her, please be open to the fact that the same may well happen to you once you do.

Warren will be the most effective President

As important as beating Trump will be, it’s only the start of the job of fixing what’s broken in America. I’ve spent hundreds of hours over the past many years reading and discussing and trying to understand the root causes of our political and economic dysfunction. I believe Warren has the best understanding of what’s really going on and how to actually begin to fix it. Political capture by moneyed interests and out of control wealth and income inequality have been growing problems for my entire lifetime, yet Warren is the only candidate focused on tackling them head on and with plans bold enough to meet the scale of the challenge. As she points out, you can’t win on any other policy areas we care about (climate change, expanded safety net, etc.) until you can solve the root problem that’s been preventing progress on them (“corruption” writ large).

She’s also the savviest when it comes to how to make real change in Washington. Her ability to create the Consumer Financial Protection Bureau is justifiably legendary, and she’s the most clear-eyed about what she can accomplish through the Executive Branch and how she can get the rest through Congress even with a narrow majority (e.g. eliminating the filibuster and passing what she can via budget reconciliation), something Biden and Bernie still refuse to touch, thus essentially neutering their policy agenda. She understands that “personnel is policy” and has been spending considerable time identifying the right cabinet and other positions to best achieve her goals. In short, she’s got a plan for her plans.

Finally, if she can start enacting her plans, I believe real Americans will, for the first time in a while, really feel a noticeable positive impact on their lives (access to childcare and education, student loan forgiveness, etc.). This in turn, I believe, will strengthen her mandate to do more, since it will shift people’s perspective on politics from essentially just a “tribal team sport” to something where the outcomes actually matter to them. This is important not just for momentum but also to counter the degradation of our political institutions, which is possible today partly because people don’t feel sufficiently personally invested in protecting and strengthening them. But once the government starts actually delivering in noticeable, critical new ways, which I believe Warren is best positioned to accomplish, I think we have a chance to finally turn the tide.

Don’t just take my word for it:

PS: I wanted to keep this post short, but there’s so much more I could say or share, so again please either help spread the good word or let’s talk more if you’re at all open to the fact that I might not be crazy here. 🙂 This election is too important to just watch from the sidelines. We all need to sacrifice and get involved to the best of our abilities. I’ve started donating and volunteering and speaking publicly. This is new and uncomfortable for me, but I hope by sharing this, it will help some of you feel empowered to take your own next steps.

Starting a new conversation…

Google AssistantI’m excited to share that I’ve recently joined the Google Assistant team! Like a lot of people (including our CEO), I’ve become convinced that natural language conversation represents the future of how we’ll primarily interact with computers and the myriad smart devices that will soon proliferate around us. This new “UI” will be personalized based on both knowledge of you and your history of interactions. It will also be proactive (reaching out to you with pertinent questions and updates) as well as responsive. And it will execute tasks across multiple, interconnected services on your behalf.

Which is to say: it’ll be a lot different than how we work with computers today. And it promises to be a lot better, too — if we can get it right.

I’ve been fascinated by interacting with my Google Home (which I’ve had early access to for a while). It highlights both the challenges and opportunities of this new conversational modality, and it surprises in equal measure with how far we’ve come and how far we still have to go. For instance, my 5 year old daughter walked into our living room the other day and proclaimed, “Ok Google, play some Lady Gaga”, then started dancing to the music that immediately began playing. Think about that: She would never have been able to accomplish that task with a traditional desktop/mobile app, nor would I have been able to help her as quickly as she was able to help herself. She didn’t have to be unnaturally precise (e.g. select a particular track or album), and it was an enormously empowering interaction with technology overall. It feels like “how things should work”.

I’ve had countless similar “wow moments” asking Google questions about the world, my own upcoming flights and schedule, or streamlining tasks like playing music or showing some recent photos I took on our TV to the grandparents. But for all the magic Google can deliver already, this is still very clearly early days. The dialogs are often too fragile and require too much custom crafting by too many Google engineers in a way that clearly won’t scale to the ambitions of the medium (and the team). There’s not yet much deep learning or true language understanding under the hood. And only recently has there even been a focused effort to build The Assistant, instead of just “voice enabling” individual products here and there. The industry as a whole is still only starting to figure out how “chatbots” and “conversational commerce” and so on should work.

Which is why it seems like an ideal time to get involved–we can tell “there’s a there there”, but we also still need many foundational innovations to realize that potential.

On a personal level, this change also represents an opportunity to get my career “back on track” after a wonderful decade+ diversion into the emerging world of social networking. I actually came to Stanford in the late nineties to study Natural Language Processing with the aim of empowering ordinary users with the superpower of Artificial Intelligence, and even published several papers and helped build Stanford’s first Java NLP libraries while earning my BS and MS. I originally joined Plaxo, an early pioneer of social networking, to build an NLP engine that automatically parsed contact info out of emails (e.g. “Oh, that’s my old address. I’ve since moved to 123 Fake St.”), but eight years later, I was still there, serving by then as its CTO and trying to open up the social web that had sprung up around us as the new way to stay connected. That in turn led me to join Google in 2010 as part of the founding team that created Google+, and I’ve been at Google ever since. But now I’ll actually be working on NLP again, and I have a feeling my years advocating for user-centric identity and data portability across services will come in handy too!

I’m uncomfortably excited to be starting this new chapter of my career. If you think about all the exhilarating potential surrounding AI these days, realize that a surface like Google Assistant is where you’re most likely to see it show up. One of the senior-most engineers on the team remarked to me that, “I’m sure a full Turing Test will be part of our standard testing procedure before long,” and I think he was only half joking. If you’re building similar things in this space, or have ideas about what we should prioritize building in the Google assistant or how you’d like it to integrate with your service, please let me know. I’m ready to learn!

When Dreams Become Real

So much has happened since I first wrote about my “side project” to help Dan Ambrosi apply DeepDream to his multi-hundred megapixel HDR landscapes. Here’s a rapid-fire rundown. Be sure to click on the photos below to see a series of (smart!) albums from each event.

Close encounter

The first chance to view our creations at near “life size” came on March 31st from Calit2 at UC San Diego where Dan was invited to present his work on their 66-million pixel wide room-sized VROOM tiled wall display. Calit2 (which my father founded in 2000) has a long history of collaboration between science, technology, and the digital arts, so everyone felt right at home, and it provided us with the first visceral validation that seeing this work displayed with sufficiently high scale and resolution was indeed a transformative experience. Here’s a video I shot of Dan controlling the wall, and here’s Calit2’s news release from the visit. It left us energized and determined to see these works printed at scale in the real world. Turns out we wouldn’t have to wait long…

The great jailbreak

The very next day, Dan installed a series of large format printed Dreamscapes at the massive GPU Technology Conference, which is held annually by NVIDIA in the San Jose Convention Center. We’d shown NVIDIA our work earlier, and they loved how it showcased the power of what you could compute with GPUs and CUDA, so they agreed to purchase three 8 feet high x 16 feet wide (!) Dreamscapes (printed on tension-mounted fabric backlit by LEDs inside a free-standing aluminum frame) for the main conference hallway. Dan, Chris, and I also gave a well-attended talk (video) on the art and tech behind the Dreamscapes, and Dan gave a couple additional interviews. The pieces were a huge hit, and it was so fun to see everyone enjoying them, taking selfies in front of them, standing back to take in the whole picture and then walking right up close to see the surprising details, and grabbing their friends and colleagues with “you gotta see this!” We’re so grateful to NVIDIA for their support, which allowed us to finally unshackle these works from their digital prison and experience them with the full freedom and resolution that only reality provides (for now).

Bringing the tour back home

Ever since I started working on Dreamscapes, I’d been sharing my progress on Google’s internal G+ and asking for feedback. After Dan published his work online and photos started flowing in from Calit2 and GTC, demand grew to present a tech talk and exhibit Dan’s work inside the ‘plex. NVIDIA generously agreed to loan us their Dreamscapes from GTC for a week, so on May 20th we set them up in one of Google’s largest auditoriums, and the following Monday Dan delivered an hour-long tech talk (video) followed by a reception. This was the first time I’d heard Dan go deep on the art history and iteration of technique that drove him to be able to produce these compelling giant landscapes, so I learned a lot and it sparked a lot of discussion among Googlers. The pieces remained up that whole week, and as word spread, there was a constant flow of people sneaking in to check them out and share in the unique experience.

Entering the fine art world

When it rains, it pours. During the time we were showing our work at NVIDIA and Google, Dan was approached by several fine art galleries about exhibiting his work, and he ended up creating installations in contemporary spaces in both Miami, FL and Steamboat Springs, CO (more photos). He gave talks at both places, and the enthusiasm from fellow artists and the community was enormously validating. He’ll also be showcasing his work at the upcoming 9e2 art + science event in Seattle this October.

A parting gift

After all of this excitement, I couldn’t help wishing I could take home a physical memento of our adventure. Obviously 8′ x 16′ is way too large to display inside a normal home, but Dan also produces his works in 4′ x 8′ wall-mounted light boxes printed using Duratrans, which is even crisper than fabric (though it can’t be printed as large). My favorite scene of his is the Point Montara Lighthouse, shot just a few minutes away from my house, which became something of a signature image for us. I celebrated my 35th birthday this year, and my parents decided to commemorate the occasion by purchasing that piece for me. I couldn’t imagine a better present, and not a day goes by that I don’t pass by and stare at it with a big smile on my face. 🙂

Onward

As you can see, it’s been a busy time in Dreamscape land. Yet remarkably, Dan has simultaneously undertaken an end-to-end upgrade of his workflow, from the camera (Sony RX1 → Sony RX1R II, which with its 42.4 megapixel full frame 35mm sensor nearly doubles the effective resolution he can capture, enabling him to shoot scenes that call for a narrower panoramic sweep) to the software (Photomatix → Aurora HDR, which provides better color balance, especially with blue skies; PTGui → Autopano Pro, which stitches more accurately resulting in fewer artifacts; and even upgrading to the latest version of Photoshop CC, which can finally handle images larger than 30,000 pixels on a side). He’s currently in the process of “re-mastering” many of his previously captured images with this new suite of tools, as well as shooting new scenes and exploring new DeepDream settings to run on them.

Oh, and he’s also started collaborating with an additional AI / deep-learning image-oriented software team on what may turn into “Dreamscapes 2nd Edition” soon. I can’t wait to see what dreams may come…

Home Sous Vide: What the Books Don’t Tell You

After years of passive enchantment with sous-vide cooking, my wife bought me an Anova immersion circulator for my birthday so I could finally try it at home. Despite reading the book everyone recommends, I was still left with a bunch of newbie questions that didn’t have immediately obvious answers. I forged ahead with a mix of phoning a friend, asking the Internet, and just giving it my best shot, reciting the mantra that worst case the food will just be inedible. In the end, things worked out fine–though there’s much more experimenting to be done!

For the benefit of other first-timers, here’s some collected Q&A based on my experiences so far. I hope this will help remove a roadblock or two and help convince you to “take the plunge” (see what I did there?).

Q: How should I prepare the water bath? How warm should it be?

Turns out it doesn’t really matter, since the circulator can heat it up pretty quickly on its own. I just filled the pot with warm water from the sink, which turned out to be about 110°F. In my case, I was cooking at 142° and the circulator brought it up within several minutes. No need to heat the water on the stove or use a kettle first (remember, you’re almost always cooking at temperatures well below boiling).

Q: How high should I fill the pot with water?

Technically, it just needs to comfortably cover the food you’re cooking, but it’s nice to give it some buffer. The Anova has a min and max line and you need the water to be in between. One thing I hadn’t thought of is how much water can evaporate over a long cooking time (more on this below), so that’s another reason not to cut it too close. The circulator moves the water pretty gently, so there’s no danger of it sloshing over the top of the pot or bubbling over like with boiling water.

Q: If my meat is frozen, do I need to fully thaw it before starting the sous vide process?

Not if you’re doing a longer cook. For instance, I did 24-hour beef short ribs at 142°, and even though the cubes of meat were 2-3″ thick, there’s plenty of heat and time to fully thaw the meat and bring it fully up to temperature. That’s one of the beautiful things about sous vide–since it’s low and slow, you don’t have the same worries about unevenly heating the meat like you would on a grill. And most meat defrosting techniques involve placing it in warm water, which is exactly what you’re doing. I just pre-defrosted the meat in warm water for ~20 mins and then put it in the bags and had no trouble. This is handy since otherwise it can take a long time to defrost meat, which means you need to plan days ahead to cook it.

Q: Many books/sites say you can use Ziploc freezer bags to seal the food. What about Ziploc storage bags?

I had a minor panic attack when I realized that all of our gallon-sized Ziploc bags at home were “storage” not “freezer”. I obviously didn’t want to risk melting the plastic or rupturing the bag during cooking. But according to the Internet, both are plenty well qualified for the task, being made of heat resistant food-grade polyethylene. If the food isn’t sharp and pointy, you’re not going to poke a hole in the bag during cooking, and the water is not going to be anywhere near the melting point of the plastic. In the end, I decided to brave the storage bag and it seemed to work just fine. I’d probably still prefer a freezer bag if I had one handy (just to be paranoid), but I don’t think it’s required. Just remember in either case to get all the air out of the bag using the water displacement method, which you do right in the sous vide pot.

Q: How do you keep the bag from getting sucked into the circulator during cooking?

When I first started the cooking process, I just naively put my ziploc bag into the pot with the circulator clamped onto the side. Nothing in the book or the instruction manual said to do otherwise! But inevitably over time the circulator would suck the bag towards it, ultimately impeding the water circulation and making some weird noises. Definitely didn’t seem like it was working as intended. Turns out most people clip the bag to the side of the pot with a chip clip or clothespin. That fixed the problem. I have no idea why this isn’t specifically mentioned in any of the basic instructional material!

Before

After

Q: Why does the circulator sometimes make a metallic grinding noise?

The instruction booklet said it means the case isn’t tight enough, but I couldn’t obviously tighten it or loosen it. It definitely sounded wrong though, so I tried a few more times and eventually got the right amount of twisting and pulling to remove the case and then reattach it with a satisfying snap at the end, after which point the sound went away. It didn’t seem to affect the cooking process to have that noise, but it was definitely somewhat worrying and annoying, so I’m glad I finally fixed it.

Q: If you set the cooking timer, will it turn the circulator off when the timer finishes?

No. The Anova timer is just FYI and does not control anything. It’s confusing that they don’t tell you that, but it means you don’t need to worry about the exact cook time, which is part of the point of sous vide (it’s hard to overdo it, since the water is the same temperature you want the food to end up at, vs. an oven or grill, which is much hotter). For instance, I’d planned for my short ribs to be done around the time I would get home from work, but my schedule is not always super predictable, and I didn’t want the circulator to shut off and let the water cool down if I came home later than expected. Fear not, it’ll just keep going until you stop it.

Q: Is it safe to cover the entire top of the pot with cling wrap to reduce water loss from evaporation?

A recipe I read suggested that I do this, but I wasn’t sure if that would cause lack of airflow or pressure to build up or something. Fear not, it didn’t seem to cause any problems, and it significantly reduced water loss from evaporation, which is especially important for longer cooking times. I let it cook over night uncovered and it lost an inch or two of water in the process, so if I’d let it continue all day, it might well have gotten down below the min line by dinner time. The only hard part was keeping the cling wrap from sagging down into the water (don’t want to cook it!), but by using some longer pieces, pulling them tight, and wrapping the extra around the pot handles and edge, leaving room for the circulator on one side and the clip on the other side, it was stable and effective. Definitely recommended.

Q: What happens if the power goes out in the middle of cooking, esp. while you’re away at work?

You’re in trouble. 🙂 This happened to me, and even though it was only a brief outage, the Anova frustratingly does not restart itself after waking back up! So the water just slowly cools, leaving your meat under-cooked and potentially in the danger zone for bacterial growth. This is what I came home to, and at first I thought the meal was ruined. But after some research and talking to some scientists in the family, I felt a bit better. It usually takes at least four hours in the danger zone for bacteria to grow, and the meat had been cooking above that zone for many hours already, so most bacteria should have been killed by then. When the power went out, the water didn’t immediately cool, so it only spent some of the post-outage time in the danger zone. And most preparations call for searing the outside of the meat before serving, which re-kills anything that might have started growing on the surface. In the end, after much soul searching, I decided to crank up the circulator for a bit and then pull out the meat and sear it. The meat was not as tender as I was hoping (possibly because the cooking time had been cut short) but it was definitely edible and, as I write this the next morning, I have no indications that my stomach is any worse for wear. The advice from the Anova folks seems to be to buy a UPS battery backup and plug the circulator into that.

My first home sous vide: 24-hour beef short ribs

Q: Can you reuse the cooking water for a subsequent sous vide run?

The consensus on the Internet seems to be yes, provided the water is not obviously funky or contaminated. It’s possible some bugs could grow in it between cooking cycles, but you’re keeping the food sealed in plastic and turning the water back up to a temperature that will kill anything in there, so you should be fine. Hey, it’s drought-tolerant cooking! After my scare with the power outage, I wanted to “get back on the horse” and do another run for the next dinner. I could have poured out all the water and started over, but it seemed wasteful, provided I wasn’t risking my health. I’ll post an update if this turns out to be a mistake, but I don’t think it will. 🙂

Conclusion

The promise of sous vide is “set it and forget it”, which as an engineer I love. It’s precision cooking that’s not fussy on technique. And it can produce some pretty stunning, unique results. But the danger is that by cooking food at lower heat for longer time, you’re at greater risk of growing bacteria if you’re not careful. So most of the first-timer questions above ultimately boiled down to how to minimize that risk without giving up entirely. That plus stuff that I’m sure you get used to after you’ve been doing it for a while, but that’s not obvious the first go around, and that the books don’t tell you about one way or the other.  So I hope this addendum finds and helps someone. In the meantime, I’ll keep at it. And if you’re a home sous vide cook and have any tips to share, or if you found any of this useful (or confusing, or provocative), please let me know!

Dreaming big

dreamscape-detail

I’ve recently been working on a side project with my good friend Chris Lamb to scale up Google’s Deep Dream neural net visualization code to operate on giant (multi-hundred megapixel) images without crashing or taking an eternity. We recently got it working, and our digital artist friend (and fellow Plaxo alum!) Dan Ambrosi has since created some stunning work that’s honestly exceeded all of our expectations going in. I thought it would be useful to summarize why we did this and how we managed to get it working.

Even if you don’t care about the technical bits, I hope you’ll enjoy the fruits of our labor. 🙂

The ”˜danorama’ back story

gridDan’s been experimenting for the past several years with computational techniques to create giant 2D-stitched HDR panoramas that, in his words, “better convey the feeling of a place and the way we really see it.” He collects a cubic array of high-resolution photos (multiple views wide by multiple views high by several exposures deep). He then uses three different software packages to produce a single seamless monster image (typically 15-25k pixels wide): Photomatix to blend the multiple exposures, PTGui to stitch together the individual views, and Photoshop to crop and sweeten the final image. The results are (IMO) quite compelling, especially when viewed backlit and “life size” at scales of 8’ wide and beyond (as you can do e.g. in the lobby of the Coastside Adult Day Health Center in Half Moon Bay, CA).

“I’d like to pick your brain about a little something”¦”

Dan AmbrosiWhen Google first released its deep dream software and corresponding sample images, everyone went crazy. Mostly, the articles focused on how trippy (and often disturbing) the images it produced were, but Dan saw an opportunity to use it as a fourth tool in his existing computational pipeline–one that could potentially create captivating impressionistic details when viewed up close without distorting the overall gestalt of the landscape when viewed at a distance. After trying and failing to use the code (or any of the DIY sites set up to run the code on uploaded images) on his giant panoramas (each image usually around 250MB), he pinged me to ask if I might be able to get it working.

I had no particular familiarity with this code or scaling up graphics code in general, but it sounded like an interesting challenge, and when I asked around inside Google, people on the brain team suggested that, in theory, it should be possible. I asked Chris if he was interested in tackling this challenge with me (both because we’d been looking for a side project to hack on together and because of his expertise in CUDA, which the open source code could take advantage of to run the neural nets on NVIDIA GPUs), and we decided to give it a shot. We picked AWS EC2 as the target platform since it was an easy and inexpensive way to get a linux box with GPUs (sadly, no such instance types are currently offered by Google Compute Engine) that we could hand off to Dan if/when we got it working. Dan provided us with a sample giant panorama image, and off we set.

“We’re gonna need a bigger boat…”

18-KolobCanyon_megoHDR60-1600Sure enough, while we could successfully dream on small images, as soon as we tried anything big, lots of bad things started happening. First, the image was too large to fit in the GPU’s directly attached memory, so it crashed. The neural nets are also trained to work on fixed-size 224×224 pixel images, so they had to downscale the images to fit, resulting in loss of detail. The solution to both problems (as suggested to me by the deep dream authors) was to iteratively select small sub-tiles of the image and dream on them separately before merging them back into the final image. By randomly picking the tile offsets each time and iterating for long enough, the whole image gets affected without obvious seams, yet each individual dreaming run is manageable.

Once we got that working, we thought we were home free, but we still couldn’t use the full size panoramas. The GPUs were fine now, but the computer itself would run out of RAM and crash. We thought this was odd since, as mentioned above, even the largest images were only around 250MB. But of course that’s compressed JPEG, and the standard Python Imaging Library (PIL) that’s used in this code first inflates the image into an uncompressed 2D array where each pixel is represented by 3×32 bits (one per color channel), so that the same image ends up needing 3.5GB (!) of RAM to represent. And then that giant image is copied several more times by the internal code, meaning even our beefiest instances were getting exhausted.

So we set about carefully profiling the memory usage of the code (and the libraries it uses like NumPy) and looking for opportunities to avoid any copying. We found the memory_profiler module especially helpful, as you can annotate any suspicious methods with @profile and then run python -m memory_profiler your_code.py to get a line-by-line dump of incremental memory allocation. We found lots of places where a bit of rejiggering could save a copy here or there, and eventually got it manageable enough to run reliably on EC2’s g2.8xlarge instances. There’s still more work we could do here (e.g. rewriting numpy.roll to operate in-place instead of copying), but we were satisfied that we could now get the large images to finish dreaming without crashing.

BTW, in case you had any doubts, running this code on NVIDIA GPUs is literally about 10x faster than CPU-only. You have to make sure caffe is compiled to take advantage of GPUs and tell it explicitly to use one during execution, but trust me, it’s well worth it.

Solving the “last mile” problem

new-dreamWith our proof-of-concept in hand, our final task was to package up this code in such a way that Dan could use it on his own. There are lots of tweakable parameters in the deep dream code (including which layer of the deep neural net you use to dream, how many iterations you run, how much you scale the image up and down in the process, and so on), and we knew Dan would have to experiment for a while to figure out what achieved the desired artistic effect. We started by building a simple django web UI to upload images, select one for dreaming, set the parameters, and download the result. The Material Design Lite library made it easy to produce reasonably polished looking UI without spending much time on it. But given how long the full images took to produce (often 8-10 hours, executing a total of 70-90 quadrillion (!) floating point operations in the process), we knew we’d like to include a progress bar and enable Dan to kick off multiple jobs in parallel.

Chris took the lead here and set up celery to queue up dispatching and controlling asynchronous dreaming jobs routed to different GPUs. He also figured out how to multiply together all the various sub-steps of the algorithm to give an overall percentage complete. Once we started up the instance and the various servers, Dan could control the entire process on his own. We weren’t sure how robust it would be, but we handed it off to him and hoped for the best.

“You guys can’t believe the progress i’m making”

dreamingOnce we handed off the running EC2 instance to Dan, we didn’t hear much for a while. But it turned out that was because he was literally spending all day and night playing with the tools and honing his process. He started on a Wednesday night, and by that Saturday night he messaged us to say, “You guys can’t believe the progress I’m making. I can hardly believe it myself. Everything is working beautifully. If things continue the way they are, by Monday morning I’m going to completely amaze you.” Given that we’d barely gotten the system working at all, and that we still really didn’t know whether it could produce truly interesting output or not, this was certainly a pleasant surprise. When we probed a bit further, we could feel how excited and energized he was (his exact words were, “I’m busting, Jerry, I’m busting!”). It was certainly gratifying given the many late nights we’d spent getting to this point. But we still didn’t really know what to expect.

The following Monday, Dan unveiled a brand new gallery featuring a baker’s dozen of his biggest panoramic landscapes redone with our tool using a full range of parameter settings varying from abstract/impressionistic to literal/animalistic. He fittingly titled the collection “Dreamscapes”. For each image, he shows a zoomed-out version of the whole landscape that, at first glance, appears totally normal (keep in mind the actual images are 10-15x larger in each dimension!). But then he shows a series of detail shots that look totally surreal. His idea is that these should be hung like giant paintings 8-16’ on a side. As you walk up to the image, you start noticing the surprising detail, much as you might examine the paint, brush strokes, and fine details on a giant painting. It’s still hard for me to believe that the details can be so wild and yet so invisible at even a modest distance. But as Dan says in his intro to the gallery, “we are all actively participating in a shared waking dream. Science shows us that our limited senses perceive a tiny fraction of the phenomena that comprise our world.” Indeed!

From dream to reality

galleryWhile Dan is still constantly experimenting and tweaking his approach, the next obvious step is to print several of these works at full size to experience their true scale and detail. Since posting his gallery, he’s received interest from companies, conferences, art galleries, and individuals, so I’m hopeful we’ll soon be able to see our work “unleashed” in the physical world. With all the current excitement and anxiety around AI and what it means for society, his work seems to be striking a chord.

Of course the confluence of art and science has always played an important role in helping us come to terms with the world and how we’re affecting it. When I started trying to hack on this project in my (copious!) spare time, I didn’t realize what lay ahead. But I find myself feeling an unusual sense of excitement and gratitude at having helped empower an artistic voice to be part of that conversation. So I guess my take-away here is to encourage you to (1) not be afraid to try and change or evolve open source software to meet a new need it wasn’t originally designed for, and (2) don’t underestimate the value in supporting creative as well as functional advances.

Dream on!

Has Tesla’s personalization crossed the line?

One of the cool things about owning a Tesla is that when you have to take it in for service, they often give you a loaner Tesla to take home. It feels like nothing’s changed–until you realize that all of your personalized settings have effectively been reset. In the past, this wasn’t a big deal, but I just took my car in for 36 hours and had an unexpectedly jarring experience. I thought I’d share it because it surprised and fascinated me how pervasive and important this personalization has become.

It may be tempting to dismiss this all as “first world problems”, and obviously the car still did go from A to B just fine, but I was struck by how much I’d unknowingly come to depend on many forms of personalization Tesla offers. None of the following examples are hypothetical or contrived–they each took me by surprise and made me go “hmm”:

  1. I couldn’t open my garage door when I got home (I don’t normally carry a house key since I come and go by car, but luckily the garage has a keypad backup!)
  2. My seat position, steering mode (stiffness), etc. were different enough to frustrate my muscle memory
  3. The loaner hadn’t enabled “walk away door locks”, so I inadvertently left the car unlocked for several hours
  4. I had to re-pair my phone via bluetooth before I could resume listening to my current audible audiobook (never as smooth an experience as it should be, esp. when adding a duplicate “Tesla Model S” endpoint!)
  5. My home/work/favorite map destinations were missing, meaning I had to re-find and enter my daughter’s preschool address (to get traffic-aware driving directions/time)
  6. I was surprised to find my windshield and windows fogged up in the morning, until I remembered that my car normally remembers when I tend to leave in the morning and automatically warms itself up beforehand (“smart preconditioning”)
  7. Similarly, my car normally notifies me proactively if there’s a faster route to home/work than usual (based on current traffic conditions), but not this time
  8. My phone buzzed from the Tesla app, but it was because they’d stopped charging my car at the service center, not the car I was currently charging at work
  9. I can normally find where I parked my car, see how much charge it has, and even unlock and start it using the Android app, but none of those things work with the loaner car
  10. My daughter loves listening to some of the kids music stations on Slacker radio (each car comes with a pre-registered account), but the loaner car didn’t have any account attached, meaning I couldn’t play her that station (and even if I could, it wouldn’t remember which songs we’d ranked up/down before)

By the time I swapped back, I’d manually corrected many of these differences, but of course that creates its own problem–this loaner car was now full of my personal data (home, work, daughter’s school, paired to my phone, etc.) with no obvious way of wiping the data (I presume they do something like a factory reset before handing out the car to the next customer, but it was never discussed with me).

Now, any loaner car situation would entail some of the problems above, but most cars aren’t as personalized to start with, and the fact that I was still driving “the same car” made it all the more surprising when the car didn’t behave like it used to. More to the point, Tesla’s persistent internet connection and over-the-air software updates put it in a unique position to actually solve many or all of the above problems. I normally think of cars as fairly fungible, but clearly these are not normal cars.

“You can’t go this far and then not go further!” –Homer Simpson

To me, Tesla’s personalization has clearly crossed the line from a minor convenience to an essential part of the ownership experience. The solution then, it seems to me, is to take Homer’s advice and sync all your personalized settings to the cloud, thus enabling you to “log in” to a car and transfer your settings before you start driving. Then you could also “log out” when you’re done and reset the car. Tesla already has all the necessary building blocks to build this–the car is persistently connected to the Internet, each driver has a “My Tesla” account that knows the VIN number(s) of their car(s), and the Tesla mobile app lets you log in with your My Tesla account and view/control the state of your car.

They even dipped their toe in the water with their recent calendar integration, which uses bluetooth and your My Tesla app to show your upcoming calendar appointments on the Tesla console (with one-tap driving directions to the destinations). It’s an effective but oddly orchestrated integration, suggesting that they were aware of the potential to “do it all in the cloud”, but chose to stop short (perhaps for a mix of technical and privacy reasons?)

Tesla also currently has support for multiple “driver profiles” per vehicle, which control not only the seat positions but also favorite radio/streaming stations, preferred steering mode, A/C and temperature, and more. It’s not much of a leap to suggest that you should be able to enter a new Tesla, pair your phone (perhaps with a quick tap on an NFC terminal built into the center console?) and choose to “download your driver profile”, bringing with it all of the personalized settings mentioned above. This would be useful not only for loaner cars but also when upgrading your own car or even borrowing a friend’s car. Since not all cars have identical hardware configurations, this “import” step would allow each car to flexibly pull down the applicable settings while ignoring the rest. Such a model would also be a lot simpler than attempting a full two-way sync.

As a bonus, this might solve a silly but common frustration / feature request I’ve heard from multiple Tesla owners: the ability to select your driver profile from the mobile app before entering the car. Anyone with a significantly shorter spouse often has to embark on a series of contortions to reach the center console when the seat was  left all the way forward. 🙂 I presume this isn’t currently possible because the driver profiles are not synced to the cloud, but once they were…

To end on a personal note, one of the things I love most about using and working on cutting-edge technology is the feeling of how “unfinished” it is. Swimming in uncharted waters reminds me that the future has yet to be written, and that it’s up to all of us to figure out where to head next. I didn’t realize that my car had gotten sufficiently personalized to need a cloud-based auth system, but it now seems obvious in hindsight. What other needs and opportunities will we discover as the lines between digital technology and the physical world continue to blur? Only one way to find out…

« Older posts

© 2024 Joseph Smarr

Theme by Anders NorenUp ↑