X.ai has raised $44 million to tackle a really big small problem: scheduling emails. You might have encountered one of their AI-powered assistants without even knowing it. When emailing to set up a meeting, a user simply copies firstname.lastname@example.org or email@example.com to take over the logistics. These bots process natural language in order to set up a meeting the way a human assistant would — but for much less money.
On this episode, Dennis Mortensen, the CEO and Co-Founder of x.ai, talks about what a future powered by artificially intelligent assistants could look like.
Want to make your own company more data-driven? Learn how Indicative can help.
Andrew Weinreich: On several occasions, I’ve had conversations with computer algorithms. My guess is that many of you have as well, although you might not have realized it at the time.
It starts out the way it always does: you email someone you want to meet, and they respond, copying their assistant, Amy, to set up the date. Everything goes exactly as expected.
In another context, I might have thanked the assistant for his or her help. But for this conversation it wasn’t necessary: because, the assistant named Amy wasn’t a real person. She was a bot — powered by artificial intelligence. I knew that because it was revealed in her signature.
X.ai is the company that’s developed Amy, and her male counterpart, Andrew. They’ve raised $44 million to develop AI that will take scheduling out of human hands. X.ai Co-Founder and CEO Dennis Mortensen explains how it works:
Dennis Mortensen: It works very similar to that of you having hired a person to sit in your front office and manage your calendar. Meaning that when somebody emails me and say, “Hey, Dennis. Do you got time to meet up for coffee?” I can simply just within my inbox, reply back and say “Hey, you know what? I’m up for coffee. I have CCed in Amy at x.ai and she can help put something on my calendar in the next couple of weeks for half an hour at 200 Broadway. Then I click send. And then, I click archive. Because this is not my job anymore. And Amy will now understand what I asked her to do, remove me from the dialogue, reach out to one or more participants, negotiate a time at my office for half an hour, and upon conclusion, send out an invite.
“I click send. And then, I click archive. Because this is not my job anymore.”
Andrew Weinreich: Dennis envisions a world where for every basic task like scheduling, we’ll call on an AI assistant that can summon up other, more specialized AI assistants, to do those tasks for us. But what makes scheduling so complicated that it takes millions of dollars in funding to build this technology?
Dennis Mortensen: Because people don’t say, “Amy, comma, can you set up a new meeting?” No, they say things like, “Hey, Amy, Lars and I need to the hokey pokey next week, can you make it happen?” Hokey pokey? As in, who is this guy? What is he talking about? Well, we still need an algorithm to understand that.
Andrew Weinreich: And to train an algorithm to understand and take action on this takes a lot of human-powered work.
Dennis Mortensen: If you asked me where a very large amount of our funds went, it will be into labeling. As in, we have today about a hundred people full time that does nothing but labeling.
Andrew Weinreich: Here’s how it’s done.
Dennis Mortensen: You could imagine really that I gave you a piece of paper and four highlighters. And I said, “I need you to highlight all of the intents with the blue highlighter. And I need you to add a note what intent that was? Was that a cancel meeting, reschedule meeting, running late, add participant, make Andrew optional, and so on and so forth. I need you to use the yellow highlighter to mark any dates and times. I need you to use the pink highlighter to highlight any location. Did he say 200 Broadway? Did he say New York? Did he say Chelsea?” …That type of labeling we obviously just do in a digital form. And the more you do that, the more we can get to build a corpus for where we have an understanding of how do people ask to set up a new meeting?
Andrew Weinreich: I’m Andrew Weinreich.
Jeremy Levy: And I’m Jeremy Levy. You’re listening to Deciding by Data, the podcast that brings you into the C-Suite to uncover how data powers successful businesses. Today on the show, you’ll hear from Dennis Mortensen, the Danish serial entrepreneur who hopes the next time you schedule a meeting, you say, “not my problem.”
Andrew Weinreich: Dennis, thanks for joining us.
Dennis Mortensen: Thanks much for having me.
Andrew Weinreich: Would love it if we could start with a little bit of background on you. And then, talk about your company. Maybe you can tell us what you’ve been doing, prior to x.ai. And then, lead us into the inspiration behind the company.
Dennis Mortensen: Sure, I will kind of spare you the four-hour seminar on just how fantastic I am. I’ll leave you the number for my mom, and she can do that version. But, the short version is one for where we’ve now spent a good, 23 years trying to extract value from data. So, our venture before this was a predictive analytics venture for media. And the venture before that, was a enterprise web analytics company. And the venture before that, was a [inaudible] analysis company so there’s certainly some fondness of data, and if anything I could do the next eight hours on how to extract value from data.
Andrew Weinreich: Where are you from originally?
Dennis Mortensen: Oh, the funny accent? I’m Danish. Or as all my American friends will call me the socialist. So, there’s another two hours we can talk about.
Andrew Weinreich: And the first company was started in Denmark or…?
Dennis Mortensen: So, we did our first venture out of Denmark. We did the next one out of Denmark, then we did one out of Budapest. I spent four years there. So, if you ever want to talk about sales skills, try to persuade your wife and kids to move to Eastern Europe. Before it joined the European Union by the way.
Andrew Weinreich: Wow.
“I’ve turned entrepreneurship into a lifelong career. As in, I have no backup plan. As in, this is exactly what I want to do.”
Dennis Mortensen: And the next one was in New York. And we are in New York today as well. So, I’m a fan — I’m so much a fan that I became a citizen. What is it now? A year and a half ago, just before you guys took a turn to the right, and — nothing wrong with that — but, I’m certainly very most impressed with just the willingness to participate in any form of entrepreneurship. So back home, certainly when I grew up, so, I’m 45, sadly, because I’m getting older. If you told people that my career choice is one of entrepreneurship, it really sounded like, “Oh, so you weren’t able to find a really half decent job, so your backup plan is try to make some money for yourself,” which was not the story. The story was one for where I think I have an idea so good, I can go out and create a company around it. Anybody here, which I speak to, if you say the word entrepreneurship, that is kind of on par with you saying, “I’m a fucking rock star,” as in I play the guitar but I don’t play the guitar but I do tech. And that’s I think is something I certainly appreciate, because I’ve turned entrepreneurship into a lifelong career. As in, I have no backup plan. As in, this is exactly what I want to do. And we’ve done it now five times over.
Andrew Weinreich: Yeah. Good for you. Good for you. Tell us a little about the inspiration behind x.ai.
“I did 1,019 meetings in the year 2012. What was even sadder? I saw that I did 672 reschedules.”
Dennis Mortensen: The true story, and it’s going to sound like one of those eBay-like made up stories. But, the true story is that, in our past venture, post-exit having sold it, what happens is you end up with a little bit extra time on your hand as you do the integration. And one day, I did this kind of odd, sad thing of going back into my calendar to see exactly how many meetings did I do the year prior to having sold the company, just for the fun of it. And what I ended up seeing was, that I did 1,019 meetings in the year 2012. What was even sadder? I saw that I did 672 reschedules, as in, I set up all those bloody meetings myself. No assistant. No help. Just me in my underwear at home, at 11 PM. And at that moment, I thought, “You know what? That is a task I’ve done for now 20 some odd years. Am I supposed to do that for the next 20 years?” As in, that doesn’t sound like a version of the future I wanted to live in. And if that wasn’t a version of the future I wanted to live in, what might that look like? And when you speak to people, we all had the same idea of what the future might look like. Either you win in the corporate lottery and you become SVP of whatever, and you get a human assistant, and you don’t have to do this chore anymore, or tech somehow solves it. And you get some agent in place who could do it for you. And it just felt like there might be an opening in late 2013, to go build that agent. So, that was the catalyst. And at that moment, it certainly looked like there might be an opening for somebody to go out and build this intelligent agent, that could do this particular task.
Andrew Weinreich: Was that the moment where you said this is an opportunity to use artificial intelligence, or were you focused on: Is it possible that we can use a human agent to solve this problem?
Dennis Mortensen: So, as an entrepreneur, I’m actually less focused on the technology. I think any entrepreneur — and I’m obviously biased on my own processes — but, I think you should less onto the pain. Because the pain, if true and honest, never change. The technology might change actually over time such that, you hate your commute. Then we can try to solve that for decades by giving you a car, probably even kind of half a century. But then, at some point, that particular technology actually doesn’t really work anymore. And then you want something else. But, the pain is still true. Which is that, you hate your commute.
And this is the same. Which is that, if I asked you: “Do you like setting up meetings?” And you’ll tell me “No Dennis, I f***ing hate it.” “Oh, if you hate that, how can we then figure out how to remove it?” And I think that’s what we latched onto. But, removing it is one for where the pain is so obvious, that a ton of people have tried to remove it before. And at least to me, it looks like they all walked down the same avenue, which is what they did when they implemented some extension or some plugin, or you Tungle.me, or you Doodle me, or you do whatever. And all good honest people, smart people who try to solve it. But, I’m not looking to have the pain alleviated a little bit. I want it gone. And it certainly looked like the only way to completely remove it was to hand over that job to somebody else. But, handing over to somebody else, if that’s a human, that’s just a very expensive product. As in you’re going to pay 50K a year to have a human sit in your front office and do this for you. So, it can’t be a human. It needs to be something else. Some sort of machinery.
“I’m not looking to have the pain alleviated a little bit. I want it gone.”
Andrew Weinreich: Before we drill down into how x.ai works, I’m curious whether you try to quantify this problem. I read a blog post. I think by your co-founder, that 87 million Americans spend five hours, I think it was five hours a week scheduling. How dramatic or profound did you find this problem? Did you think that if you could change scheduling, you could increase productivity of people, and therefore, literally impact our gross productivity? Or did you think about this more as a corporate expenditure challenge or opportunity? How did you think about this problem and quantify it?
Dennis Mortensen: You’re certainly right that any entrepreneur, whether just in your kind of challenge of pitching it to yourself, or pitching it to your investors or co-founders, or team members would have to come up with some sizing of market. As in, am I selling this for me and my mom? Or is it something a little bit bigger out there? The funny thing about this particular venture, and any one of our past ventures have had some sort of a damn slide in our pitch deck. As in, “Dennis, tell me exactly what is the total addressable market for this thing you’re trying to solve?” For this venture, funny enough, that slide didn’t even come into question. It wasn’t even part of our deck. As in everybody just knew inherently that this is massive, as in we could do a study right now, go down to Broadway and ask the first 20 knowledge workers; one, do you touch a computer in one way shape or form this week? They’ll say yes. Do you do a meeting? They will then say yes. Do you like setting up? They will then say no. And that just holds true in any form of survey you would do.
And what we found is, when we started to kind of look into the size of it is that, it was massive. Massive to the kind of, if somebody solved it, forget about whether we solve it, if somebody comes along and solves it, there’s a real visible impact, on if not just domestic, but global productivity growth. As in this is something which isn’t a blip. We’re talking hours and hours on, at least in the US about short of 90 million US knowledge workers. That’s certainly something that was strong when we had to go pitch this for funding that people immediately saw that if now is the time, the market is massive. Just like you haven’t read a single story, probably in any publication where somebody suggests, the market for self driving cars, we’re not sure how big it is. We just all assume that it is massive to the extent for we probably don’t even have to talk about it. We have to talk about the technology and is it even kind of feasible at this moment in time, or should we wait kind of a couple of decades? This was very much the same.
Jeremy Levy: So, the term AI is really popular these days. And I find it’s really helpful, could you help us understand, what does AI even mean?
Dennis Mortensen: That’s a very good question. And given that, there’s people out there taking their degrees in AI, we should probably have some sort of fixed explanation for exactly what it is. I like the idea though of AI being some mechanism, that’s got the ability to do a job or a single task autonomously. Meaning, that there’s something here which if I don’t do it, you would do it, or somebody else would do it. And it becomes AI when you can kind of hand over that task or job to some piece of machinery, whether that be setting up a meeting for me, driving the car for me, finding that three faces in that picture uploaded to Facebook. But, doing a job that a human would otherwise have to do.
Jeremy Levy: I mean, there are so many examples of where technology make decisions on our behalf. Where does the sort of line go between like a simple, if statement, and sort of branches to do one thing versus another, versus actual intelligence? Where is that threshold? Where is that line?
Dennis Mortensen: So, I certainly like the artificial part of artificial intelligence. Given that, we are not even entirely sure what true intelligence is all about. As in, we don’t yet have some paper that came out a decade ago that concluded that human intelligence is this. And now, it’s just a matter of time before we can figure out how to replicate that. As in, we just don’t know. So perhaps we should focus on the artificial part. As in, there’s some decision-making here in the machine that we do understand, that can do a task and a job that if it didn’t do it, you would have to do it. So, that is something which I keep, at least personally, latching onto. And I actually don’t mind the simplicity in some of the technology. As in, for me, that can still be artificial intelligence. And we still have this, in particular with artificial intelligence, conundrum for where once we saw something, and it used to be sexy yesterday upon having solved it, it’s just not sexy tomorrow. And then, we forget how interesting that problem was.
Andrew Weinreich: Let me just drill down on Jeremy’s question. Because we had, for example, online dating services. That’s a machine we had, instead of a in-person matchmaker, that’s a machine that answered a query and that did a person’s job. But, we wouldn’t call that artificial intelligence. So, when Jeremy says, “Where’s the line?”, there seems to be a higher level of intelligence than simply a machine doing a person’s job and so I’d love it if we could push you a little further, and say how would you define- We’re seeing it, sort of like blockchain, right? We see every company, if you want to be a sexier company today, no matter what you’re doing, you’re saying “I’m in blockchain.” And we have that same dynamic with AI. And it’s not clear to me what the definition is you’re offering of AI that’s distinct from machines handling human tasks, that they’ve been doing for decades and decades and decades.
Dennis Mortensen: Let’s unpack it a little bit more. So, I certainly think there’s some value in making the distinction between the conversational UI and the agents that exist within that conversational UI. So, we now have this moment in time for where a new UI is about to arrive. So, I took my CS degree on the command line, and that was the UI. The only access to compute was, if you could figure out the syntax on the actual command line. Then, we got the graphical user interface and we can kind of semi-democratize access to compute. And reasonable people within other verticals could get access to it. So if you’re in CPA, you can get access to a spreadsheet, and you can now use compute. And then perhaps the mobile use interface is a UI paradigm in its own right.
But, there’s this moment now for where people are moving into this conversational UI, for where, at least in my opinion, we will fully democratize access to compute. As in anybody who can speak even a funny Danish-English sentence like me will get access to compute. Well, that’s an interface. That is just you wanting to speak to your computers instead of clicking a dropdown, or a checkbox, or a radio button. But, that to me is not intelligence. That is just a new UI.
Now, when you get the ability to speak to your computer, what happens is that, you will be less focused on tasks and more focused on objectives. As in, I want something done. I actually don’t care how you do it. I just want something done. And those agents within this new UI, I think are AI agents. They don’t have to be. They can also just be humans. I’ll give you a good example here for where I think this new UI paradigm really starts to kind of make sense.
So, one of my favorite examples is this: I’ll have to go to San Francisco on the 25th of January. And as I go there, I’ll stay at the Hilton. It will be 11 PM when I arrive, and I want a Diet Coke. My first line of thought, is not one for where I want to use the existing kind of UIs. As in “Oh, Diet Coke, let me go to the AppStore, let me find the Hilton app, let me install the Hilton app on my iPhone, let me set up a new account if you don’t have an Hilton account, let me log in with my new credentials. Upon logging in, let me find the store, let me add Diet Coke to basket, let me click checkout.” As in, that just doesn’t work, which is why the phone and the conversational UI still works. I pick it up. I click room service. I say, “Hey, can I get a Diet Coke for 1920, please?” That though, is a human agent on the other end of that phone call. But perhaps, it’s just an Alexa in the room, or a big-ass number on the front door, I can text the downstairs lobby and say, “Hey guys, can you bring up a Diet Coke?”
When they receive that message, that’s where I think AI will start to come into play. As in, it might just be too expensive to have a human agent read and interpret that message and figure out what action to take, perhaps in most of the cases, in most lobbies, in most hotels, it’s simple, ‘can you get me a Diet Coke’ requests, and a machine should just take that over. And that’s where I think it will become somewhat easy to just make the distinction between humans and AIs, where they will read the request, and they will take the action.
The sophistication of some of those agents will be super simple, and we’ll be embarrassed to call them AIs. I’ll give you one of my favorite bots right now, outside of x.ai, which is Citibank, and we can have a laugh about that. But they used to call me, “Hey, Dennis. I can see that you spent three dollars, but it’s in Singapore. Did you buy that bagel?” “Uh, yeah. First of all, I think I’m paying AT&T three dollars just for you to call me right now. So, why are you trying to confirm this?” But, that was the mechanism which they put in place, they had a human agent called me to verify a specific buy outside of my normal pattern. Now, they just text me. Reply one back if true. Reply two back if not true. Damn, that’s an awesome bot. As in, I hated that kind of, ‘I think Citibank is calling me.’ I didn’t have all the numbers for Citibank. It was just kind of one of those calls where it could be them. I think I have to pick it up.
Jeremy Levy: Help us understand what are the other components that fall into sort of this AI bucket? So, one is obviously the interface. You talked then a little bit about sort of the actionability around it. What else is required to make an AI around scheduling meetings? Help us understand the other components involved.
Dennis Mortensen: Sure. So, we certainly like to see it as a kind of three-pronged challenge, which consists of one, the NLU, which is the Natural Language Understanding, or your ability to read the message and understand it in full. Say that, we’ve set up a meeting for today, and you email Amy and say, “Hey, Amy. I’m going to be running five minutes late.” That happens all the time, I’m sure you can imagine that. But, if I understand it, which is a big ‘if’. That is super hard, given that language is not a solved science, and the only hope you have is to pick a vertical so thin, that you might be able to solve language within that kind of vertical. But, that’s the first challenge, the read challenge. Now, we make the assumption that you did understand it in full.
Jeremy Levy: Is all of that automated today is just the entire understanding from a natural language perspective automated?
Dennis Mortensen: I’ll tell you about how we got about to actually solving. So, that’s the first challenge. You need to understand it. But, let’s say I do understand it. You’re running five minutes late. Not as in, I was able to pick up the temporal data, and I was able to pick up the intent or the location, or whatever. As in I truly understood it. Then, I need some sort of reasoning engine. As in, you’re running five minutes late. What does that mean? What do I do with that? You and me might just think that’s common sense. You probably do nothing. As in, five minutes, all good, stay tuned. I’ll just wait if you drive a little bit later. But, there is no such thing as common sense. You have to kind program for that. So we’ve had, and anybody building any agent, we have to kind of build some sort of reasoning engine within their very confined universe.
Jeremy Levy: Meaning, in the context of scheduling, we can make certain assumptions about what people mean when they say five minutes late.
Dennis Mortensen: So, we have no aspiration, or are not as naive to believe that we can build a human-like agent here. What we can build perhaps, if we’re good, is some agent that can exist within a meeting scheduling universe. That means, if you email Amy and say, “Hey, you know what? I actually think Chelsea is going to win the Premier League this year.” She will have no idea. They will, but she will have no idea. But, if you do something within her universe, she’ll understand everything about it.
Andrew Weinreich: When you say that you have no aspiration of expanding beyond scheduling, I assume you mean in the next couple years. Because I’m guessing that at some point, this idea of the challenge of language and establishing context, or understanding context regardless of the topic is a solvable problem.
Dennis Mortensen: I’m not really a huge fan of people trying to do seven things. Especially if you’re a startup, you can barely do one thing. So, being kind of half-assed at seven things doesn’t really get you far.
Andrew Weinreich: I got it for you, but we hear from Elon Musk that we have to be afraid of artificial intelligence. So I’m asking you, are you saying that the challenges associated with language and understanding context, are such that, we’ll never solve this problem? It’s a 20-year problem? When I say we, I’m talking about understanding language, notwithstanding that you are operating in one vertical or there is not the context for knowing what vertical you’re in.
Dennis Mortensen: I am certainly not convinced that we will get some sort of press release from the Googles of the world in five years that will say, “Hey, tada, we’ve solved it.” We now have an agent for where any question you have, we’ll have an answer. Any job which you can imagine, will be able to do that job. As in, I cannot, even in my wildest dreams get to that end state. What I do think will happen though, is that you’ll see thousands of these highly verticalized, highly specialized agents that can do one job, but do that one job really, really well.
Jeremy Levy: So, does that mean I have to have an agent for every specific nuanced task that I want to get accomplished?
Dennis Mortensen: Given that I don’t believe in the single agent scenario, that this oracle will arrive one day. As in I can’t play out that scenario. I could be wrong, but it doesn’t ring true to me. What I think will happen is, which you allude to here, but not in the kind of setting for where, “Hey, Dennis. I don’t want to hire 22 single individual agents and try to figure out how to work them.” I do think you’ll end up doing just that though. As in, I think you’ve done it already. So, pick up your iPhone and look at it. What you did is that, at least on average, people have a little bit short of 100 apps on their phone. That is you picking, really, just within the existing UI paradigm, about 100 apps that you need to go about your day. I need this to search, this to email, this to calendar-
Andrew Weinreich: But we’ll look back, you know, 10 years from now, we’ll look back, and we’ll say, “That was the craziest experience in the world, right?” That in order for me to figure out the ferry that I was going to take here, I had to look at the ferry app. In order for me to look at the calendar, or my schedule, I have to look at the calendar app. So, I think we’re trying to understand your perspective. Is there a router in the future? Is that Siri? Is that Google Home? What does the dynamic of the future look like? This doesn’t strike me as sustainable, that I have 100 agents, all that develop this level of AI, and that perfect the intelligence, but, it’s a clumsy experience.
“…we’ll end up with one core enabler AI, or you can call it a horizontal AI like Siri, Alexa, Cortana, and so on and so forth. And it will be your job to figure it out, which specialized agents do I want to employ, which my enabler AI, will then have to manage on my behalf.”
Dennis Mortensen: See, I think it’s going to play out exactly as you suggested, which is that we’ll end up with one core enabler AI, or you can call it a horizontal AI like Siri, Alexa, Cortana, and so on and so forth. And it will be your job to figure it out, which specialized agents do I want to employ, which my enabler AI, will then have to manage on my behalf. So that, when you say, “Hey, Siri. Can you get Dennis and I together the first week of February when I’m back in Manhattan, please?” Siri needs to understand that, that is a job for which I don’t have the skills to solve it. But, what I do know is that, we got somebody on payroll who knows how to solve that. Let me package the request, send it over to Amy. She can work it over the next day and a half. Once she’s solved it, she should then package it, and go back to Siri and say, “Hey, remember that thing you gave me on Monday? I now solved it. And you can go back to your boss and say so.” And that I find really interesting. I find it interesting not only because I think it provides a very similar setting to the kind of app economy, which we have today, that will allow startups like ours to exist. Because we are highly specialized in doing this kind of one thing and doing it really, really well. But what will change then is that, your responsibility as an employee, or just as a good knowledge worker is one for where you would have to pick who is my foreman, or my enabler AI? And which agents do I want to hire? How do I want to train them, and how do I make sure that they become better and better? Which suggests, at least in my version of the future here, some setting for where we all become managers. As in, you will have to learn how to manage agents.
Andrew Weinreich: But how does that work? I mean, you’ve raised I think $60 million, is that right?
Dennis Mortensen: About 45.
Andrew Weinreich: $45 million. Most of the apps on my phone have not raised $45 million. In fact, I would say a few have. And I’m trying to understand the level of sophistication that’s necessary to bring an AI application to dominate a vertical — calendaring in this case — seems like extraordinary. So, I’m trying to figure out how that plays out. Siri is hitting the APIs of these different functional components. And how many companies will there be like you that are capable and properly capitalized to play a role in this emerging world?
Dennis Mortensen: If the App Store is any predictor for where today, you’ll see about two and a half million some odd apps in the Apple App Store, and a similar amount in Google’s store. I find it hard to believe-
Andrew Weinreich: Most of those were built with $20,000.
Dennis Mortensen: Correct. And I think the same will be true here. And some were built with hundreds of millions of dollars. As in, some particular games have hundreds of people working, years on end trying to bring that to market. And I think you will see the same type of power law applied to these agents. Where some will be extremely sophisticated. Some will be middle of the road. And some will just be two guys hacking away over the weekend on some free API, some predictors from AWS, and good for them. It will be so specialized, that it really only works for 800 people at some high school in Denmark. And they’ll be fine for that particular agent, it only really solves the, when you have a half hour free in your schedule in that high school in Copenhagen.
Jeremy Levy: You mentioned AWS for a second. What happened when the company was founded that allowed x.ai to happen now? How was the sort of AI technological ecosystem evolved, such that the opportunity for x.ai is now? And what’s going to enable that in terms of what you just described in terms of the App Store of individual specific agents to make that easier in the future?
“If you go engineer a self-driving car, you can’t have a footnote saying, “Hey, by the way, for every 1000 miles, we’re going to hit a pedestrian.” As in, even though that will be fantastic software — if you can drive a thousand miles in Manhattan and only hit one pedestrian, I’m pretty amazed, as in, that’s awesome. It’s probably not going to be commercially viable.”
Dennis Mortensen: And by the way, you’re seeing this App Store happening right now. So, If you go to the Alexa ecosystem, what you see is that they have what they call a Skill Store, which is very similar to the narrative I just provided here. So, it’s not that I’m a single individual who’s super crazy. As in, Amazon thinks my version of the future is going to play out as well. But, in regards to the entrepreneurs and their access, I think you might have to, at least, in a simplistic way, look at AIs in two ways. AIs that live in some high-accuracy setting and AIs that live in some low-accuracy setting.
And it’s not a matter of you always want to be in a high-accuracy setting. As in, there’s plenty of applications where you don’t need high-accuracy. If you go engineer a self-driving car, you can’t have a footnote saying, “Hey, by the way, for every 1000 miles, we’re going to hit a pedestrian.” As in, even though that will be fantastic software — if you can drive a thousand miles in Manhattan and only hit one pedestrian, I’m pretty amazed, as in, that’s awesome. It’s probably not going to be commercially viable. So, that needs to be extremely accurate. We happen to be in the high-accuracy space as well for where, if I didn’t turn up today for this particular meeting, that’ll be just not cool. Not cool to the extent for where, we probably couldn’t exist as a commercial viable product in market.
But, there’s plenty other applications where you don’t need that. Right now, take a picture of the four of us in here. Upload that to Facebook. And they might just only recognize two of us. That’s 50 percent accuracy. That’s low, as in. That’s hitting a pedestrian for every 100 meters. But, it doesn’t matter. It’s one of those where like, “Oh, nice. Thanks, Facebook.” You might even just go click the other two faces. You might not- as in, that’s a low accuracy thing for where you’re actually fine with that. And you need to probably pick one of those two, or understand which one you land in with the product you’re trying to bring to market. I think there’ll be plenty of low-accuracy products for where, you and me tonight could spin up a set of instances on AWS, collect a little bit of data, use their set of prepackaged services-
Jeremy Levy: Is that the enabler though? The fact that we can sort of spin up, compute power sort of at a whim? Is there software that is available now publicly that wasn’t previously available? What are the building blocks that have allowed x.ai to exist today? Whereas we had conversational interfaces on IM, you know, 12 years ago, 15 years ago, 20 years ago, I think.
Dennis Mortensen: Taking a few steps back, I think what people perhaps don’t fully appreciate and what we kind of left out of this kind of half-technical discussion is one for where the very first challenge is one for where, if you want to go attack a vertical, you need to be able to walk up to the white board and describe that universe in full. Not part of it. Not most of it. So, if you don’t describe it in full, how can you ever create an agent that can navigate that space?
Andrew Weinreich: So, give us another vertical. Give us another small example of a universe described briefly in full, other than calendaring.
Dennis Mortensen: Business travel for direct flights, something for where, I do what, 20 trips a year, I want a fair price but I’m not price sensitive and I really just want to go from here to San Francisco, and I need to be there for my meeting at 6 o’clock. End of request. Who’s the agent? Me. That means I go to Expedia. I go do all the searches. I pick JetBlue, or United, or whatever. I don’t have any loyalty. And then, I find the best time. I then go book it. But really, there was a time when we had human agents, called travel agents. And I could say, “Hey, can you get me to San Francisco next Thursday by 3 PM so I can be at my meeting at 4 for less than $800, please.” And then they would just make it happen. Or doing my receipts. There was a time for where, you wouldn’t be sitting in your desk kind of half crying, figuring out how to get $5.40 reimbursed from Pret, now, you might take little pictures, but you’re still doing it. There was a time some people just take all the receipts, give it to somebody in the office. They would sort through it. So, there’s plenty of those. But, you need to be able to map out that universe in full. As in, what exactly do I mean when I say meeting scheduling? The good thing about my particular vertical is that, we are almost in instant agreement about what that means. When we say business travel, direct flights, we are certainly quite close. When we talk about receipts being reimbursed, we might have different opinions. Equally good opinions about what it is, but different opinions. But, you need to be able to map that out.
Andrew Weinreich: But we shouldn’t have different opinions, right? The IRS prescribes, I mean [Dennis laughs] — presumably, you should be able to incorporate some standard for that.
Dennis Mortensen: But, we might have different opinions about where the pain is embedded within this particular task. There are certain parts of it that I don’t care about. There are certain parts that I really care about. And the first thing, once you mapped that out, then you need to figure out, now, that I know the universe what my agent needs to exist in, what data can I then collect so I can create some model, which can kind of replicate decisions in this space? Where does that data come from? How are you going to collect it? And how are you going to label it? And those two kind of baby steps turns out to be perhaps, really the barrier to entry for anybody who wants to do anything in the AI space. And they sound so unsexy that we barely want to talk about it. But, that is really where I think you win or lose. Can you map out your space and can you collect some data that represents all agents navigating around in this space? We spent, just to give you some sort of scope here, four years with the majority of the team, by the way, just labeling data. It is so unsexy that I want to cry.
Andrew Weinreich: What does that mean, labeling data?
“…all of that data needs to be labeled, so that becomes understanding so that hopefully in the not too distant future, you can make a decision that a human would otherwise have made by reading this text.”
Dennis Mortensen: Labeling data means that you shoot Amy an email saying, “Hey, you know what? I’m going to be running five minutes late.” Let’s just text, unstructured data, that doesn’t mean anything. For that to have any value, we need to label it. As in, you could label it for an intent called ‘running late.’ It could be labeled for ‘reschedule.’ If you said, “You know what? I don’t think I’m going to make it. Perhaps we can do early next week.” Then I need to label for the reschedule intent. But I also need to kind of label for some temporal data because you said early next week. And all of that data needs to be labeled, so that becomes understanding so that hopefully in the not too distant future, you can make a decision that a human would otherwise have made by reading this text.
Andrew Weinreich: Are you labeling that data relative to me? Or you’re labeling that data to try to understand an entire new dictionary of-?
Dennis Mortensen: Entire new dictionary. Good question. The reason that we are not labeling it distinctly to you is that, for meetings, you might, if you are as aggressive as me, do a thousand meetings a year.
Andrew Weinreich: Well, I actually meant more, if I tell you I’m five minutes late, what I really mean is I’m 10.
Dennis Mortensen: Yes.
Andrew Weinreich: Otherwise I wouldn’t write you, right?
Dennis Mortensen: Yes.
Andrew Weinreich: I would just show up I’m five minutes late. So, I’m trying to understand when we talk about AI and calendaring, it would seem to me that the words people use mean something different relative to the person who’s using them.
Dennis Mortensen: So, the way we thought about this is certainly one for where we think the data is just too sparse if we go down to the individual. As in, you’re not running late so often, that I would really have a large data set on you running late. You might be running late 40 times over the next year. I’m just picking a random number here. That’s 40 data points. There’s not much modeling you’re going to do in 40 data points. So, we have to do it network-wide. However, what we are very likely to do is when we move this to not a new language, but a new culture, because we might say this should exist in Japanese or Italian, it’s not so much the fact that we need to exist within a new language, it’s that we now exist within a new culture for where there’s a different understanding of running late in Italy or Japan, I’m most sure. Well I can certainly, I’ll give you a funny story here, which is true and it’s a feature that we used not to have. So, where I’m from, Northern Europe, we don’t double or triple confirmed meetings, which is very American. So, this idea that we set up a meeting for March 10th. Then three weeks prior, “Hey, Dennis. Just checking in. I’ll see you on March 10th.” Then the day before, “Hey, Dennis. I’ll see you tomorrow at 1:00.”
Andrew Weinreich: I’m assuming you’re going to stand me up. That’s the American culture.
“…we’ve had to engineer part of the reasoning engine to adapt to the culture.”
Dennis Mortensen: Here’s the funny thing. For every one of those I get I’m thinking, “Yeah, I know. It’s on my f***ing calendar.” And the day before, “I know. This the third time you emailed me on this meeting.” As in, you and me could set up a meeting for December 10th this year. I wouldn’t call you. I wouldn’t email. I wouldn’t do nothing. I’d just turn up at your office on December 10th as we’ve agreed. That’s part my culture. We though have engineered it — and I’m the only Danish dude around the office, so it doesn’t matter what my culture is. So, we’ve had to just live in this environment where people do double and triple confirm. So, Amy does the same now. As in, she will, if we set up the meeting way back, go out and do that very American thing for where, “Hey, guys. I know we set this up some time ago. Just confirming everything set for tomorrow. I assume it’s all good. If not, do let me know.” And that is something where we’ve had to engineer part of the reasoning engine to adapt to the culture, which kind of brings us back to where you started, that we can’t do it on the individual. We probably can do it on the culture.
Jeremy Levy: Are those the biggest challenges that you have left for x.ai?
Dennis Mortensen: I would say coming back to full circle where you and I started, I named two of the three challenges. I said there’s the NLU challenge, for where we need to understand what is being asked of us. We need a reasoning engine to take the action. And lastly, we need the NLG engine, some natural language generation where I can write back because the reasoning engine will be give me sort of computational outcome which I need to turn into language so that my human constituents can understand what’s going on. The real challenge honestly, is not on the natural language generation end. That we’ve solved in full. We’ll continue to perfect it. The reasoning engine, the intelligence that Amy have today is at such a level for where she can operate in a fully autonomous setting, doesn’t need any kind of guidance from humans on the understanding end. We are still and might even forever fight that. We’re just willing today, if we don’t understand, to send out kind of like a Siri-like message, which is when Siri doesn’t understand, she’ll do the whole, “I don’t have the answer, but here’s 10 pages from the internet that you might want to look at,” which is-
Jeremy Levy: Worthless.
“…if humans were just honest and not lying all the time, the whole thing would just be so much easier. But they are not honest. They lie all the time.”
Dennis Mortensen: Yeah, not good. We have something similar which is that, “I didn’t really understand that. Could you please clarify?” I’d obviously, rather be in a setting where I never had to send out that email. But given that we are in this kind of fully kind of mechanical setting, for where we need the machine to just do the task on its own, that is where we keep fighting. And really what we keep fighting here is human ambiguity. And I keep telling this anybody who’s willing to listen, if humans were just honest and not lying all the time, the whole thing would just be so much easier. But they are not honest. They lie all the time. They’re f***ing crazy. But we can’t change that. We have to live within that universe where they do exist.
I’ll give you just a simple example here. Say, we’ve been talking and we’ve been super busy throughout the day. So, tonight at 1:00 AM you email me saying, “Dennis, I think I figured it out. How about we get together first thing tomorrow morning and hash things out?” Happens all the time, right? It’s not true though. It’s not what you want. At 1:00 AM, you don’t want tomorrow. You want today. It’s just that humans use the word tomorrow all the way up to the point where they go to bed not the midnight. Now, the machine has to do two things. One, be super confident at high accuracy, that I picked all the data up as it were. Two, equally confident and say, “I know what you said, but it’s not what you mean. So, I’m going to change it for you.” That is dangerous territory to be in. And it’s not this one here. It could be “I won’t make the meeting tomorrow,” but you have two meetings set up. Which one do I cancel?” What negative ramifications are going to come along with me cancelling the wrong meeting? As in, people might be traveling in for this meeting. And that is just really difficult. We’re working it though, but really difficult. And to answer your question, human ambiguity and the fact that people are crazy. But we have to live with that. And all agents have to kind of live with the fact that is just how humans are.
Jeremy Levy: Maybe humans need to learn a way to interact with agents.
Dennis Mortensen: I hear you, and I got mixed emotions on that. On bad days for where I get punched in the face on Amy having made some sort of mistake, I want to go back and say, “How about you man up a little bit, and really, just speak like a machine.”
Jeremy Levy: Be nice to your AI.
Dennis Mortensen: Yeah. Be nice to your AI, right? And then, on other days, I have this idea that I don’t think it’s really my job to change my mom, who really just wants for this agent to go deliver some bagels for tomorrow morning. “Hey, why didn’t I get those bagels? Why are they coming kind of the day after where I don’t need any bagels. I’m not even home now.” I got mixed emotions, so I go round in circles here. I would hope though that we as an industry end up in a setting for where we don’t have to change humans too much.
Andrew Weinreich: Are you integrated into Alexa through the Skills?
“…almost all of your external meetings happened over email, almost all of your internal meetings could happen over a platform like Slack.”
Dennis Mortensen: We are not. The two dimensions we are going to expand on is one into other languages, so, we can attack more markets, and into other communication channels, such as Alexa. First one we’re going to do though is Slack. Just because Alexa is sexy and it’s a version of the future, kind of Star Trek levels, but it’s not really where you set up meetings. So, what we’ve seen is that almost all of your external meetings happened over email, almost all of your internal meetings could happen over a platform like Slack.
Andrew Weinreich: But Slack is still very much a corporate product.
Dennis Mortensen: Correct.
Andrew Weinreich: You know it’s funny. With my Alexa, I asked my Alexa in the morning what do I have today, because I’m integrated with my Google Calendar. And I love the idea of scheduling meetings from my Alexa. But, you don’t see that as- and I loved your vision that Alexa would be this gatekeeper for all these requests. How far off are we from that? I mean I could imagine, I’m going through the examples you gave: “Alexa, can you schedule these meetings?” “Alexa, I’ve got a trip coming up.” In fact, we wouldn’t even want to ask Alexa. Alexa just knew I had a trip coming up and routed the question. How far are we from that vision?
Dennis Mortensen: I don’t think we are very far away from that vision. So, we’ve certainly tried to engineer our agent in such a way that it is not an English email agent, but is an agent that can schedule meetings that happens to exist on email and only be able to speak English today.
Andrew Weinreich: But your vision of Alexa routing, how far are we from-?
Dennis Mortensen: That is so close that I might even suggest that it’s happening as we speak on some very simple agents. And that means when I want to reorder my Seamless Thai setting, I use Alexa. But that’s because I’m willing to go through something which is a little bit clunky, just for the fun of it. But it’s not so clunky that it is not useful. So, I use Alexa for where I open up the Seamless Skill and I reordered the same set of Thai food with me and [inaudible] gets once every two weeks. And that works quite well. And that kind of brings me one baby step into your version of a future.
Andrew Weinreich: It’s your version of your future, but x.ai, Alexa, am I one year away? Two years away?
Dennis Mortensen: In the not too distant future. So,we have grand aspirations of you being able to wake up Amy and/or Andrew on any communication channel. As in, if you talk about meetings, the next thing you should think about is I need to wake up Amy because this is not my job. And if you get an email, you reply back, CC her in. And if you get a Slack message, you ‘at’ her and she makes sure that she takes it over. If you get a text message from one of your daughters, you add in her number. And if you’re in an office or at home, you should be able just to wake her up on Alexa as well. The way we prioritize those is really just by how many meeting requests or moments do you end up talking about meetings in that particular channel. We’ve seen that, if we ask our customers what they ask about the most is: one, Slack; two, text messages.
Andrew Weinreich: Tell us quickly just about the limitations of a solution like Calendly and then tell us whether there are others like a company like Fin that have a contextless approach to taking on all verticals at the same time.
Dennis Mortensen: So, there’s a whole pool of solutions like Calendly out there and we’ve seen them before and Calendly, just a nicer version of Tungle.me, really. And there’s nothing wrong with that, and in many scenarios for where everything is normal, they might even work just fine. I just don’t like the idea that for some meetings, I can use this solution. For others, I cannot. I like the idea that I reached a moment in my life for where setting up meetings is no longer a job I do. And that means that many of these kind of fixed web interface type settings break if things out of the ordinary. How do you tell a web interface that you’re running late? How do you tell a web interface that I’m going to bring my colleague? How do you tell the web interface that I want to extend it half an hour? How do you tell a web interface that I think we need to change the location? How do you do really all those things that are just very human? But it doesn’t mean that it does not work. It works when things are normal and within the confinements of what a web interface can do. I just don’t think that is what I need. I don’t need to alleviate the pain. I need it to disappear. And suddenly, our bet is one for where we reached that moment for where an agent can be built for where it can disappear in total.
Andrew Weinreich: Just to close, tell us about whether there’s this opposing view to the one you articulated before, that a company like Fin, or maybe there are others that have sort of a contextless approach we can solve all types of problems?
Dennis Mortensen: Again, I don’t think there’s anything wrong with that. I’m just more a fan of finding certain verticals for where at least, there’s this possibility for where you can solve it in full. Smart people have taken the approach that it might not be solvable. As in, we’ll attack multiple verticals and they’re not machines solvable. We’ll apply machine learning and make it more efficient. And then, we’ll have humans kind of sit back, and verify, and participate, but that’s really just outsourcing V2. And again, you can create a $20 million business on that. And you should do so. And we’ve seen that many times over. And I think it can create a great service. But, what you see is whenever you apply humans into the equation, it becomes a luxury. As in, as far as I believe, Fin charges $60 an hour, and they’ll take 10,15 minutes to set up a meeting. And all of a sudden, it can’t be for everybody. It can be for people for where, I’m willing to pay that and my invoice now is $700, and you’re okay with that.
But I really like this idea that me giving you an email, if we end up working together, you won’t see that as a luxury. That is seems like “Why wouldn’t you?” As in, I’m going to get an email, a calendar, a laptop, a keycard to the office. As in, that’s just necessities to do reasonable decent job. I want Amy and Andrew and this whole idea of setting up meetings — it’s not a luxury. That’s just a given. As in, why would I hire somebody at a six figure salary and then say, “Hey, by the way, I want you to do email pingpong at the office trying to set up meetings with our customers, or leads, or prospects, or whatever that might be, candidates, and so on and so forth.” No, I don’t. If you’re a recruiter, I want you to speak to our candidates. If you’re a sales person, I want you to speak to our leads. If you are an account manager, I want you to speak to our customers. The setting up the meetings doesn’t add any value.
Andrew Weinreich: Dennis, before we break, any numbers you can give us about the number of people using x.ai?
Dennis Mortensen: Those aren’t public just yet but I can certainly tell you that we are setting up hundreds of thousands of meetings. But the funny thing is, even as proud as I am of hundreds of thousands of meetings as in real people are meeting as we speak because Amy and Andrew kind of set up the meeting, it is still us having pretty much zero percent penetration. So, there’s about 10 billion formal meetings being set up in the US alone every year. Meaning that if I do a million meetings, I’m at zero percent. As in, that is awesome, isn’t it? There’s just so much to attack here.
Andrew Weinreich: Dennis, thank you.
Jeremy Levy: Thank you so much. Really appreciate it.
Dennis Mortensen: Cheers.