ArticlesBlog

Dashboard Mechanisms for Online Marketplaces

Dashboard Mechanisms for Online Marketplaces


>>Hi welcome everyone. It’s my great pleasure to welcome back Jason Hartline to
Microsoft Research. Jason was part of Microsoft research long time
ago in the Silicon Valley lab, and then he went on to Northwestern
to be a faculty there. He is an associate professor
there for exactly two more days.>>[inaudible].>>Yeah. So this is the last
talk as an associate professor. He’s going to be a full professor
starting September 1st. So Jason gave an earlier
version of this talk here. I think it’s been a couple of years. But this new and improved
version is much better. So take it away Jason.>>Thank you. So this is joint work with Aleck Johnsen
who’s a PhD student Northwestern. Denis Nekipelov, who’s
an econometrician at University of Virginia and Onno Zooter who’s the head of
Data Science at Booking.com. The four of us were together for one amazing summer in
Amsterdam at Booking.com for a couple of months thinking
about how to improve online marketplaces
with special attention to the one that
Booking.com was running. Three years later we actually
understood what’s going on. So that’s what I’m going to
be telling you about today. So this paper is called dashboard mechanisms for
online marketplaces. So I’m interested in
online marketplaces and you probably have a bunch in
mind that you’re thinking about, and you’re thinking
about the right kind. What I mean by dashboards is things like this plot I have
behind this title slide, where maybe one of the
hotels on Booking.com can see if they increased their commission that they’re
willing to pay Booking.com, how much would visibility on
Booking.com website change? So this says an estimate
of what might happen. Of course, Booking, they can’t
know the future for sure, but they can make a
prediction that could help the hotels know how to optimize
their strategies on this platform. So the talk is about if you’re designing an online
marketplace and you’re providing dashboards to the participants to better optimize in the marketplace, what’s the interplay between these. From a theory point of view, how do we design these things
to work well together. Cool. So one [inaudible]
online marketplaces is the following scenario, which describes many of them. So they’re short-lived users. They get matched with
strategic long-lived agents. I guess people in this room might have thought about
ad auctions a lot. So the long lived agents
are the advertisers, and the short-lived users are
people showing up and do a search, and they maybe see some ads and
click on that, maybe don’t. Then they go away and they don’t come back until later when they
have to do another search. But the advertisers are there the whole time and they’re
bidding and changing their bids, etc., and they’re in
for the long haul. There’s going to be some
matching algorithm, which is going to prioritize
the agents but eventually the users are going to select
which agents to match with. So a key point here is that actually you don’t get to
design the exact matching, you get to prioritize, and then something
is going to happen. You don’t necessarily know
what’s going to happen because part of what’s happening
comes from the user behavior. In this talk, I’m going to
care mostly about the agents. I’m going to put a big box around
the users and the algorithm, and call that the algorithm. But what that means is the
algorithm is not something I know, it’s going to be something
that’s happening endogenously as people show up on the site. Some of my favorite
examples of these are ad auctions, Booking.com, eBay, etc. In eBay, it’s the sellers
that are long-lived agents, and the buyer who show
up are the users. Cool.>>One question. So
does eBay hold out some commission that sellers
can pay more to eBay?>>So when you’re thinking about
how you design the systems, typically, you want to understand how the agents can tune all the knobs they have at
their disposal to tune. So on Booking.com, I
talked about commission, but also on Booking.com they can change the price that they’re
selling their hotel room for, and the same with eBay. eBay actually they can’t
change their commission, but they can change their price
and so they’re not as price. So it’s just what knob or are you
talking about is the question. So I want to contrast designing marketplaces to designing algorithms that we usually do just briefly. So algorithm design,
you’re given inputs, which may be the agents
values for stuff. You want to determine a good output. For instance, which
agents to prioritize and my goal is good welfare
given the input. So I’m going to for this talk assume that our engineers have
built a great matching algorithm. We’ve got a good job of doing
this and we know how to optimize welfare if just algorithmically. The change that I want to make is consider the fact that we
actually have these agents strategizing and they’re
going to be tuning their parameters to try to get the best outcome from
the marketplace. The goal is that we are going
to achieve good welfare in equilibrium when all the agents
are tuning the parameters. So there’s been a lot of
really excellent theory in understanding the
relationship between mechanism design and
algorithm design. In fact, they’re reductions. Probably the most
famous one is the VCG, Vickrey, Clarke, Groves mechanism, which is generalization of
a second-price auction, which many people have heard of. There are other versions
of these that people have studied but I’m not
going to talk about. The main thing I want to say about these approaches is they
carefully construct payments. So that everyone’s incentivized to tell their true values
to the algorithm, and then you can just run the
algorithm on these true values. With these carefully
constructed payments, everything is an amazing dominant
strategy things work well. The main obstacle I want to point
out is that I don’t know of any online marketplaces that have truth-telling as a equilibrium
or dominant strategy. So given that none of these
marketplaces do this, this whole approach is
not on the table for us. Unfortunately, if you look at the auction theory of how to
design auctions and marketplaces, the auction theory doesn’t say much about how to
design mechanisms that don’t have truth-telling
as equilibrium strategy. So I’ve been for many years really interested in how to
actually do this. It’s not easy but this paper is one of several papers in the
space attempting to do this. My main goal actually is to get
back essentially this result. But with mechanisms that don’t have truth-telling as
an equilibrium strategy, they have some other
strategies as equilibrium. So I would like non-truthful
mechanisms that are in some sense going to be equivalent to the sequential
truthful mechanisms. I’ve added the term sequential here because I’m imagining
of an online marketplace. So every user shows up, that’s a new running
of the mechanism. So we’re going to be running
this over many users, over many days, months,
years, whatever. So my goal is to get
back this result, but in the realistic scenario that we have non-truthful mechanisms
that we’re talking about. One of the most canonical families of non-truthful mechanisms is
Winner-pays-bid Mechanisms. The agents show up, they bid. We run an allocation algorithm
to map the bids to who wins, and then the winners are
going to pay their bids. So people who’ve thought
about ad exchanges. Ad exchanges are basically
winner-pays-bid model. Ebay for the sellers is
winner-pays-bid model. You can post up say a bid now
price and you get paid that when a user shows up and
wants to buy at that price. The thing that you get to do here in the mechanism is choose step 2. Step 1, they show up a bid. Step 3, the winners pay their bids. Those are fixed. The question is what algorithms in step 2 lead to
good outcomes in equilibrium? One common rule you might think of, imagine thinking of a
single item auction is just highest bid wins. Or if you have a more
complicated scenario, find the feasible set of
agents that maximizes your objective in terms of bids. All right. So just treat
bids like they were values and optimize that instead. So I’ve written down a
couple examples here. So for example, one item two agents, winner-pays-bid
highest-bid-wins mechanism. You can imagine values a 101 and 100, and you can imagine
equilibrium bids or this is going to be an approximate
Nash equilibrium. The 101 guy bids a 101 cent, and the other guy bids a 100. This guy can’t really
improve his bid very much. This is an approximate
equilibrium and this is great because the
welfare we got was a 101 because the 101 guy won and
the optimal welfare is a 101, and so that was happy. So this could be for example, selling exclusive ad
space on some banner ad. Imagine though that
we’re selling maybe ad space that could be
exclusive or it could be subdivided into
multiple ad spaces, and so maybe I can do one bidder or I can have actually three bidders
doing a shared ad space, and imagine I have two exclusive agents and their
values are a 101 and 100, and then I’ve got three
shared agents with values 99, 98 and 97, and imagine
maybe I was running this as just an exclusive space
before and then I get my engineers to allow us to do
multiple ads in the same space. So you can imagine this
being the equilibrium. Because we were doing
the same thing as before that was an equilibrium, and now notice that for any
of these lower value bidders, for them to get shown at all, they have to bid over a 100
which they don’t want to do. Now, together they might
want to move together to bid over say 33 but that’s
not equilibrium, this is an equilibrium. Would also be an equilibrium
point of bid 33, but this is also an equilibrium and this equilibrium
is actually pretty bad. So we got a welfare of 101 and the optimal welfare is
almost three times that. So what I’m illustrating here
is that not thinking carefully about the algorithm that bids to wins could result in mechanisms that
have some really bad equilibria. There’s a great paper by Dutting
and Kesselheim that shows that essentially in worst-case, these equilibria are
going to get this value, and they can be really bad. So the focus of my talk today is going to be how to design winner-pays-bid mechanisms, and I have two obstacles to overcome. The first obstacle is that if you’re a bidder in a
winner-pays-bid mechanism, it’s not obvious how you should bid. You have to really understand
what’s going on to be able to optimize your bid and
that’s complicated, and so that’s one challenge. Another challenge is as I just
showed you on the previous slide, the equilibria can be quite bad. So I’d like to overcome both
these challenges at once and the solution that I’m going to be proposing is using
bidding dashboards. So a bidding dashboard you can think of as just a price quantity curve. So if I give you a price quantity
curve and you can pick with your bid a point out on that
curve that you would hope to get. That’s going to make the
bidding problem a lot easier, and the main question that I want to be thinking about is how should I redesign my mechanism understanding that the bidders are seeing these dashboards and
responding to them, and could I maybe improve
mechanisms based on that to get rid of say
the bad equilibria? So the questions I’m going to be answering during this talk
are: what dashboard do I want to publish to the bidders and what’s the mechanism I
should run that’s going to be aware of the dashboard is running that can help make sure
that all the equilibria are good? Cool. So I’m going to tell
you the second answer first. So here’s the basic
idea for dashboards. I’m going to say it informally, it’ll get more formal later. So the dashboard mechanism that I’m going to be
using is the following. So given a dashboard and given the algorithm that the
engineers came up with that maps values to who should win. I’m going to publish the dashboard. I’m going to ask for bids. I’m going to assume that these bids are best
response to the dashboard. If the bids are best
response to the dashboard, I can invert the bid to determine what values was generating
that best response. Then I’m going just to execute the allocation algorithm on the values as if they
were the true values. So I’m going to run the engineering
solution on the values, and of course it’s a winner-pays-bid
mechanism so whoever wins in the algorithm gets
charged their bid afterwards. One thing that you should
note here is that so far, the dashboard is not even
connected to the algorithm. I say for any dashboard I
publish, I’m going to do this. So the key thing that it’s going to be the major endeavor of a talk is to say for what dashboards is
this mechanism a good mechanism? So if you’re an economist,
you’re thinking wow, this is going to be incredibly
horrible because I just published a dashboard and then we do something completely different from what
the dashboard we published was, and how could that possibly work out? So as I said, what I want to do is ask
the question in reverse. For what dashboard does this
mechanism actually a good mechanism? I want to point out that if the
agents do follow the dashboard, then the allocation we
produce is the right one. Because we correctly
infer the values and then we run the algorithm
on the correct values, and so we’re always allocating
the way the engineers wanted. The problem is, is that if we
publish the wrong dashboard, then the payments are going to incentivize people to
want to follow the dashboard and all things could explode in our
faces if we do this uncarefully. So to give an overview of the results we’re going
to be talking about today, I want to first recall the goal. The goal would be to
get back the results we had for truthful mechanisms but with these non-truthful
winner-pays-bid mechanisms. So I’d like to try to get sequential non-truthful
mechanisms that are approximately the
truthful mechanisms, and as we saw a second
ago with our fact, if people follow the dashboard, we’re producing the
right outcome allocation but we may be getting
the wrong payments. So the question we have now is how can we make sure that we
get the right payments? So the results are going
to bear the flavor of, in this situation, the payments
are basically going to be right. So to overview the results, I’m going to go through a series of constructions of dashboards and get more and more
complicated as we go. They’re going to satisfy stronger
and stronger properties. So the first result is going to be in static environment
with static values. So imagining the engineer’s
algorithm is not changing and also the values that the agents
have are also not changing. So in static environment
we’re playing the same stage game every time, following the dashboard
converges to an equilibrium. So I’m going to give
you a dashboard such that following the dashboard
converges to an equilibrium. I’m then going to look
at dynamic environments where either take one
agent’s point of view, the algorithm is changing or the inputs from the other
agents the algorithm are changing. But that agent’s value
is not changing. So one agent has a value of say $10, it’s not changing over time, but the world is changing underneath it and things
are changing that way. So in this environment, I’m going to modify
my previous dashboard to make a stronger one such
that following the dashboard, the average payments you make
converge to the correct payments. Then the final thing I’m going to do is look at dynamic environments
with dynamic value. So now, I’m imagining your value
could be changing every time. Maybe these users have co-keys, and for different co-keys, you have a different value for the users. In that environment, what
I’m going to show is a proc. Again, I’m going to make
my dashboard a little bit more complicated to handle this
more complicated environment. That more complicated dashboard, I’m going to get approximate
strategic equivalence with the sequential
truthful mechanism. So we had a really nice theory
on the very first slide saying, hey, you could just do these
sequential [inaudible] , for example. What I’m saying is, if you’re required to do winner pays bid like many marketplaces are, we’re essentially going to get
all the same equilibria of those. For example, the follow the dashboard equilibrium is
going to correspond to the tell the truth equilibrium in a sequential truthful
mechanism. Yes please.>>I’m a bit lost. Can you give
a concrete example to keep in mind like what is the dashboard and what it
contains and stuff like that, like a concrete example.>>Again, I have to
construct the dashboard. So why don’t we start out with
what’s the engineer’s solutions. So if you want, you can think. One of my favorite allocation
rules is proportional values. So I have two agents with values and the engineers have
decided what should happen is Player 1 should win with
probability V1 divided by V1 plus V2. So I like this allocation rule
because it’s nice and smooth, which means if I were to write out the allocation
rule and value space, I’ll get a smooth continuous
function which I can then think about converting
into a dashboard, which I give the person which
is the smooth function, and I want the function
dashboards to be smooth because I want to invert
them to get the values, and if it’s a smooth dashboard, then there’s a unique inverse to get the values and this is a setting
you can have in your mind.>>But the problem is the [inaudible]>>So setting one item. It could be R-space, it
could be something else and there are many agents,
so that’s the problem.>>Yeah.>>With the user
submit bids which may not be the true values to that goal.>>Yeah.>>That’s correct.>>So what is the dashboard
in this scenario?>>Yes Cisco, it’s like if you
bit this mat, [inaudible].>>So here’s the trick
of that we’re doing. So if we were to run an auction, there does exist a function. That is, what would have happened for any counterfactual
bid you can put in. What we’re going to
do is try to predict that beforehand, and
that’s the dashboard.>>But it also depends
on other people.>>Exactly. So the thing
we predict might be wrong. If the thing that we
predict might be wrong, then the bid you put in isn’t really
an appropriate bid to put in. So all of this endeavor
is as I had in a previous slide saying if
you follow the dashboard, we correctly allocate because
we infer your value correctly. So we know what the values are, so we actually run the algorithm
on the correct values. But the payments could
be wrong because you bid according to
the dashboard and not according to the actual
thing that was happening. So this seems wrong is that those
things might not be connected. So the thing that we’re going to
be doing is trying to construct dashboards that connect them in a
way that where things work out.>>Someone you want to
make this innovator and finally the
fruitfulness comes out.>>What if you’re not [inaudible].>>I asked you to invert
these two values, create some monotone function
so you just move on. So now, it’s like having if
somebody follows the dashboard, it take the big truthfully. So you know the values
you can find allegation.>>The dashboard was just a
function that’s telling if you bid this much
probability, you’ll get it.>>Yeah.>>Yeah.>>So if they follow the dashboard, which means they do what we
think is best for themselves, so then from this best response bid, you can infer what the value is. So it’s like they’re telling
you what the value is. So then you can allocate
according to the value, but you’re restricted
in what can charge, because you can only
charge them to bid. But the correct charges
something else.>>There’s some unique
correct charge.>>So there is a discrepancy
between what [inaudible].>>Without telling them how
much you’ll charge them, how will they even
participate in the bid?>>So actually the point is we do tell them what we’re
going to charge them.>>Which is not the bid.>>No, it is the bid. So just remember when a winner
pays bid mechanism, that means if you bid five dollars, if you win, you pay five dollars. Now, what I’ve told you is here’s the curb and if you bid five dollars, you’re going to win
with this probability. Given that curve, bidding five
dollars was good for you. But it could be that curve was
wrong and in real life there’s a different curve and
then bidding five dollars wouldn’t have been what you
wanted to do with that new curve. So that would cause disconnecting the incentives and
if that persisted long term, no one want to follow dashboard. So what we need to ensure is that even if there is a
disconnecting this round, but somehow that gets
corrected long-term in the whole system we’re constructing.
Could that make sense?>>Yeah, it makes.>>Cool. So here’s an outline
of what we’re going to do. So I’m going to talk about single-agent mechanisms because single-agent mechanisms
are lot easier. Again, there was a question about what happens when other
people also bidding in it things change and that goes
away with single-agent mechanisms. So we understand some of the
geometry of incentives in a single agent case and when we go on to construct
and analyze dashboards, it’s going to be essentially based
on our single-agent analysis. I probably won’t have time for it, but if you want, you can check the paper for what I call single-called
dashboard mechanisms. Typically, in mechanism design, we assume that we can actually
run the algorithm on any input we want and so I could do the
counterfactual of seeing what would the algorithm
do if I changed the value. But if I’m really running one
of these online marketplaces, I can only see what would happen if I actually show these ads to the
users and see who clicks on what. So that’s the single-call model where you can only see the outcome
of the algorithm by running it. The dashboard I’m going to talk
about will extend to that model, but I won’t have time to talk
about them today and then we’ll wrap up after that. So we’ll focus on two and three. So looking at
single-agent mechanisms, understanding the geometry of these. So imagine I’ve got a single
agent with some value v, then I could think of truthful
mechanisms for one agent, and truthful mechanism
for one agent and I specify an allocation rule x, which is the probability someone wins the function of the value
and the payment rule p, which is the expected
payment they’re going to give as a function of their value. A winner-pays-bid mechanism
on the other hand, and I’ll put tildes over
functions that take in bids, specifies as an x tilde and a p tilde that define a bid
allocation rule and bid payment rule that map bids to
probabilities of winning and bids to payments and of
course if it’s a winner pays bid, then your payment is your bid
times your probability of winning. So p tilde b is equal to b
times x tilde b. I’ll denote by your bid strategy in this auction b and it’s going to map your values to the
actual bid you make and you’ll notice here that I’ve
denoted in italic blue functions whereas realizations
of those functions I’ll denote in just regular font. So the revelation principle is a common workhorse in mechanism
design and it says the following. I’m going to use this
to talk about mapping between what happens in
winner-pays-bid mechanisms and what happens in
truthful mechanisms. If b is the optimal strategy for some winner-pays-bid
mechanism x tilde to b tilde, then xp is a truthful mechanism where x is equal to apply
x tilde to b and apply, this should have been p tilde to b. Meaning if I have a non-truthful mechanism where people play strategies b in
it and I put a box around the whole thing and
say input something into this thing where I simulate this strategy and then
put that into the thing, then the right thing to input
to that is just your value. So that would be a
truthful mechanism. So this is going to
be useful because of the following
characterization of what you can do with truthful mechanisms. Truthful mechanisms, the x and p, have to satisfy these two properties. x has to be monotonically
non-decreasing and p is defined as a
specific function of x. This is useful because again, we’re talking about the
non-truthful mechanisms. Once I put the b into the x tilde, I have to get out something that satisfies these properties as well, and so this also says something
about the non-truthful mechanisms. Just to unpack this
formula a little bit, I like to do it in pictures. So your payment is equal to your value times
the probability of winning, which is the surplus you
degenerate which I can depict by plotting the allocation rule
and its value times probability of winning is the area of
this rectangle minus the integral below the
curve, which is that. So the payment is the
difference between those two, which is the area above the curve. The way we think about
this is as follows, by allocating to you a value
v with probability x sub b, this is the surplus we generate
and we share that surplus with the agent getting this utility and the mechanism designer
getting this profit. Okay. Restating that,
again, on the slide. So what I want to now do
is let’s suppose I have some desired single
agent allocation rule in value space that I
got from engineers, and I want to make a
winner-pays bid mechanism that has this outcome
when people strategize. So I just want to do my whole
thing but just for one agent. Okay. It’s actually
pretty easy. Okay.>>Can you just explain
this theorem again? I didn’t completely follow
it. Myerson’s theorem?>>Yeah.>>What is this that you’re clipping?>>Sorry, what?>>What’s the area under the curve? Why is it that you’re clipping?>>To me, I see that in hindsight knowing this is the
payment and this is a surplus, this is what the designer gets. So what’s left of the surplus. So remember what happens is the agent gets value
times its probability. So this is sort of how
much society is happier. But the agent pays that amount. So what are they left
with? This amount. Yeah, you wouldn’t know
there was the utility until you saw this theorem saying, “This part is the payment.” Does that help, Samir?>>To some extent. Carry on.>>So this theorem says that, “A mechanism xp for a single agent is truthful,” meaning I
want to report my value. It requires the
allocationally monotones, the higher value I report.>>The probability goes up.>>It probably doesn’t go down.>>Okay.>>Then it says, “The
payments actually have to satisfy a very
specific property.” Okay. This property is essentially keeping the first
orientation satisfied everywhere. But in pictures, it’s this, the payment is that function. The integral is the
area above the curve. Cool. So again, using the
revelation principle, I want to basically
use this theorem to think about what happens for
winner-pays-bid mechanisms. The key idea is, well, in winner-pays-bid mechanisms, it has to be that my
payment satisfies this when I put my strategy
into the allocation rule. But we know when a winner
pays bid mechanism, you pay your bid when you win. So P, in a winner-pays-bid mechanism, is b times x of b. So that lets us solve for what b is. Okay. So that’s the first
thing I’m going to do. So the bid must satisfy, bid is equal to P divided by x, because b times x is equal to P. Then I just plug in the
formula and I get this. Okay. It turns out, this function is a
nice monotone function when P is less than or equal to zero. This payment or a
person with value zero. So for now, let’s just
assume that’s zero. So it’s a nice monotone function. Well, that means I can construct the bid
allocation rule by saying, “If I get this bid, let’s invert it to get the value, and put it into the allocation rule.” Okay. So then what
we’re going to say is, if I just publish this bid
allocation rule and I ask for bids. Well, they’re going to optimize and they’re going to
bid this function. If they bid that function, then when I invert it
and put it into x, I actually implement the x of
the i I wanted to implement. Cool. So for a single-agent, things are pretty straightforward. So what I want to do is do this thing basically
for multiple agents. Here’s the challenge. The challenge is, if you
think of my allocation rule, my x, taking all of the
values that the agents have. Notice that the function
for just agent i, if I fix the other agent’s values, is going to be some function. But if I change these values, there’d be a different function. So how can I know what
the x that agent i has until I know the values
of the other agents? So basically, you’re
going to get stuck here in this inversion because
you don’t know the values. So it’s going to be hard to move
forward with multiple agents. So the main thing that we’re going
to do is, we’re saying, “No, we’re not going to look
for like fixed points of this would multiple agents”, which is a difficult thing
to try to think about. Instead, we’re going to look at the factoring sequentially
and we’re going to publish a similar
respondent dashboard and then just use something
completely different, and hope we can fix it later. Cool. So I did my single-agent
winner-pays-bid mechanism design. Now, with that geometry in mind, let’s take that back to the
dashboard problem and show you how we can design multi-agent
dashboard mechanisms. The main ideas is really simple. Probably obvious. Why don’t we look at the
historical allocation rule and publish that as a dashboard? Hopefully, the passes, a good
prediction of the present. Okay. So given that, I’m going to infer the values as best response to this
historical dashboard. In a static environment, well things are static, so the future is like the past, and so my guess was right. So everything converges
to Nash equilibrium. Again, this is a main ideas slide. So we’ll see this in
more detail in a second. The next idea is going to be, well, if we’re running a
winner-pays-bid mechanism, there are going be lots of rounds where the agent doesn’t win at all. These allocations could
be crazing these rounds. Updating dashboards based on what happens in rounds
you don’t allocate, it’s going to potentially
lead to funny outcomes. So if we just update the dashboard
when agents are allocated, when an agent is allocated, we update that agent’s dashboard. Then it’s going to lead to
a much more consistency in how we’re thinking about things. So this is going to be
actually a good idea. Then we’re going to get the
second main result from this which is that payments converts to truthful payments
for the static valued agents. So one agent with the same value, even if the environment’s changing. This is going to be good if we only updated the dashboard when
the agent is allocated. Then the last idea which is going to help us get
the situation when values in the environment are
changing altogether is that, “Hey, look, we know
what allocation we ran, we know what dashboard we gave them, we know how much is incorrect. So let’s just keep track of how much we’ve overcharged
or undercharged people and make sure that in
the future, that’s paid back.” How do we pay that back? We’re going to basically
add to the p zero, the thing we had in
the previous slide, the person with value
zero, what did they pay? The idea is, essentially,
add to that. It’s actually not possible
to just do that which is why I put it in quotes and
put an approximate equal, but you can essentially
do almost the same thing. Cool. Okay. This first of all, is going to give us the
approximate strategic equivalent to a sequential truthful mechanism, because I’m basically saying, “I’m going to make sure that if your payments are
off in one stage, in some later stage,
you’re paying it back.” So if I average over all the stages, the payments are right on average. One of the things I think
is really cool about combining result three with result four is that here we see if your value doesn’t change,
everything is fine. So combined with this result, you actually only have to do funny rebalancing when
your value changes. So that’s the main
ideas in one slide. I now want to go
through more slowly and present each result and talk
about it more formally. Any comments or questions? All right. So here is the dynamic
model that I have in mind, that I haven’t formally
specified yet, but let’s go and specify it. So we’re going to run iterate
mechanism over T stages. I’ll index the stages by s. I’ll
have n agents in each stage, and I’ll denote the
valuation profile by indexing by agents
and also the stages. I have a stochastic
allocation algorithm which also could
change in every stage, which maps a profile of values to a profiled allocation
probabilities. What’s the chance
that each agent wins? The agents are going to
have linear utility, so your utility is
the sum of how much you got allocated in the auction minus the payments you’ve
made in the mechanism. I’m going to restrict the payment
format, the winner-pays-bid. So this is the full dynamic model. I’ll sometimes talk
about the static model. In this case, the
values are the same in every round and the mechanism we’re running is the same in every stage. The algorithm is the same. Again, the goal is a
sequential winner-pays-bid mechanism that tries to
implement these allocation, the desired allocation
on the engineers. I’m going to frequently be looking at just a single stage and
within a single stage, I’m going to be looking at just
what happens for one agent. So when it’s going to be obvious, I’m going to drop the
superscript and subscript and just talk about what happens
to one agent in that stage. Cool. So formally
what is a Dashboard? A Dashboard is a profile
of bid allocation rules. For every agent, it tells that
agent as a function of their bid, what’s their probability of winning. So I’m going to denote those
by y’s and I’m going to distinguish dashboard quantities from the actual mechanism
I run as follows. So the mechanism has an
allocation rule and value space, and a payment rule and value space. It’s got a bid allocation
rule which I can solve for, like we did with a
single agent analysis, and it also has the bid strategy I would take in that
bid allocation rule. So Dashboards have the
equivalent quantities, just a different allocation
rule y. I’m just going to basically add to the variable to get payments q. I’ll put a tilde
for the bit allocation rule and then the bid strategy in
the Dashboard is going to be c. So importantly, c is the bid strategy for y tilde and b is the bid
strategy for x tilde. Cool. So here’s my
dashboard mechanism. So the Dashboard mechanism for a given Dashboard and an algorithm is going to
publish the Dashboard, it’s going to solicit bids, it’s going to invert
the bids by using the bid function for the
Dashboard, which is c, to get inferred values and
then we’re just going to run the engineer’s algorithm on the inferred value c to
get some allocation. The payments are, of
course, winner-pays-bid, so p_i is equal to b_i times x_i. Cool. So that’s formally with the notation everything
we’ve been saying so far. So the first dashboard
I want to talk about is what I call the
last stage dashboard. It’s really simple. In the last stage of the algorithm, we inferred some values. So we’re in stage s,
some inferred values, v-hat in stage s minus 1. So the last stage allocation rule
in terms of all the values is this but I want to project it
onto just player i’s value, fixing the other agents’
values is what we observed. Then I want to let y be
the dashboard for round s to be the allocation
rule for round s minus 1, the realized thing that we did, and then I’m going to convert
it into bids space by putting tildes on it and
publish that Dashboard. So here’s the static analysis. So I promised you a static analysis, which says if the
values aren’t changing and the allocation
rule is not changing, that we converge to
a Nash Equilibrium. So here’s the deal, in the first round, first stage, we’re going
to publish a Dashboard. We have no idea what’s
going to happen. So we published some
stripey monotone function and the bidders bid some way, we invert the values,
we then run them, and we see what would have
happened for these values. So assuming bidders
follow the Dashboard, we know the values and
now in this round, we know what the allocation
rule is for those values. In particular, if in the next round, everyone follows the Dashboard, we publish a new Dashboard, the last round Dashboard, which is now the correct
dashboard for the true values, and everyone follows
the Dashboard again, that’s actually in
equilibrium because we infer for the other agents the same values we had
in the last round, which means the
allocation rule for agent i is the allocation we
published the Dashboard, and so the Dashboard and the
allocation were the same now and responding the Dashboards
now at a Nash Equilibrium. So in two rounds were
a Nash Equilibrium. If we run this forever, the amount you’re off is just your
first stage incorrect payment, and so that vanishes on average
with the number of rounds. That was my easy result. So here is a slightly better
result but it’s also pretty easy. So instead of looking
at the last stage, I’m going to for each agent make their Dashboard be the Dashboard from the allocation rule of
the last stage that they won. So I didn’t write
complicated math for that. I thought that idea was
pretty straightforward. So here’s my second result. For the Last-winning-stage
Dashboard in a dynamic environment, then the average outcome following
the Dashboard for an agent with a static value converges to the average outcome of a
sequential truthful mechanism, meaning the payments are correct, because we already knew the
allocation was going to be right. So this is saying that the
payments are the right payments. So here’s the proof.
It’s very simple. Fix the value of the
agent and consider two successive stages
in which the agent won. What’s indexed as stage
is s_k minus 1 is the last stage I won and I just
won in this stage that’s s_k. Notice that the bid in the stage s_k is the correct bid
for stage s_k minus 1 because the Dashboard
I published for stage s_k is equal to the
allocation rule for s_k minus 1. So c of s_k is equal
to b of s_k minus 1. This is the correct bid for
round s_k minus 1 and this is the bid I make in round s_k and it’s because it’s
the Dashboard for that round, it’s the correct bid for that round. Now, it’s basically
straightforward because look, I just sum over all
the winning stages, and I see that I’m basically sort
of off by one in all of them. So I just charged you a little
bit late the right amount. So look at the difference over these winning stages and you
see that the total difference is the Dashboard bid
for the first round, which was for a made-up
dashboard, of course, it’s wrong, minus the correct bid
for the last round, which I haven’t published
that Dashboard yet, so I haven’t collected
that correct bid yet. These are the differences
of two bids in the auction. The bids are always
less than the value. So the difference is in the
two bids are most the value. So it should be the absolute
value is at most the value. So I get this result, which means if I
average over T rounds, the average per round difference in the payment vanishes with
the number of rounds. Okay. So I’m moving on to the most complicated dashboard
I’m going to be talking about, which is this payment
rebalancing idea. The idea is, if your values are
changing from round to round, things could be just incorrect, and I could never correct for them naturally using my
previous dashboard, so I would not be able
to correct for that.>>You already presented
two dashboards, so what’s the difference? Because both have [inaudible] I understand the difference
in the dashboards, but why is one more?>>The theorem is a stronger theorem. Each dashboard is going to come with a stronger theorem
about it working well. Okay. So the first dashboard
just uses the last round. That round says, “If everything
is static, it works well.” But that dashboard doesn’t work if things are changing
for other agents. This dashboard works as long as
one agent’s value is static, and it works for that agent. Okay. But I want a dashboard that works even if
that agent’s values are changing. Okay. So good. So here’s the thing. When the last two rounds don’t
have the same allocation rule, the dashboard is wrong. But actually, we know what
the allocation rules are, so we know exactly how
the dashboard is wrong, and so we can keep track of in fact, exactly how much the person
overpaid or underpaid. The idea is to keep track
of this in a balance, and have future dashboards include essentially a lump sum payment
to reduce the balance. So again, let’s fix
the stage, and drop i, and say we have inferred value v, and we implement allocation rule x. Then they’re going to
make some actual bid by following the dashboard c, and we really wish
they’d bid b instead. That would be the correct bid. So they overpaid by, whether or not they won, times the difference
in these two bids. So the approach is, I’m going to add this
payment residual, how much they ever paid
to some balance I’ll call L. Then I’m going to adjust their dashboard by putting some
of this balance into the q0 term, the payment that you make
if your value is zero, when I construct the
dashboard that I offer you. So this space shifts
the whole dashboard, so that you have to pay
more if you owe me money, and you pay less if I owe you money. Actually, that’s a
little bit of a lie because in my slide when I talked about having
q0 not equal to zero, I said it can’t be positive, and so we have to actually
do some more work here. It turns out that’s not
too difficult to do, but for the purpose of this
talk, I’m not going to do it. So let’s just assume we can
just throw it into a q0. Remember q is the payment
according to the dashboard. It’s the payment identity for y. So the rebalancing dashboard
that I’m going to given. This construction is going to take an original dashboard
that I wanted to use, and it’s going to add
some rebalancing to it to correct mispayments
from previous rounds. So if you had a dashboard
you were using y, rebalancing rate which is
how aggressive I want to be with rebalancing the balance, then I’ll construct a new dashboard by taking the old payment identity
for the original dashboard y, adding a lump sum q0 to it, and constructing my
new payment that way. So then the bid function is
going to be exactly that, but divided by y. Because remember, we get from the payments to bids by dividing
by the allocation probability. One of the reasons why we include this Eta factor for not asking
for the whole balance at once, is because we’re dividing by
the allocation probability. This is small, then we’re making
this a really big number. Which could mean we
actually overshoot. If they owed us money, now all of a sudden, we
owe them a lot more money. We don’t want to have
the balance exploding, so we’re going to try
to balance less of it, and you need to choose an Eta that’s always going to be less than y. So that this number, Eta divided by y is a
number less than one. So that’s how you choose the rate
at which you try to rebalance. The analysis is
actually pretty simple. There are three lemmas that you
have to prove to understand this. you need to prove that the error in payment in every
round is at most the value, and that’s actually because
there’d be difference in two bids, and difference in two bids
is the most the value. So the error is between v and minus v. Then you can show that if
you choose Eta correctly, less than y, so that number
you see never overshoot. Then you can prove
that the amount you rebalance is between L Eta and L. Meaning that you either do the full thing or you
do at least L Eta. Then you can say, if I create a balance in some round, how much of that balance is
left at the end of time? Well, if you’re dividing by a
one minus Eta factor every time, then if you have k steps
to the end of time, then that’s how much is left. Then you just sum this
overall rounds in telescopes, and you get that at stage t, the balance you have from all previous rounds is at most v divided by Eta,
that rebalancing rate.>>Why don’t we just
choose Eta equals one?>>Because if I choose Eta equal one, I can overshoot the balance, because the y appears
in the denominator. So if y is 0.01, then I’m basically asking me for L times a 100 here
in my additive term. So you owed me say $10, and I just made you pay me
$1,000, and now I owe you. Right now I want to avoid that
oscillation in the payments. So that’s why I do this, carefully choosing the units
so we don’t overshoot.>>So you also said Eta
should be less than yv? Why don’t you just fix the
rebalancing rate between zero and yv, in the first line? [inaudible].>>So in practice actually, what you could just do
is set Eta equal to y0. The allocation
probability of no value. So in practice, that’s
what you would do. So Batch and I have a lot of
ground on that which Eta, then this is how fast it works. This is how much balance
I have leftover. So then dynamic analysis says that, our dashboard is approximately strategically equivalent to the
sequential truthful mechanism, with an error equal to an
upper Gamma values divided by Eta t. ” So it’s
vanishing in t again. The average per stage error.>>Isn’t y equal to zero?>>No. So you need to
choose y0 not to be zero. Because if y0 got to be zero, this term is going to blow
up in a not very good way. But here’s the thing. So
y0 is your dashboard. So it just says, if you
choose a dashboard, do not have y0 equals zero.”>>What is the order of this choice? you choose y Eta first and
then y0, or y0 first then Eta?>>That’s the same choice. So
you want a lower bound on Y0, that’s what we call eta, and so it’s the same choice.>>Again, like can’t I
choose Y0 equals one?>>If you choose Y0 equal to one, you’re saying the lowest type
it’s allocated probably one.>>One minus H1?>>Then your Dashboard’s way wrong. So what you’ll see is
there’s a trade-off between publishing wrong
Dashboards at the low end versus to choose eta that’s big.>>Right.>>All right?>>Right.>>Because I like choosing the
right Dashboard because then I don’t have to do any rebalancing if the
percent has a static value.>>Right.>>The point is there
are some trade-offs. Cool. This is my last main results. We’re essentially out of time. I want to jump to conclusions. So as I predicted, I’m skipping discussion of
Single-call Dashboards. So online markets can
have bad equilibria. We need to design them carefully. The sequential Dashboard
mechanism can implement any outcome of any sequential
truthful mechanism. I showed you here,
winner-pays-bid implementations. You can also do all-pay
implementations and I think pretty much anything that you had in practice that you
wanted to implement, you could probably do in
the Dashboard framework, but I haven’t talked to
you carefully about that. Again, one thing that’s nice is we only have to do this complicated
rebalancing when the values change and so essentially, we’re just publishing
the right Dashboard. So some extensions that would
be great topics of future work. So I’m focusing on implementing
welfare maximization, whatever the engineer’s
algorithm was. You might care about some
other objective like revenue. You might worry about the engineer’s
algorithm not being monotone, which would mean there doesn’t even exist payments that make it correct. So everything breaks. I’m talking about single dimensional
algs, but linear utility, you might have non-linear
utility that’s going to be an interesting
challenge area here. I had talked about a
single-call implementation, part of that is going
to be instrumenting the mechanism to learn about what
it’s doing counterfactually. That is caustic. It changes what the algorithm
does and so you might want do frugal instrumentation in the
single-call implementation of the Dashboard mechanism. Or if you have your favorite
mechanism in question, try to put it into the Dashboard
framework and see if it works. I think in theory land, we know very little about how to design good
non-truthful mechanisms. It’s obviously important because 99 percent of mechanisms in
real life are non-truthful. So I think we need to develop
a lot more of that theory. Another paper I have
on this topic is with a former Northwestern student
who’s now a professor at Oberlin, Sam Tigert. Thanks
for your attention.>>Are more questions?>>So are all these Dashboards
using booking.com or?>>They use booking.com,
thanks so much.>>They use it for the hotels, right?>>So they had Dashboards like
the ones I’m talking about. I can’t really talk about what they
do in their actual algorithms. I do think that it’s
definitely the case that much of what I presented happened
well after we left the project. So you can infer from
that what you want. I think that this
approach seems quite practical and I presented
sort of worst-case theorems. So in practice, it
probably actually is far nicer than my worst-case
theorem suggests. So I think exactly pursuing the
parameter space and seeing how to make these work better
in practice might be a really interesting thing to
do in a more applied context.>>Well can you land
these Dashboards?>>Yes. So one other advantage of the rebalancing approach is it
means if the Dashboard is wrong, we’ll fix it in the long run. So that means if you have
learning algorithms to figure out what a good
Dashboard is, it’s fine. If they make mistakes, it’s fine. We’ll fix it in the long run. So that’s sort of a nice advantage
of the rebalancing approach. So it actually is really
compatible with learning.>>Yes. So there’s one purpose of the Dashboard
here that you are using, which is just running
the bills and you mentioned the long run
payment are current. But there is another purpose
why you show this Dashboard, is to give a good idea of what’s
actually going to happening.>>Yes.>>What you’re saying is they’re learning algorithms, so you
give them a good algorithm, then you give them good idea and these two are working
for a common good combined.>>Yes. That’s nice.>>Without attrition
from marketplaces, it seems like if someone
gets a bad price, they’re more likely to
trade and so you have the population of bidders
you have left is different.>>Yes. So dropping out is a special case of the
model of changing values. It’s a value changing to zero. So everything works in the Dashboard mechanism with
values that might drop out. But I think one of the main issues with this
approach is as Nikhil suggested, if you make a prediction,
your prediction is wrong. That has some kind of impact on the behavior of people even if
you eventually correct for it. I think understanding whether you can do better at
predicting or whether, for example, we can make
predictions that we can guarantee and then we don’t have to correct the payments later because we guaranteed the prediction we made.>>Thanks.>>Thanks.>>Cool. Thank you all.

Comment here