This problem used to bug me once in a while, but recently I had my first insight into it, so I started working on the ideas, which you can see below. Most of the work so far has been conceptual.
This seems like it would be a basic question of multi-agent systems / mechanism design research, but I was unable to find it out there.
--------------------------------------------------------------------------
Suppose I have a decision I need to make, which depends on the uncertain outcome of a yet-unobserved binary event (the outcome will be either "yes" or "no"), which I know nothing about.
Fortunately, I have access to an agent, who through his knowledge and reasoning, has the best guess of anyone out there (in fact, as far as we're concerned, he's the only person who knows anything) about this event.
This agent believes that the outcome with be "yes" with probability p.
THE QUESTION IS: Assuming the agent is risk-neutral, how can I make him tell me, *honestly*, what p is?
(This may sound trivial, but it's not. The apparently trivial solution of a linear reward function (i.e. given a guess p, reward p if "yes" and 1 - p if "no") fails, as the agent could always state either p = 0 or p = 1, whichever is closer to his believed estimate, while still maximizing his reward.)
To tackle this, I need a reward function covering both outcomes and all possible agent outputs o (the term "guess" would be ambiguous here). We could create a reward function for each outcome*, r_yes and r_no, and combine these two functions into one in the following way:
Given the agent's believed estimate i, the expected reward for him (assuming risk-neutrality) is
r_i(o) = i * r_yes(o) + (1 - i) * r_no(o) , for all o in [0,1]
This is treated as being itself a reward function.
(_* this was the insight that triggered all of this.)
Remember that our goal is for the agent's stated estimate, o (for output), to be the same as his believed estimate, i. Also, since i is unique, o should be unique as well (afterall the idea is for "o" to communicate "i" exactly). Thus r_i must have unique maximum at i.
I shall call this "the constraint of honesty". My motivation is to find a function satisfying this constraint.
Now, we can reframe this reward function as a function of two variables,
r(i,o) = i * r_yes(o) + (1 - i) * r_no(o) , for all i, o in [0,1]
Keep in mind: although this is a two-parameter function, the agent only controls o (if he is reasonable, that is).
So we need to design r_yes and r_no in such a way as to meet the constraint of honesty, namely:
* For all (i, o) in [0,1]^2, if i!=o then r(i,i) > r(i,o)
that is, i * r_yes(i) + (1 - i) * r_no(i) > i * r_yes(o) + (1 - i) * r_no(o)
This still seems to leave us free to choose many functions (although, I haven't been able to construct and prove one yet: I'm hopeful about r_yes(x) = (2x-1) / x) and r_no symmetrical to it.)
It may be interesting to add further constraints:
* symmetry around o = 0.5: r_yes(i, o) = r_no(i, 1 - o)
* monotonicity of r_yes (although this may be entailed by honesty or
symmetry + honesty)
* constant expected reward (r(i1,i1) = r(i2,i2) for all i1, i2) .
* normalization: r_yes(1) = 1 (beware: r_no(1) could tend to -infty)
It would be great if honesty is compatible with all of these constraints, but otherwise, I wouldn't know how to judge these constraints. I would probably need a generalization of the problem in order to come up with new criteria. But I should probably solve the specific case first.
--------------------------------------------------------------------------
Now, following Joe Halpern's suggestion, I'm looking at the standard mechanisms to see if they solve my problem. I bet 99% that somebody already solved it, though I wish it were otherwise (hmmm... so what's my incentive for honesty here?)
This seems like it would be a basic question of multi-agent systems / mechanism design research, but I was unable to find it out there.
--------------------------------------------------------------------------
Suppose I have a decision I need to make, which depends on the uncertain outcome of a yet-unobserved binary event (the outcome will be either "yes" or "no"), which I know nothing about.
Fortunately, I have access to an agent, who through his knowledge and reasoning, has the best guess of anyone out there (in fact, as far as we're concerned, he's the only person who knows anything) about this event.
This agent believes that the outcome with be "yes" with probability p.
THE QUESTION IS: Assuming the agent is risk-neutral, how can I make him tell me, *honestly*, what p is?
(This may sound trivial, but it's not. The apparently trivial solution of a linear reward function (i.e. given a guess p, reward p if "yes" and 1 - p if "no") fails, as the agent could always state either p = 0 or p = 1, whichever is closer to his believed estimate, while still maximizing his reward.)
To tackle this, I need a reward function covering both outcomes and all possible agent outputs o (the term "guess" would be ambiguous here). We could create a reward function for each outcome*, r_yes and r_no, and combine these two functions into one in the following way:
Given the agent's believed estimate i, the expected reward for him (assuming risk-neutrality) is
r_i(o) = i * r_yes(o) + (1 - i) * r_no(o) , for all o in [0,1]
This is treated as being itself a reward function.
(_* this was the insight that triggered all of this.)
Remember that our goal is for the agent's stated estimate, o (for output), to be the same as his believed estimate, i. Also, since i is unique, o should be unique as well (afterall the idea is for "o" to communicate "i" exactly). Thus r_i must have unique maximum at i.
I shall call this "the constraint of honesty". My motivation is to find a function satisfying this constraint.
Now, we can reframe this reward function as a function of two variables,
r(i,o) = i * r_yes(o) + (1 - i) * r_no(o) , for all i, o in [0,1]
Keep in mind: although this is a two-parameter function, the agent only controls o (if he is reasonable, that is).
So we need to design r_yes and r_no in such a way as to meet the constraint of honesty, namely:
* For all (i, o) in [0,1]^2, if i!=o then r(i,i) > r(i,o)
that is, i * r_yes(i) + (1 - i) * r_no(i) > i * r_yes(o) + (1 - i) * r_no(o)
This still seems to leave us free to choose many functions (although, I haven't been able to construct and prove one yet: I'm hopeful about r_yes(x) = (2x-1) / x) and r_no symmetrical to it.)
It may be interesting to add further constraints:
* symmetry around o = 0.5: r_yes(i, o) = r_no(i, 1 - o)
* monotonicity of r_yes (although this may be entailed by honesty or
symmetry + honesty)
* constant expected reward (r(i1,i1) = r(i2,i2) for all i1, i2) .
* normalization: r_yes(1) = 1 (beware: r_no(1) could tend to -infty)
It would be great if honesty is compatible with all of these constraints, but otherwise, I wouldn't know how to judge these constraints. I would probably need a generalization of the problem in order to come up with new criteria. But I should probably solve the specific case first.
--------------------------------------------------------------------------
Now, following Joe Halpern's suggestion, I'm looking at the standard mechanisms to see if they solve my problem. I bet 99% that somebody already solved it, though I wish it were otherwise (hmmm... so what's my incentive for honesty here?)
(no subject)
Date: 2003-04-09 08:54 am (UTC)The solution to your problem is a
"proper scoring rule". The first such rule was developed over 50 years ago.
I cite and summarize a little in my papers:
http://hanson.gmu.edu/mktscore.pdf or .ps
http://hanson.gmu.edu/elicit.pdf or .ps