types in Statistics
Oct. 3rd, 2008 04:21 pmMy Stats homework has a question of the type: "Given this joint distribution over X and Y, compute E(E(X|Y)).". This notation is extremely confusing. Can you tell what this means?
Let's see:
E(X|Y=y) is a real-valued function of y.
Thus, E(X|Y) is a random variable.
Therefore, E(E(X|Y)) is a real number.
Type-checking passes.
Given how unclear the notation is, I decided to do something about it, using the formal(ish?) language that I designed yesterday:
Type specifications:
Full type derivation, in a fictitious shell:
Note that I'm passing value-less arguments to functions, and they don't complain.
If I now specify the distributions (i.e. give values to the RandomVariable objects), and implement the methods used above ('makeJointIndependentRV', 'expectedValue', 'applyFunctionToRV'), it should compute e:
Note that it remembers that
We could have a lazy strategy and do all the intermediate computations (except for type inference) at the very last step, as the value is requested (i.e. query triggers update). This whole thing is reminding me of Excel (except that the latter is eager, i.e. change triggers update).
Note 1: Coming up with 'applyFunctionToRV' was hard, because Statisticians have no such concept explicitly. It's implicit. Working this out and writing it up just cost me the last hour.
Note 2: I never used X|Y ('XGivenY') in my very thorough derivation. This is because, despite the traditional notation, X|Y is not a meaningful unit. Remember: X|Y=y is a random variable (after you pass a Real value for y), and thus E(X|Y=y) is a Real number (after you pass a Real value for y). "E(X|Y)" is the RV that results from making y random.
Note 3: we could go inside RandomVariable, and specify it as being to the type
Let's see:
E(X|Y=y) is a real-valued function of y.
Thus, E(X|Y) is a random variable.
Therefore, E(E(X|Y)) is a real number.
Type-checking passes.
Given how unclear the notation is, I decided to do something about it, using the formal(ish?) language that I designed yesterday:
Type specifications:
Type RandomVariable alias RV; Type JointRandomVariable alias JointRV has X:RV; Y:RV; function expectedValue : RV -> Real; //function: given X, it computes E(X) function expectedValue : (a -> RV) -> (a -> Real); //mathematical abstraction: we are // now allowing the user to specify the RV in terms of unknown variables. // does this subsume the first specification of 'expectedValue'? function given : (XY:JointRV * X:RV * Y:RV) -> y:Real -> X:RV; //in this syntax "X:RV" is simply a more human-readable version of "RV" //given a joint RV, the "output" RV, and the "conditioning" RV, this function returns //the function that, given a value of y, returns the RV X. function applyFunctionToRV : (f:(Real -> Real) * Y:RV) -> RV; //given a function from Reals to Reals, and a RV over the Reals, returns a RV that //can be sampled by sampling from Y, and then applying f. //the part of the resulting RV's domain with positive probability will be a subset //of range of f.
Full type derivation, in a fictitious shell:
>> let X : RV;
X : RandomVariable (no value)
>> let Y : RV;
Y : RandomVariable (no value)
>> XY = makeJointIndependentRV(X,Y)
XY : JointRandomVariable (no value)
>> XgivenYisy = given(XY,X,Y) /* Note: since I didn't pass the last argument, this
returns a function */
XgivenYisy : Real -> RandomVariable (no value) //namely, y:Real -> X:RV
>> innerE = expectedValue(XgivenYisy) //'expectedValue' is polymorphic: this is the 2nd
innerE : Real -> Real (no value) //namely, y:Real -> meanX:Real
>> innerE_RV = applyFunctionToRV(innerE, Y)
innerE_RV : RandomVariable (no value)
>> outerE = expectedValue(innerE_RV)
outerE : Real (no value)
Note that I'm passing value-less arguments to functions, and they don't complain.
If I now specify the distributions (i.e. give values to the RandomVariable objects), and implement the methods used above ('makeJointIndependentRV', 'expectedValue', 'applyFunctionToRV'), it should compute e:
>> X = Gaussian(0,1) X : RandomVariable >> Y = Gaussian(0,1) Y : RandomVariable >> outerE outerE : Real outerE = 0
Note that it remembers that
outerE = expectedValue(innerE_RV), rather than outerE = 0. So whenever you give new values X and Y, and ask for outerE, you will get an updated value. I call this "spreadsheet logic".We could have a lazy strategy and do all the intermediate computations (except for type inference) at the very last step, as the value is requested (i.e. query triggers update). This whole thing is reminding me of Excel (except that the latter is eager, i.e. change triggers update).
Note 1: Coming up with 'applyFunctionToRV' was hard, because Statisticians have no such concept explicitly. It's implicit. Working this out and writing it up just cost me the last hour.
Note 2: I never used X|Y ('XGivenY') in my very thorough derivation. This is because, despite the traditional notation, X|Y is not a meaningful unit. Remember: X|Y=y is a random variable (after you pass a Real value for y), and thus E(X|Y=y) is a Real number (after you pass a Real value for y). "E(X|Y)" is the RV that results from making y random.
Note 3: we could go inside RandomVariable, and specify it as being to the type
Real[0,1] -> Real (i.e. the RV is defined by the inverse of its cdf). But since we'd have encapsulation, you'd be free to change this type specification later, by typing it just once. I almost want to call it "type implementation", but it doesn't run.
(no subject)
Date: 2008-10-04 12:14 am (UTC)(no subject)
Date: 2008-10-04 12:43 am (UTC)(Starting from "Full derivation")
Highlights:
<< Note that I'm passing value-less arguments to functions, and they don't complain. >>
are there programming languages like this?
<< Now that I've specified Y, it can compute e. Note that all the intermediate computations (except for type inference) are done at the very last step. This whole thing is reminding me of Excel. >>
Are you aware of any shell-based languages like this?
(no subject)
Date: 2008-10-06 07:24 am (UTC)http://github.com/darius/halp/ is an Emacs interface I've written recently for conventional programming languages; it just reevaluates everything from scratch, though. I'm told Mathematica notebooks have a similar feel.
(no subject)
Date: 2008-10-04 03:53 am (UTC)A measure-theoretic approach to probability is helpful here. There is a sample space Ω and a measure P representing the probability of events (measurable sets of samples).
Random variables are functions from Ω to another space of outcomes R: for example, if X is a random variable representing the color of the next car you see and ω is a complete state of the universe, X(ω) might be "blue".
Expectation is then a linear functional taking a random variable to an element of R, and the "given" operator is the restriction operator.
E(X|Y=y) denotes the result of the expectation functional acting on the restriction of the random variable X to the subspace of Ω where the random variable Y is equal to Y:
E(X|Y=y) = ∫Y-1(y) X dP
and you can curry it to a random variable E(X|Y):
E(X|Y)(y) = ∫Y-1(y) dP
which you can then apply the linear functional to. Haskell people love to do this. Supposing we already have a symbolic algebra/calculus system,
[ugh, I had a lot of fun just now refreshing my memory of Haskell enough to write most of this, but got hung up on how to do `given`. I have a nasty cold and plead temporary stupidity]
OK, I'm overexplaining because I really like this stuff; the real information here is just that the random variable can be considered as a function, and this is how you distinguish between the random variable per se and a particular value it takes on, which is where the confusion in the beginning of the post came from.
(no subject)
Date: 2008-10-04 05:18 am (UTC)Does it let you compute the type of an object without computing its value?
My language would always infer the types of objects, even if not their values. For example, if you call a function that's not implemented, or if some necessary arguments to the function have no value.
A completely orthogonal feature is that it uses "spreadsheet logic" (a lazy version thereof). (Tangentially, this is incompatible with the semantics of memoization: memoized values could wrong, unless you're always updating. I suppose there are smart ways of updating. Even more tangentially: Acar, Blelloch, Harper - Adaptive Functional Programming).
(no subject)
Date: 2008-10-04 12:07 pm (UTC)On the type/value question, I'm not sure what you're getting at. Haskell is strongly typed; you can't reference an object without knowing its type, although you can be less than picky about what the type is if you want to.
Remind me one more time and I'll write you the above in Haskell this evening; or give it a try yourself---the language is worth learning and this will drive your understanding of several of its nicer features.
(no subject)
Date: 2008-10-04 05:16 pm (UTC)That was not my question. I want to pass valueless arguments to the function, and get a valueless output. i.e. only the types are computed.
It would be cool to actually see this in Haskell.
(no subject)
Date: 2008-10-04 09:01 pm (UTC)(no subject)
Date: 2008-10-04 11:19 pm (UTC)Is this right? How is it done?
(no subject)
Date: 2008-10-04 11:29 pm (UTC)If
f :: a -> b -> c
x :: a
then
(f x) :: b -> c
I don't understand why you want to get an actual object with type but no value, rather than an object that (a) will give you an object if you give it an object and (b) can tell you in advance what type of object you will get.
If you can explain in a different way what you hope to accomplish, perhaps I will understand better.
(no subject)
Date: 2008-10-04 11:40 pm (UTC)my goal is to keep track of types in algebraic equations, e.g. matrix with m rows, n columns, etc. I'm hoping I can explain this better tomorrow (to you and to myself) (I'm currently pressed for time).
(no subject)
Date: 2008-10-04 11:46 pm (UTC)(no subject)
Date: 2008-10-06 07:18 am (UTC)The m-rows n-columns bit is hairier though it is possible to express that constraint with Haskell's type system.
(no subject)
Date: 2008-10-04 11:43 pm (UTC)(no subject)
Date: 2008-10-05 12:36 am (UTC)(no subject)
Date: 2008-10-05 03:59 am (UTC)(no subject)
Date: 2008-10-05 04:43 am (UTC)(no subject)
Date: 2008-10-08 09:14 pm (UTC)Namely, I'd like to specify the types of functions (like
expectedValue : RV -> Real) and atoms (likeX : RV), and prove thatouterE : Real, like above.(no subject)
Date: 2008-10-08 09:17 pm (UTC)(no subject)
Date: 2008-10-09 04:17 pm (UTC)i'll see if i can code this up after my advisor meeting today.
(no subject)
Date: 2008-10-10 06:36 am (UTC)(no subject)
Date: 2008-10-10 03:36 pm (UTC)(no subject)
Date: 2008-10-08 09:20 pm (UTC)expectedValue : RV -> Realis a subtype of(a -> RV) -> (a -> Real).i.e. can we plug in an empty type for 'a', and beta-reduce the latter type to the former type?
(no subject)
Date: 2008-10-09 04:21 pm (UTC)the reason it doesn't make sense is that we'd like to have the property that if M:A and A <= B then M:B. a term of type C -> B doesn't have type B, unless you can prove some kind of strengthening lemma about it (that it doesn't use its argument).
(no subject)
Date: 2008-10-09 05:33 pm (UTC)The reason I ask is: in math, this pattern is very very very common.
Nobody thinks twice before using
RV -> Realas a metonymy for(a -> RV) -> (a -> Real).To illustrate, suppose someone asks:
"If x and y are real numbers, then is x+y a real number?" Literally, what this provides is two variable type declarations (
x : Real, y : Real) and a function (type implicit(Real * Real) -> Real), and the question is whether the beta-reduced expression has typeReal.But if you ask them, they'll say they're not referring to the function "lambda(x) x+y", but that they're referring to x+y itself, as if it had an independent existence as a Real number.
In any case, my point is to save work.
It sounds like under current formulations of subtyping, mathematicians would always need to declare 2 polymorphic types for each function:
(1)
RV -> Real, and(2)
(a -> RV) -> (a -> Real)I'd like one declaration to do both.
(no subject)
Date: 2008-10-11 09:52 pm (UTC)(no subject)
Date: 2008-10-10 09:40 pm (UTC)twelf's output:
twelf works fine for when all you want is syntactic checking, but i don't think the idea of plugging in a distribution would fly.
(no subject)
Date: 2008-10-10 09:42 pm (UTC)(also, just realized the [] notation could be confusing. it's just lambda.)
(no subject)
Date: 2008-10-10 10:44 pm (UTC)that's the cherry on top. :-D
(no subject)
Date: 2008-10-10 10:46 pm (UTC)(a -> RV) -> (a -> Real)definition is really ever necessary.(no subject)
Date: 2008-10-11 05:08 am (UTC)(no subject)
Date: 2008-10-11 08:48 pm (UTC)do you mean f g a : C ? What are "\" and "."?
(no subject)
Date: 2008-10-11 09:38 pm (UTC)(no subject)
Date: 2008-10-11 09:39 pm (UTC)(no subject)
Date: 2008-10-10 10:39 pm (UTC)(no subject)
Date: 2008-10-04 05:10 am (UTC)