gusl: (Default)
[personal profile] gusl
My Stats homework has a question of the type: "Given this joint distribution over X and Y, compute E(E(X|Y)).". This notation is extremely confusing. Can you tell what this means?

Let's see:
E(X|Y=y) is a real-valued function of y.
Thus, E(X|Y) is a random variable.
Therefore, E(E(X|Y)) is a real number.
Type-checking passes.

Given how unclear the notation is, I decided to do something about it, using the formal(ish?) language that I designed yesterday:

Type specifications:
 Type RandomVariable alias RV;

 Type JointRandomVariable alias JointRV has
   X:RV;
   Y:RV;
 
 function expectedValue : RV -> Real; //function: given X, it computes E(X)

 function expectedValue : (a -> RV) -> (a -> Real); //mathematical abstraction: we are
 // now allowing the user to specify the RV in terms of unknown variables.
 // does this subsume the first specification of 'expectedValue'?

 
 function given : (XY:JointRV * X:RV * Y:RV) -> y:Real -> X:RV; 
 //in this syntax "X:RV" is simply a more human-readable version of "RV"
 //given a joint RV, the "output" RV, and the "conditioning" RV, this function returns
 //the function that, given a value of y, returns the RV X.
 
 function applyFunctionToRV : (f:(Real -> Real) * Y:RV) -> RV;
 //given a function from Reals to Reals, and a RV over the Reals, returns a RV that
 //can be sampled by sampling from Y, and then applying f.
 //the part of the resulting RV's domain with positive probability will be a subset
 //of range of f.




Full type derivation, in a fictitious shell:
>> let X : RV;
 X : RandomVariable (no value)

>> let Y : RV;
 Y : RandomVariable (no value)

>> XY = makeJointIndependentRV(X,Y)
 XY : JointRandomVariable (no value)

>> XgivenYisy = given(XY,X,Y)   /* Note: since I didn't pass the last argument, this 
                                   returns a function */
 XgivenYisy : Real -> RandomVariable (no value)  //namely, y:Real -> X:RV

>> innerE = expectedValue(XgivenYisy) //'expectedValue' is polymorphic: this is the 2nd
 innerE : Real -> Real (no value)   //namely, y:Real -> meanX:Real

>> innerE_RV = applyFunctionToRV(innerE, Y)
 innerE_RV : RandomVariable (no value)

>> outerE = expectedValue(innerE_RV)
  outerE : Real (no value)


Note that I'm passing value-less arguments to functions, and they don't complain.

If I now specify the distributions (i.e. give values to the RandomVariable objects), and implement the methods used above ('makeJointIndependentRV', 'expectedValue', 'applyFunctionToRV'), it should compute e:

>> X = Gaussian(0,1)
 X : RandomVariable

>> Y = Gaussian(0,1)
 Y : RandomVariable

>> outerE
  outerE : Real
  outerE = 0


Note that it remembers that outerE = expectedValue(innerE_RV), rather than outerE = 0. So whenever you give new values X and Y, and ask for outerE, you will get an updated value. I call this "spreadsheet logic".
We could have a lazy strategy and do all the intermediate computations (except for type inference) at the very last step, as the value is requested (i.e. query triggers update). This whole thing is reminding me of Excel (except that the latter is eager, i.e. change triggers update).



Note 1: Coming up with 'applyFunctionToRV' was hard, because Statisticians have no such concept explicitly. It's implicit. Working this out and writing it up just cost me the last hour.

Note 2: I never used X|Y ('XGivenY') in my very thorough derivation. This is because, despite the traditional notation, X|Y is not a meaningful unit. Remember: X|Y=y is a random variable (after you pass a Real value for y), and thus E(X|Y=y) is a Real number (after you pass a Real value for y). "E(X|Y)" is the RV that results from making y random.

Note 3: we could go inside RandomVariable, and specify it as being to the type Real[0,1] -> Real (i.e. the RV is defined by the inverse of its cdf). But since we'd have encapsulation, you'd be free to change this type specification later, by typing it just once. I almost want to call it "type implementation", but it doesn't run.

(no subject)

Date: 2008-10-04 12:14 am (UTC)
From: [identity profile] darius.livejournal.com
Skimming this reminds me of Sussman & Wisdom's treatment of the principle of least action in their book SICM. Functional notation to clarify the ideas that go into the handwavey standard notation.

(no subject)

Date: 2008-10-04 12:43 am (UTC)
From: [identity profile] gustavolacerda.livejournal.com
I've added more stuff.

(Starting from "Full derivation")

Highlights:
<< Note that I'm passing value-less arguments to functions, and they don't complain. >>
are there programming languages like this?

<< Now that I've specified Y, it can compute e. Note that all the intermediate computations (except for type inference) are done at the very last step. This whole thing is reminding me of Excel. >>

Are you aware of any shell-based languages like this?

(no subject)

Date: 2008-10-04 03:53 am (UTC)
From: [identity profile] en-ki.livejournal.com
Yeah, this really seems like a fine opportunity for currying (and lazy evaluation). I'm not clear on why you would invent a new language for it rather than throw it into Lisp-dialect-of-choice or Haskell.

A measure-theoretic approach to probability is helpful here. There is a sample space Ω and a measure P representing the probability of events (measurable sets of samples).

Random variables are functions from Ω to another space of outcomes R: for example, if X is a random variable representing the color of the next car you see and ω is a complete state of the universe, X(ω) might be "blue".

Expectation is then a linear functional taking a random variable to an element of R, and the "given" operator is the restriction operator.

E(X|Y=y) denotes the result of the expectation functional acting on the restriction of the random variable X to the subspace of Ω where the random variable Y is equal to Y:

E(X|Y=y) = ∫Y-1(y) X dP

and you can curry it to a random variable E(X|Y):

E(X|Y)(y) = ∫Y-1(y) dP

which you can then apply the linear functional to. Haskell people love to do this. Supposing we already have a symbolic algebra/calculus system,

[ugh, I had a lot of fun just now refreshing my memory of Haskell enough to write most of this, but got hung up on how to do `given`. I have a nasty cold and plead temporary stupidity]

OK, I'm overexplaining because I really like this stuff; the real information here is just that the random variable can be considered as a function, and this is how you distinguish between the random variable per se and a particular value it takes on, which is where the confusion in the beginning of the post came from.

(no subject)

Date: 2008-10-04 05:10 am (UTC)

(no subject)

Date: 2008-10-04 05:18 am (UTC)
From: [identity profile] gustavolacerda.livejournal.com
Does Haskell let you define types like above?
Does it let you compute the type of an object without computing its value?

My language would always infer the types of objects, even if not their values. For example, if you call a function that's not implemented, or if some necessary arguments to the function have no value.

A completely orthogonal feature is that it uses "spreadsheet logic" (a lazy version thereof). (Tangentially, this is incompatible with the semantics of memoization: memoized values could wrong, unless you're always updating. I suppose there are smart ways of updating. Even more tangentially: Acar, Blelloch, Harper - Adaptive Functional Programming).

(no subject)

Date: 2008-10-04 12:07 pm (UTC)
From: [identity profile] en-ki.livejournal.com
Yes, in Haskell, if you write an expression that calls a function without providing all its arguments, the result is a curried function, not an error.

On the type/value question, I'm not sure what you're getting at. Haskell is strongly typed; you can't reference an object without knowing its type, although you can be less than picky about what the type is if you want to.

Remind me one more time and I'll write you the above in Haskell this evening; or give it a try yourself---the language is worth learning and this will drive your understanding of several of its nicer features.

(no subject)

Date: 2008-10-04 05:16 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
<< Yes, in Haskell, if you write an expression that calls a function without providing all its arguments, the result is a curried function, not an error. >>

That was not my question. I want to pass valueless arguments to the function, and get a valueless output. i.e. only the types are computed.

It would be cool to actually see this in Haskell.

(no subject)

Date: 2008-10-04 09:01 pm (UTC)
From: [identity profile] en-ki.livejournal.com
I guess I'm not understanding what you're looking for, then. The curried function's type signature will communicate all the type information that will ever be available and does not need a value to do it; if that's not what you're looking for, I'm confused.

(no subject)

Date: 2008-10-04 11:19 pm (UTC)
From: (Anonymous)
So if f : a -> b, then we can create an object by currying f, whose type will be b.
Is this right? How is it done?

(no subject)

Date: 2008-10-04 11:29 pm (UTC)
From: [identity profile] en-ki.livejournal.com
Well, no.

If

f :: a -> b -> c
x :: a

then

(f x) :: b -> c

I don't understand why you want to get an actual object with type but no value, rather than an object that (a) will give you an object if you give it an object and (b) can tell you in advance what type of object you will get.

If you can explain in a different way what you hope to accomplish, perhaps I will understand better.

(no subject)

Date: 2008-10-04 11:40 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
Can Haskell do this inference for you if f is not implemented, and x is not instantiated?

my goal is to keep track of types in algebraic equations, e.g. matrix with m rows, n columns, etc. I'm hoping I can explain this better tomorrow (to you and to myself) (I'm currently pressed for time).

(no subject)

Date: 2008-10-04 11:43 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
I want to automatically do some simple accounting for types, so that I can catch errors when reasoning abstractly (i.e. before implementing or instantiating anything).

(no subject)

Date: 2008-10-04 11:46 pm (UTC)
From: [identity profile] en-ki.livejournal.com
Hm, not sure. I'm not that much of a Haskell hacker, but Hugs won't seem to let me define a function without also implementing it in the same file. (Of course, if I just want the type signature computed for me, I can do a cheesy trivial implementation and get the type signature of that.)

(no subject)

Date: 2008-10-05 12:36 am (UTC)
From: [personal profile] chrisamaphone
i would just like to interject: TWELF!

(no subject)

Date: 2008-10-05 03:59 am (UTC)
From: [identity profile] gustavolacerda.livejournal.com
does it do what I'm asking?

(no subject)

Date: 2008-10-05 04:43 am (UTC)
From: [personal profile] chrisamaphone
yes, aside from not having reals.

(no subject)

Date: 2008-10-06 07:18 am (UTC)
From: [identity profile] darius.livejournal.com
You can give f a type signature while defining its value as 'error' -- since error is polymorphic the compiler can still do type inference using f's type.

The m-rows n-columns bit is hairier though it is possible to express that constraint with Haskell's type system.

(no subject)

Date: 2008-10-06 07:24 am (UTC)
From: [identity profile] darius.livejournal.com
The closest thing that occurs to me is Vital, which was kind of a freeform spreadsheet using a subset of Haskell. You could indeed edit some values and get that spreadsheet-like reevaluation going on.

http://github.com/darius/halp/ is an Emacs interface I've written recently for conventional programming languages; it just reevaluates everything from scratch, though. I'm told Mathematica notebooks have a similar feel.

(no subject)

Date: 2008-10-08 09:14 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
Could you do the code I have here in Twelf? It would be awesome to see that.

Namely, I'd like to specify the types of functions (like expectedValue : RV -> Real) and atoms (like X : RV), and prove that outerE : Real, like above.

(no subject)

Date: 2008-10-08 09:17 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
Since Real is not built-in, you could simply define a type Real in 1 line, right? I'm not asking it to model the real numbers: it's like a stub.

(no subject)

Date: 2008-10-08 09:20 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
Tangentially, I wonder if the type expectedValue : RV -> Real is a subtype of (a -> RV) -> (a -> Real).

i.e. can we plug in an empty type for 'a', and beta-reduce the latter type to the former type?

(no subject)

Date: 2008-10-09 04:17 pm (UTC)
From: [personal profile] chrisamaphone
yes, that's true. i'm just saying it might not be good for stats in general because of that (although jcreed tells me there might be a library somewhere).

i'll see if i can code this up after my advisor meeting today.

(no subject)

Date: 2008-10-09 04:21 pm (UTC)
From: [personal profile] chrisamaphone
no, that doesn't make sense in any formulation of subtyping i've seen, although it is a good observation that -> behaves like <=.

the reason it doesn't make sense is that we'd like to have the property that if M:A and A <= B then M:B. a term of type C -> B doesn't have type B, unless you can prove some kind of strengthening lemma about it (that it doesn't use its argument).

(no subject)

Date: 2008-10-09 05:33 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
I see. I guess '<=' means "is a subtype of".

The reason I ask is: in math, this pattern is very very very common.
Nobody thinks twice before using RV -> Real as a metonymy for (a -> RV) -> (a -> Real).

To illustrate, suppose someone asks:
"If x and y are real numbers, then is x+y a real number?" Literally, what this provides is two variable type declarations (x : Real, y : Real) and a function (type implicit (Real * Real) -> Real), and the question is whether the beta-reduced expression has type Real.

But if you ask them, they'll say they're not referring to the function "lambda(x) x+y", but that they're referring to x+y itself, as if it had an independent existence as a Real number.

In any case, my point is to save work.
It sounds like under current formulations of subtyping, mathematicians would always need to declare 2 polymorphic types for each function:
(1) RV -> Real, and
(2) (a -> RV) -> (a -> Real)

I'd like one declaration to do both.

(no subject)

Date: 2008-10-10 06:36 am (UTC)
From: [identity profile] gustavolacerda.livejournal.com
or today? pretty please...

(no subject)

Date: 2008-10-10 03:36 pm (UTC)
From: [personal profile] chrisamaphone
working on it. i'm still not exactly sure what your intent is with the α -> RV, and lf doesn't have polymorphism, so it's not a direct translation.

(no subject)

Date: 2008-10-10 09:40 pm (UTC)
From: [personal profile] chrisamaphone
  1 real : type.
  2 rv : type.
  3 jointrv : type.
  4 
  5 joint : rv -> rv -> jointrv.
  6 given : jointrv -> rv -> rv -> real -> rv.
  7 
  8 ev : rv -> real.
  9 apply : (real -> real) -> rv -> rv.
 10 
 11 rvX : rv.
 12 rvY : rv.
 13 
 14 rvXY = joint rvX rvY.
 15 rvXgivenY = given rvXY rvX rvY.
 16 innerE = [y] ev (rvXgivenY y).
 17 innerE-RV = apply innerE rvY.
 18 outerE = ev innerE-RV.
 19 


twelf's output:
[Opening file gustavostats.elf]
real : type.
rv : type.
jointrv : type.
joint : rv -> rv -> jointrv.
given : jointrv -> rv -> rv -> real -> rv.
ev : rv -> real.
apply : (real -> real) -> rv -> rv.
rvX : rv.
rvY : rv.
rvXY : jointrv = joint rvX rvY.
rvXgivenY : real -> rv = [x:real] given rvXY rvX rvY x.
innerE : real -> real = [y:real] ev (rvXgivenY y).
innerE-RV : rv = apply ([x:real] innerE x) rvY.
outerE : real = ev innerE-RV.
[Closing file gustavostats.elf]
%% OK %%


twelf works fine for when all you want is syntactic checking, but i don't think the idea of plugging in a distribution would fly.

(no subject)

Date: 2008-10-10 09:42 pm (UTC)
From: [personal profile] chrisamaphone
p.s. note that you don't need the second definition of ev because you can just use composition.

(also, just realized the [] notation could be confusing. it's just lambda.)

(no subject)

Date: 2008-10-10 10:39 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
awesome! thank you!

(no subject)

Date: 2008-10-10 10:44 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
<< p.s. note that you don't need the second definition of ev because you can just use composition. >>

that's the cherry on top. :-D

(no subject)

Date: 2008-10-10 10:46 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
I'm now pondering if the (a -> RV) -> (a -> Real) definition is really ever necessary.

(no subject)

Date: 2008-10-11 05:08 am (UTC)
From: [personal profile] chrisamaphone
well, no, it's not. in fact i should have realized this and pointed it out earlier. think types as propositions -- (B -> C) entails (A -> B) -> (A > C) via composition (\f:B->C.\g:A->B.\x:A.f a : C). it's not subtyping -- you don't want to say that a term having some type means it also has the "larger" type, but rather that whenever there exists a term of the first type, you can construct one of the latter.
Edited Date: 2008-10-11 05:09 am (UTC)

(no subject)

Date: 2008-10-11 08:48 pm (UTC)
From: [identity profile] gustavolacerda.livejournal.com
<< (\f:B->C.\g:A->B.\x:A.f a : C) >>

do you mean f g a : C ? What are "\" and "."?

(no subject)

Date: 2008-10-11 09:38 pm (UTC)
From: [personal profile] chrisamaphone
yes i do. \ is λ and '.' is just a binding '.'.

(no subject)

Date: 2008-10-11 09:39 pm (UTC)
From: [personal profile] chrisamaphone
well, f (g a), rather

(no subject)

Date: 2008-10-11 09:52 pm (UTC)
From: (Anonymous)
i think the point about real -> rv vs (a -> real) -> (a -> rv) is just about the fact that (a -> -) is a functor, and its action on arrows (i.e. functions) is just given by composition. as chris pointed out, given an element f : b -> c, the corresponding (a -> b) -> (a -> c) is just [g : a -> b] [x : a] f (g x)

February 2020

S M T W T F S
      1
2345678
9101112131415
16171819202122
23242526272829

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags