### (no subject)

Feb. 9th, 2020 05:47 pmWouldn't you like to read my academic posts instead? Stats Computer Science

You're viewing gusl's journalCreate a Dreamwidth Account Learn More | light | Reload page in style:

Wouldn't you like to read my academic posts instead? Stats Computer Science

Gelman, Carlin, Stern, Rubin (2nd Edition)

* p.586-588, details about this.

Hastie, Tibshirani, Friedman

* p.156: consistency of MLE.

* p.466: solution to the proof question.

* p. 672: in non-parametric test for the mean, KL is used to define the most favorable distribution in H0.

* p. 56: asymptotics of estimation under mis-specified model.

* p. 62: viewing MLE as an M-estimator.

Manhattan: 8.3333c/kWh

Brooklyn: 15.8377c/kWh

Delivery charges

Manhattan: 10.1321c/kWh

Brooklyn: 10.1789c/KWh

SBC/RPS charges

Manhattan: 0.3082c/kWh

Brooklyn: 0.3911c/KWh

This means that in Manhattan I used to pay 18.7736c/kWh, whereas in Brooklyn I'm paying 26.4077c/kWh.

On top of that, in Manhattan, my two-person household was spending 5.12 kWh per day; in Brooklyn, my 3-person household has been spending 9.87 KWh per day.

On the positive side, Con Edison has adjusted their ridiculous "estimated" charges, and the adjusted bill looks almost reasonable.

== desert weather ==

* camelbak: BOUGHT

* goggles: BOUGHT

* dust mask: BOUGHT

== camping ==

* tent: BORROWED

* rebar: BOUGHT

* reflective material for cooling: BOUGHT

== sleeping ==

* ear plugs (gel): BOUGHT

* sleep mask: ORDERED

* sleeping bag: ORDERED

* self-inflating mattress (3" thick): ORDERED

* blanket: ALREADY HAVE

== ride arrangements ==

I need to ride with Victoria, since we are splitting a Will Call ticket.

We arranged a ride in a sedan transporting 4 Burners and our stuff, which I think is very tight.

There might not be room for most of my stuff, so I will try to find other people who can transport my bicycle.

== lights ==

* headlamp: ORDERED

* reflective tape: ORDERED

* blinky reflective vest: ORDERED

* solar-powered light: BOUGHT

* bicycle lights: NEED BATTERIES

== electricity ==

* solar-powered phone charger: THANKS,GOOGLE

* batteries:

== survival ==

* dish, mug, cutlery: BOUGHT

* water:

* food:

* clothing for cold:

== MOOP ==

trash bags:

== other ==

duct tape: BOUGHT

I went to AirBnb look for rooms near Barra. I found a studio here, for $150/night, "Enjoy Nature Studio in Lagoon RIO!":

The conference is near the Sheraton, which is a 20-minute walk from the center of the pink circle, and I'm otherwise not very picky, so I didn't mind the Strict cancellation policy. I accept the charge of $750 for 5 nights + $82 in AirBnb fees.

Once I book, I get the address: Av. Armando Lombardi 370. I tell my friend in Rio, who walks by and tells me that address given is a gas station! I call the host, and her daughter explains to me that the studio is on an island: Ilha Primeira, which falls completely outside of the pink circle (the triangular island just North of the pink circle), and that this address is where the boat picks you up. I understand why they did this: AirBnb requires a street address, and Ilha Primeira doesn't have streets.

The place sounds very beautiful, but it would be extremely inconvenient to have to depend on boats all the time. After I make a few inquiries about the boat service, the host advises me to cancel. I agree, pending her reassurance that she will give me a full refund. I call AirBnb twice, and learn that their resolution procedure is essentially: "you guys find an agreement", and they tell me that I'm going to lose the fees ($82) regardless. They heard my complaint about the misleading address, and seemed to agree with me, but didn't want to take any action on it.

Anyway, I accept losing the $82, so just before midnight, I cancel... Today I spoke to the host again, and she told me that she will refund me all the money they give her, but that this is only $727. So it sounds like AirBnb is charging her $23, on top of the $82 from me! Unfortunately, this means that my total loss would be $105... which crosses the threshold for picking a fight with me. Maybe this means that I need to call AirBnb again and threaten them with reversing charges. I have a debit card, but my bank reassured me that they will give me the benefit of the doubt in such a dispute... but first I have to wait until the transaction posts.

For the future, I should probably use credit cards more often: my understanding is that they are better when it comes to dispute resolution.

I do worry about burning my bridges with AirBnb, but this is a matter of principle. Hopefully they wouldn't do anything to my San Francisco booking if I reverse charges on the Rio booking.

So here's the basic calculations I do, before I start working on my tax forms. This is not tax advice. This is not legal advice.

* TA Wages, see W2 form

* Stipend: look through bank account or MyColumbia, and add all the checks issued in 2012.

* Interest income: Chase sent me a form ("in lieu of 1099-INT"), informing me that I made $4.58 in interest, of which $0.00 was withheld. However, Chase charged an "Agent Admin Fee" $4.58, which means that I'm going to pay tax on money that I never saw... So let's be thankful that Savings accounts have such crappy interest!

* TA Wages, see W2 form

* Stipend: look through checks at MyColumbia. Check whether they withheld anything. In my case, they didn't.

* My scholarship exactly cancels out tuition+fees. This means I don't need to look at the 1098-T, even though the university is obligated to send it to me. I think this form concerns the university's taxes wrt me, not my taxes.

* Unlike most international students, I should not receive a 1042-S (it's only for Non-Resident Aliens)

Since no money is being withheld from my stipend checks, I expect to owe money to the IRS, on the order of a few thousand per year.

My stipend is taxable

One of the key principles expounded in this book is known as the "conditionality principle": given your model, if you can find a statistic that is ancillary (i.e. invariant to the parameter of interest), then your likelihood function should be conditional on it.

Now, if the minimal sufficient statistic is complete (as is the case in any full-rank exponential family), Basu's theorem tells us that any ancillary statistic will be independent of it, i.e. there is a clean separation between sufficient and ancillary. But in curved exponential families, it can happen that there is no maximal ancillary statistic, i.e. you may have multiple choices of ancillary statistic, but combining them yields a statistic that is no longer ancillary. This is a bit troubling to me, because it breaks the nice idea of a bijection between model and likelihood function.

Given a choice between two ancillaries, C&H advises selecting the one whose Conditional Fisher Information has the greater variance. It's not immediately obvious why one should do this, but I think this can be understood as the Conditional Fisher Information giving us a lens into the conditional likelihood function. For example, if the conditional Fisher Information has 0 variance, it may be because the ancillary statistic doesn't add any information (as is the case when the minimal sufficient statistic is complete). However, it still seems plausible to me that the Conditional Fisher Information can be constant (independent of the ancillary statistic) even while the likelihood function is sensitive to it.

C&H also hint at a notion of partial sufficiency/efficiency and how to measure it: just compute a Conditional Fisher Information, conditioning on the proposed statistic.

(Since Fisher Information is an expectation, Conditional Fisher Information is the expectation of a conditional distribution; since the quantity on the LHS is a function of the sufficient statistic, conditioning on the sufficient statistic will not change anything, whereas conditioning on something insufficient can have the effect of making the log-likelihood smoother, and the Fisher Information smaller) Conditioning on ancillary, however, doesn't simple make the log-likelihood sharper: the average of the Conditional Fisher Information is just the Fisher Information.

[the last paragraph is probably wrong; please comment]

Of course, I much prefer having a real blog, with space to develop my thoughts and a permanent record. But there is something to be said for Twitter, which is hard to put my finger on. Twitter gives a feeling of immediacy, that you're speaking to the whole world. Although real blogs are just as public and just as immediate as Twitter, (a) blog interaction suffers from interoperability issues, especially commenting: there are many competing standards, and unfortunately OpenID didn't take off (though maybe Facebook Connect will), whereas Twitter is unified and simple; and (b) the set of people who read you somehow on real blogs feels more limited

I have a little problem now: I have "followed" way too many people on Twitter, so I need to split my friends into groups, i.e. reading lists.

Researcher: The End of Spam Is Closer Than You Think, July 2012

The End of Spam?, Jan 2011

I wish we had a standard language for naming specific sensations. I would like to convey precisely the twinge on my lower back, which might be a pinched nerve, but might just be soreness. If my teacher could feel what I feel, he would know what it was, but instead his judgement has to rely on my imperfect attempts at describing it.

When it comes to bodily sensations, we don't know how much subjectivity there is. Psychologists (psychophysicists) can often quantify the subjectivity of senses (say color), because even when words fail, they can do experiments to test whether subjects are able to detect tiny differences in stimuli (perhaps defining a metric on perceptual space, or more!), and then quantify how much people differ in this ability, in different regions of stimulus space. But when it comes to your body, it is much harder to stimulate a sensation to a precision worthy of being called "reproducible". And then there's habituation (which is also a problem for scientists trying to study smell).

Right now you could start a philosophical food fight by bringing up the label-switching problem (namely that, just because you and your teacher are in verbal agreement doesn't mean that your experiences agree), but I just want to be practical here: how can we develop a shared vocabulary that would allow me to better convey my sensation to my teacher, so that he may make a better guess about what is wrong with my back? Are there existing human cultures in which people can easily convey their bodily sensations to each other?

I think that the biggest obstacle here is establishing joint attention. It is easy to teach the names of visual stimuli to a seeing person. But when it comes to coining words to describe types of pain in the back, this becomes like two blind people trying to come up with words for categorizing shapes (they can experience shapes by touch, but without joint attention, i.e. let's say they are not allowed to pass shapes to each other).

---

Why are "the arts" traditionally visual and/or auditory? Because out of all our senses, vision and hearing are the only senses whose stimulus-response mapping is reliable enough. With the other senses, there is too much variation within and across people to have any control over their experience (which also explains why we have so few olfactory words/concepts). Smell and taste have very little spatiotemporal resolution. Touch may actually be a good candidate.

The event felt like a musicians' party full of old timers, people who had opened for Bob Dylan, and written reviews for Rolling Stone. The shows were projected live (in black&white) onto the red-brick building across the street, for a very nice effect. The bar staff were very chill, and told me that since they didn't serve food, I could bring outside food(!!!).

Insight #1, as one might imagine, is that human creativity is now worth more than it used to be, since most of the analysis is now automated.

So I've been moving my videos to my external G-Drive. But this 750GB drive, which also serves as a backup for my machine, is nearly full. This means that I have two problems:

* buy more space to store my videos.

* buy more space to have a backup of these videos.

I could buy 2x 2TB drives, but this is likely to be expensive.

Any suggestions?

See here

.

Today he told me that his "style" doesn't have a name, but that his teacher was Allan Bateman.

Finally, I asked him to name some materialistic schools/style of yoga. He told me:

* Krishnamurthi (empiricist philosophy FTW)

* Strala

* Katonah

(beware, the marketing may be mystical, but that's just marketing)

I might go to a Katonah lesson next week.

---

I seem to be making steady progress (e.g. I can now touch my toes after just a few minutes of stretching). But I'm still a long ways from where I want to be. There's a lot of work ahead for strenghtening my upper body (abdomen, chest, arms).

As of next week, I'm planning to do two lessons per week.

Key concepts:

*

*

*

*

Directed graphical models perhaps provide something like a more concrete mechanism, allowing us to simulate the effects of interventions and propagate them downstream. But as far as real applications are concerned, papers in this tradition tend to make assumptions less explicit, and tend to mislead practitioners into thinking that the required assumptions are satisfied. (See Dawid - "Beware of the DAG")

---

UPDATE: Cosma Shalizi writes:

<< You've read Pearl'sStatistics Surveyspaper, right? I think the critique of the potential outcomes framework there, in section 4, is very strong. (Look at the stuff on ignorability, especially.) As for propensity matching, when the set of covariates you're using to calculate propensities doesn't meet the back door criterion, well, you get results like this. >>

Taxation happens when you have transactions between

Income tax on services encourages do-it-yourself and informal transactions: if you get your child to do it, no one is going to come into your house and audit your child. And the bigger your house, the more taxation you can avoid. Analogously, corporate income taxes encourage mergers & acquisitions: by bringing your supplier inside the organization, you no longer have to "buy" their product, since it is now made in-house! (Kinda like erecting an eruv). This may explain why corporations, unlike people, are taxed on

It would seem natural to want to acquire your most important supplier. (Kinda like the signing up for a "best friend plan" on your mobile service). But are acquisitions just a simple way to dodge taxes? For one thing, now your big organization has to run a business that may be outside of its expertise, the newly-acquired business can end up insulated from market forces, and gradually lose its competitiveness, yadda yadda yadda. Now, this analysis conflates ownership with management. Of course, it is possible to

The common justification for stopping mergers (and breaking up monopolies) is that they would make it impossible(?) to enforce rules against price-fixing.

I would like to see an empirical study of mergers. If we have a scenario in which a supplier has a single customer, is there any reason

---

Does a Major Company like Walmart have to Pay Sales Tax when Making Major Purchases From Another Business?

<< Businesses pay sales tax on items they purchase for their own use.

They don't pay sales tax on transactions in which they obtain products for resale in the store. This applies to all businesses of all sizes. Its also universal across state lines. >>

This means that merging along the production chain will not save on taxes. However, Walmart could save on taxes by buying a company that produces flooring or security cameras. Similarly, software companies could save on sales taxes by buying a coffee company.

---

* 5 tricks corporations use to avoid paying taxes

* When supply chains merge: 5 mistakes to avoid

* One Big Mutual Fund, or, The Ownership Society, by Cosma Shalizi

<< ... Ambitiously, Miller tries to explain why hierarchical corporations exist at all, why they take some of the forms they do, and how, in part, their form relates to their performance. ... >>

* MLE: for regular parameter, asymptotically normal, with rate 1/sqrt(n).

* MLE: for truncation parameter, asymptotically exponential, with rate 1/n or worse.

* If family has both types of parameters, we cannot(?) use the Fisher Information to find the asymptotic variance of the regular one. But can't we plug in the true value of the truncation one, and use the asymptotics of the regular subfamily?

* Consistency is guaranteed if n/p -> infinity, plus a few other conditions ("for all theta, the density is bounded" should suffice)

* UMVUE: when is it asymptotically equivalent to the MLE?

* Sample quantiles

* Estimating Equations, a.k.a. no closed form for the MLE (e.g. Beta, Gamma, GLMs). Van der Vaart proves consistency (5.10), normality (5.19).

* Why the two formulas are equivalent.

* Delta Method (for Rˆn -> Rˆm functions, we can easily generalize it using Jacobians!)

* Why knowing the nuisance parameter decreases the asymptotic variance of the parameter of interest

* Why location-scale families of *symmetric* distrs have a diagonal information matrix.

* Cramér-Rao Inequality, about *unbiased* estimators, comes from Cauchy-Schwarz. Not asymptotic: holds for every n! Equality is attained when we have "linear dependence". In other words, I think this means that an unbiased estimator U will be efficient iff can be written as: U = a*MLE + c.

* Compare: variance bounds for unbiased estimators vs other estimators.

* What if calculating the Fisher information is intractable?

* ATTENTION: is this for a single observation or for the whole sample?

* How the asymptotic normality comes from one-step Newton-Raphson.

* Why asymptotics of likelihood ratio is Chi-Square.

* Delta Method, and how if the first derivative is zero, we get slower convergence to a Chi-Squared.

* Edgeworth Expansions

* Simple vs Simple: Neyman-Pearson.

* Simple vs Composite: compute MLE of the alternative.

* Composite vs Composite: MLE of the null (a.k.a. least favorable distribution)

* UMP: Monotone Likelihood Ratio on the *sufficient* statistic implies that I{T>c} is UMP.

* UMPU: Power function has slope 0. Is it a mixture of two UMPs?

* LMP: maximize the derivative of the power function at the boundary.

* Asymptotic power under contiguous alternatives: projections, non-Central Chi-Squared (I might need more practice with basic power calculations first!)

* Do there exist simple conditions for existence or non-existence?? For location families, UMVUEs for the location parameter should always exist: U = MLE + constant.

* If U is unbiased for 0 and T is UMVUE, then Cov(U,T) = 0.

* Studentized Intervals

* Bootstrap intervals

* Many options for the Fisher information: Fisher Information at MLE, observed Fisher Information, etc. The observed Fisher Information may be biased. If the bias is positive, the resulting coverage probability will be below 1-alpha (but maybe the coverage probability converges to 1-alpha). If the bias is negative, the intervals will be conservative.

* Variance-stabilizing transformations. Do we get better intervals this way? These intervals will be asymmetrical.

* What if the first derivative of g is near zero at the MLE?

* Is S^2 always independent of beta-hat? Why?

* Why is the F-test equivalent to the T-test, and to the Likelihood Ratio Test?

* Review matrix calculus

* Simultaneous Confidence Intervals (studentized maximum modulus, studentized range distributions)

* Complete sufficient statistics for non-parametric families (e.g. all distrs, symmetric distrs, mean-zero distrs, etc)

* Kernel Regression

* Kernel Density Estimation (work out the bias!)

* U-statistics, and using projections to obtain asymptotic normality

* Review exam problem on Metropolis-Hastings

* Bayes Risk: may be minimized by posterior mean, median, mode, depending on the loss function

* Distributions: pdfs, cdfs, means, variances

* Relations between distributions: conjugacy, convolutions, scaling

* Law of Total Covariance

* Joint distribution between minimum and maximum order statistics.

* Inequalities: Markov, Chebyshev, Jensen

* Dominated Convergence / Monotone Convergence: swap limit and integral.

* (1 + x/n)^n -> e^x

* \sum_k x^k / k! = e^x

* \sum_{k=0}^n p^k = (1 - (p^n+1)) / (1 - p)

* \sum_{k=0}^n k p^k =

*(no subject)*- lit review: Kullback-Leibler divergence in Statistics PhD textbooks
- R <-> Julia dictionary
- my electricity costs twice as much in Brooklyn
- my Burning Man checklist
- dispute with AirBnb
- my taxes
- summer sublet in NYC
- conditional inference; why completeness matters
- getting back on Twitter?
- the end of spam?
- yoga and intersubjectivity: let's invent a language
- Greg Garing
- intelligence augmentation and chess
- video storage
- eye movements and cognitive state
- yoga
- causal inference
- economic organization and taxation
- Things to put on my cheat sheet / Things to think about

- Base style: Transmogrified by
- Theme: Shallowest Depths by

No cut tags