QuickTopic (SM) free message boards QuickTopic (SM) free message boards
Skip to Messages
  Sign In to access your topic list  |New Topic |My Topics|Profile
Upgrade to Pro   Customize, show pictures, add an intro, and more:   QuickTopic Pro...and check out QuickThreadSM
Topic: CSE 291 Winter 2004 Assignment 3 questions
Views: 1797, Unique: 798 
Subscribers: 1
What's
this?
Printer-Friendly Page
Subscribe to get & post, or stop messages by email Subscribe
All messages    << 12-27  1-11 of 27        
About these ads
Who | When
Messagessort recent-top   
Post a new message
 
Jay  1
02-11-2004 12:07 AM ET (US)
In the LRT, 2Log(L(H0)/L(H1)) = 2Sum_i(o_i * Log(o_i/e_i)).
How should we handle cases where o_i equals to zero?
Andrew  2
02-11-2004 12:13 AM ET (US)
In the derivation of the Pearson X^2 statistic, we used the taylor expansion of ln(1+x), and we are left with

 n
Sum [ d^2/e - d^3/e^2 ]
i=1

but the Pearson X^2 statistic is
 n
Sum
i=1 [ d^2/e ]

My question is where did the d^3/e^2 term go?

Since it is sometimes positive and sometimes negative did we assume/hope it would cancel its self out?
Jay  3
02-11-2004 12:22 AM ET (US)
I guess we just ignore it, since we assume d << e. That's why using X^2 to approximate LRT only stands when d << e.
Nuno  4
02-13-2004 08:55 PM ET (US)
I don't quite understand question 2d). Under H_0 we have a predetermined positioning of the centers c1 & c2 that gives us the smallest r(x)'s - part a).

But under H_1 (any distribution for X) we can place the centers anywhere, including the optimal positions that minimize r(X).

Therefore how can r(X) ever be larger under H_1 than under H_0?

Maybe I'm missing something?
Jay  5
02-14-2004 02:35 AM ET (US)
Don't just think about one or two guassian. Think more.
Anjum GuptaPerson was signed in when posted  6
02-15-2004 11:34 AM ET (US)
Edited by author 02-15-2004 12:42 PM
I have a few questions --
i) How do we deal with the cases when O_i is zero (observed number of events is zero). This is the same question asked by Jay.

ii) I think in general I am bit confused about the relationship between chi-squared test of indepenence statistic and Pearson chi-squared statistic. Any one line clarification to this confusion? Also am I right in thinking that if you use normal distribution approximation for a binomial distribution then the LRT statistic boils down to the Pearson chi-squared statistic?

iii) In Prof. Weber's notes, I am not clear how expression (3) is derived. How is it that 2*O_i + e_i is replaced by 2n + n?

Thanks.
Jay  7
02-15-2004 02:28 PM ET (US)
Anjum:

I think the test in the two articles are different.

In the note, we are testing if given samples are from a binomial random process with certain p.

In the thesis, we are testing if two samples are from two independent binomial random processes.
Nuno  8
02-15-2004 09:24 PM ET (US)
Jay:

Thanks for your reply but the problem was that I was not understanding the question itself. Re-reading it now it makes more sense if I interpret it as "Find a non-Gaussian distribution _hypothesis_ ...". I was only looking for a non-Gaussian distribution of the sample points.

Anjum:

i) We saw in class that you should ignore the terms 0log(0)

ii) Dunning makes a good point of clearly separating Pearson's chi-square statistic from any other chi-square statistic. In short, Pearson's formulation is an approximation to the LRT score when the binomial can be approximated by a normal distribution and you're testing for independent binomials vs a generic multinomial. I do recommend chapters 4 and 5 of Dunning's thesis - very clear and readable.

Nuno
Charles Elkan  9
02-15-2004 09:30 PM ET (US)
Answer to /m4: We can calculate r(x) for any distribution, with centers that depend on that distribution.

The question is: find a non-Gaussian distribution that appears even more spread-out than a Gaussian with same mean and variance.
Charles Elkan  10
02-15-2004 09:39 PM ET (US)
Answers to /m6:

i) When O_i is zero, assume that 0*log 0 is zero. We discussed this question in class on Thursday.

ii) Tests of independence are really tests whether observed data were generated by a particular multinomial.

Note that different null hypotheses give different tests, but they all have the same test statistic

Because the tests we use are LRT tests, they involve the chi-squared distribution, so they are called chi-squared tests.

The Pearson test is derived from the LRT test by takeing the Taylor approximation of the log function.

Dunning says that the Pearson test can be derived from the LRT test by using a Gaussian to approximate a binomial. However, he does not explain this in detail.
 
iii) SUM 2O_i + e_i = 3n because n = SUM e_i and also n = SUM O_i
Michael  11
02-16-2004 12:31 AM ET (US)
I'm having a little trouble being sure I understand the charts from the thesis. Any clarification is appreciated:

1. Charts 5.4 and 5.5 merely state the relative significance between the Pearson's chi-squared stat and the LRT under justified and unjustified normal approximation scenarios respectively. These charts do not necessarily show how well either statistic follow chi-square distributions.

2. Chart 5.6 shows how well the LRT stat follows the theoretical chi-square distribution. It does not show how well the Pearson's stat follows the chi-squared distribution.

Is my understanding correct? I just want to make sure that I am not missing something or reading too much out of the examples. Thanks.
RSS link What's this?
All messages    << 12-27  1-11 of 27        
QuickTopicSM message boards
Over 200,000 topics served
Learn more Frequently asked questions  Acknowledgements
What they're saying about QuickTopic
 Questions, comments, or suggestions? Contact Us
Read our use policy before beginning. We value your privacy; please read our privacy statement.
Copyright ©1999-2008 Internicity Inc. All rights reserved.