| Who | When |
Messages | |
|
|
|
| |
Messages 42-40 deleted by topic administrator between 06-25-2008 02:30 AM and 07-22-2006 09:30 AM |
| Jordan
|
39
|
 |
|
07-21-2006 11:28 PM ET (US)
|
|
Wow! Well done! If this was, like, an assessment or something, i'd sure give you guys full marks!! buy pravachol webpage devoted to buy pravachol. pravachol webpage devoted to pravachol. master.
|
| Anthony
|
38
|
 |
|
07-21-2006 05:48 PM ET (US)
|
|
You have a very good topic here, so best greetings to you and all your visitors. Admin of inderal information webpage devoted to inderal information. free amaryl webpage devoted to free amaryl.
|
| |
Messages 37-34 deleted by topic administrator between 07-21-2006 09:00 AM and 07-22-2006 09:30 AM |
Charles Elkan
|
33
|
 |
|
02-13-2005 02:54 PM ET (US)
|
|
/m29 answer: Remember, you can maximize f(x) by setting its derivative to zero only when the optimal x is in the interior of the allowed range for x.
|
Charles Elkan
|
32
|
 |
|
02-13-2005 02:51 PM ET (US)
|
|
/m31 answer: Your question is perhaps the part of the problem that is conceptually the least clear. I think the answer is that the point at which each organelle is aliced is uniformly distributed, hence the observed radius of the disk obtained from the organelle is not uniformly distributed.
|
| Banu Dost
|
31
|
 |
|
02-13-2005 02:27 PM ET (US)
|
|
For problem 4, which variable is uniformly distributed? Is the distance of a cross-section or the radius of a sphere itself? How can we decide this? In both cases, we get similar pdf but not the same.
|
| Stephen Krotosky
|
30
|
 |
|
02-13-2005 02:25 PM ET (US)
|
|
/m29 sorry that should read: s(x_i,r) = -1/r - r/(r^2-x_i^2)
|
| Stephen Krotosky
|
29
|
 |
|
02-13-2005 02:24 PM ET (US)
|
|
Thanks, I had figured that out.
Now I'm having trouble finding the MLE, because i'm having trouble breaking up the sum for the total score function.
I know that each x_i ~ p*(x_i,r) = x_i / ( r sqrt(r^2-x^2)
s(x_i,r) = -1/r - r/(r^2-x_i)^2
I need to find s(x,r) = \sum s(x_i,r)
When I do that, I can't seem to separate it into something solveable. Does anyone have any tips on manipulating the sum to give the score function in terms that will put the sum only on the x_i's
Thanks,
Stephen
-- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.300 / Virus Database: 265.8.7 - Release Date: 2/10/2005
|
| Jan Schellenberger
|
28
|
 |
|
02-13-2005 05:34 AM ET (US)
|
|
/m27There is a trick. You can use the CDF of the uniform (P(z<Z)) to correspond to some CDF of the radii (P(r>R)). Once you have that, the PDF is the derivative of the CDF.
|
| Stephen Krotosky
|
27
|
 |
|
02-13-2005 01:55 AM ET (US)
|
|
For problem 4,
I've figured out what the distribution looks like, and I believe I can define it mathematically, but am having trouble doing so. Basically what I would like to do is transform a uniform distribution to the desired one. Here is my logic.
If we look at a cross section of an organelle of radius r, the cross section can be thought of as cutting the organelle at a point x that is generated uniformly from -r..r. This corresponds to the front and back of the sphere relative to the cutting plane.
If we know where the sphere is cut, we can compute the observed radius, Z = sqrt(r^2 - x^2). My questions is how can I use the function for Z and the fact that x is uniform to find the pdf (likelihood function) for Z.
I know this is possible but am having difficulty actually doing it. Thanks.
|
Charles Elkan
|
26
|
 |
|
02-12-2005 03:47 PM ET (US)
|
|
/m25 answer: Yes, you may assume that the number of cross-sections n is fixed.
|
| Hyun Min Kang
|
25
|
 |
|
02-12-2005 12:13 PM ET (US)
|
|
/m21 /m24 Thanks. I understand what you mean. One more question. Can we assume that the number of cross-section is always n? If the organelles are distributed randomly, the number of cross-section will actually vary. So, the distribution should involve the probability distribution of n also. But since there is no description about the density of the organelles, I cannot compute the exact distribution. (Also, the problem becomes much more complicated).
|
Charles Elkan
|
24
|
 |
|
02-12-2005 01:07 AM ET (US)
|
|
|
Charles Elkan
|
23
|
 |
|
02-12-2005 01:02 AM ET (US)
|
|
|
Charles Elkan
|
22
|
 |
|
02-12-2005 12:56 AM ET (US)
|
|
|
| Ryan Kelley
|
21
|
 |
|
02-11-2005 11:40 PM ET (US)
|
|
\m20 Just as a random variable has a distribution, so does some function of that variable. Since an estimator is just a function of the data, it will some distribution as well. In your example, if x1,...,xn are iid gaussian(\mu,\sigma^2), then avg(x1,...,xn) will follow a gaussian distribution with mean \mu and variance \frac{\sigma^2}{n}
|
| Hyun Min Kang
|
20
|
 |
|
02-11-2005 09:44 PM ET (US)
|
|
Edited by author 02-11-2005 09:44 PM
In problem 4, I don't get the last sentence. What did you mean by "distribution of estimate"? For example, if avg(x1,..,xn) is the MLE of \mu in Gaussian, what is the distribution of the estimate?
|
| Hyun Min Kang
|
19
|
 |
|
02-11-2005 12:37 PM ET (US)
|
|
/m17 Thanks for the information. I found dirichlet there, but Zipf distribution is not stated clearly, and there is no power law distribtution. Where can I get these info?
|
| Hyun Min Kang
|
18
|
 |
|
02-11-2005 12:33 PM ET (US)
|
|
In part 3(c), "Explain carefully whether or not the theorem relies on the exponential family being described using its natural parameters." I don't get that part. What does that mean, and what I am supposed to explain?
|
| Banu Dost
|
17
|
 |
|
02-11-2005 05:19 AM ET (US)
|
|
from wikipedia.com Banu
|
| Hyun Min Kang
|
16
|
 |
|
02-11-2005 03:03 AM ET (US)
|
|
Where can I get the formal definition of Dirichlet distributions, power law distirbutions, and Zipf distributions?
|
Charles Elkan
|
15
|
 |
|
02-10-2005 11:55 AM ET (US)
|
|
/m10, /m11 answer: Yes, the binomial assumes N is large. You may assume this; I should have mentioned it in the problem statement. No need to do the more difficult hypergeometric calculations.
|
Charles Elkan
|
14
|
 |
|
02-10-2005 11:52 AM ET (US)
|
|
/m10 answer: I think the reasoning with the binomial is correct, and the hypergeometric is not needed for this problem. Be careful with this claim: "if the actual N is less than the claimed N, we would expect to see more tagged animals." I'm not saying it's false (or true) just that it requires careful thought to be sure.
|
Charles Elkan
|
13
|
 |
|
02-10-2005 11:45 AM ET (US)
|
|
/m7 answer: I discussed this with Ryan, and I think he is correct. This does not mean that the Gaussian-based answer is incorrect, just that it may be unreliable in the real world. It points out the need to investigate which distribution(s) model real-world returns well.
|
| Stephen Krotosky
|
12
|
 |
|
02-09-2005 07:50 PM ET (US)
|
|
/m11 Oh I see. Thanks. I didn't think about the fact that I was assuming iid. That makes perfect sense.
|
| Jan Schellenberger
|
11
|
 |
|
02-09-2005 07:47 PM ET (US)
|
|
/m10That seems reasonable. The hypergeometric distribution is a generalization of the geometric and if N were large you could use the binomial. However, if N is small then your samples are no longer iid (finding a tagged animal decreases the chance that the next animal observed will be tagged). The hypergeometric distribution takes this into account. -Jan
|
| Stephen Krotosky
|
10
|
 |
|
02-09-2005 07:42 PM ET (US)
|
|
Edited by author 02-09-2005 07:42 PM
/m8 /m9: This has confused me. My initial logic was that we would assume that the number of animals found tagged, r, would follow a binomial distribution in our hypothesis test. Assuming that we initially believe that there are N animals, we can assume that p = m/N. We also know that the E[r] = n*p and the var[r] = n*p*(1-p). I was thinking that in simulation we could find the observed r and then see if it exceeds a certain threshold. We would only be concerned with exceeding it, since if the actual N is less than the claimed N, we would expect to see more tagged animals. Is this reasoning correct and if so, is it similar or identical to the hypergeometric reasoning? Thanks
|
| Jan Schellenberger
|
9
|
 |
|
02-09-2005 06:50 PM ET (US)
|
|
|
| Michael Sanders
|
8
|
 |
|
02-09-2005 02:38 PM ET (US)
|
|
How do you calculate the probability of a hypergeometric for a population N with m tags and a sample size n with r tags, where n,r << N? Using the formula in Casella causes overflow in Matlab
|
| Ryan Kelley
|
7
|
 |
|
02-09-2005 01:44 AM ET (US)
|
|
/m2I think I am experiencing a similar problem where the expected gain is undefined if the stock market distribution is cauchy (even for a large number of trials, the value does not converge). Supposing this is the case, should we just report this as evidence that our answer in the previous case is incorrect?
|
Charles Elkan
|
6
|
 |
|
02-08-2005 02:25 PM ET (US)
|
|
|
| samory
|
5
|
 |
|
02-08-2005 03:03 AM ET (US)
|
|
OK, never mind ... r is the radius of the spheres, and the xi's are radii of circles cut out of those spheres.
Samory.
|
| samory
|
4
|
 |
|
02-08-2005 02:46 AM ET (US)
|
|
Hi, About Question 4:
I don't understand how the organelles have "equal" radius r if we could at the same time observe n different radii. Could you please clarify ?
Thanks,
Samory
|
Charles Elkan
|
3
|
 |
|
02-05-2005 01:04 PM ET (US)
|
|
/m2 answer: What you write is thoughtful, but I will need some more time to understand it. (1) answer: Yes, simulate with Cauchys and see if the answer based on Gaussians is still useful. (2) answer: To keep the simulation simple, I suggest simulating just one year at a time. While the sum of k Gaussians is itself a Gaussian, this fact is not true for other distributions. So, just generate each simulation one year at a time.
|
| Stephen Krotosky
|
2
|
 |
|
02-04-2005 06:59 PM ET (US)
|
|
I have some questions on problem 1.
1) When you say repeat part c, do you mean that we simulate using Cauchy and compare to our theoretical answers for Gaussian, since Cauchy integrals blow up?
2) To generate Cauchy distributed samples, I am executing the following code:
temp = rand(samples,1)*pi - pi/2; xcauchy = tan(temp).*g.*sqrt(n) + d*n;
where g is the "spread" of the cauchy distribtion, and d is the median return value. I multiply each by sqrt(n) and n respectively to account for the change in distribution each year n. I've plotted these compared with Gaussians and am able to get pdfs that look similar in terms of "mean" and "variance".
However, I'm unsure how to actually use this to get valid results. In the Gaussian case, I can simply generate a Gaussian, find the ones that are lower than the cash investment and replace those values and find the new mean as shown:
xnormal = randn(samples,1).*s.*sqrt(n) + d*n; I = find(xnormal < n*c); xnormal(I) = n*c; avg_ret_normal(n) = mean(xnormal) + (N-n)*d;
However, if I do that for the cauchy, the mean blows up. If I try to take the median instead of the mean, this would just give me a value close to n*d, since the median wouldn't change if I just truncate the tail and replace it with cash investment values.
One solution I just thought of trying would be to find a new median associated with just the values > n*c, but I'm not sure if this is valid, or if it would be a valid comparison to the Gaussian case.
Any thoughts would be appreciated.
|
Charles Elkan
|
1
|
 |
|
02-01-2005 03:06 PM ET (US)
|
|
Please ask questions here about the third assignment for "Statistical Learning".
|