CS 362 Discussion: Problem Set #1 (5.1, 5.3)

Thursday, January 19, 2012

Problem Set #1 (5.1, 5.3)

...

9 comments:

Theodore SchnepperJanuary 22, 2012 at 6:37 PM
In looking at 5.1, I notice we're attempting a run time of O(log n), which implies a divide and conquer strategy, with recurrence on one half of the problem. At that same time we have two databases to consider, and we may well have to do this twice.

This was my initial assumption anyway. The problem of attempting to solve for the median between two databases of values, both sorted individually makes for an interesting problem. In thinking it over, I've come across some test cases that are going to make things a little interesting.

But first, let's lay down some facts. We know there are 2n numbers contained between both databases, and that each database contains n numbers. We also know that all of the numbers are unique. Based on the definition of the median, our goal is to find two numbers between the two databases and average them to determine the median.

Initially the thought is that there exists one number in both databases that you can narrow down to. So if you start with the n/2 value in both databases, you merely need to compare which is less, and which is more, and up or down in the problem accordingly. This helps to keep the problem balanced so you always have a ratio between the two databases that correspond to n number above the potential median solutions, and n above. This seems like a pretty good start, however there is a problem with this.

The problem does NOT guarantee that both numbers will be in separate databases. This means that the two numbers for the median, could very well be in one query. This could cause a potential problem, as it's a very dangerous assumption.

Here's an example:
1,2,3,4,5,6
0,7,8,9,10,11

In that case, the correct median values are 5, and 6, for an actual median of 5.5

Another tough case to guess will be the border cases, where the median values occur on the edge. However, at this point, because of the first example, I'm some-what stuck on what to do for an approach. Everything I keep coming up with seems to suffer, based on the same assumption. There has to be some way to consider both of the queries as one query, and to try and work from there, but because they are sorted individually I can't seem to see it.

Are there any thoughts on this problem so far, that might be slightly more insightful?
ReplyDelete
Replies
Mark MontoyaJanuary 23, 2012 at 1:14 PM
considering how tricky it gets when your recursion has narrowed it down to 2 or 3 elements per list, I think it's reasonable to manually (naively) find the median of this list of 4 or 5, since this would be a constant-time addition to the algorithm. I believe it makes the base case of the recursion much simpler to deal with.
ReplyDelete
Replies
AnonymousJanuary 29, 2012 at 5:54 PM
I had a question pertaining to 5.3:

The question states is there a set of more than n/2 of the bank cards that are all equivalent to one another?

Is the question asking if it is possible that more than n/2 cards are all from the same bank account???

Thinking out loud:
It seems that if n = 20 than a bank account could have 11 or more cards...that seems strange to me OH! i.e. credit fraud, perhaps???

Thanks for any feedback :)
ReplyDelete
Replies

Add comment