亞馬遜編碼面試題--最接近原點的K點 (Amazon Coding Interview Question - K Closest Points to the Origin)

Subtitles section Play video

Here's a coding interview problem from Amazon:
We have an array of tuples, or an array of arrays of two items each.
This represents points on a two-dimensional space
So for example this first item, (-2, 4) represents
this point right here because the x-coordinate of this point is -2 and the y-coordinate of this point is
4 and the problem is this:
given these points,
how would you find the K closest points to the origin? And of course the origin is at (0,0).
So for example, if the given K is 2, how would you find the two closest points
to the origin out of these points? The two closest points here are (-1,0) and (0,-2)
so you should be able to print these points and try solving this problem in a more general way
To practice pause the video right here
And think about how you would solve the problem and then come back to the video
the first natural step for solving this problem is to find the distance to each point from the origin and
of course we can use the Pythagorean theorem to do that so to find the distance to for example at this point, (-2,4)
We can just find the value of square root of
minus 2 squared plus 4 squared which
is equal to
square root of 4 plus 16 which is square root of 20 so that's
the distance from the origin to (-2,4) and we can go through the same procedure for
every other point to find the distance for every other point as well, and after that you can just
construct a new array with objects
containing the coordinates of each point as well as the distance to each point from the origin and
this new array might look like this
I'm calling this new array points_with_d as in points with distance and the first object of this
new array will correspond to the first item in the original array
with the coordinates (-2,4) and with this is square root of 20 of course,
and we're going to have the same type of object for every other point in the original points so the length of points_with_d
the new array will be the same as the length of the original array points,
and you can use dictionaries or hash tables to store this information
Instead of using objects as well
Once you have this array all you need to do is, you just need to find the K objects or the K points
with the shortest distance
So at this point this problem is actually equivalent to the problem of finding the K smallest
elements in a given array of numbers
for example if the given array is this array of integers, and if the given K is 3
We want to be able to find the three smallest items out of this array
Which are 1 2, and there are right here now
There are a few different approaches for solving this problem
The simplest approach is the one in which you sort this array in an ascending order so that the smallest items
come first and out of the sorted array take the K first items,
and that approach would take Big O of n log n in time if you use for example quick sort or merge sort
another approach is using a variation of something called selection sort and
I'm not going to go into detail here in this video
But this approach would take Big O of nk in time
Now my personal favourite approach is the one in which we use a max-heap and this one would take
Big O of (K + (n - k) log k) in time. Let me explain
how it works. First of all let's quickly go over what a heap is. In a sentence
It is an efficient way to keep track of the largest value in the given collection of numbers
Suppose as an example
You're given these three numbers
4, 1, 5. If you construct a max-heap with these three numbers, you can sort of visualise it as
a bag or a box of these three numbers,
and however many numbers you have in the max-heap
Retrieving the largest number out of it is very efficient. In fact, asking the heap what the largest number is
Without modifying the heap, so just examining what the number is without retrieving the number itself
will be done in Big O of 1 in time. Now, there are a few other operations
we're going to use on a heap for this particular problem. The first one is create
creating a heap, a max-heap, with K items
takes Big O of K in time the second operation is replace.
So for example, if we want to replace the current largest number in
the given heap, this one with a new number three,
then this heap will be rearranged so that the current largest number will be four, and they will have one and three
below it. This operation takes Big O of log K,
where K is the number of items in the heap, and the third operation is print all.
This prints all the items in the given heap in the order
that's not necessarily sorted. So for example, this operation on this particular heap
Might print 4, 1, 3, and this operation takes Big O of K in time.
Now, how can we use a max-heap and those operations on it to find the K smallest items in a given array of numbers?
Let's consider this array of numbers again and
Let's say we want to find the three smallest items.
Then the first step will be to construct a max-heap with the first K items, or with the first three items.
After that, for each item in the remainder of the array, ask ourselves, is this number
smaller than the current largest number in the heap? If it is smaller then replace it
with the current largest number.
So when we are examining this number right here, 2, the heap will look like this,
And the current largest number has become 4, and when we're examining this number 3,
4 will be replaced with 3.
We're going to keep going like that, and when we're examining this number, 10,
because this is not smaller than the current largest number in the max-heap,
we'll just ignore that number, and at the end of this process just print all items from this heap,
and we're done. We have the K smallest items from the given array. Now let's quickly recap our entire strategy
Using pseudocode we're going to define our function closest,
which takes the given points, and K, for finding the K closest points to the origin.
And from this function, we're not going to return anything,
but we're going to print the K closest points. And in this function, first of all using points, create
points_with_d for points with distance, an array of objects
containing the coordinates, as well as the distance to each point from the origin.
After that, using points_with_d
the array of objects, create a max-heap with the first
K items, and use the distance as the key when we create a max-heap.
Let's just call this max heap MH. After that,
run a for loop for each point in the remainder of the array,
and if this point's distance is less than the current max of MH,
then replace MH's current max with p, and once we're done with this for loop, then just print all points in
MH, and we're done. Now what about time complexity? Let's think about it line by line.
The first line here, using points creating points_with_d
takes Big O of n because we go through this array only once,
and creating a max-heap with K
items, as we discussed earlier, takes Big O of K in time, and here we're running n-k loops.
In each loop, replacing MH's current point with p takes Big O of log K,
and so the runtime for this entire for loop will be Big O of n-k times
log K in the worst case scenario.
After that, printing all points in MH will take Big O of K in time.
So adding them all up, Big O of n plus Big O of K
plus Big O of n - K times log K + Big O of K
We're going to have Big O of n
plus n - K times log K. And that would be the runtime for this entire solution.
Ok, thanks a lot for watching this video. If you liked this video I would also recommend
my course on Udemy, 11 Essential Coding Interview Questions, in which I cover
11 of the most essential coding interview questions to master with coding exercises in Python and Java.
In case you're interested in taking the course. I put a discount code below in the description. Alright,
I'll see you in the next video.