welcome back today we’re going to take a look at a number of interesting applications of symbol tables in the binary search tree data structure to address problems with processing geometric data so let’s take a look at it the idea is that we’re going to be talking about geometric objects not simple keys like strings and numbers so here’s an example so say your geometric objects are points in the plane and you specify a rectangle that’s as oriented with the horizontal vertical axes and you might want to ask which points are inside the rectangle or how many points are inside the rectangle or maybe what you’re processing is rectangles you have a set of rectangles and we want to know which of these rectangles intersect or how many rectangle intersections are there these are interesting problems that have lots and lots of applications from computer-aided design to games and movies and also in abstractions such as databases in other situations where you might have multiple keys or multiple dimensions and it’s a very interesting extension of the ideas that we’ve looked at for symbol tables for all sorts of familiar applications and surprisingly binary search trees and the Associated algorithms that we’ve looked at are going to provide very efficient solutions to a number of important problems in this area and really have enabled new developments and new technology in all of these kinds of applications so to get started we’re gonna look at a simple problem called one-dimensional range search and it really forms the basis for what we’re gonna do it’s a little bit of an extension of the ordered symbol table API that we gave before and we’re gonna have the operations range search in range count so you know one dimensional just means we have one key so we’ll insert a key value pair as before and what we want to do is be able to search for a key and the value associated with it we’re going to be able to delete but then we want these operations range search and range count so find all the keys that are between two given keys or give how many keys are there between two given keys so for this example at right we have insert a number of keys and and we’re just showing them in sorted order but then you might say well how many keys are there that are between G and K in this case there’s just two and then client might say well what are those keys you want to be able to return them and this is a very common operation say in databases you want to return how many taxpayers have salaries between 1 million and 10 million and then which ones are there and so forth so range searching is a very important fundamental operation now in geometric interpretation we just thinks of the keys as points on a line and so the key values well you are just specified as points on a line we might convert the letters to numbers or we might keys might be numbers and then what we’re looking for is to find or count the points in a given interval in one dimension so how we’re going to implement that well this is a basic problem that is very similar to our symbol table problem we might consider keeping the things in an unordered array just put them in an array and then while insertion this is fast we just add it to the end of the array we might have to use resizing to make the array grow but this is unattractive because for large numbers of keys in order to count the keys that fall within a given range you have to go through all the keys and test whether they’re in the range or not and to return them the same way so take linear time for large number of keys if you keep the things in order like in a binary search situation then to insert in order to keep it in order in an array you might need to move the larger ones over one position and so forth or elementary implementation of binary search when we did symbol tables I did this so the insertion time might be linear but then you can use binary search to look for the two end points that’s only going to take time proportional to log in and then from that you could figure out how

many keys there are or return them all between the index the lowest one in the range index to the highest one in their range so those elementary implementations are not acceptable for large numbers of keys because they have the linear time operations so what we really want is to have time proportional to log in for insert and fan for counting for range search of course we have to touch every key that we return so the running time is going to be proportional to the number of keys that match but anyway those are reasonable goals and they’re easy to achieve so so for example what about one-dimensional range counting well what we’re going to do is just keep the keys in a binary search tree and we looked at the implementation of the rank function for binary search trees where for every key we can compute how many keys are there that are strictly less than that key so in this case the rank of E is 2 and H is 3 and so forth so in a binary search tree those rank numbers go in increasing order as we do an inorder traversal and that’s easy to compute e to keep that rank tree as a field or keep a field which has the size of the tree and it’s easy to compute the rank from that so how many keys between say e and s 1 2 3 4 5 it’s AK really just the difference between the ranks plus one if the high entry in the range query is in the table and not plus whenever it so there’s the same number of keys between E and a cesare between E and t5 between F and T there’s only four so that’s a a really you know 1d range count is a very easy computation to perform in in log time with a binary search tree the number of nodes examined when we do a search is the length of the search path to lo plus the length search path to high to find their ranks and that’s going to be time proportional to log n so and range search well we just do a recursive search and to find all the keys between low and high you look in the left subtree if any of them could fall in the range you look at the current node and you look in the right subtree if any of them could fall in the range and it’s easy to tell whether any of them could fall in the range by just checking whether their range overlaps the root or not so if we’re looking for all the keys between F and T then we have to look at both sub trees of the root s but we don’t have to look at the left subtree of e because all of those are less than e and therefore are less than F so we don’t have to have to look there but otherwise it’s a simple modification of recursive tree search to find all the keys and it’s easy to see that the running time of that is going to be proportional to the number of keys returned plus login so that’s one dimensional range search using binary search trees