Uncategorized

hello so it is it am I talking to the microphone can you hear me in the back don’t it is that better okay good so I’m a compiler in a debugger engineer and for the past couple of years I’ve been working on the ll DB debugger and I’m going to be talking about the experiences that we’ve had embedding Python in the interactive debugger so I’m going to start by talking a little bit about ll DB and about debuggers then I’m going to describe how we put Python in ll DB and I’m going to describe it both from the user experience and then some of the implementation details I’m going to go over some of the problems that we had trying to make this work and how we solve these problems then hopefully I’m going to have time to go over an example of actually using the Python scripting in the debugger to solve a real debugging problem and then finally hopefully we’ll have time for a few questions I should probably mention a couple of things I tend to talk very quickly especially when I’m nervous so if I start talking way too fast you know raise your hands and I’ll try to slow down and the other thing is it I’m as a complete Python newbie when I started I still feel like a Python newbie so there are probably better ways of doing what we’ve done and i’ll be very happy to hear about that later on but if you’re wondering why did we do it that way it’s because i didn’t know what i was doing when i started all right so um what is ll DD ll DB is part of the llvm project llvm is an open-source compiler and tools technology project it was started at the University of Illinois it’s actually a a library of compiler tools and technologies so you can people link to it and they do a lot of research projects are using the LLVM compiler for doing a compiler research there’s also a lot of corporate companies that are contributing source to the LLVM compiler project including Apple and Google and arm and a bunch of other companies ll DB is llvm s debugger so ll DB is actually more than just a debugger it’s actually a debugger library so again it’s something that you can link to and get various debuggers services out of including you know back-tracing in San Paulo keishon disassembly and stuff so some of the things that might try to use ll DB would include you know front-end debuggers so you know you can write a front-end and link to the ll DB library and get all of your debugger technology that you want you can also have automated testing tools or disassemblers execution analysis tools crash reporting for the back tracing and symbolic ation or on performance analysis tools so all of these different types of tools might be interested in linking to the llvm ll DB library and using the various services that we can provide in this talk I’m going to be describing the debugger front ends and in particular I’m going to be describing the ll DB command-line debugger and the ll d be gooey debugger which is part of an IDE so I’m going to start by going over a few bits of basic debugger terminology you may already know all of this but in case you don’t just to go over it so when you have executable files on your disk on like hello world or Python exe these are what we refer to as targets when you go to execute your target you load it into memory and then the thing running in memory is called a process you execute your process in memory your process finishes executing its pulled out a memory you still have your targets on disk so targets and processes breakpoints are places where you can pause execution to examine the state of your program and try to figure out what’s going on break points can be conditional so you can say I only want to stop at a particular breakpoint if a particular variable has a particular value or some other condition is true you can also attach code to break points and then have this code get executed each time the breakpoint is hit every time you call a function you have a frame that’s associated with a function the frame contains parameters for the function its local variables and often the return address for the function processes and threads maintain call stacks of the currently i’m called functions when you call a function its frame gets pushed onto the stack when it finishes executing the frame gets popped off the stack so there’s the end of debugger 101 your introduction to debuggers now about ll DB ll DB itself is written in C++ it’s a multi-threaded program it’s got an object-oriented design so the main the debugger are all objects and the objects are nested within each other ll DB is currently in its beta release so there are still a few bumps in the program but if you want to try it out you can as I think I said earlier it’s an open source project you can download it build it contribute to it whatever the objects in lld be so the main top level object is the debugger object inside the debugger you can have a target you can even have multiple targets it’s been designed to handle multiple targets within a single debugger within your target you can have

a running process and your process can have one or more threads in it now within your debugger you also have your interpreters because you users have to be able to enter commands and you have to interpret the commands so you have your command interpreter that’s the thing that handles your basic debugger commands like setting your file setting your breakpoints running your program you know whatever and in ll DD we also have embedded a script interpreter and this is where the Python comes in so the script interpreter then handles all the Python command Z the user might want to give the debugger so we also lld be in addition to having just a command line debugger there’s a GUI debugger and this makes things a little bit more interesting because you have a single running GUI debugger process that may want to launch multiple targets it wants to launch each target in a separate window and it wants each target executing its separate window to have a separate debugger session but this is all running under one over overarching process therefore you have multiple simultaneous debuggers which want each one wants its own script interpreter so you really need to have multiple simultaneous script interpreters or something like that you need to have the ability to switch smoothly between the windows is the user clicks from one into to another and at the same time you need to maintain complete isolation between your various sessions so you have thread safety and no dead blocking and things work the way users expect them to so the debugger an LD be that it works roughly like this you have the lltv process in the big lldp process you have one running single embedded python interpreter and then you’re going to have a debugger object with the command interpret and the script interpreter inside it but you can actually have multiple debuggers each with their own command interpreter and script interpreter but python is you’re probably no doesn’t really allow you to have multiple script interpreters in one process so we have to simulate that and we actually have on what I call interpreter sessions inside each debugger each session simulating being its own separate script interpreter and that’s one of the things I’m going to explain a little bit later in this talk now the debugger so you’ve got your multiple debuggers and each one has its own interpreter session so the users can enter Python commands in their different interpreter sessions and they might you know define a variable and then say I want to look at the variable they could do that in multiple sessions at the same time and you really want to be careful that the values don’t stomp on each other you know you want to make sure if you’ve set count to 24 in the left-hand window when you go to print it in the left-hand window you get 24 you don’t want to get the value from the other window so we really have to maintain very strict separation between the Python definitions that are used in different interpreter sessions in this single process so that you the users get the behavior that they expect so at this point I’m going to show you a little bit of ll to be running to give you an idea of some of the stuff you can do with script and scripts in lld be if I can type so we have startup ll DB and all load hello world um this is a sea program show you running not very interesting I have another program it counts to ten again I can show it to you running not very interesting I can set a breakpoint um on the function that prints the numbers if I could actually type okay and then we run and it stops each time it goes to print a number now at this point I’m actually going to attach a Python breakpoint script command to this breakpoint so I’m going to say breakpoint command add script it’s going to be a Python script and i’m going to add it the first break point and i’m going to actually start counting how many times we hit the break point so i’m going to use a variable break point count i’m going to increment it every time we hit the break point and then i’m going to print out what it is so hit this breakpoint again i can’t type when I’m look nervous sorry whatever okay so and now we’re done with our script great i mistyped it I appreciate your patience I’m sorry we’ll just print the break point itself okay done okay now we’re done now i can continue executing and all we didn’t seem to print out the break point number well let’s ask Python what break point count is and it tells us it doesn’t know what it is oh let’s initialize it and then maybe it’ll know what it is so we can initialize it here directly from our command line prompt now we say what is

break point count the script command is telling it we wanted to execute some Python then we continue executing and now if you look up at the lld be prompted actually printed out a 1 and the next time it printed out a 2 can you see that yeah okay and now I can say I want to drop into the script interpreter and I this is like having a Python at your unix punk’d but you’re inside ll DB I can say print break point count it says it’s three I can say that’s not interesting let’s change it so i’m going to say break point count equals you know 28 get out of the script interpreter and continue executing and it’s now about 29 30 31 so we actually have Python accessible from all these different places inside lld be so that’s a real fast demo of using it now I’m going to get back to the main talk so now i’m going to show you talk a little bit more about the implementation and the users of Python in lld be so the first question obviously is why did we want to have Python scripting in a debugger at all and one of the answers is that it can get you can start using it to set really useful conditional break points because the Python will allow you to get a hold of your debugger objects and then you can query information out of them so you can do things like say I want to set a breakpoint and then I only want to stop at that break point when it’s called by a particular function so if you have a library routine that you want to set your break point in and it’s called all over the place but you really only want to stop in that library function when it’s called from you know a couple places in your code you can now say only stop here when the calling function has a particular name or when the calling functions argument have particular values if you’re doing multi-threaded debugging you can also say I only want to stop with this breakpoint if it’s being hit by a particular thread you can even record the thread ID in a Python variable and then say I only want to hit it stop at this breakpoint when it’s being hit by the same thread that hit it last time in addition to setting conditional breakpoints using Python you can also use oops gotta head myself you can use Python to a help to verse your dynamic data structures so if you have I know a huge binary tree or a linked list or heap or something like that and there’s you know thousands of nodes but there seems to be a problem in one of your data nodes and you’re trying to find that node now you can write yourself a Python script that’s going to traverse your data structure find the node it can report the path to the node it can record all kinds of information about your data structure and take you right to where the problem is you can also use Python to record and register record things like register values when you hit your breakpoints you can record your registry values and your local variables you can use the python file stuff to write it out to a file yes Oh does it oh yeah I’m curious about the support for programmatic things in the other Python debuggers mainly the Python PDB the original Python debugger and maybe any extended ones i haven’t seen scripting before so I’m really interested in this honestly I don’t know anything about the I Thunder buggers I said at the beginning of the talk i’m a real pipe on newbie I’m sorry but but so anyway you can you can record this information to a file you can do it each time the program point is run tip so you can start building up traces of information to help you with your debugger your debugging you can do this across multiple runs of the program you can stop execution go back and record it again and this brings up the fact that you can also use your Python scripts to help you do automated testing and QA with your programs if you have one of these nasty bugs that only shows up once in a while and it’s not predictable you can write a Python script that will run your program over and over and over again until it actually hits the problem and it can collect data about your program every time it runs your programs you have all these logs and records you can compare when you actually find the problem to see what was going wrong and so these are just some of the reasons why we thought it would be really nice to be able to have scripting inside a regular debugger I mean there are lots more ways that could be used this is just the tip of the iceberg so talking a little bit about what we’ve actually got whereas you’ve all figured out already python is accessible from with inside the ll DB debugger and we’ve also made it so that you can access all of the LGBT bugger functionality directly from Python at the UNIX prompt if you want and so basically Python lld be when it’s when it’s compiled and built generates a python module so you can run Python at the UNIX prompt import your ll DB module and then you can start doing regular debugging stuff you can create a debugger you could create a target you can set breakpoints you can run your program you know you you’re often running doing debugging stuff straight from the Python prompt so these functions that I’ve highlighted in red are all ll DB API functions they come with the API which I’m going to be talking a little bit more about throughout the talk if you want to see some good examples of how to do this kind of thing the ll DB regression test suite is actually written in this style with you know running Python straight from unix prompt and then importing the module and doing the tests so that’s a great place to see examples of how to do this kind of thing now back to the main part of the talk which is actually going

to be the Python inside the ll debugger ll DB contains a full and complete Python interpreter and it can be accessed from several different ways so there’s the one line script command oh sorry there’s the one line script command so if you have just a little bit of Python that you want to have executed but you want to stay in your main ll DB debugger context you don’t want to leave the prompt you can just say Python go execute this little bit of code for me and it’ll do it and bring you back the answer so you know here I’ve asked to convert decimal to hex you can also drop into the full interactive interpreter this is I’m just like typing Python at the UNIX prompt but it’s already loaded the ll DB module for you so you’ve got all the API functions and we’ve done a little bit of extra set up to make your debugging tasks easier to accomplish and then you can also run Python from your break point commands as I showed you you can run Python in your break point commands as I showed you earlier so we’ve added some stuff to elevate it to make the Python really useful one of them is as I said at the beginning of the talk ll DB is really a debugger library so as a library it has a full and complete API that allows all the different tools that want to use it to call its functions create access manipulate debugger objects and state and the entire API is fully accessible and callable from Python that’s in the Python ll DB module we’ve also pre-loaded several ll DB objects into Python variables for you so anytime you stop your execution and you want to get into Python your target your process your current target current process current thread and current frame are all in these blue Python variables for you so you’ve got these variables there there you can use them to start calling the API functions and doing your debugging tasks and finally as I actually demonstrated in the little demo that we gave we’ve set it up so you have a symbol persistent Python dictionary for your dictionary for your entire debugger session that’s within the single debugger object so that you can get in and out of your interactive interpreter multiple times and things you define the first time you get into it are available the last time you get into it you can define a function in the interactive interpreter initialize it with a script command you know call it in your breakpoint command functions and they’re all accessing the same persistent dictionary for your entire debugger session so there are two main parts of the implementation of Python and lld be one of them was just getting Python into it so we had the interactive interpreter and you could do the interpreter and the script commands and the breakpoint commands and then the other part was getting the API into the into a Python module so you could actually call it and do some useful debugging tasks so I’m going to talk a little bit about how we implemented the python interactive interpreter stuff in ll DD first so implementing the actual interactive interpreter with the prompts and everything we wrote our own interactive interpreter module in Python we inherited from the interactive console class which is in the code module this our interactive Python interpreter takes a dictionary as a parameter and then all the Python code that gets executed in the interactive interpreter and in the one-line script commands uses that dictionary as its main context so all the definitions go into it or looked up into it and then we use the python c api functions to in ll DB itself to initialize the Python interpreter to initialize threat some thread capabilities and we actually had to turn off Python signal handlers because by default Python installs it’s a signal handlers which is good in many cases but if you’re a debugger you actually need to handle the signals yourself you know you need to be you need to get the signal that says you’ve hit a breakpoint you need to get the signal that says the user wants to interrupt the the running process so we had to turn off Python signal handlers and then we import our own interactive interpreter module and then call the appropriate interactive interpreter function for handling the one line script command it’s pretty much the same mechanism we use the same module we just have a different method in the module that we have to call and it’s a simpler we don’t have to do the looping we don’t have to collect the input and if they give us an incomplete line we just tell them they’ve given us an incomplete line so a real quick overview of how the dictionary stuff works is when we create a new debugger we get a new debugger object the GNU debugger object creates new command interpreter which creates a new script interpreter session the script interpreter session automatically generates the name of the dictionary that’s going to be used for that session in Python and then we tell the inter the Python interpreter to generate a new dictionary for us an empty dictionary and we put a you know in the the Python global dictionary we insert you know the dictionary name for our session and point it to the new dictionary that Python was created so that’s how we get our dictionaries and then if we have multiple debuggers of course we’re going to have multiple dictionaries one for each session so if we go to invoke a one-line script command then what happens is the first thing we do is we look up the right dictionary name in our session we find the dictionary in Python then we find the code for running our one line script interpreter that we’ve imported from our module and we get the input that the user said they wanted to

use now we take this and we have to wrap the the input that the user gave us in our dictionary up into this argument tuple we use the embedded capi function calls to do this and then we on call yep I object call object another capi function to actually execute the code on the argument to poll if we’re calling the interactive interpreter it’s a similar idea but a little bit different again we look up the dictionary name first off and then we create a string that’s going to call our our interactive interpreter type Python run into is the name of our method in our module that runs the interactive interpreter for us and we fill in the name of the dictionary that it’s supposed to run with and then we call the capi function pi run simple string on this this string we’ve created and it executes our method call it using the right dictionary now he actually had to do a little bit more we had to do some io setup and some I’ll clean up and the reason for this is that Python really really really really wants to use stood in and stood out for all of its I oh and when we’re like using the graphical user interface we’ve got like three different windows with three different script interpreter sessions we want to be able to say this io goes to this window in this I August this window in this IO goes to this window so we actually had to redirect stood assisted and insisted out before we could call the interactive interpreter and we also then had to reset them after we called the interactive interpreter and we also do some funny stuff with the term iOS to deal with control D but that’s another matter so then now breakpoint script commands work a little bit differently so i’ll be talking about them for a little bit so when you go to add a breakpoint script command in lld be you know you prompt you for your Python you enter your Python script and you know looks like you’ve just entered normal Python behind the scenes ll DB is going to take what you wrote and wrap it up in a Python function and give it some obscure name so that hopefully the user won’t accidentally recreate this name and we also pass in two more of these Python variables that we load with ll DB objects so in particular you’re going to get the frame and the breakpoint location object for wherever the breakpoint was hit and then you you can when the breakpoint is hit Python sorry lld be loads those objects into Python variables and calls the Python function with them now this means there are two things you need to remember when you write a breakpoint script commanded in lld be one of them is that you actually are going to have the frame in the breakpoint location variables that you can use for calling api functions and doing stuff in your script if you want and the other one is it in your script command if you actually want to use any variable that was defined outside of the script you have to remember to tell python that its global otherwise pythons going to treat it as a local and you’re going to get unexpected behavior now there are two stages to the breakpoint script commands there’s a stage where create the command and then there’s a stage where you execute the command so when you go to create the breakpoint script command first we collect the script text from the user then we have to prepend the function definition line and indent all the code an extra four spaces and then we put all this into a big long text string with new lines in the appropriate places and call the capi function to tell it to compile this string into a Python function for us once we’ve got the compile function we actually tell Python to evaluate it immediately and what this does is if the users given us a valid function definition then it stops the function definition straight into the global dictionary immediately for us and if it’s not a valid function definition then we know immediately that the users done something wrong and we can go right back to you the user immediately and say you’ve given us a bad script we can’t use this but if they’ve given us a good script then addition in addition to putting it in the dictionary we also create the call for the function and attach that to the breakpoint now when the breakpoint gets hit and you actually want to call the function again we have to look up the dictionary in the script session we find the dictionary in Python then we have to find the breakpoint function to call for the breakpoint and then see right then the next thing we have to do actually is we have to get our frame object an hour break point location object and stuff them into Python variables so that we can pass them to the function we create an argument tuple with the frame the breakpoint location and the dictionary and then again we call we use the capi function to call the code on the argument tuple so that’s kind of how we implemented the interactive interpreter and the breakpoint command scripts in ll DB now I’m going to talk very quickly about how we made the lltv python module and it was really very very easy the one-word answer is we used swig so for those of you who don’t know swig is a simplified wrapper in interphase generator it’s an open source tool that parses C and C++ interfaces and it generates a glue code that allows Python and other scripting languages to call into the C and C++ code so we take our header files and our swig input file which tell us what header files to use in some type definitions and stuff we

pass it to swig so it generates our C++ file with all the wrapper glue code and it also generates a py file to actually call the the C++ stuff after we’ve compiled it we also take our ll DB sources and the C++ file that seats wig generated run it through the compiler and s gives us our lld be shared object and the shared object together with a py file is our module that we then can important to python to access lld be so that’s pretty much how we implemented on Python in ll DB now I’m going to talk about some of the problems that we ran into and how we solved them so there were three main problems i’m going to talk about one of them was trying to pass pointers in c++ objects back and forth between Python and end ll DB then how we maintained a single dictionary across the entire debugger session and finally how we dealt with the fact that we have multiple sessions with seemingly multiple script interpreters under a single embedded Python interpreter so passing pointers in C++ objects to python the api operates on debugger object so it operates on targets and processes and threads and frames and stuff and ll DB stores these things as objects C++ objects or pointers now embedded Python only really passes scalar data types back and forth between C++ programs and Python so we had a real problem trying to figure out how to get the ldb objects over into Python so that the API could call them so you know to illustrate the problem here we have a process object on the process class contains several methods including get num threads and we have our current processes running so there’s a pointer to this process object and we’re over in the script interpreter and we want to be able to call get num threads so what do we do well if you try calling get numb threads directly python says correctly that the name get num threads is not defined because it’s not at the global level it’s a class method so the next thing we might try doing is we say it well let’s tell Python it’s in the ll DB module it’s in the SP process class and it’s a gettin um threads method again Python complains and particular it says get num threads has to be called with the process instance you can’t it’s cute can’t call it as an unbound method so that means we really have to figure out how to get a process object into a Python variable so that we can call this method that we want now Python the C API provides a couple of methods for converting stuff from types from C to python but basically it converts strings it converts integers it converts other numbers it converts these things called PI size and PI object but if you want to convert anything else the programmer has to write the conversion and this is very complicated you I’ve read the books on the types in the the the way you set it all up and it’s got its very ugly and not something that as a newbie we really wanted to try it’s very difficult very easy to make mistakes and we didn’t want have to do this for all of the types that we have in lld because that’s a vast number of types so we thought about this for a while and we thought we came up with a few key insights the first one is that integers are very easy to pass back and forth between C and Python there are lots of different ways you can pass the integers back and forth the second insight was that we have control over the API we can write whatever API functions we want and it’s very easy once you have an object to get to another object so if we have a target we can get to a process if we have a process we can get to a frame so the real problem is we just have to manage somehow to get a single object across into Python and then we’ll be off and running so we decided to use a combination of api api functions and integers to do just that so we decided the single object we were going to focus on getting over there was the debugger object because again that’s kind of the top-level object and from there you really can get to everything and we’re going to start out we decided to attach a unique ID unique integer to every debugger basically give it an ID then we’re going to pass the appropriate debugger ID to python so whenever you want to get into python we’re going to set up this ll DB debugger unique ID object with the idea of whatever debugger is trying to access python and then we’re going to create a static api function and so we added a new method to our debugger class find debugger with ID and the important part is that this is static and because it’s static you don’t have to have an object to call it when you can just call it directly and so that’s what we do so when you type script in LDV to get into the script interpreter the first that it does is it actually goes and finds the debugger ID and sets the debugger ID variable correctly for you then you drop into the script interpreter and then I put it into the shorter variable because I don’t have enough room on my screen and then you can use the new find a bugger with ID function to actually get your debugger object and put it into a Python variable and now we’re often running we have a debugger object from there we can get a target from the target we can get the process from the process we can call get

num threads and it actually works so we have succeeded more or less so the next problem that we had to face was using a single dictionary across the entire debugger session and so there were two reasons why we really wanted to have this this dictionary for the entire debugger session one of them was the idea of having persistent and reusable definitions so that you could define something in one dropping into the interpreter session and then call it from the one-line script commands and have it you know callable from the break point commands and not have the definitions disappear between each of the pieces and the other idea again is as I said with the GUI we had to have multiple script interpreter sessions and so we really had to have a way of keeping independent and non-interfering definitions so again this is a reminder of the general setup and what it looks like you probably already remember this so with the interactive interpreter and the one line script commands it was pretty easy as i said we wrote our own interactive interpreter module that takes a dictionary as a parameter and so then all of the interactive stuff is executed in terms of this dictionary all the definitions and go into this session dictionary all the code is executed in context of the session dictionary and the dictionary because it lives in the Python global dictionary which persists across the entire debugger session all the definitions automatically also live across the entire debugger session no matter how many times you drop in and out of the interactive interpreter or the script interpreter or whatever now the breakpoint commands were a real problem and the reason for that is there was no encapsulating run environment we had no way of using the session dictionary as a global dictionary for the break point commands because we had to wrap up the the frame object in the breakpoint location object as parameters and pass them into the the break point command script function we really didn’t have the option of using the interactive interpreter code to execute it so the breakpoint script functions actually are called from the global Python environment therefore what we actually did was we decided to modify the global environment carefully and this really works because of the way we set up the interactive interpreter and the script interpreter the user has no way of really directly putting definitions into the global environment so remember this if you enter this kind of script as your script command lldp will automatically create a Python definition for you with these parameters it actually throws in the third parameter which is the session dictionary for whichever debugger session is calling the breakpoint script command it also adds some code before the users code that does some dictionary set up magic that I’m going to talk about and it adds some code after the users code does some dictionary clean up magic so the dictionary set up magic works like this going into the users script we have our global dictionary with its values and we have our session dictionary with its values so we make two lists recording what the keys are in both of the dictionaries this means that before the users code executes we know what was supposed to be in each dictionary then we extend the global dictionary and we actually copy all the definitions from the de bug recession dictionary into the global dictionary and then at this point we’re ready to execute the users break point script command in the context of the global dictionary after the users code executes we have to do our cleanup and the clean up goes roughly like this not surprisingly the first thing we do is we take all the the values for the users value users definitions and copy them back into the session dictionary so in this case a couple of things have changed and we move all the appropriate user definitions back into the debugger session dictionary we pull them out of the global dictionary then the keys themselves go away because those are actually local variables that we set up in the sprit point script command and so now we’re happy we’re done we’ve got all our dictionaries back the way out they ought to be the third problem that we had to solve was the fact that we had to simulate having multiple script interpreters well we really had one underlying script interpreter running in Python so again just as a reminder I went over this before the GUI needs to be able to launch multiple executables in separate window is each with its own de bugger session so you need to be able to simulate having multiple script interpreters with dictionaries switch between them and keep the isolation between them complete have thread safety and no dead blocking so one thing we looked at was using the capi function pi new interpreter but there were some problems with that the first one is that it doesn’t fully load and initialize modules into all the new interpreters so this means we could load the ll DB module into the first interpreter and it would be fine but we load the LED module into the second interpreter it’s not fully loaded it’s not fully initialized it’s not fully there and same for the rest of them according to the documentation some extensions may not work properly it does not fully isolate files and I oh so you might end up you know doing something in one window and having the solution show up in another window and finally they can insert objects into each other’s name spaces so your dictionaries are not guaranteed to be isolated so this approach was rejected the next thing that we looked at was possibly relying on the Gil the global interpreter lock so the global interpreter lock you get some functions that ensure your state

and allow you to set it and release it and deal with threads and it does serialize calls into the interpreter and prevent deadlock so that’s good however it could release the lock too soon the Gil decides okay this thing’s had the lock for a little bit of time I’m going to let it go and have something else have the lock so if the user was in the script interpreters in the interactive interpreter over in this window and this window hits a breakpoint script command the Gil might suddenly decide okay you’ve had the lock long enough I’m going to let you have the lock and the breakpoint script command can start executing all the users still doing stuff in there while the users trying to do stuff in there interactive interpreter session and it could lead to problems and again it does not guarantee non interference between the separate session dictionaries so the Gil seems to be insufficient for our purposes so we decided we were going to have to write our own a locking mechanism so we used basically a combination of mutexes and predicates every session has its own input and help pseudo terminals again because they have to write to the different windows they have their own session dictionary and they have a boolean indicating whether or not they’re active when particular debugger thread indicates it wants to access Python then we first say does it have the Python lock if it does not have the Python lock we say well can you get the Python lock if you can the Python lock or you have the lock then we do our setup are on contact setups we set up the dictionary we set up the i/o we set up the convenience variables then we execute our Python and release the lock if we could not get the lock then we print the error message saying you can’t do that right now because the interpreters locked in this window it’s you’re using in another window and then depending on what the he’s just trying to do we either go back and try to get the lock again or we just return and leave it at that and again once you release the lock you return so this bit up here guarantees that we have thread safety and no deadlocking because we’ve got your one lock that you know only one thread at a time can get and this bit down here guarantees that we have the correct dictionary you know early release because once you get the lock you actually go through and finish executing whatever you were going to execute before you let go of the lock so how do we do well we have seemingly multiple script interpreters and dictionaries we have the ability to switch smoothly between them we have complete isolation between our sessions and our session dictionaries we have thread safety and no deadlocking we do not have true concurrency Python just does not allow that so we are still serializing accesses to the single underlying script interpreter and until somebody can convince Guido to change things that’s why it’s going to be unfortunately so that’s basically the implementation of ll DB and the problems that we ran into now for the next part of the talk I’m actually going to show an example of using scripting in lgb to solve a real debugging problem or a kind of debugging problem so for this example we’re going to have a sea program that reads in an input text file and stores all the words in a binary search tree and then the program can be asked is the word in the tree and it’ll say yes or no now since this is about debugging talk obviously there’s a bug in the program otherwise it wouldn’t be very interesting so um we’re reading in the play Romeo and Juliet and we’re looking for various words that should be in the play and it’s finding some of them but the word Romeo which I am dead sure is somewhere in that play is not being found so um there are several reasons why it might not be found one of them is that the word never got inserted into the dictionary another possible reason is that it got inserted but it’s in the tree in some unexpected location so that the binary search algorithm just isn’t finding it so the first thing we need to figure out is is the word in our tree or not well if it were a tiny tree we could look at all the nodes by hand and say and try to find the word but of course Romeo and Juliet has thousands and thousands of words the tree is going to be enormous trying to look for it by hand is just not practical luckily we can write a Python script to do this for us and that’s exactly what we’re going to do so the plan is we’re going to write a recursive depth-first search function in Python we’re going to stick it in this tree you Dalls file because you don’t want to write long things at the interactive prompt or you make typing mistakes we’re going to attach to our running program using the ll DB debugger then drop into our interactive interpreter and call the depth-first search function on our binary search tree the depth-first search function if it finds the word it’s actually going to return a string representing the path from the root of the tree to the node where it found it so this is roughly what the depth-first search function looks like um it takes three parameters the first one is actually going to be our binary search tree or note in our tree because it’s a recursive function on the second one is the word that we’re searching for and the third one is a string representing the path from the root of the tree to our current node now these functions in red are all part of the lld bapi those are all functions so the first bit of code up there at the top is actually just getting the the fields out of the node and putting them into Python variables and then starting with you if statement that’s the main body of the depth-first search function I assume you probably know depth-first search so you just say you know have we found the word no should we go left go left should we go right go right standard depth first search so seeing what it looks like actually when we use it we’ve attached to our running program we drop into our interactive script interpreter and we import the file that contains our depth first search function

and now we have to take our binary search tree and put it into a Python variable so that we can pass it to the depth-first search function so here we’re actually using one of those ll that one of those convenience variables that I talked about at the beginning of the talk remember I said that Python ll DB preloads certain Python variables with bits of your state that you’re going to need the target in the process and the frame and the thread so we’re going to use our frame we’re going to ask the frame to find a variable using the API function and the variable we’re going to look for is the one called dictionary because that’s the name of our binary search tree so it’s going to find the variable name dictionary in our current frame and it’s going to put it into the Python variable named root then we initialize our current path to the mt string because we’re starting at the root itself and we call a depth-first search function we’re going to pass in the binary search tree the string Romeo and our current path string and if it finds the node it’s going to return the string to the of the path to the node in the Python variable named pass so then we print path to see what it is and sure enough it found our node and the path to the node is left left right right left so we’re halfway there we know that the note is in the tree we found it the next question is why didn’t our search algorithm find it and how are we going to figure out where the problem is well the answer is we’re going to use break point script commands so the idea goes like this we know this is where our word is and we know that a binary search algorithm has two decision points the point where it decides to go right and the point where it decides to go left so we’re going to set breakpoints at each of these decision points and attach a break point script command to each decision point the breakpoint script command is going to compare the decision with what the past says it should do at that point as long as the decision agrees with a path we’re just going to keep executing but as soon as the decision differs from what the path says we’re going to should do we’re going to stop executing and say we found our problem so this is what the breakpoint script command looks like at the decision to go right and again remember ll to be wraps it up in the function passes in the frame in the breakpoint location for you so we’re going to use our global path variable that was returned by the depth-first search function so this tells us the path that we should be looking at we’re going to compare the decision to go right with what the front of the path says if the path agrees with the decision then we’re going to strip the first character off the path and resume execution we resume execution by first i’m using the frame variable that we know is there lldp gives us to us in our breakpoint commands we’re going to use our API functions to get the thread and then get the process and then tell the process to continue execution so from the users point of view if the path agrees with the decision then we don’t even see this breakpoint it just keeps executing but if the decision disagrees with the path then we’re going to stay stopped at this breakpoint and we’re going to print out an error message for the user saying we found the problem so looking at this in an execution we attach a breakpoint command at the decision to go right we’ve already seen what that looks like we attach to the breakpoint commanded the decision to left to go left it looks just like the one to go right except it says left and then we continue executing it executes for a little while and then it stops and prints out this error message so we actually seem to have found the problem so now let’s look at the tree and see what we’ve got so at the current node the word is dramatis the word we’re searching for is still Romeo that’s good now the tree is sorted alphabetically and Romeo is greater than dramatis alphabetically so Romeo should be to the right of dramatis and it says we were trying to go right but the path says we should go left that seems a little bit odd so we’re going to ask Python again prin us the pass from the current node to the word Romeo that you found it says the path is left left right right left and for those of you who are observant you will notice that this means we actually have a problem in the very first note of our tree because the path hasn’t changed yet so we say ok let’s find out what we actually have left left right right left from here and we look at that and it’s the word Romeo but look at this we have an uppercase R versus a lowercase R our program contains a case conversion problem and that is the bug so basically that that’s the end of my example of how you might use Python scripting to solve a real debugging problem if you’re trying to find a problem in your program so in summary I’ve shown you hopefully that embedding Python into debugger gives users a great deal of power and I’m the ability to do a lot of really cool useful stuff that they couldn’t do without a scripting on Python ability in their debugger I’ve showing you how we sold a couple problems including passing pointers and objects on 22 between C and C++ and Python and also how we managed to maintain the fiction of having multiple separate Python interpreters with really a single underlying interpreter running if you want more information that you can always go to the ll DB website there there’s a full project description you can download the code and look at it if you want to look at the API you could look at the header files that’s the best place to examine it there’s also the scripting example that I went over at the end there’s a link to the example on the web page with more explanations and descriptions if you want and there’s a link to the developers mailing list so at this point I think I’m ready for questions and comments yes

you did a lot of work to get pies in to fulfill your requirements so besides getting the nice food here and this nice conference why did you pick pies why did you not pick for example Java Script lure or tickle something that can do multiple interpreters a thread-safe nurse and was it sings well when we woke up a couple of reasons one of them is I’m when we first started we didn’t realize that the multiple script interpreters was one of the requirements and so we were doing this thing in Python pythons an easy to use language easy to learn a lot of a lot of people use Python so we thought Python would be a good thing for a lot of programmers to be able to do their scripting in and then when we’re told oh and by the way you’ve got to make it work with these multiple sessions we said lovely now what do we do so so that that that’s kind of the show requirements gathering was not perfect yes and basically just a quick question about lld be can it be used easily to debug programs that haven’t been compiled with llvm it should work on any any compiled program yeah well it needs to Worf debug information to work well but um any other questions um okay currently it runs best on Mac OS X I didn’t say but most this development actually happened at Apple but it’s also there there are some developers have been working on a port to linux and i believe the linux port is either fully functional or almost fully functional so it Linux and Mac OS X now and we’d like to get it onto windows and some others but that’s what we’ve got at the moment okay so which languages can I compile with this compiler llvm mostly does see languages so it’s see I’m a C C++ objective-c an objective c plus plus at the moment okay and why should i use this compiler instead of GCC and be nettles toolchain ye ok a couple reasons llvm the LLVM compiler it actually it does a better job of optimizing than GCC it’s a much more modern compiler so it’s got a much better more coherent design it’s got more advanced optimizations it runs more efficiently you know there are a lot of good reasons for using llvm and ll DB actually takes a lot of advantage of llvm so when you have both ll DB and llvm on your system for example ll DB uses the llvm parser to parse all the sea expressions in in the debugger so we didn’t have we didn’t try to write our own c and c++ parser and we didn’t try to write our own um we actually use some of the abilities in llv LLVM compiler to do some stuff in lld be so you can we have some code there’s there’s i didn’t show it to you but there’s an expression command in here the way you can actually write a C or C++ expression have it um jittered and executed in your code so you can do all kinds of very useful stuff with that um so ok and for reach architectures can I use this compiler it’s only for Intel CPU or can i use forearm or some other arm oh definitely works for forearm and intel and i’m not sure of all I’m less sure of the backends that the compiler works for I’ve been working largely on the debugger side but I believe it works on across a wide variety of ends um I go to the website and check I’m sorry but it works on an awful lot of back ends so LLL vm is been around a lot longer than ll DB and so it’s a much more mature product as I said ll DB is still in the beta release version so okay thanks actually how much speed do I lose using the debugger and does the rely on the cpu extensions for debugging on each supported platform um I’m not sure I understand what you mean by losing speed compared to if I do it did not attach to the process how much slower that doesn’t actually get when well checking for break you you can’t really compare running the program in a debugger versus running the program without using the debugger because the debugger stops the program and so you’ve got it stopped so yes but but on a function where I actually have no break points um you shouldn’t have any slowdown at all then okay does ll DB contain any support for debugging interpreted programs not at the moment unfortunately because it would be interesting if I could debug attached to running Python program and debug it

using the sort of peevish things I completely agree but it’s not there yet yeah if you want to write it you know it’s open source please feel free I’d love to see it yeah anything else um no we weren’t really aware of anything besides swig at the time and I repeat we were newbies and you know did what we could figure out you talked about having difficulties getting a Python usable object into the script yeah I was wondering why you didn’t have the same trouble with values that were returned from api functions that you could then call um largely because the api functions went through swig so sweet took care of all the heavy lifting in terms of figuring out how to do the type conversions for us we just busted that split would convert it properly and it would end up in a Python variable for us okay or in just one more question did you compare this compiler to the interest compiler you know c compiler compiler and what is the difference of the speed um I’m sure the comparisons have been done and I don’t know the answers I’m sorry okay again and if you really interested you can look it up on the web I’m sure it’s there hey Owen okay no more questions okay so thank you

You Want To Have Your Favorite Car?

We have a big list of modern & classic cars in both used and new categories.