Uncategorized

okay so uh negative segmentation conceptually this is the sort of capability segmentation is supposed to achieve you’re supposed to be able to have some memory space and it’s broken up into a bunch of different segments going every which way but basically each of these segments registers of which you have six each segment register will select from many different segment descriptors so there’s a bunch of these structures which describe different ways that you can chunk of memory and so six of them could potentially be active at one time and so for each of those it would be basically something like this DSR code segment register would say okay here’s the base address of my code segment and you know here’s the limit you take the base plus the limit and you get some chunk of memory and we’re just say that’s all code and so built into segmentation is already these security notions of code segments are executable and readable but not writable okay so these things already segmentation already implements this sort of notion of non-executable data and non writable code so as I just said if you have something which is the data segment you know some segment register will select one of these descriptors and the descriptor will say here’s where data starts here’s where data ends and built into this is the notion that all data segments are not executable so you could do a jump over to some data segment but it’s not going to execute code name yes this is sort of just a it would be some other thing other than the base and limit but it’s the structures are gonna look sort of different than that so it’s not strictly just access there’s a few different fields and stuff in there but but yes the the chunk of the descriptor other than the base in the limit is going to describe whether or not it’s code whether it’s data whether it’s some other things and what ring it’s in so basically we have there’s this notion of linear address space we’re going to be using these terms like linear address space and linear addresses and stuff a bit throughout so the thing you need to know about linear address spaces is that eventually you can think of it like the virtual memory address space for now a linear address maps one-to-one to a physical address so until we talk about paging a linear address is just a physical address so the segmentation therefore are taking chunks of physical memory and assigning them permissions like your code your data and your ring 0 your ring 3 so to locate a byte and so if you’re trying to find memory in some segment you use what’s called a logical address and and this is going to be a key point in this class is that in reality no matter what you’ve learned about virtual memory addresses before and things like that all memory access by the hardware uses these logical addresses analogical address also called a far pointer so it’s potentially just far pointers another name for it a logical address consists of a segment selector and an offset into the segment which is being selected so if you’re specifying a logical address you say I want that segment and I want this offset into that segment the segment selector is 30 16 bits and the offset is 30 cubits let’s say that again later when we get pictures so logical address though is a 48 bit address and all hardware is actually using these 40 bit segment selector plus 32 bit also the physical address range since we were talking about linear addresses the physical address range is defined based on you know whatever you can actually try to go talk out to ram on the memory bus that’s your physical address space we talked about that and CPI D we said it should tell you what your physical address range is what addresses can actually be talk to you in real RAM chips but for a normal system we think of it as being just for a normal phone system you think of it just being to 230 we that for gigabytes and maximum address range in normal coordinate system in detail and later on we’ll see physical address extensions which kicks that up to 36 bits but not all at once all right and obviously all right so the linear address space that we’re saying right now it’s just physical address is is a flat 32-bit space but when we only have physical addresses that 32 bits is reduced to however much physical address the

physical RAM you actually have and so as I said before when paging is disabled which as far as we care right now it is linear addresses map one-to-one physical this right now we have some notion of logical addresses up to linear addresses and all memory access is actually logical addresses and so we said logical addresses are a segment selector and a 32-bit offset into that segment the segment selector which is more than one of those segment registers it’s selecting from some big table of those descriptors so here it’s explicitly calling out and saying there’s a descriptor table and I select into that and then I pull out a base address from that and I add that to my 32-bit offset so going back to this port picture for instance there’s some table of these segment descriptors right here in the middle and it’s just a big array saying like here’s one segment here’s another segment and it just specifies how they look in memory and these segment registers store segment selectors which you’re saying here’s some offset into the table and that’s the segment descriptor I want and then you basically take what you’re trying to access memory you’re also you’re not just giving a segment selector you’re not just saying I want to access memory in this segment you’re saying I want access memory in that segment at some offset into the second so this is sort of this picture is it’s critical to understand and we’ll be coming back to it again later to reemphasize this but first you gotta pick which segment you want to deal with then you gotta say how much of an offset it’s going to be into that segment that gives you linear address that’s right now we say addresses are just physical addresses so that just gives you the actual addressing lab where your data is that gives you just a land where your total bytes so in the big picture of things this has been how it looks and right now this is the won’t fold over your eyes this is where paging would be but since we have no paging right now this logical address up here DKA far cleaner where you have a segment selector and an offset take segment selector pick something out of a table now we’re adding in these table names there’s a table called that will be descriptive table and there’s another one so you say I want to pick this description of a segment out of this table and then I want to add this offset into that segment so at this table if we selected this segment descriptor right here and it says hey my segment starts at a one two three zero zero zero zero then you say and I want offset one two three you take base one two three zero zero zero plus 32 bit offset 1 2 3 you get 1 2 3 1 2 3 and then that’s the actual linear address which you trying to access since there is no paging you then the hardware goes up to physical and it tries to access you know a keyword or invite you to whatever of the physical land at address 1 2 3 1 2 3 alright so now we have to understand how those segments likely work cuz they start the process off for us right they say I want this 32-bit offset to be in the context of some segment and so which segment where is it how do I find it that’s handled by the segment selected so segment selector is a 16-bit value held in the 16-bit segment register so in this picture we said we got a bunch of segment registers and they’re just pointing at some descriptor so these are just each of these registers is just 16 minutes so you got a 16 bits yes SSDs will sort of describe the conventions that go along with the convention would describe the way the hardware uses those but so you’ve got up to six of these 16-bit registers and each of them is potentially has a segment selector which is a 16-bit data structure that we saw on this page so now we’re talking about the 16-bit data structure which is held inside of one in segment registers fairly simple data structure only got three fields the first field is two bit field it’s called RPL or requested privilege level two bit field meaning that it can hold value 0 through 3 we might think that this privilege level thing may have something to do with ring 0 but right now just put that on your mental stack we’ve got a two bit two bits the least significant two bits in a segment selector are saying something about the requested for those letters then we have one bit the table indicator of it and this turns out to specify there’s two different tables I can select from when I want to go find a descriptor one of them is called the gdt the global descriptor table and this is something which is used in every process in the kernel everything else but in some cases the OS may want to have segments which are specific to a given classic and in that case when it uses is something called the ldt the local descriptor

table and so if you want there to be some chunk of this processes memory which is marked as you know non excu both data then you use the ldt for that process and basically the OS will swap in and out different LD T’s when it swaps around different processes and you could duplicate things between the two of them so if you mean you could certainly have index zero in the global descriptor table look the exact same as index zero anyway welcome the script of table but there would be no point to it because you can always access the GDP entries from any thing if you have something that you would put in both just put it in the gdt and use that way you only want stuff from the elder key to be whatever is specific to that process which isn’t necessarily accessible by any other process only global in the sense that everyone sees the same thing with the gdt but for the LD key US has the option turns out not to use that option at least on Windows but it has the option of swapping in and out different LD T’s for different processes so that they can see memory segments differently from their name so the table indicator just says for this you know if my CS register right now is pointing at some segment first I go to this channel indicator and I say is this an index into the GDP or LD t and then I take these top 13 bits and I say that’s my index into the GDP or anything Oh pick your table based on the table indicator and then pick your offset into that table where the offset is you know it’s an array of these of these descriptors you’ve got one descriptor 1 descriptor one descriptor so index zero is zero offset index one is eight bytes outside because each yes I think Linux uses the LDP I’m not 100% certain on that I’ve heard that I’ve seen reference to the fact that there’s some virtualization detection things which look at the LDP information and that it’s different inside of VM then it’s not on Linux but that definitely is not the case in Windows so I’m assuming this organization detection stuff it is but I’ve never come from you we will talk about at the very end of this slide that there’s a reference to the virtualization detection and stuff that did say hey look the only T is different from IBM they are up to 13 entries so yes they can hold up to these 8,000 possible description so you can have 8,000 for everyone and then each process can have another 8,000 specific to it well maybe less than that because turns out the LD T is actually in the DD T so in order to find the L DT you have to go into the GD P a little less than sixteen thousand twelve certainly more than secure yeah that’s potentially limiting all right so if we dig down into each of those segment registers themselves we said first of all again sanguine registers are just holding one of those 16-bit descriptive names and the hardware is always using you know whatever CS is pointing out RSS is pointing out it’s saying you know oh look I want to find CS stuff I go to table indicator that says GDP and you want index I so the code segment selector the important thing here is that the hardware always uses the code segment whatever whatever segments likely you’ve got in CS right now the hardware is going to walk these tables and cache the information it’s been a walking table and say if you’re trying to execute code somewhere at some logical address the hardware is always implicitly using the CS so if you don’t say anything else the hardware will implicitly use the CS segments later so if you just think that you’re jumping to address you know one two three one two three in reality the hardware is saying okay well code that’s running right now what’s your CS set C your CS is selecting this segment and so if you want to jump to one two three one two three I will add that to the base of whatever is in CS right so CS is always implicitly being used by the hardware to say this is the base of your segment or code access so I think we’re going to see this a little later – I don’t think we really had a notion of it in the first class but the processor differentiates between code access and data access when you’re jumping around and you’re a IP is changing the address of the next instruction in the he is changing that’s a code access to the hardwired knows the IP is about

where do I need to pull from memory to get code instructions so he treats that one way dealing with this des Vosges for data access when you’re just saying hey move data from here to there it’s then implicitly using this SS it’s calling it the stack segment and yes the DES is called the data segment when you dig down into it all data access if you do not specify your own segment that you want to deal in but we saw before with that well just to quickly diverge we saw with that rep move s and rep store s or rep start moving if you look at the actual instructions they are actually specifying a specific segment one of them says you know always move to destination at D s : something that’s a logical address it’s saying take whatever BS is treat that as my segment selector take that you know brackets EDI treat that as a 32-bit offset into that segment so in that case there’s actually an instruction which is explicitly you’ll see it and give this assemblers they’ll write that showing you that this always uses certain segments and it turns out to actually use them wrap move assets like using D s and yes so it’s technically copying between two different segments but in reality the OS sets it up so that they’re the exact same can you um can the segment’s overlap so you can treat code as data and it is absolutely yes so the segment’s can overlap in fact they can completely overlap and in that way code can be data and data can be colored yes but I don’t want to spoil the surprise for later all right the up segments can overlap so you can have a base address of one they suggest the same I mean the fact that I just said that in reality the OS makes D s and es the same so that when your rep moving it’s really just functionally moving from one space those D s and es they’re both the exact same completely overlapped memory segments but the point is you could envision scenarios where you have complete separate segments non-overlapping and you want to copy out of one segment into the other segments so you need to set yes to whatever and said yes to whatever and then they’re between them if you’re trying to enforce actually productions forces but back to the point about SS reality although it’s called SS the hardware is using SS the stack segment for all data access so I said if it’s trying you know if the hardware is looking for VIP for new instructions it’s actually using CS l code access implicitly using CS if you don’t tell it otherwise all data access implicitly uses the SS if you don’t and the only thing yeah we’re sorry say again where’s the where England’s accent so all of your data and all of your say if I say that everything implicitly is using SS it turns out that all of your data and all of your stack and everything else is actually somewhere within that stack segment and we’ll get to that later how it all works out but it definitely does work out so it’ll all become clear at the very end when I spoil everything but for now you just think that an OS wanting to set up something that this is code and this is data they can say you know my code is based at zero those two you know two gigs or something and my data is based at two gigs and goes to four gigs you can really have those completely separate and any time you access code with any kind of excess need in there all right so bad IDs it stands for data segment and yes that stands for extra segment and then there’s FSGS that don’t stand for anything they’re just some extra things so in reality you can think of des in GS as just like general purpose segment registers for whatever you want yes and SS are used by the hardware for data operations code operations the rest of them you’re free to put whatever you want select whatever segment you want and then you know maybe you want to jump to D s : something right maybe you want to jump between segments call between segments access data in different segments etc it all depends on what the operating systems actually try to achieve oh but the only key point I want you to take away from this slide is that and I mean this was new to me when I dug into really understanding this stuff is that the hardware is really using CS for all of those nice times where you see instructions if you’re not saying override CS and use my segments it’s actually using CS for

all code jumps returns calls if you’re saying you know just move you know some you know if you’re just saying you know move from memory to register or something like that you don’t say move you know GS : address to register EAX it’s just using this wherever SS is and offset things of that stacks of it alright so one point I would say here is that there’s the visible part so in terms of how the hardware actually uses it there’s what’s called a visible part which is the segment selector that’s 16 bits but because you don’t want this this will be a recurring theme hardware doesn’t want to keep walking tables all the time and always have to go out and find stuff so in reality when you specify some code segments libraries you put a second selection CS hardware we walk it once it will go to the offset in whatever table you asked it to but then it actually caches the information in what’s on the hidden part and so it caches that entire descriptor out of that table so it takes the table information pulls it all into this hidden part and you can’t access the hidden part but the hardware is using it but you can’t access the visible part so that’s just one point that once you’ve accessed something in the code segment once the hardware forevermore is really just consulting this and we’ll see an example that sort of let’s not walk the tables let’s just cash it later in the tech all right so this again is just what I was saying so in reality even if you say you know move ESP and reality of the or even you’re pushing or popping in reality it’s always an SS ESP if you don’t specify otherwise yeah as always yes yeah and this is yes so this is what I was wanting to try to say I mean we could go look at an action disassembly example for to confirm it as well but when you look at the actual instruction manual or move s it says move you know II X or CX double words from D s that’s specifying a segment : psi you know going to memory identification so it’s saying that’s a logical address so it’s using a logical address 16-bit segment selector is stored in the DES register 32-bit offset here stored in the ESI and it’s going to memory at that location so this is a rare example where they will explicitly call out 8 this is covering from two different segments so if you don’t want them to be different set them to be the same thing if you do want them to be different make sure you change the SMEs alright and again if you want to override things you can in inline assembly it would really just usually look like tacking on a segment register : 32-bit offset so in that way if you know there’s some and this one you know the reverse engineers will will be familiar with this if you know there’s for instance some data structure stored at the base of FS as is the case in Windows you can just say FS : brackets 0 and then whatever that data structure is well let me just they just skip them to the chase here in Windows by convention windows always exports a big data structure to every different process they each get this done I think it’s clever thread environment plus so they have what’s up Fred blog and every process when the window has let’s that process run it makes sure to always set up the segmentation information so that the FS segment selector is selecting a segment where the base points at this data structure so when you access FS brackets 0 you’re accessing the first element of the data structure you’re accessing FS brackets 32 you’re accessing you know 32 bytes in so when your voice engineering on Windows you’ll see this frequently you’ll see code accessing the threat environment block 10 and you know maybe it’ll walk some number of things in get a pointer pull off the ped costless environment block and then it’ll start accessing that etc we actually in the you know we didn’t well on this but in the life of binaries class the virus code and I think something else well in the life miners class in the virus code in order to find some data structures in memory I use the environment block I just doing something like FS 0 and I walked the data structures so if you go back and we look at the virus code and now be able to see this little thing in and I had that time thanked Corey because he pointed out this was in actually some an exploit tutorial talking about different types of shell code because accessing this data structure knowing that this data structure no matter what cost us you’re in is always based at FS because Windows just does that by convention that’s

useful for viruses and exploits to know ok from there whatever data is in that data structure some of its useful to me something that’s not but if there’s something useful to me such as a linked list saying here’s where one module is loaded the memory here’s where kernel 30 cute idea long-sleeved on their kids were ntdll about dll’s overloaded so you can find like the list of loaded modules in a user space process by accessing this FS so you know yeah we do a question in the chat for Dave all right so he’s asking whether the opcode would be different for a rep move ass referencing another segment in reality you cannot access another segment from the retinas that’s why it’s saying in the manual you’re copying from dia or es es you don’t have the option in that case override it so that’s actually why the specified in the manual whether you like it or not this is always moving from this yes to yes now you can change what’s in ES and BS and that’s how you can actually change where it’s coming from but the OP code for it will not actually change because there’s no notion of variability here at saying this OP code is always the literal bytes at 385 well that’s a good question I can’t remember let me think I believe I’m trying to remember it’s really just a permission issue here I’m trying to remember whether or not the user space code could change FS I believe it can but certainly no legitimate program would change its own so the point is the kernel wouldn’t be that data structure is exported by the kernel or user spaces convenience so you’ve really just be screwing yourself so that certainly could be the case or maybe your maybe you want to you know trick the you know if you ever get exploited maybe you want to trick the exploit code which is going to come in and the students all correct maybe that would work but I have a feeling other behind the scenes libraries would actually die because if you dig into some libraries as well you’re gonna see them accessing that s okay so that’s what I wanted to say about the that’s one place where there’s a convention Windows is exporting data structure always at FS and this is one of the points that Kristina Jones had brought up in the previous class he had dug into it because I had had that one question about I thought I had seen at some point that like the stack cookie on Linux was being so some offset up from GS and so I found like is GS on Linux being used as a sort of data structure expert thing and yes it does appear to be the case that she found us something which I need to get into the slides were I was referencing that yes Linux likely knows was doing some I think they called it they called it like that thing in their life in binaries as local storage thread-local storage is what they were calling it so but it’s the same idea it’s just you know some thread information which maybe points to some process information yeah so so basically each of them has a different segment which by convention makes or from quality user space to store some and thread information for that day and so between every when they’re switching between different processes they need to be updating NFS or GS so that it points out that processes specific data structure so that the base address equals you know the right place so that base plus zero gets you the linear address which gets you to the physical address that has the data structure so yep we’re actually going to learn a little bit more about that right now by taking some measurements of the FSGS looking at all the values literal values in the things and then comparing them user space versus kernel space so does Oh so and then yeah they was asking how about in other instructions would be opcode change it turns out the opcode so if you’re hard coding and overriding a thing like I said you know you can do access Dax or you can access FS yeah right now if you’re hard coding it the opcode doesn’t change but there’s something which is at the very end which we may or may not have time to get to called sec called prefixes instruction prefixes and turns out if you hard code this is FS : whatever there’s just like an FS prefix that gets tacked on at the beginning of the thing so that as the you know CPUs through it said okay I’ve got this prefix please interpret the next

instruction as referring to FS for instance oh yeah opcode doesn’t change but a prefix is tacked on the beginning of the up there alright so for this type we’re basically going to run something which just takes each of these registers yes SS yes yes that’s GS stores them into data and then just print them up and we’re going to do it from user space and kernel space to see what if anything different between so to do that I don’t know if we’re gonna have this on your desktop some when you download because I don’t know them I guess not but if you don’t cut it on your desktop you need to search forward to bug view and download that so the bug view lets us see colonel the bug outputs without having to actually be in the kernel and this lets us at least put off and then while I’m using rpm so you need to move over the bug three it should be the first link through Microsoft thing maybe to your desktop and ended up drag the exe out to your desktop you should have a little de bug view with a little picture of magnifying glass covering a yeah in the bug view once you open it the key point is you need to so great of the EULA and then the key point is right here where it has capture kernel it’s like a little gear with like a red slash through it you need to click that so that there’s no longer a red slash through it that you are capturing federal messages and then where it says capture win32 it has the little windows icon you want to click on that so there is and read the / period so we want kernel messages we don’t want you to space 130 key messages oh and then once you do that at this little here then going back into visual studio we want to change our startup program to user space segment registers right click on that it has startup project then we’ll just take a glance at the code here oh did I start a program look at user space segment registers and you can see really I just have some inline assembly which takes the CF register and moves it to a local variable called ICS etc then I just go through and now the only thing here that I did was instead of just printing out the literal value that’s less useful to you you want to interpret it as a segment selector to take the bottom two bits and say you know that’s your requested privilege level take the next bit and say that’s your cable indicator and if it’s zero say that it’s GDP and if it’s one so that it’s ldt and then take the top 13 bits and say that connects that’s what the selector print does so pretty easy so set a breakpoint on return hex oddball I have the best feature but I cheated because I just wanted to move that explains easy set a breakpoint and run it alright and then you know pull up the window again so what it’s saying is look the literal value for CS is one be x1 be but in reality we take those bottom two bits and we interpret them the requested privilege level is three we take you know the table indicator which tells us that it’s segment GT so table indicator was zero and then the index the top 13 bits is 3 so it looks like everybody here has a requested privilege level of 3 yes is that two different things than SS d s and es are set to the same thing which makes sense for that thing when we were saying look you know even if then move left move or rep sauce and stuff like that are using these things explicitly the OS probably should set them to be the same once a fiance understate FS is set to something different and GS is set to literal value of zero which cannot be interpreted because it turns out that GT entries 0 is never considered legitimate and you have literals ero you have indexes 0 that’s not valid if you have literal 0 and table indicator 0 which means Pvt ng T it next arrow anyways

DDGS is not actually being used by those in userspace so let’s see about kernel-space now I hope he’s going to work didn’t test this way alright so you need a command prompt I think I usually test this inside the VM there should be no difference but and one spoke to the MN so this is going to be the known part you’re gonna assuming you have one on the intermediate x86 on your desktop you want to change directory to desktop intermediate x86 intermediate x86 code and then turn off segment registers that’s desktop intermediate x86 intermediate x86 code segment registers alright and so once you get there go ahead and run load dot bat so just unload alright it looks like a mine or you know actually I think it always says that yes ok it’s not a pair but in reality it succeeded because if you go over to the bug view you should see something like this this is the exact same sort of code with that segment interpretation that was done in user space so what do we see here well we see segment register whatever it is looks like requested privilege level is 0 or yes and SS but for yeah yes and yes it looks like it’s still 3 for whatever reason FS set to 0 and GS is still not used so this is looking at our Sigma registers and you know just interpreting what they are so if we want to make some inferences based on that we want to go back so I basically took and made that into a table in your slides so that we can do a little oh yeah right I did have pictures in the slides that if you want to do it on the oh yeah that these pictures in the slide though are in the context that if you’re inside you’re here alright so when we look at the difference between the two things what we see is while user space looks like they have everything sent to our PL of 3 so maybe that’s kind of implying that you know our young has something to either spaceless kernel-space has a few of them set the zero hold on a second almost all right so the difference is IPL is three for pretty much everything in space our PL is 0 for every s MPs and ES and kernel space the indices are different so there’s no overlap in user space code versus data segment so code segment is index 0 or 1 in the kernel and X 2 for data and then in X 3 is code and user space index or is own data in user space yeah GS is always in valid looks like FS is different between user space in kernel space again different indices different RPLS but the DSM es don’t seem to be changed between the display certain space and yeah as I said subject to an old friend you know look through this on Wynne 2011 Vista I don’t know if it’s gonna be the same I would expect it would be except on 64-bit systems even different all right so the inferences we get from this is that it looks like CS SS and FS are definitely different per user space for this kernel space and it’s just saying user space for this kernel space the RPL field seems to pour late very strongly with whether you are in user space with no station right except for that es es it seemed to be the same between either ah and that was the other point it doesn’t change the S or yes when moving between and GS is not usually this is just some inferences I’m not trying to imply anything amazing here yet we’ll talk about big now and some minutes later but this is some initial things just by dumping me things you can get a sense of whether it’s doing the same thing within space

week something different where the privilege levels do it don’t matter that’s what it is so going back to the picture we had before about how logical addresses are translated from the near dresses just to reiterate we know logical addresses are 48-bit addresses that have 16-bit junk that says I want to select this segment and they have a 32-bit trunk that says I want this offset in that second and maybe because some people are wondering if you have a 32-bit offset which is greater than the limit of a segment right if your segment is you know five bytes long and you specify an offset of six the hardware actually enforces and it says look I see that you’re accessing outside the signal bounds it sends a general protection halting the processor handles that s well with interrupts which we’ll learn about later so yeah the hardware is enforcing whether or not you’re overstepping your bounds of these segments for instance the hardware isn’t is the one that’s actually saying you know give me an address and you give it a logical address whether it’s explicitly logical because you’ve folded the segment or implicitly logical because it assumes cs4 code access and SSD taxes are lower as well so is asking what enforces the the request privilege level and that’s hardware as well there’s a variety you know this isn’t this isn’t like the end-all be-all for what specifies ring zero versus ring through so we’re not quite there yet but throughout as we said x86 privilege rings are Hardware enforce so the hardware is checking v sort of bits like rpm as well as some others we learn about the little bit it’s checking those and saying oh look I see you’re trying to access you know some code segment that is ring zero and oh look it looks like you’re currently rings no Hardware says no so the hardware is built into it these checks on the current for these level requested for the demo right so without 48-bit logical addresses select your segment out of a table the table can be the GDP or Alberti we said gdt everybody sees the same thing ldt the colonel at its leisure and it’s option say you serve at the colonel at the pleasure may the colonel may swap out different LD teas for different processes as it sees fit and whenever you’re accessing this logical address eventually it makes its way to the linear address which for now we said is always physical address any questions on what we’ve covered so far we haven’t seen the segment descriptor yet we’ve just seen how you select a segment of the segment selected any questions on anything investable so continuing on so we have a 60 it was not actually sorry let’s see well we were gonna dig into it in a second book from what you’ve seen thus far when you’re talking about base addresses versus offset the base is actually 32 bits so in most descriptors yet we haven’t got cities yet but it is 16 bit for limit and 32 for the base okay so this is what we’re going to go into next what are those descriptors one of the AB structures actually look like but the two key fields are just it starts here and it goes to you there all right so looking at our two different angles then there’s the question of how does the hardware find the gdt in the LDP and stuff like that because we said the OS is actually setting up these tables right so how does the Oh excellent how’s the hardware failure so we sat the Malaysian where now we know the relation that the segment selector has the table indicator bit which is saying look this index is either into the gdt or it’s into the LD so TI equals zero just like if you GDP ya equals one you’re selecting from ldg every entry in these things which is 8 bytes large that’s YC 0 8 16 each entry is 8 bytes that 8 bytes and that is your segment descriptor which we’re going to see in a little bit for now we want to focus on how you find it and so there is a specific register called the GD TR the descriptor table register which points at the base of the table and the limit of the table so it says my GD t starts here in memory and then it goes there and again it’s so you can go and then the LD t turns out to be well and get there but what this picture is trying to imply is that the LD t does not is

unlike the GD t the LD t does not just have a base in a limit that says my LD p starts here and goes to there in reality the LD t actually just has a segment selector and that segment selector selects something from the GD t and that GD t descriptor as a type on i’m an LD p and so it actually points that some other chunk of memory in that segment of memory and the GD t is actually an LD p segment so it’s best final base and it’s best final in it but it’s also saying my type is LD t so if you’re trying to access this if you’re trying to access the LD keeper process the colonel would go around and change that segment selector so let’s say it has you know five LDP’s for five processes process one you know it may have segment selector ten and for the LD t and says if you want to access LD teeth the hardware goes to offset ten in the index ten in the gdp and from there it finds the LDP and from there it goes offset into it so it’s a couple levels of indirection when you’re dealing with the LD feeling it come back to that in a second so the GDP for now is the much simpler case right there’s just a 48 bit register that said the bottom 16 bits in this case so you know don’t confuse this with logical addresses or anything like that this is just 16 it’s the data structure with two fields bottom 16 bits is just here’s how large this is in terms of bytes I like that place as we said before the GP can only have 8192 entries and so to the 16 bytes is a 2009 8192 basically was saying here’s how large this GDP is and theoretically it could be less right you could specify less than the maximum size but the point as you can see it goes up to the maximum size of 8,000 entries and then the upward of 48 bits are just the 32 bit base address and in 64-bit mode this would be a larger register that larger address so when the hardware needs to access the gdt consults the GDP register and same thing with the OS when the OS wants to modify the GDP in monthly checks the GDP register breaks up close to chunks says here’s the base address and then you know here’s a lemon and you know this doesn’t really care about the limit that much it’s the hardware and hardware enforces so if you specify some and you shouldn’t be able to you like where it possible to specify oh yeah you could let’s say that this limit for this GDP okay bill overboard yeah everyone got the password okay so EGR says face right here and there’s some PDT and then it says is you know can you hold on a second we haven’t followed you with the video okay you know can I get the video of the world thank you all right so the GD TR has to know 16-bit chunk of limit and 6:32 big chunk of base it said here is where here’s the linear address for my GD key starts and you know it’s that’s 3/4 limit and bytes it says you know this is say this could you know only you wanted it could be only 64 bytes large and then you would only have eight of these entries in all right so teaching em is eight so you could have a small GDP for instance and if you don’t want to use all of the memory and so the point of the limit therefore is that when someone’s specifying a logical address is that logical address says oh I want index this right here but what if they you have a small GDP and they say oh I want index that right there I want index 100 in your gdt but the limit says no I’m only really eight eight in this eight entry large so this is again for the hardware so that it can you know enforce that if someone’s coming at it with the logical address that’s outside of the bounds here outside of limit you know send a general protection so I was I was starting to say oh no one cares about the limit but yes the hardware does care and yes you can specify if this were not maximum size which I believe it typically is maximum size but if it were not maximum five and someone had a logical address we’ve selected outside of it well it would certainly have to be minimum size and minimum size would be

at least you know two things for a code in Sangin let’s say simplest possible case everyone uses the same code in data segment so I can only have that and they could you know completely cover all of the memory so that’s all you would really leave so in a more complicated case where you have FS pointing at different data structures everything you know depending on how your or less wants to do that the minimum size is something different oh you don’t need maximum size in order for everything to know where to find stuff it’s just a question of what does your OS actually want to do okay really because this type of stuff there’s typically the kind of thing that only the OS deals with when they don’t want how their third party software messing with it their party software could mess with it but it’s not something where they have any you know API documents saying oh if you wanna you know register a new GDP entry do this and that it’s kind of they do what they’re gonna do for their purposes it’s really not something anyone else has meant to mention like messing with it I think if it does problems as we found when we founded the attack which I’ll maybe talk about later all right so this is pretty much it there’s two new instructions here the stars LD LGD teeth for load six bytes of memory into the GDP our rights two bytes for the 16-bit part four bytes for the 32-bit part take something out of memory load that into the register or for DD T says take that register dump it off to six bytes of memory so if you want to look at the base and the limit and the thing you use the store you store it out you start doing it turns out the load is ring zero only so only the kernel could be changing around different base address for the GDP and stuff like that you don’t want them don’t want user space code playing the GDP out from underneath you store them however is unprivileged and anyone can do that and so we’ll get to a point later about theirs when we deal with a lot of these registers we want to talk about the segmentation and then your ups and stuff like that there’s a funny abnormality in Intel specification in that the only people who would ever have any cause to change these things is the kernel for some reason they had the reading the register out was available to user space and that’s the basis for a couple different virtualization detection thanks if they read out the register and oh hey look it doesn’t look like a normal Windows register because virtualization system ended changing it alright so that’s all I want to say about gdpr for right now the point is it’s just how the OS or the hardware finds the GPS because we know that hardware needs to find it when you’re specifying a logical address our needs to break down the logical address take table indicator KP index talk to this table find each of those descriptors leave whatever data is in this script oh we don’t know yet but you know figure out whether you’re trying to access outside the bounds of some other enough segment descriptor so then with the LDP who said if someone were to make use of the LDP and the foot is after this class you can go right here on OS and you say I want to use the LDT the LDP register is in reality only a 16-bit segment slacker and all it does is it selects a segment which must always be selected from the gdt because you can’t find the LDT by selecting something in the LP the point is how you find the LD t is you take this segment selector break it down it always must have a table indicator of GDP and then you say whatever this index is that’s an index in the GDP and that specific descriptor in the GDP should have itself marked as hey I’m an LD P and it should say the LD t has a base address of this and so like this kind of says right here in the segment descriptor which is going to get loaded and it’s going to get cached in an invisible area again when the hardware is trying to access the LD key in walks one time to the GDP finds the entry pulls out the base in the limit for the LD t in caches that so that forevermore when you try to access something with the logical address that points into the L key it just knows here’s the base how many steps do I need to make into that table to find the next descriptor alright and again two instructions ll DT take and go ahead and load up that new segments so the point here is if the OS wants to have a different LD t four different processes each time before it swaps into the context of a new process it would say okay I’m going

to load my LD t to now be index 11 to now be indexed 12 the index 13 in the GDP so that each of those indices points at a different base and address and stuff like that so that they’re not overlapping between processes and again store so ll DT that’s privileged only ring zero can set LD Q values reading it out is I’m privileged to do that like I’m saying this is the basis for one entire virtualization detection mechanism at all alright so now what are the segment descriptors action right so we’ve been referring we got tables of segment descriptors but what’s actually contained therein we know they’ve got a base and we know they got a limit because they’re specifying some memory range which is a segment well what else all right so how they’re actually broken up this is you know 64-bit data structure and so the first 16 bits here are the first 16 bits in the segment limit but it turns out the segment limit is not actually so you know actually I think I’ve been miss describing this throughout yeah so that’s a table limit that’s 16 bit we will back up here when we first saw this you know heuristic sort of view thing I said this is a stupid base 16 bit limit in reality this is a 20-bit limit and you’ll see why in a second but a 20-bit limit is required in order to access all of memory if you’re accessing memory and chunks of 4k so 2 to the 20 times 2 to the 12 because 4 kilobytes is 2 to the 12 20 times 2 to 12 add the exponents 20 plus 12 to 2 to the 32 so you can access all 32 bits of memory if you have a 20-bit thing and you’re saying hey I want to ask my limit is not actually invites my limit is in 4 kilobyte choice that was my bad on describing that our before but when we get into the extra details we can see that it’s specified at first 16 bits the limit is here second 4 bits of is right here and so then there’s this other field granularity G right there granularity says is this limit in bytes or is this limit in 44 kilobyte choice so in that way you could specify like hey I only want you know something to be four kilobytes big twenty it’s four times I can make it like yeah I thought it was a megabyte but really the I seem to remember something about like das could only access a megabyte working memory at a time or something because of this 20 bits because it couldn’t use these four cameras anyway 20 bit limit which says you know how big of a chunk of memory this segment is and then we got base address is all chopped up all over the place but 16 bits of it here 8 bits of it there 8 bits of it there but you know the hardware puts it all together and says here’s my 32 bit base address this is the linear address which is the base of that segment and then we got a bunch of different like field oh all right so we saw the base 32 bits we saw the limit that’s 20 minutes which is required to access all the memory granularity as I said if it’s 0 it says treat this limit as the number of bytes if it’s one it says treat the limit as the number of four kilobyte machines all right DB default operation size this is an interesting thing I’ll talk about in the next slide but in the intro class I’d given you a lying simplification in terms of I said like I’m one of the next slide but this is what controls whether or not instructions are treated as 16-bit instructions or 32-bit instructions as well and whether memory access is computed as 16 by default or 32 by g5 all right and then descriptor privilege level here’s a nice another two bit field having the name privilege level in it and so the scriptura privilege level is I mean to cut to the chase it’s it is saying whether or not it’s descriptor describes a chunk of memory which is going to be ring 0 or a 3 so 1 0 3 we already saw I requested privilege level in Xena segment selector we now have a descriptive privilege level so almost on the point where we can really just understand whether stuff is executing ring 0 length 3 not quite there yet but most of the way here you can see a segment specifies this chunk of memory and it can specify oh yeah that’s a ring 0 chunk of memory so it’s a coda segment maybe that’s a really rich chunk of clay alright but so the one thing I wanted to clarify for me to lie that I told in the intro class is we had seen sort of at the very end quickly when you go into

the manual and you look at the actual op codes there can be ambiguity here in that it says an add instruction is either you know for the same opcode you’re either you know the processor is reading in byte 0 5 and it knows AHA and add immediate is coming up and then the question is in this syntax it’s saying immediate word which is words are 2 bytes in Intel’s well so immediate word so let’s say if the processor sees 0 5 it should expect 2 bytes after that and it should add those two lights to a X register right well we have the exact same opcode byte 0 5 and now in this other form you get 0 5 and the instead expect to take 4 bytes and add at the EAX register right so how does the CPU actually know what’s going to go on here when it sees this and the ambiguity how does it know what to do and the answer is based on the DB field of the segment whichever it’s executing it so if you’ve got a code segment and this DB field back here has is set to 0 the processor knows when it’s reading in code from that segment it should treat those op codes as 16-bit instructions and 16-bit words and everything else and if it sees that that chunk of memory is set as DB 1 it should now interpret everything is 32-bit so this is the way that it actually decides whether it’s executing 33 or 16-bit cold obviously if you’re in something like real mode you said is like a hard-coded lockdown this is 60 Vietnam but actually I shouldn’t say that I haven’t had enough experience with real mode that I can really say you can’t access their EQ and then feeling like you probably can and so later on if we make it to the end there’s again a different type of instruction prefix which he can use to override this so if you’re in 32-bit mode you want to access the 16-bit instruction you tack a little thing onto the front of the instruction opcode sequence so instead of 0-5 there would be some prefix 0 5 and then the CPU would say aha I need to read 2 bytes rather than four words or you can override this but by default whatever your segment says that’s whether you’re executing 32-bit or 16-bit or coding data access but most protected mode OSS of course they’re going to be executing in 32 that’s why we always just assume very TV alright a couple last thing there’s an elf flag which says you know is this a 64 that segment we don’t care about that for this class there’s an S flag saying well this is a system segment or a coder data segment so there’s two classes of segments their system which has a bunch of different special descriptors such as the ldt and then there’s quoted data and for the coding data there’s a bunch of variety of them of whether they’re currently set to read only or read light or different options like that so there’s four bits specifying a type and so in reality you kind of have five bits of different things so system can be 0 or 1 and for system equals 0 you then have four bits specifying all the different types of system ones and for system equal to one you have 4 bits specifying all the different types of code or data so we’ll see that in just a second here finally there’s the present flag which is just saying like if for whatever reason the operating system wants to swap in and out segments and say like sorry this one’s not available now but we’ll get it back later vo is consent pleasant to zero on a flag and so then if someone specifies a logical address which selects that segment the hardware will automatically check this present flag and say aha he tried to specify a segment which isn’t really here right now throw up a hardware fault attention interrupt they’re not present exception all right so this is sort of the breakdown we don’t really care about all this but this is just to kind of give you a notion so when system when the system bit is set to one saying like this is a codon segment and not system which I know is there tutor when system is set to one it’s over data and then you consult the type to understand what type of code or what type of data it is so for all of the most significant that’s equal to zero it’s always data often the most significant it’s even the one it’s always code and then within that there’s code read only and code read only access saying like someone’s actually touched this really read only there’s read right but then there’s interesting things that we’d only expand down and those read write expand expand down segments rather than being base plus limit to tell you what’s in bounds for this segment so that the hardware will check base plus limit and if you’re not within that range then you know you get a hold expand down segments its base minus limit and then if you’re not within that

range you’re out of bounds and so some place you might use that these principles of stack we sense tax growth Winslow’s addresses well maybe you want to set the base address at the top of the stack maybe stock highest memory address and then you say I have an expand down segment and then that sort of makes sense with stacks they don’t actually do it that way there was like an exploit I don’t know the exact nature of it but yeah I’ll give a link to it later there was a case where someone had done the point is it’s hard to say something is an exploit when you’re already in the kerlun you can already change stuff it was an error condition that they could mess with the operating system by setting expand down segments I think it was actually calling an exploit in a sense of access control bypass one so it’s more like there was an expand down segment and whereas the operating system would normally you know not give you access to this memory the expand down segment with he was within bounds so that was one case where someone knowing this in detail type of thing was able to say aha what if I do that how will the operating system respond and obviously responding an error because it’s not expect you never seen it alright anyways then for our code we have execute only and execute please so first of all we can see right now segments have already have the ability to Hardware enforce one if you’ve got a data segment you know you know your CS segment slacker should never point at a segment which is described as a read-only or read right thing that’s not executable your CS must always point at some segment which has its type of code and then it can be executed on your execute read but not execute right and so you may not write to your phone and then there’s these other notions of conforming there’s this is called non conforming and in this conforming the regular non conforming although you know again it’s counterintuitive non conforming is the norm non conforming things you may not may not read from like you may not jump in your code from ring 0 to 3 okay so for normal stuff you may not drunk jump from ring through the rings in which make sense the user space phone should not be able to jump into the kernel and start executing right and the forming is the backwards of that you may not jump from 0 to 3 but you can jump from 3 to 0 but it’s still sort of secure and still all works out because even though you can your link 3 code can jump into this ring 0 segment and execute the ring 3 count has not had its technical privilege level change to ring 0 it’s still wing 3 cone it just happens to executing into the cone segment of the kernel but it can’t execute instructions which can only be done by Rho 0 for instance don’t think it jumps in that code it’s not says hey I want to load up the GDP it’s not going to work anyways it was in like Greg Holland his frack article on kids 1999 he said hey we can you know set these fillings backwards and that could be you know backdoor and I used a space code can execute kernel code and yes that’s true but one it can’t access privileged instructions and two in reality and modern operating systems the privileged stuff is actually enforced more on the paging which we learn about later so it turns out the paging permissions are such that your winged fruits code can actually access any of the wings zero code so I don’t know if that maybe made sense at some point in into your Windows 2000 but I don’t care to go back and check easy no yes is expand down used in relation to local variables I’m no not really so we did say before right we have the notion of stack and we have the notion that stacks expand down its back several local variables and stuff like that but when we get some as far as ending here you’ll see that no the using a data segment which expands down is not actually going to have any relation to the variables so she’s gonna keep meeting along until we finally get for the surprise ending not answering the question but I’ll hopefully come back the problems are problems to get them information actually I need to go set up lunch cons so here’s the system segment stuff the only thing we’ve actually seen that makes any sense to us thus far is

this one right here there you go he said with descriptor may say hey I am of type ldt and if so then you know the GT you can index in video the LDP register would point at that segment and then if you have a logical address because the segment selector chess table indicator of one it’ll say AHA I know that I need to access the ldt and hardware will we’re on the first access cache the information about the base of the LDT in the minute so that’s it for now anyone have any questions thus far we’re going to get back to the lab here in a second and dig into this and look at the things in the in debug does anyone have any questions about what we’ve now learned about segment selectors or sorry segment descriptor the selectors select these data structures and they describe chunks of memory anyone have any questions this one

You Want To Have Your Favorite Car?

We have a big list of modern & classic cars in both used and new categories.