Does an Operating System inject its own machine code when you open a program?

Question Detail: 

I'm studying CPU's and I know how it reads a program from the memory and execute its instructions. I also understand that an OS separates programs in processes, and then alternate between each one so fast that you think that they're running at the same time, but in fact each program runs alone in the CPU. But, if the OS is also a bunch of code running in the CPU, how can it manage the processes?

I've been thinking and the only explanation I could think is: when the OS loads a program from the external memory to RAM, it adds its own instructions in the middle of the original program instructions, so then the program is executed, the program can call the OS and do some things. I believe there's an instruction that the OS will add to the program, that will allow the CPU to return to the OS code some time. And also, I believe that when the OS loads a program, it checks if there's some prohibted instructions (that would jump to forbidden adresses in the memory) and eliminates then.

Am I thinking rigth? I'm not a CS student, but in fact, a math student. If possible, I would want a good book about this, because I did not find anyone that explains how the OS can manage a process if the OS is also a bunch of code running in the CPU, and it can't run at the same time of the program. The books only tell that the OS can manage things, but now how.

Asked By : Revering Sumoda
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/28200

Answered By : David Richerby

No. The operating system does not mess around with the program's code injecting new code into it. That would have a number of disadvantages.

  1. It would be time-consuming, as the OS would have to scan through the entire executable making its changes. Normally, part of the executable are only loaded as needed. Also, inserting is expensive as you have to move a load of stuff out of the way.

  2. Because of the undecidability of the halting problem, it's impossible to know where to insert your "Jump back to the OS" instructions. For example, if the code includes something like while (true) {i++;}, you definitely need to insert a hook inside that loop but the condition on the loop (true, here) could be arbitrarily complicated so you can't decide how long it loops for. On the other hand, it would be very inefficient to insert hooks into every loop: for example, jumping back out to the OS during for (i=0; i<3; i++) {j=j+i;} would slow down the process a lot. And, for the same reason, you can't detect short loops to leave them alone.

  3. Because of the undecidability of the halting problem, it's impossible to know if the code injections changed the meaning of the program. For example, suppose you use function pointers in your C program. Injecting new code would move the locations of the functions so, when you called one through the pointer, you'd jump to the wrong place. If the programmer was sick enough to use computed jumps, those would fail, too.

  4. It would play merry hell with any anti-virus system, since it would change virus code, too and muck up all your checksums.

You could get around the halting-problem problem by simulating the code and inserting hooks in any loop that executes more than a certain fixed number of times. However, that would require extremely expensive simulation of the whole program before it was allowed to execute.

Actually, if you wanted to inject code, the compiler would be the natural place to do it. That way, you'd only have to do it once but it still wouldn't work for the second and third reasons given above. (And somebody could write a compiler that didn't play along.)

There are three main ways that the OS regains control from processes.

  1. In co-operative (or non-preemptive) systems, there's a yield function that a process can call to give control back to the OS. Of course, if that's your only mechanism, you're reliant on the processes behaving nicely and a process that doesn't yield will hog the CPU until it terminates.

  2. To avoid that problem, a timer interrupt is used. CPUs allow the OS to register callbacks for all the different types of interrupts that the CPU implements. The OS uses this mechanism to register a callback for a timer interrupt that is fired periodically, which allows it to execute its own code.

  3. Every time a process tries to read from a file or interact with the hardware in any other way, it's asking the OS to do work for it. When the OS is asked to do something by a process, it can decide to put that process on hold and start running a different one. This might sound a bit Machiavellian but it's the right thing to do: disk I/O is slow so you may as well let process B run while process A is waiting for the spinning lumps of metal to move to the right place. Network I/O is even slower. Keyboard I/O is glacial because people are not gigahertz beings.

No comments

Powered by Blogger.