Project 5: real-time batch operating system simulator

Due Friday, 11/9/2018, 11:59:59 PM

Background on operating system scheduling

An operating system (OS) is a program that manages all running programs (aka processes) on a computer. The OS decides which process gets to execute, and for how long. This is called process scheduling.

A batch operating system is one which runs one whole program, then runs another whole program, etc. Batch operation is an older, simpler approach to scheduling in an operating system. A non-batch operating system is one like Windows or Linux, where multiple processes all seem to run at the same time (really they just use small amounts of time and switch back and forth).

An operating system is real-time if it makes guarantees about when processes will be executed (e.g. when it will start, or when it will finish).

Operating systems use different methods for scheduling which processes are run, and when. One simple method for a batch operating system is first-come-first serve, in which the processes are ordered by their start times, and run in order, where each process is run to completion. Another popular method is shortest-first, where processes are ordered by the amount of time they require (from least to greatest), and run in that order.

For a real-time operating system, it is often important when a process will finish. For example, if we know that a prediction program for tomorrow's weather will take 12 hours to run, then it would be useless to start it any later than noon today because the answer would be irrelevant. Therefore, processes could be ordered by ther deadlines, and the first process run is the one which must finish earliest, thereby guaranteeing it finishes on time. For example, if the OS time is measured in `ticks', and there are three processes that must finish at 25 ticks, 80 ticks, and 15 ticks (respectively), the first process run would be the one that must finish at 15 ticks. Note that 25, 80, and 15 are not the amounts of time the processes take to run; they are the actual times by when the corresponding processes must finish.

Data Structures Involved

Write an program that uses a priority queue (heap) to simulate a real-time batch operating system, as described above. The simulator should start new processes running when they are ready, and update the clock when they have finished. The next process to run is always the one that has the earliest deadline (breaking ties with rules given below).

Sometimes a process may have a finishing time that is impossible to achieve. For example, if the clock is currently at 30 ticks, the next process to run must finish by 50 ticks, and it requires 40 ticks to run, then the process cannot finish on time. In such a case, the latter process is simply skipped.

There are two abstract data types for this project: ArrayHeap and Process. Each Process has several attributes:

The ArrayHeap is used to store Process objects, ordered by deadline. In the case where two Process objects have the same deadline, they should be ordered by amount of time required to run (ordered least to greatest). If they are still the same, then they should be ordered by their process id (ordered least to greatest).

Simulation Details

At the beginning of the simulation, the system clock starts at 0 ticks. At any time, the simulator should have in the heap all processes that have already started (i.e. their start time is less than or equal to the system clock). However, processes that haven't started yet shouldn't be on the heap! The simulator then attempts to run a process, in the order discussed above. If a process is next to be run but cannot finish in time, it is skipped. This continues until there are no processes left to run. The system clock increments by the required run time of a process when that process is run, or by 1 tick when any process is skipped.

The first line of input is the number of processes n, where 1 ≤ n ≤ 10,000. The remaining n lines each describe one process. Each line describing a process has three integers s, d, and r and a text description i. They are as follows:

The processes are guaranteed to come in order of submission time.

Sample simulation

Here is a demonstration of the simulator. The sample input is:

10 20 5 hello there
11 20 5 how are you
12 20 5 i am fine
13 20 5 i am glad to hear that
14 30 5 goodbye

Note that the input has a single space between the fields required time and information, and this space should be skipped. The single space is not a part of the information field.

Here is a description of the simulator as it processes this input:

The sample output is:

running process id 0 at 10
hello there
running process id 1 at 15
how are you
skipping process id 2 at 20
skipping process id 3 at 21
running process id 4 at 22
final clock is                 27
number of processes run is     3
number of processes skipped is 2

If a process is run, its output has two lines:

If a process is skipped, it has just one line of output: the simulator tells that it is being skipped, giving the process' id and the system clock when the process is skipped

After simulation of all processes completes, the simulator tells the final system clock, the number of run processes, and the number of skipped processes.

Some additional notes

Here are some tips that should help you along the way. Read these after you have read the rest of the document.

Sample input and output

The inputs will always be ordered by start time, so you can read them in one at a time, without sorting by start time. They should be read in the given order. Some processes may have the same starting time.

Here are several sample inputs and outputs.

Sample executables

When you design test cases, you can judge your output against the output from my correct solution.

As in previous projects, if you give a command-line argument to these executables, they will print extra information about how they are running.

Provided code

You must use the .h files provided here. As before, you should put your code for the ArrayHeap in the student file, and do not put your code in the prof file. However, you might put code in the prof file for debugging purposes (but it will not be submitted).

Remember that when using templates, all of the code you write goes in the .h file. So you will turn in 3 files for this project: arrayheap-student-proj5.h, process-proj5.cpp, and your driver (a .cpp file). Your driver should #include arrayheap-student-proj5.h, and it should #include the corresponding prof file (which it already does).

Structuring the project

Since this is a large project, it helps to have a plan of attack. The following milestones should be turned in via the upload site.

Step Finish by Milestone
1. Tuesday, October 30 (by noon) WRITE and thoroughly TEST test the Process class, and the following methods for the ArrayHeap: default constructor, copy constructor, destructor, getMin, getNumItems, bubbleUp, and bubbleDown.
2. Tuesday, November 6 (by noon) WRITE and thoroughly TEST the insert, removeMin, doubleCapacity, and operator= methods for the ArrayHeap.
3. Thursday, November 8 (by noon) WRITE and thoroughly TEST the main driver. Write and solve by hand several test case inputs and outputs. Check your driver against the inputs and outputs you have developed. Finish early so that you have time to solve any remaining bugs.

Writing a test driver for a data structure means writing a small, self-contained program that tests the different methods of the data structure and verifies that they are correct. For each milestone you should develop and turn in a driver that illustrates testing your code.

Final notes

Remember when writing this program to adhere to the coding style guidelines. No credit will be given for a solution which does not pass all the hidden tests I create, or does not pass in the allowed time. For more detailed instructions, read the project submission guidelines.

Copyright © 2018 Greg Hamerly.
Computer Science Department
Baylor University

valid html and css