AVERAGE SEEK TIME OF A COMPUTER DISK

RESEARCH PROBLEM FOR UNDERGRADUATE STUDENTS WHICH SPANS HARDWARE ISSUES, MATHEMATICAL METHODS AND PROGRAMMING

JAN PLAZA

Computer Science Department, SUNY/Plattsburgh
101 Broad Street, Plattsburgh, NY 12901, USA
(518) 564 2781 (phone), (518) 564 3010 (fax)
http://faculty.plattsburgh.edu/jan.plaza/
jan.plaza @ plattsburgh.edu

COMMENTS WELCOME!

Abstract.
We present an example of analysis of average seek time for a computer disk drive. The example is unique in the following ways: the issue involved is of practical importance; the final version of this analysis which is done without any simplifying (unrealistic) assumptions is still accessible to undergraduate students; the analysis involves most basic and crucial mathematical concepts: operations on polynomials, solving systems of linear equations, mathematical induction, probability understood as frequency of events and organization of a mathematical proof. The paper also shows how the problem and the solution can evolve together.

Undergraduate students who have been solving textbook problems which depend on use of a single mathematical method will see, perhaps for the first time, how several methods can be combined together to solve a real world problem. It is generally understood that small but realistic research problems which touch upon a variety of issues are among most valuable components of college courses, but finding such problems which are also accessible to undergraduate students is a challenge to educators.

Analysis of the average seek time similar to ours would be a routine for a computer engineer but it has not been used in computer science education. A wider use of our example could help in rising understanding and appreciation of math among computer science and engineering majors. This paper is addressed to students and instructors in undergraduate courses on computer architecture, discrete math and programming courses. Ideas presented in this paper have been successfully used at SUNY/Plattsburgh.

COMPUTER DISK ACCESS TIME

A surface of a computer disk contains a number of concentric circles on which the information is recorded. These circles are called tracks. Every tracks is divided into the same number of segments; these segments are called blocks or sectors. Angular measures of all blocks are the same Although blocks on tracks near the edge of the disk are longer than those on tracks near the center of the disk, every block stores the same amount of information. The disk drive is equipped with an arm containing a read-write head which moves in a straight line between the edge of the disk and its center. When the head reaches the desired track, it stops; the disk continuously rotates underneath and the head can read or write a block of information on the track over which it is positioned. In response to requests from the CPU, the I/O processor and disk controller transfer the entire block to an I/O buffer in RAM. For more information see chapter 3 of [Folk,Zoelick].

The time needed to fulfill a request for a block transfer is called access time. Access time can be split into the following components.

Moving the read-write head from its current position to a position over the track where the requested block is located. The time spent on this operation is called a seek time. Of course, the seek time varies depending on the initial position of the read-write head.
Waiting until the beginning of the requested block comes under the head. The time spent on this is called a latency or rotational delay. Of course, depending on the situation, the waiting time can be as short as 0 or as long as needed for one (almost) full revolution of the disk.
Scanning the desired block. The time spent on this operation is called a transfer time. As all blocks have the same angular measure, block transfer time is constant -- it does not depend on the position of the block, track or other circumstances.

Disk manufacturers often specify the average-case or worst-case access time for their models. We will consider the best-case as well. Let us think what is represented by these terms.

	Best-case	Worst-case	Average-case
Seek Time	0	Time needed to move form the first to the last track	? (This will be investigated in the rest of this paper)
Latency	0	Time needed for one full revolution of the disk	Time needed for half of the revolution of the disk
Transfer Time	Time needed for a block to pass under the head	Time needed for a block to pass under the head	Time needed for a block to pass under the head
Total Access Time	Sum of the above	Sum of the above	Sum of the above

WHAT DOES "AVERAGE" MEAN?

It is obvious that the average-case latency is the average of the best-case latency and worst-case latency. But what does average seek time really mean? We need to develop understanding of this issue and we will be proceeding in stages before we can formulate and solve a real life version of this problem. Our initial formulations of the problem will contain simplifying (unrealistic) assumptions, but these will be removed one by one in consecutive versions.

To get to the essence of this issue, instead of considering the time we will consider the distance the read-write head needs to move. The distance between the tracks will be measured in multiples of the distance between two consecutive tracks; so, the distance between tracks i and j is |i-j| units. Assume that there are n tracks on our disk.

The case of worst (largest) distance occurs when the head needs to move from track 1 to track n, i.e. the distance of n-1 units. The best case is when no movement is necessary, i.e. the distance of 0 units. What is the average case?

Problem 1. Is the average distance traveled by the head equal to the average of the best- and worst-case distances?

To visualize the distances involved consider a (toy) disk with just five tracks. The following chart shows the distances the read-write-head must travel to move form a position above track i to a position above track j for a all pairs <i,j>. The distance between two consecutive tracks is used as a unit.

Distances		To track:
Distances		1	2	3	4	5
From track:	1	0	1	2	3	4
	2	1	0	1	2	3
	3	2	1	0	1	2
	4	3	2	1	0	1
	5	4	3	2	1	0

The worst-case distance is 4; the best-case distance is 0; their average is 2. The total distance in the 25 cases listed in the table above is 40. Assuming that each of the 25 cases is equally likely, the average distance is 40/25=1.6 . This shows that the answer to the question in Problem 1 is "no".

AVERAGE SEEK DISTANCE ASSUMING UNIFORM DISTRIBUTION OF REQUESTS

Now let us try to calculate the average distance traveled by the read-write head. Let us also specify precisely the assumptions concerning likelihood of different requests for transfer.

Problem 2. Consider a disk with n tracks per surface. Assume that the requests to move the read-write head from track i to track j are equally likely for all pairs <i,j> where i,j £ n. What is the average distance AD(n) the read-write head travels in a single move? (Assume that tracks are equally spaced and give answer as a multiple of the distance between two consecutive tracks.)

We already answered this question for n=5. To consider the question in full generality we need to deal with the following table.

Distances		To track:
Distances		1	2	3	...	n-1	n
From track:	1	0	1	2	...	n-2	n-1
	2	1	0	1	...	n-3	n-2
	3	2	1	0	...	n-4	n-3
	...	...	...	...	...	...	...
	n-1	n-2	n-3	n-4	...	0	1
	n	n-1	n-2	n-3	...	1	0

We denote the total distance traveled by the read-write head in n^2 cases listed in the table above by TD(n) and the average distance by AD(n).

So, TD(n) is the sum of the n^2 numbers in the table above and AD(n) = TD(n) / n^2.

But how to represent TD(n) by a concise mathematical expression? Well, let's first see how the function TD changes when we pass from n to n+1, for instance, from 5 to 6.

Distances		To track:
Distances		1	2	3	4	5	6
From track:	1	0	1	2	3	4	5
	2	1	0	1	2	3	4
	3	2	1	0	1	2	3
	4	3	2	1	0	1	2
	5	4	3	2	1	0	1
	6	5	4	3	2	1	0

Notice that TD(6) = TD(5) + 2(1+2+3+4+5).

In general, TD(n+1) = TD(n) + 2(1+2+...+n) = TD(n) + n(n+1).

(We have used the fact that 1+2+..+n = n(n+1)/2.) So, now we have a formula which allows to reduce the problem of calculating T(n+1) to the problem of calculating T(n). Such a formula is called a recursive formula or a recurrence. However, notice that this formula does not define function TD uniquely until we add a base case: TD(1) = 0. Now, the complete definition of TD is

(1)	TD(1) = 0
(2)	TD(n+1) = TD(n) + n(n+1)

We would be interested in knowing a formula which defines TD explicitly in terms of n, without referring to TD. Finding such formulas is the subject of the theory of recurrence equations, but as we do not know this theory, we need to manage by our own means. The nicest functions to work with are polynomials -- perhaps TD is a polynomial -- let's explore. What could be the degree of such a polynomial? Well, equation (2) above involves terms of degree two, so perhaps TD can be represented by a polynomial of a degree not bigger than three. We are going to assume that

(3)	TD(n) = an^3 + bn^2 + cn + d

and we will try to find values of coefficients a, b, c, and d. (If this fails, we will be back where we were.) Using the recurrences (1) and (2) above we will set a system of linear equations in which a,b,c,d will be treated as variables. Let us express the left-hand side of (2) using (3):

TD(n+1) = a(n+1)^3 + b(n+1)^2 + c(n+1) + d =

a(n^3 + 3n^2 + 3^n + 1) + b(n^2+2n+1) + c(n+1) + d =

an^3 + 3an^2 + 3an + a + bn^2 + 2bn + b + cn + c + d =

an^3 + (3a+b)n^2 +(3a + 2b + c)n + (a + b + c + d)

Now let us express the right-hand side of (2) using (3):

TD(n) + n(n+1) = an^3 + bn^2 + cn + d + n(n+1) =

an^3 + (b+1)n^2 + (c+1)n + d

Do you remember from your calculus class a theorem which says that if for all values of the argument two polynomials have the same values, then the polynomials must have the same coefficients? (Would you be able to prove this theorem?) By this theorem we obtain the following equations.

(4i)	3a + b = b + 1
(4ii)	3a + 2b + c = c + 1
(4iii)	a + b + c + d = d

So, this is a system of 3 equations with 4 variables -- we need one more equation. The fourth equation will be obtained from (1) and (3):

0 = TD(1) = a* 1^3 + b* 1^2 + c* 1 + d = a + b + c + d

i.e.

(4iv)

a + b + c + d = 0

Solve equations (4i) - (4iv). You should obtain a = 1 ¤ 3, b = 0, c = -1 ¤ 3, d = 0.
So, TD(n) = (n^3 - n) ¤ 3. Or, is it so, really? Have we proved that? No! Remember assumption (3)? We have proved just a conditional statement:

If TD(n) is a polynomial of degree not bigger than three then this polynomial must be (n^3 - n) ¤ 3.

So, (n^3 - n) ¤ 3 is a candidate for the expression which could represent TD(n) but there is no certainty that this expression is correct. We leave it to the Reader to verify that the function TD(n) defined as (n^3 - n) ¤ 3 satisfies equations (1) and (2). (The Reader could try to prove a more general and more challenging result: if there exists a polynomial p(n) of degree k such that for every n, f(n+1) = f(n) + p(n) then f(n), treated as a function over integers, is a polynomial of degree k+1.) Once a proof has been completed we know that

(5)	TD(n) = (n^3 - n) ¤ 3 = (n-1)n(n+1) ¤ 3

So, the average distance traveled by the read-write head on a disk with n tracks under assumptions of Problem 2 is

(6)	AD(n) = TD(n) ¤ n^2 = (n-1)(n+1) ¤ 3n = (n^2 - 1) / 3n = (n - 1/n) / 3

For big values of n, the following approximate formula will be valid.

(7)	AD(n) » n ¤ 3

This suggests that it is 3, not 2, that is the magic number of computer disk average seek time!

ALTERNATIVE CALCULATION

The Reader could approach Problem 2 in a different way. The lower triangle in the table with distances traveled by the read-write head on a disk with n tracks contains n-1 ones, n-2 twos, n-3 threes, etc. So

TD(n) = 2* [ (n-1)*1 + (n-2)*2 + ... +1*(n-1) ] =

2 * [ (n*1 + n*2 + ... + n*(n-1)) - (1*1 + 2*2 + ... + (n-1)*(n-1)) ] =

2 * [ n*(1 + 2 + ... + (n-1)) - (1*1 + 2*2 + ... + (n-1)*(n-1)) ]

The formula for 1 + 2 + ... + k is well known. On the other hand, hardly anybody knows or remembers the formula for 1*1 + 2*2 + ... + k*k. If we knew this formula we could finish the calculation of TD(n) and then obtain AD(n). The Reader can find this formula using steps analogous to those in the previous section: assume that 1*1 + 2*2 + ... + k*k can be represented by a polynomial of degree not bigger than 3, set a system of linear equations and solve them to find the coefficients of a candidate polynomial, then prove by mathematical induction that the polynomial is correct. We leave this as an exercise.

The sum of numbers in the lower triangle can be also obtained by adding the numbers in its every row and then adding obtained sums together. As the sum of numbers in a row of the lower triangle can be represented as a binomial coefficient, obtaining the final result can be an exercise in manipulating binomial coefficients.

AVERAGE SEEK DISTANCE ASSUMING NON-UNIFORM DISTRIBUTION OF REQUESTS

In some situations one cannot assume (as we did in the previous sections) that all the moves of the read-write head from track i to track j are equally likely. If the computer is used by a single user who has just one file open, the read-write head is more likely to stay on the same track or to move to the next track than to move anywhere else. Let us formulate a more general version of our problem which will cover also this case.

Problem 3. Consider a disk with n tracks per surface. Assume that the probability that the next request for data transfer involves the same track is p0. Assume that the probability that the next request requires the head to move from track k to track k+1 (where k<n) is p1. What is the average distance AD(n,p0,p1) the read-write head travels in a single move? (Assume that tracks are equally spaced and give answer as a multiple of the distance between two consecutive tracks.)

DISTANCES		To track:
DISTANCES		1	2	3	...	n-1	n
From track:	1	0	1	2	...	n-2	n-1
	2	1	0	1	...	n-3	n-2
	3	2	1	0	...	n-4	n-3
	...	...	...	...	...	...	...
	n-1	n-2	n-3	n-4	...	0	1
	n	n-1	n-2	n-3	...	1	0

Requests represented on the main diagonal (zeros) occur with probability p0; requests represented just above the main diagonal (ones) occur with probability p1; the remaining requests occur with probability 1-p0-p1. How to handle these assumptions of probability? Let us think about frequency of events. Let R be a number of requests for data transfers; think that R is big. According to the assumptions of the problem among our R requests there will be roughly

R*p0 requests for data from the current track,
R*p1 requests for data from the next track,
R*(1-p0-p1) requests for data from a yet another track.

The average length of a move is

0 -- for requests involving the same track,
1 -- for requests involving the next track,
( TD(n) - (n-1) ) ¤ ( n^2 - n - (n-1) ) -- for other requests

where TD(n) has been calculated in the previous problem. (Look at the table above and think why!) So, the total length of R moves is

R* p0*0 +
R* p1*1 +
R* (1-p0-p1) * ( TD(n) - (n-1) ) ¤ ( n^2 - n - (n-1) )

And the average length of a move can be obtained by dividing this expression by R:

p0* 0 +
p1* 1 +
(1-p0-p1) * ( TD(n) - (n-1) ) ¤ ( n^2 - n - (n-1) )

After substituting here the expression (5) for TD(n) and some simplifications, the Reader should obtain:

(8)	AD(n,p0,p1) = p1 + (1-p0-p1)* (n^2 + n -3) ¤ (3(n-1))

The Reader should not only carry out calculations leading to this formula, but also check consistency of this formula with the formula (6) for AD(n) -- Problem 1 is a particular case of the Problem 2 with p0 = n ¤ n^2 = 1 ¤ n and p1 = (n-1) ¤ n^2, so substituting these values in formula (8) should produce a formula equivalent to (6).

Finally, let us also derive from (8) a simplified, approximate formula which can be used for big values of n.

(9)	AD(n,p0,p1) » (1-p0-p1) * n ¤ 3

AVERAGE SEEK TIME -- REAL WORLD VERSION

In the real world, customers of disk manufacturers do not ask about the average distance traveled by the read-write head, they ask about the average seek time. Transition from the calculation of distance to the calculation of time is not immediate -- the seek time is not proportional to the distance traveled by the head. In every move, a certain startup time needs to elapse before the mechanism can overcome inertia and only after that the head moves at a constant speed. With the addition of this detail we obtain a real world version of our problem.

Problem 4. Consider a disk with n tracks per surface. Assume that the probability that the next request for data transfer involves the same track is p0. Assume that the probability that the next request requires the head to move from track k to track k+1 (where k<n) is p1. Assume that the time needed to move the read-write head from track i to j (where i¹j) is t0 + |i-j|*t1 where t0 and t1 are constants, (t0 is much bigger than t1). What is the average seek time AT(n,p0,p1,t0,t1) ?

We will continue the reasoning from the previous version of the problem. Recall that R stands for the number of requests for data transfers. The total distance the head moves in these R requests is R*AD(n,p0,p1). In R*(1-p0) of these requests the head does not stay on the current track and the startup time must be accounted for. The total time spend on head movements in these R requests can be split into R*(1-p0) startups, which take R*(1-p0)*t0 time, and the time spent on moving at the constant speed over the total distance: R*AD(n,p0,p1)*t1. So, the total time used for head movements in these R requests is

R*(1-p0)*t0 + R*AD(n,p0,p1)*t1

To obtain the average seek time we need to divide this by R:

AT(n,p0,p1,t0,t1) = (1-p0)*t0 + AD(n,p0,p1)*t1

Now, it remains to substitute the expression (8) for AD(n,p0,p1)

(10)	AT(n,p0,p1,t0,t1) = (1-p0)t0 + p1t1 + (1-p0-p1)t1(n^2 + n -3) ¤ (3(n-1))

The reader should notice that the previous problem was a particular case of the current one, with t0=0 and t1=1, and substituting these values in formula (10) should give a formula equivalent to (8).

Finally, let us also derive from (10) a simplified, approximate formula which can be used for big values of n.

(11)	AT(n,p0,p1,t0,t1) » (1-p0)t0 + (1-p0-p1)t1*n ¤ 3

In this approximate formula we left the constant term (1-p0)*t0 because t0 is much bigger than t1 causing that this term can substantially affect the resulting value if n is big but not huge. (In real computer disks n is of the order of 100 or 1000 and t0 can be 50 times bigger than t1.)

USES IN A FIRST PROGRAMMING COURSE

The problems presented in this paper can be solved by means of computational experiments instead of mathematical calculations and they are appropriate for first programming course. They can be presented as a series of programming exercises corresponding to problems 2, 3 and 4 above. We will refer to them as Exercises 2', 3' and 4'. The instructor is expected to make an introduction and explain the relation of these programming exercises to the average seek time of a computer disk, including a discussion of our Problem 1. Exercise 2' involves nested for-loops; Exercises 3' and 4' involve, in addition, a random number generator. Exercises 2' and 3' are introductory and can be given during a lab session while Exercise 4' is the ultimate goal, and if there was not enough time in the lab it can be finished by students on their own.

USES IN HIGHER-LEVEL COURSES

Solutions to problems 2, 3 and 4 presented above assume the First-Come-First-Served (FCFS) algorithm for scheduling disk arm movements: the requests for block transfers are served in the order they were received. There is a more efficient disk arm scheduling algorithm, known as LOOK (a variant of SCAN or ELEVATOR algorithm): the head keeps moving in one direction until it visits all tracks for which there are requests in the current pool, then it changes direction and continues in the same manner, and so on. For more information on related algorithms see chapter 13 of [Silberschatz,Galvin].

We can assume that placements of requests form a Poisson process, i.e. that the time between two requests has an exponential distribution. Theoretical analysis of the average seek time under the LOOK algorithm belongs to queuing theory and could be included in a graduate course on probability. A solution through a discrete simulation is accessible to undergraduate students -- this is a excellent problem for an advanced undergraduate course on object-oriented programming. Such a programming simulation should report not only average seek time but also the average waiting time for a completion of the request. For discussion of results of similar simulations see [Teorey,Pinkerton].

Let us also mention that section 5.4.9 of [Knuth-3] and section 3.2.1 of [Wiederhold] contain other calculations related to the problems of this paper -- this material as well as papers listed in [Silberschatz,Galvin] can be used as basis for individual research of students.

CONCLUSION

We presented a topic which spans issues of hardware, mathematical methods and programming. It can be developed in a series of increasingly difficult problems. The final version is a real world problem, without simplifying (unrealistic) assumptions, but it is still accessible to undergraduate students. Our experience at SUNY/Plattsburgh shows that problems with such characteristics attract special interest of students and are especially effective in increasing understanding and appreciation of the subject.

ACKNOWLEDGEMENTS

Programming problems were prepared by Prof. S. Denenberg and used in his course at SUNY/Plattsburgh. The author is grateful to him and to Prof. L. Fairchild for their comments.

REFERENCES

[Folk,Zoelick] Folk, M. and Zoelick, B., File Structures, second edition, Addison Wesley, 1992.

[Knuth-3] Knuth, D., The Art of Computer Programming, Volume 3, Sorting and Searching, Addison Wesley, 1973.

[Silberschatz,Galvin] Silberschatz A., Galvin, P.B., Operating System Concepts, fifth edition, Addison Wesley, 1998.

[Teorey,Pinkerton] Teorey, Pinkerton, 1972.

[Wiederhold] Wiederhold, G., File Organization for Database Design, McGraw-Hill, 1987.