Study questions, Theory Area
MS Comprehensive In CS


Back to MS Comps home page

T 400

     Consider the problem of storing the maximum number  of  files  on  two
disks  (where  no  file occupies more than one disk).  Prove it is NP-hard.
Note:  there will be a part of the proof specific to the problem,  but  you
will also need to discuss what it means for a problem to be NP-hard.

T 401

     Merge sort is an algorithm that sorts an array of keys by breaking the
array  in  half,  recursively sorting each half, and merging the two sorted
halves.


     1.  (25%) write pseudo-code for merge sort.  Be sure to  include  code
         for the merging of the two sorted halves.

     2.  Analyze the best  (25%),  worst  (25%),  and  average  (25%)  case
         performance  of  merge sort (in terms of the number of comparisons
         of keys).  Be as rigorous as possible.

T 402

     A) What is a "correspondence system" and what is Post's Correspondence
Problem?

     B) Given a context-free grammar G,  and  a  string  w  over  the  same
alphabet, is it decidable whether there is at least one derivation of w in G?

     C) Given two context free grammars G1 and G2, is it decidable  whether
the  language  they  describe,  L(G1)  and  L(G2), have no strings in common
(i.e., they are disjoint)?

     Please show your reasoning.

T 403

     Determine whether it is possible to build machines which  will  always
answer  (decide)  each  of the following problems.  Explain why or why not.
If a problem is undecidable, show if you can that it is equivalent  to  the
halting problem.

     A) Given a number y and a primitive recursive  function  f(x1,...,xk),
is there a value of (x1,...,xk) such that f(x1,...,xk) = y?

     B) Given a number x and a one-place primitive recursive function f(x),
is there any number y such that f(x) = y?

     C) Given a Turing Machine T and an input string w, does T ever write a
non-blank symbol on the tape?

     D) Given two Turing Machines T1 and  T2,  do  they  accept  complement
languages?   (For  each  possible input string either T1 halts or T2 halts,
but they do not both halt.)

T 404

A)  Show that the function m(x,y) isprimitive 
    recursive, where m(x,y) computes the difference
    y - x when y > x, and zero otherwise.

    Define this function rigorously, using only the fundamental
    concepts of the primitive recursive functions.



B)  If f(x,y) is a primitive recursive function, show that g(x,y)
    is also primitive recursive, where:

             g(x,y) = f(f(f(...f(f(x,y),y-1),...2),1),0)

    In other words:

         g(x,0) = f(x,0)
         g(x,1) = f(f(x,1),0)
         g(x,2) = f(f(f(x,2),1),0)
         etc.

C)  The following definition of function Z(x,y) is not primitive
    recursive, however it turns out the Z(x,y) is the same as a 
    familiar primitive recursive predicate.  Name that predicate.

          Z(0,y)    = 1
          Z(x',0)   = 0       {x' is the successor of x}
          Z(x',y')  = Z(x,y)

D)  The following function W(x,y) may or may not be equivalent to
    a familiar primitive recursive function.  Decide whether it 
    is or isn't, and explain why.

         W(x,x)  = x
         W(x,y') = W(x,y)'

T 405

For each of the following languages over the alphabet {0,1,2}
state (10% apiece) and prove (15% apiece) whether or not it
is regular and whether or not it is context free:

     L1 = { w | the base 3 interpretation of w is divisible by 2}.

     L2 = { w | w has an equal number of 0's, 1's and 2's}.

     L3 = { w | w has an even number of 0's and 1's or an even
            number of 0's and 2's or an even number of 1's and 2's}.

     L4 = L1 - L3

T 406

Removed

T 407

Topic :  Simulation

1.  What is trace-driven simulation?  State advantages and disadvantages.

2.  What is self-driven simulation?  Satae advantages and disadvantages.

3.  What types of techniques do you know to generate input for  self-driven
simulation?

4.  Explain synchronous time-control using a poisson arrival  process  with
Fx=1-e**at as interarrival time.  a is the average arrival rate.

5.  Explain asynchronous time control for a single server  queue  with  job
types 1,...,r

     (a) self-driven, with interarrival time distribution Fx(x)

     and service time distribution Fs(s)

     (b) trace-driven

     In either case, how do you represent the event list?  How many

     events do you need to keep in your list?  On what does that

     depend?

T 408

1.  You are considering using a design with 3 identical processors
    and a common queue. Their average service rate is 1/s where 
    s=2min=1/30 hr. Three job streams arrive independently with 
    respective average arrival rates 1[1]=20, 1[2]=15, 1[3]=10.

    (a)  Compute the expected response time  E[Rc]  for  this  system  with
         three  servers  connected  to  a  common  queue  when arrivals are
         poisson and service distributions are  negative  exponential  with
         the  parameters  given  above.   Use E[Nc], the expected number of
         customers in the system for the Common queue  (C)  and  p(0),  the
         probability  of the system being idle, and Little's Law to compute
         E[Rc].

    (b)  Jobs of  type  3  (those  with  arrival  rate  1[3]=10)  are  very
         important.   You are considering dedicating one processor to them.
         Will that improve their average response time?  Also  compute  the
         average response time of the other two job streams which share the
         remaining 2 processors via a common queue.


T 409

     A) Define Goedelization, and  explain  its  purpose  in  computability
theory.

     B) Show a  scheme  for  Goedelizing  strings.   Assuming  an  alphabet
consisting  of  the  symbols  {a,b,c},  compute the Godel number under your
scheme for the string "cba".

     C) Show a scheme for Goedelizing Turing Machines.

T 410

Let L(n,k,m) = { w | the base n interpretation of w = k (mod m)}.  L(n,k,m)
is a regular language for all integers n >= 2, m >= 1, and 0 <= k < m.

     A.   Explain  this  fact  by  describing  a  general   procedure   for
constructing a deterministic finite automaton that recognizes L(n,k,m).

     B.  Demonstrate your procedure with the construction  of  a  DFA  that
recognizes L(3,1,5).

T 411

A.  If asked whether or not a language is regular and whether or not it  is
context-free, there are only three possibilities (rather than four).  Which
combination of answers is impossible?  Explain why.

B.  For each of these languages, prove whether it is regular and whether it
is context-free.


     L1 = { w | w contains more a's than b's AND more b's than c's }

     L2 = { w | w contains more a's than b's OR more b's than c's }

     L3 = { w | w = some c's followed by some b's followed by some a's }

T 412


     Let R and S be infinite languages over a finite alphabet.

     1) Assume R is decidable.  Does it have  any  subsets  which  are  NOT
decidable?

     2) Assume R is decidable and S is acceptable.  Is  their  intersection
acceptable?

     Prove your answers.  Note on equivalent jargon:

     "Decidable" is the same as "recursive"

     "Acceptable" is the same as "recursively enumerable"

T 413

     1)  What are the primitive recursive functions?

     2)  How  are  the  mu-recursive  functions   (or   general   recursive
         functions) different from the primitive recursive functions?


     3)  Show that the set of primitive recursive functions  is  countable,
         and briefly sketch some way to actually enumerate the set.

T 414

Suppose I produce a new computational model M, which I claim to be equal 
in power to Turing Machines. Without knowing any of the details of M, answer
the following questions:

1)  Should it be possible to write a simulator in M, i. e. an M-program which 
    is capable of simulating the operation of all other M-programs? Be sure 
    to explain your reasons.

2)  How would you approach a proof (or disproof) of my assertion that model M
    is as powerful as, but not more powerful than, the Turing Machine model?

3)  Describe an undecidable set of M-programs.

4)  Does there exist a one-to-one mapping of M-programs to a set of strings?
    Why or why not?

T 415


     State the halting problem for Turing Machines, and prove  that  it  is
undecidable.

     You may use any other well-known model of computation if you wish.

T 416

     1) Show that the mu recursive functions comprise a countably  infinite
set.

     2) Find some way to associate a unique  natural  number  i  with  each
definition of a mu recursive function.

T 417


     Please state whether the following problems about Turing Machines  are
decidable  or undecidable.  If the problem is undecidable, show a reduction
from the halting problem.  If decidable, show how.

     1) Given a Turing Machine M, an input tape w, and a number k, does the
machine's  head  ever  move  more  than  k tape positions from its starting
point?

     2) Given a Turing Machine M and a blank input tape, does  the  machine
ever reach a halted state?

T 418

Removed

T 419

Part A (50%)
Consider the attribute severity of software defects and
the following empirical relation system R=(R1,R2,R3,R4,R5):
R1: is critical defect
R2: is major defect
R3: is minor defect
R4: is cosmetic defect
R5: is more severe than
  Our understanding of relation R5 is that:

all critical defects are more severe than all major, minor, and
cosmetic defects,

all major defects are more severe than all minor and cosmetic 
defects,

all minor defects are more severe than all cosmetic defects.

a.  Determine the scale type for this relation system R.
Explain your answer.
b. Add new relations to the relation system R so that the
scale type for the severity is ratio.  Explain your
answer.

Part B: (50%)
a.  Compare McCabe Complexity and Lines of Code.  What are
the advantages and disadvantages of each of these measures?


b.  Consider the attribute "length" of programs.  Show that the 
"Number of Lines of Code" is not an absolute scale measure
of program length.

T 420

a. (50%)
One method for evaluating software reliability is to use
Mill's seeding approach.  In this method some faults are
seeded in the program, and the reliability is assessed
based on how many of these seeeded faults are detected
during testing.  Develop a simple reliablity model
based on this approach.  Define your parameters, and give
a formula for estimating the reliability and the number
of faults remaining in the system.  What are the drawbacks
and limitations of this seeding model?  What are the
assumptions about the seeded faults?

b. (50%)
Supppose that you want to find if there is a coorelation
between (1) complexity and reliability, and (2) size
and reliability.  What data will you collect during and
after termination of a project?  How can this data be
obtained?  How do you determine whether there exists
correlation between these attributes?

T 421

Please state whether the following are true or false.
Please give a reason or a proof, or show a counterexample.

a)  Every regular language is also context-free.

b)  A subset of a context-free language must be context-free.

c)  The intersection of two regular languages may not be regular.

d)  The intersection of a regular language and a context-free 
    language is sometimes context-free but not regular

e)  The complement of a context-free language must be context-free.

           *     *     *  *
f)  (r + s)  =  r  ( rs )

T 422

Please state whether the following are true or false.
Please give a reason or a proof, or show a counterexample.

a)  Not every regular language is also context-free.

b)  A subset of a regular language might not be context-free

c)  The intersection of two context-free languages must be
    context-free.

d)  The union of a regular language and context-free language 
    might not be context-free.

                   k             
e)  The language {a  | where k is not a perfect square} is regular.

T 423

a)  Describe the Attribute Oriented Induction Learning Technique.

b)  What is the role of the "concept hierarchy" in the  Attribute 
Oriented  Induction Learning Technique?  

c) Illustrate how bias can be introduced because of the choice of 
concept hierarchy.

d)  How can the designer control the number of facts produced  by 
this technique? 

e)  Explain the difference between a "characteristic rule" and  a 
"classification rule"?

T 424

     1.  Define an absolute approximation algorithm for a problem class P.


     2.  Define a polynomial time approximation scheme for a problem  class
P.


     3.  Give an example of a problem which has no  absolute  approximation
time   algorithm   (assuming   P<>NP)  but  does  have  a  polynomial  time
approximation scheme.

4.  Prove part 3.

T 425

1.  Carefully define NP-hard, NP-complete.

2.  Explain chromatic number for a graph.

3.  Show (assuming Cook's  Theorem)  that  the  chromatic  number  decision
problem is NP-complete.

4.  Show that the halting problem is NP-hard, and explain  why  it  is  not
NP-complete.

T 426

     There is a proof technique used to show the solution given by a greedy
method algorithm is optimal.  Describe the proof technique.

     State a greedy method algorithm to find an  optimal  solution  to  the
problem  of  storing  the maximum number of files on a disk and prove it is
optimal.

T 427

a)   Describe the Ji/Carlson Algorithm for Relation-To-NER Schema 
Conversion.

b)  Illustrate an application of the algorithm.

c)  Describe the heuristics used by this algorithm in the construction 
of an NER schema

T 428

 A. Define for a program P, precondition pre(x) and post
    condition post(x,z)

    1. P is incorrect with respect to pre(x) and post(x,z)
    2. P is totally correct with respect to pre(x) and post(x,z)
    3. P is partially correct with respect to pre(x) and post(x,z)

 B. Given the following program segment, complete the assignment
    such that the program segment will meet its specification.
    Explain how you derive  your solution and formally prove
    that you are right.

                           Y2      X2
     { Y2 > 0  AND  Y3 * Y1    = X1    }

     if even(Y2) then (Y1,Y2) <--- (Y1*Y1,? )
                 else (Y2, Y3) <--- (Y2-1,? )

              Y2     X2
                                                               


     { Y3 * Y1   = X1    }

     NOTE : If you prefer to use Floyd method then first
            represent the program segment in a flowchart.


T 429


Weakest Precondition Semantics

A.  Define each of the following rules and give a graphical representation.

     a)  Law of the Excluded Miracle

     b)  Distributivity of Conjunction

     c)  Law of Monotonicity

     d)  Distributivity of Disjunction.

B.  For commands S1, S2 and a post condition Q

     define WP(S1;S2, q)

Show that your definition satisfies the four rules.


T 430


     A cookie jar contains some (not zero) black cookies and white cookies.
The following process is to be repeated as long as possible.

     Randomly select two cookies from the jar.  If they are the same color,
throw them out, but put another black cookie in (enough extra black cookies
are available to do this).  If they are different colors, place  the  white
one back into the jar and throw the black one away.

     a)  Describe the above algorithm using a pseudo code (you may wish  to
         use  a  nondeterministic construct to express the random selection
         of cookies).

     b)  Give Hoare's verification rules for all  the  language  constructs
         you use in your algorithm (including the nondeterministic one).

     c)  What can you say about the color of the last cookie  left  in  the
         jar  with respect to the initial number of white cookies and black
         cookies?

     d)  Use Hoare's verification rules given in part B to  formally  prove
         your solution.


 

T 431


     Given the following formal specification:

     Input specification:  X>0 and N>0

     Output specification:  Z=X**N (Z is equal to X to the power of N)

     a.  Formally define what it means  for  a  program  P  to  be  totally
correct with respect to this specification.

     b.  List all the properties of  the  power  function  and  suggest  an
invariant  to direct the design of a program P totally correct with respect
to the specification.  Make the complexity of your program 0(log N).



     c.  Prove that your program is totally correct  with  respect  to  the
specification.

You may use either pseudocode and Hoare method,  or  use  a  flowchart  and
flyoed method.


T 432


The predicate transformer WP

"If we are to define a programming notation using the concept of  WP,  then
we  had  better  be  sure  that WP is well-behaved.  By this I mean that we
should be able to  define  reasonable,  implementable  commands  using  WP.
Furthermore, it would be nice if unimplementable commands would be rejected
from consideration" [David Gries "The Science of Programming"].


     1.  Define WP with respect to a program P and a post condition Q.

     2.  List the four properties of WP.

     3.  Define WP(S1;S2;Q).

     4.  Show that your definition satisfies the four properties  given  in
         b.

     5.  Step by step compute the following WP.
         WP (a:=b; b:=c; c:=d; d:=a, a<b<c<d)
         Simplify your result.



T 433


A. Give Hoare verification rules for:
     Assignment command: y:=g(x,y)

     Conditional commands: if B then S
                           if B then S1 else s2

      While command: while B do s

      Concatenation command: S1;S2

      Consequence rules: Weakening the post condition
                         Strengthening the precondition.

For each rule denote your precondition by P and your
postcondition by Q.

B. Invent a verification rule for the following CASE command

CASE
B1 ==> S1
B2 ==> S2
       :
Bn ==> Sn
END CASE

Where the semantics is:
IF B1 THEN S1 ELSE
   IF B2 THEN S2 ELSE
      ......
                       ELSE

      IF Bn THEN Sn ELSE NULL

C. Use your rule to prove

{ x=0 }

CASE
x=1 ==> y:=1
x=0 ==> y:=0
x=0 ==> y:=2
END CASE

{ y=0 }


T 434


A.  For a command S and a predicate R give the definition of WP(S, R)

B.  Give the four properties of any WP definition which enable us to define
reasonable,  implementable  commands  and  reject  from  consideration some
unimplementable commands by using the weakest precondition semantics.

C.  Give the definition of WP (DO, R)

     where IF is syntactically defined as:

                do
                                                               

                    [] B1 => S1
                    [] B2 => S2
                         .
                         .
                    [] Bn => Sn
                od

D.  Show that this definition satisfies the four rules given in section B.

E.  With respect to your definition, what is WP( do od, x=5)

F.  Is the following true:  ( WP (S, R) or WP (S, -R) ) = TRUE.  prove your
statement or give an example.

G.  For a command S and a predicate R give the definition of WLP( S, R)  in
terms of WP(S, R).


T 435


A.  Write a small program  to  compute  integer  division.   (You  can  use
pseudocode   and  use  the  Hoare  method  or,  you  can  use  a  flowchart
representation and use the Floyd method).

B.  Prove that  your  program  is  totally  correct  with  respect  to  the
following specifications:

PRECONDITION (x1, x2): x1>=0 & x2 >0
POSTCONDITION (x1, x2, z1, z2): x1=z1*x2+z2 & 0 <=z2 <x2


T 436


     Consider the following program for parts a and b, where gcd(x,  y)  is
the greatest common divisor of x and y.


     {x1 > 0 and x2 > 0}

     (y1, y2) := (x1, x2);

     {gcd(x1, x2) = gcd(y1, y2) and y1 > 0 and y2 > 0}

     do

     y1 > y2 -> y1 := y1 - y2

     or

     y1 < y2 -> y2 := y2 - y1

     od;

     {gcd(x1, x2) = gcd(y1, y2) and y1 > 0 and y2 > 0 and y1 = y2}

     z := y2

     {z = gcd(x1, x2)}

     a.  By using the Hoare Method, prove the partial  correctness  of  the
given gcd program.

     b.  Provide a bound function t for the do..od statement in  the  given
gcd program and prove its validity by applying the checklist.

T 437

PART A (50%):

Prove or disprove whether an arithmetic mean and a geometric mean are allowable operations on measures of:
a. interval scale
b. ratio scale
c. ordinal scale

PART B (50%):

Given the following data collected in a hypothetical software company:

Module #

Size (LOC)

Effort (days)

# of defects found

Module cohesion

1

407

21

5

4

2

247

17

1

1

3

230

16

9

7

4

55

3

8

5

5

445

21

2

2

6

96

5

6

4

Use regression analysis to develop two prediction systems that can be
used with a significant level of confidence. In the first prediction
system a module size should be used to estimate an effort. In the
second prediction system a module cohesion should be used to predict
the number of defects in the module. Note that the following is a list
of possible types of module cohesion and their values:

7	Coincidental cohesion
5	Temporal cohesion
4	Procedural cohesion
3	Sequential cohesion
2	Communicational cohesion
1	Data or Informational cohesion

T438

Part A (40%):
The most commonly used software quality measure in industry is the 
number of faults per thousand lines of product source code. Compare 
the usefulness of this measure for developers and users. List some 
possible problems with this measure.

Part B (60%):
Suppose you have overall responsibility for a number of ongoing 
software projects (some of which are being beta-tested). There are 
wide quality variations among the projects, and you have available the 
following measures for each project:

*	mean time failure
*	mean time to repair reported defects
*	total number of user-reported failures
*	total number of defects found during system testing
*	total number of changes made during development
*	maximum number of changes made during development
*	maximum cyclomatic number
*	average number of function points produced per month of 
		programmer effort.

Discuss the relative merits of these measures for purposes of 
comparison. Are there any other measures (that are relatively 
straightforward to collect) that might help you.

T 439

a. (20%) Explain the idea behind Albrecht's function points measure. 
b. (20%) List the main applications of function points.
c. (20%) Compare function points with lines-of-code measure.
d. (20%) What do functions points measure and what are the major drawbacks?
e. (20%) Give three examples of how you might use the function-point 
	 measure in quality control or assurance (stating in each case why it 
	 might be preferable to some simpler alternatives).

T 440

Part A (70%):
When faults are found and fixed in a program, we observe reliability 
growth. Specifically, we see a sequence of times between successive 
failures t1, t2, ..., tn which tend to increase in magnitude. Explain 
carefully the underlying failure process, identifying the sources of 
uncertainty that require us to adopt a probabilistic approach when we 
wish to predict the future failure behavior of the program.

Part B (30%):
Suppose you can remove 50% of all faults resident in an operational 
software system. What corresponding improvements would you expect 
in the reliability of the system?


T 441

What are multirate strictly nonblocking network, multirate wide-sense nonblocking network and multirate rearrangebly nonblocking network?

Consider the three-stage Clos network C(n, m, r).

Show that if each connection has weight in the interval [b, 1], then C(n, m, r) is strictly nonblocking if and only if

m >= 2 floor(1/b ) (n-1)+1

Show that if m = 3n-1, then C(n, m, r) is rearrangably nonblocking.

T 442

Consider an SS/TDMA (satellite-switched time-division multiple access) system with M uplink beams, N downlink beams, and K (1<= K <= M, N) transponders.

Given any trafic matrix D, let T be the sum of all entries in D, and C be the maximum of the line sums in D. Show that any transmission schedule length for D is at least

max(ceil( T/K ), C).

Give an algorithm to find an optimum transmission schedule for any traffic matrix D.