Author: migo (Mikhael Goikhman)
Date: 2003-Nov-25 19:15:40

Here is a question I was asked on an interview of one half-hardware,
hard-software company. The original question was not as formal as this.
The solution is pretty straightforward if you think/draw a bit.

1) Given this device,
         _______
  a1 -->|       |
        |       |--> max(a1, a2)
  a2 -->|_______|

use several such devices to compose a more complex device:
         _________________________
        |                         |
  a1 -->|                         |
        |                         |
  a2 -->|                         |
        |                         |--> max(a1, a2, a3, a4)
  a3 -->|                         |
        |                         |
  a4 -->|                         |
        |_________________________|

Think about using the minimal number of sub-devices. Try also to optimize
the execution time of this super-device.

2) Given the device you just created for max(a1, a2, a3, a4) use several
such devices (either receiving 2 inputs or 4 inputs) to build the device
receiving 8 inputs, i.e. max(a1, a2, a3, a4, a5, a6, a7, a8).

3) Generalize the solution for any 2**N. What is the execution time of a
super-device and how many sub-devices are needed to compose a super-device.


Author: bandsmer (Michael Bandsmer) Date: 2003-Nov-27 20:14:56 To find the max of 2^N inputs, you can use 2^N-1 two-input max operations, operating in time proportional to N. Just arrange the max operations in a tree. E.g. the solution for the 2^N = 4 is shown below. To go to 2^N=8, just use two copies of the 4-solution in parallel, and add another two-input max to take the maximum of the 2 halves. The extension to general 2^N is straightforward. a1 \ > max(a1,a2) / \ a2 \ > max(a1,a2,a3,a4) / a3 \ / > max(a3,a4) / a4
Author: migo (Mikhael Goikhman) Date: 2003-Nov-27 23:16:29 Hmm, did I write "max" function? I am not sure why I did this. :) The actual question was about sort(). So, please replace the schemes with the following, the rest is the same. _______ a1 -->| |--> s1 = max(a1, a2) | | a2 -->|_______|--> s2 = min(a1, a2) use several such devices to compose a more complex device: _________________________ | | a1 -->| |--> s1 = max(a1, a2, a3, a4) | | a2 -->| |--> s2 | | a3 -->| |--> s3 | | a4 -->| |--> s4 = min(a1, a2, a3, a4) |_________________________| The interesting thing is to extend this to the case of 8 inputs when you have a time pressure, i.e. should think/draw/speak in real time. :)
Author: tomc (Tom Carmichael) Date: 2003-Dec-02 20:19:29 Sounds like a merge sort to me. For 4 inputs, you can do this: (i# indicates an intermediate solution) blackbox(a1,a2) -> i1, i2 blackbox(a3,a4) -> i3, i4 blackbox(i1,i3)-> s1, i5 blackbox(i2,i4)-> i6, s4 blackbox(i5,i6)-> s2,s3 For 8, it's a bit more complicated, but it's still effectively 2 large boxes (to sort into 2 lists of 4) followed by a series of merges. If the first '4' box gives i1-i4 and the second i5-i8, then you have these comparisons: i1,i5 -> s1 & i9 i2,i6 -> i10 & i11 i3,i7 -> i12, i13 i4,i8 -> i14, s8 i9, i10 -> s2, i15 i11,i12 -> i16, i17 i13,i14 -> i18, s7 i15,i16 -> s3, i19 i17,i18 -> i20, s6 i19,i20 -> s4,s5 And so on.
Author: migo (Mikhael Goikhman) Date: 2003-Dec-03 13:54:08 tomc: yes, merge sort is the way to go for optimal execution. I initially tried bubble sort just to see how it works, it is not optimal. However, your solution for 8 inputs/outputs is not the best one. :) Hint: the solution should be nice and not complicated. Here is the solution for 4 inputs/outputs in ascii art: _______ _______ a1 -->| |--------->| |---------------------> s1 | D1 | | D1 | _______ a2 -->|_______|---. ,--->|_______|--------->| |--> s2 _______ X _______ | D1 | a3 -->| |---' `--->| |--------->|_______|--> s3 | D1 | | D1 | a4 -->|_______|--------->|_______|---------------------> s4
Author: tomc (Tom Carmichael) Date: 2003-Dec-03 14:56:38 My implementation for the merge was intentionally done not using the derived 4-box function. You can do so, sticking with the 'recursive' model, but I think you'll find that is, to some degree, less optimal. If you wish to use a big box (derived 4-box) in the solution, you can once you've reduced to 4 inputs. I would argue that's less efficient however, not more, since the reduced solution uses some pre-sorted known properties to skip the first 'step' of the 4 box solution. A simlar type of optimization may be possible on the front end of my solution, I didn't look in detail at that piece.
Author: migo (Mikhael Goikhman) Date: 2003-Dec-03 16:52:23 Let's call the device with 2^N inputs/outputs "DN". Tom: now that I think about it, you are right, we may save both time and the number of D1 devices if we use your solution. I think we may even start with D1 and not to use D2 at all, but I don't have a time now to verify whether we will end up with your solution (20 D1 used) or it is possible to improve it. Anyway, it is better than 25 D1 used in the original solution. However the original solution is the best if you count any subdevice as one and not just the smallest. It is always possible to use only 5 subdevices to implement DN. Here is the solution for D3, it pretty much mimics the solution for D2. _______ _______ a1 -->| |--------->| |---------------------> s1 | | | | a2 -->| |--------->| |---------------------> s2 | D2 | | D2 | _______ a3 -->| |--. ,-->| |--------->| |--> s3 | | \ / | | | | a4 -->|_______|--. X ,-->|_______|--------->| |--> s4 _______ X X _______ | D2 | a5 -->| |--' X '-->| |--------->| |--> s5 | | / \ | | | | a6 -->| |--' `-->| |--------->|_______|--> s6 | D2 | | D2 | a7 -->| |--------->| |---------------------> s7 | | | | a8 -->|_______|--------->|_______|---------------------> s8