123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549 |
- \documentclass[a4paper,%
- 11pt,%
- DIV14,
- headsepline,%
- headings=normal,
- ]{scrartcl}
- %\usepackage{ngerman}
- \usepackage[utf8]{inputenc}
- \usepackage[T1]{fontenc}
- \usepackage{textcomp}
- % stuff for getting the bold monospace font to work
- \usepackage{bold-extra}
- \usepackage[T1]{fontenc}
- %\usepackage[scaled=0.85]{beramono}
- % for matlab code
- % bw = blackwhite - optimized for print, otherwise source is colored
- %\usepackage[framed,numbered,bw]{mcode}
- % for other code
- %\usepackage{listings}
- %\usepackage[ansinew]{inputenc}
- \usepackage{color}
- \usepackage{hyperref}
- \usepackage{graphicx}
- \usepackage{listings}
- \usepackage[hang, font = footnotesize]{caption}
- %\usepackage{german}
- \usepackage{algorithm}
- \usepackage{algpseudocode}
- \usepackage{enumitem}
- \usepackage{amssymb}
- \usepackage{amsmath}
- \usepackage{mathrsfs}
- \usepackage{mathalpha}
- \usepackage{wrapfig}
- \renewcommand{\thesubsection}{Ex~\arabic{section}.\arabic{subsection}}
- \newcommand{\solution}{\textbf{Solution} }
- \lstset{basicstyle=\footnotesize\ttfamily,breaklines=true}
- \lstset{morekeywords={atomic, int, bool, return, if, for, while, do, return, switch, case, class}, morecomment=[l]//, morecomment=[s]{/*}{*/}}
- \begin{document}
- \hrule height 1px
- \vspace*{1ex}
- \begin{minipage}[t]{.45\linewidth}
- \strut\vspace*{-\baselineskip}\newline
- \includegraphics[height=.9cm]{res/Inf-Logo_black_en.pdf}
- \includegraphics[height=.9cm]{res/par-logo.pdf}
- \end{minipage}
- \hfill
- \begin{minipage}[t]{.5\linewidth}
- \flushright{
- Research Group for Parallel Computing\\%
- Faculty of Informatics\\%
- TU Wien}
- \end{minipage}
- \vspace*{1ex}
- \hrule
- \vspace*{2ex}
- \begin{center}
- {\LARGE\textbf{Advanced Multiprocessor Programming}}\\
- {\large{}%
- Summer term 2023 \\
- Theory exercise 2 of 2 \\
- \vspace*{2ex}
- Norbert Tremurici (11907086)
- %\semestershort\\
- %\assignmentname{0}}
- }
- \end{center}
- \hfill
- \begin{minipage}[b]{1\linewidth}
- \hfill
- \begin{tabular}{@{}ll@{}}
- Issue date: & 2023-04-23\\
- Due date: & \color{red}{2023-05-08 (23:59)}
- \end{tabular}
- \end{minipage}
- \vspace*{3ex}
- \hrule
- \vspace*{1ex}
- \begin{flushright}
- \textbf{\underline{Total points: 42}}
- \end{flushright}
- %===================================
- \section{Consistency Criteria} \label{sec:consistency}
- %\subsection{(1 point)} \label{ex:21}
- %Explain why quiescent consistency is compositional.
- \subsection{(2 points)} \label{ex:37_new}
- Give an example of a sequentially-consistent execution that is not safe.
- \solution Sequential consistency requires that a history $H$ preserves program order for all threads, that is to say, $H$ consists of a valid interleaving and for a thread $A$, its subhistory $H|A$ is in program order.
- While safe objects can return any value of its domain on a read during a concurrent write access, it must return the most recently written value during a non-concurrent read.
- Thus we can construct an example:
- $$
- H = write_B(x = false) \to write_A(x = true) \to read_B(x == false) \to read_A(x == true)
- $$
- Note that $H|B = write_B(x = false) \to read_B(x == false)$ and $H|A = write_A(x = true) \to read_A(x == true)$ are sequentially consistent but because $B$ reads $false$ even though $true$ has been written strictly before, the object is not safe.
- \subsection{(2 points)} \label{ex:22}
- Consider a \emph{memory object} that encompasses two register components.
- We know that if both registers are quiescently consistent, then so is the memory.
- Does the converse hold?
- If the memory is quiescently consistent, are the individual registers quiescently consistent?
- Outline a proof, or give a counterexample.
- \solution It does not automatically hold that the individual registers are quiescently consistent.
- Consider a history $H$ where events $Y_1 = write(y = true)$, $X = read(x == false)$ and $Y_2 = read(y == false)$ happen concurrently, but $Y_1 \to Y_2$.
- In this history, there is no period of quiescence for the memory object, because beginning from the start of the first event to the end of the last event a method call is pending.
- If we unroll the history into the objects' subhistories, we see that they are not quiescently consistent:
- $H|x = read(x == false)$ and $H|y = write(y = true) \to read(y == false)$.
- For subhistory $H|y$ there is a period of quiescence, so the register is not quiescently consistent, even though the memory object is.
- %\subsection{(4 points)} \label{ex:23}
- %Give an example of an execution that is quiescently consistent but not sequentially consistent, and another one that is sequentially consistent, but not quiescently consistent.
- %For each of your examples provide a short proof sketch and an example execution to both show the impossibility regarding one consistency criterion and the compliance regarding the other one.
- \subsection{(4 points)} \label{ex:original_1}
- Suppose $x,y,z$ are registers with initial values $0$.
- Is the history in Figure \ref{fig:cons_crit_hist1} quiescently consistent, sequentially consistent, linearizable?
- For each consistency criterion, either provide a consistent execution or a proof sketch.
- Does there exist an initial valuation for the registers $x,y,z$ such that your above result w.r.t. linearizability changes? (Meaning if you found the history to be linearizable, can you find initial values for $x,y,z$ s.t. it no longer is linearizable; conversely if you found the history to be not linearizable, can you find initial values s.t. it becomes linearizable?)
- If so, provide the initial values and either a proof sketch or an example execution.
- \begin{figure}[H]
- \centering \includegraphics[width=16cm]{res/Consistency Criteria Ex.1 monochrome.png}
- \caption{History with five threads and three registers} \label{fig:cons_crit_hist1}
- \end{figure}
- \solution The history is not linearizable and not quiescently consistent, but sequentially consistent.
- We can easily see that it cannot be quiescently consistent if we consider the single point of quiescence in the history.
- The first block ends with event $z.read(4)$ and the second block begins with $x.write(1)$, with a period of quiescence in between.
- As the history becomes quiescent, the execution so far has to be equivalent to some sequential execution.
- We can see that event $z.read(4)$ is impossible, as $z$ starts with initial value $0$.
- We would need to re-order the event $z.write(4)$, which is not allowed due to the period of quiescence.
- It follows that the history cannot be linearizable, but we can also easily see that the first reads are impossible, because registers $x, y, z$ have initial values $0$.
- However, if we allow initial values $x = 1$, $y = 3$ and $z = 4$, we can find a linearization of the history.
- Figure \ref{fig:ex2-1-4-sol1} shows the unrolled sequential history, where events have been re-ordered and with program order remaining undisturbed, proving sequential consistency.
- Figure \ref{fig:ex2-1-4-sol2} shows a valid linearization if the initial values are changed.
- \begin{figure}[H]
- \centering \includegraphics[width=16cm]{res/ex2-1-4-sequentialization.png}
- \caption{Unrolled sequential history} \label{fig:ex2-1-4-sol1}
- \end{figure}
- \begin{figure}[H]
- \centering \includegraphics[width=16cm]{res/ex2-1-4-linearization.png}
- \caption{Linearized history} \label{fig:ex2-1-4-sol2}
- \end{figure}
- \subsection{(4 points)} \label{ex:original_2}
- Suppose $x,y,z$ are stacks and $w$ is a FIFO queue.
- Initially the stacks and the queue are all empty.
- Is the history in Figure \ref{fig:cons_crit_hist2} quiescently consistent, sequentially consistent, linearizable?
- For each consistency criterion, either provide a consistent execution or a proof sketch.
- \begin{figure}[H]
- \centering \includegraphics[width=16cm]{res/Consistency Criteria Ex.2 monochrome.png}
- \caption{History with seven threads three stacks and one queue} \label{fig:cons_crit_hist2}
- \end{figure}
- \solution The history is not sequentially consistent, quiescently consistent or linearizable.
- For a sequentially consistent execution, we cannot change the thread-local order of events.
- If we consider threads $B$ and $E$ operating on object $w$, we can see that $H|B|w = w.deq(3) \to w.deq(4)$, but $H|E|w = w.enq(4) \to w.enq(3)$.
- Values $3$ and $4$ are enqueued and dequeued only by threads $B$ and $E$.
- Because $w$ is a FIFO queue, what is enqueued first must be dequeued first, we can see that no matter what, $4$ is enqueued before $3$.
- But we can also see that no matter what, $3$ is dequeued before $4$.
- This contradicts the assumption that $w$ is a FIFO queue and thus the history cannot even be sequentially consistent.
- There is no period of quiescence, so for this example sequential consistency and quiescent consistency are identical and the history is not quiescently consistent.
- By reasoning similar to why the history cannot be sequentially consistent, the history cannot be linearizable.
- \subsection{(2 points)} \label{ex:27}
- The \texttt{AtomicInteger} class (in \textbf{java.util.concurrent.atomic} package) is a container for an integer value.
- One of its methods is
- \texttt{\textbf{boolean} compareAndSet(\textbf{int} expect, \textbf{int} update)}.
- This method compares the object's current value to \texttt{expect}.
- If the values are equal, then it atomically replaces the object's value with \texttt{update} and returns \texttt{true}.
- Otherwise it leaves the object's value unchanged and returns \texttt{false}.
- This class also provides
- \texttt{\textbf{int} get()}
- which returns the object's actual value.
- Consider the FIFO queue implementation shown in Figure \ref{fig:non_lin_queue}.
- It stores its items in an array \texttt{items}, which for simplicity we will assume has unbounded size.
- It has two \texttt{AtomicInteger} fields: \texttt{head} is the index of the next slot from which to remove an item and tail is the index of the next slot in which to place an item.
- Give an example execution showing that this implementation is \emph{not} linearizable and provide a short explanation as to why.
- \begin{figure}[H]
- \centering \includegraphics[width=10cm]{res/Non-linearizable queue implementation.png}
- \caption{Non-linearizable queue implementation} \label{fig:non_lin_queue}
- \end{figure}
- \solution We can construct an example if we look at what precisely causes a problem in this implementation.
- After \texttt{tail} is incremented atomically, a thread can nap or take a vacation until it decides to execute line 10, setting the enqueued value to the slot the thread has reserved for itself.
- Consider history $H$ where $enq(2) \to deq(\text{EmptyException})$, with event $enq(1)$ happening concurrently to $enq(2)$ and $deq(\text{EmptyException})$.
- This example is illustrated in Figure \ref{fig:ex2-1-5}.
- \begin{figure}[H]
- \centering \includegraphics[width=10cm]{res/ex2-1-5-counterexample.png}
- \caption{Non-linearizable queue execution} \label{fig:ex2-1-5}
- \end{figure}
- The execution is valid, if we consider that $enq(1)$ reserves slot $i$ and $enq(2)$ slot $i + 1$.
- If there were no previous elements, \texttt{head} is going to have value $i$ and the dequeue operation attempts to read \texttt{items[slot]} at slot $i$.
- If $enq(1)$ decides to nap or take a vacation, \texttt{items[slot]} at slot $i$ is still going to be null, so the dequeue operation throws an \texttt{EmptyException}.
- There are three candidates for a possible linearization as can be seen in the figure, but the first two where $enq(1)$ linearizes before $deq(\text{EmptyException})$ cannot be a valid linearization for a FIFO queue.
- The last candidate, where $enq(1)$ linearizes after, also fails, but for a different reason.
- In this case, $deq(\text{EmptyException})$ cannot occur because it strictly follows $enq(2)$.
- Thus this queue is not linearizable.
- \subsection{(4 points)} \label{ex:32}
- This exercise examines a queue implementation that can be seen in Figure \ref{fig:lin_queue}, whose \texttt{enq()} method does not have a linearization point.
- \begin{figure}[H]
- \centering \includegraphics[width=14cm]{res/Linearizable queue implementation.png}
- \caption{Queue implementation} \label{fig:lin_queue}
- \end{figure}
- The queue stores its items in an \texttt{items} array, which for simplicity we will assume is unbounded.
- The \texttt{tail} field is an \texttt{AtomicInteger}, initially zero.
- The \texttt{enq()} method reserves a slot by incrementing \texttt{tail} and then stores the item at that location.
- Note that these two steps are not atomic: there is an interval after \texttt{tail} has been incremented but before the item has been stored in the array.
- The \texttt{deq()} method reads the value of \texttt{tail}, and then traverses the array in ascending order from slot zero to the tail.
- For each slot, it swaps \emph{null} with the current contents, returning the first non-\emph{null} item it finds.
- If all slots are \emph{null}, the procedure is restarted.
- Give an example execution showing that the linearization point for \texttt{enq()} cannot occur at line 15. (\emph{hint: give an execution, where two \texttt{enq()} calls are not linearized in the order they execute line 15})
- Give another example execution showing that the linearization point for \texttt{enq()} cannot occur at line 16.
- Since these are the only two memory accesses in \texttt{enq()}, we must conclude that \texttt{enq()} has no single linearization point.
- Does this mean \texttt{enq()} is not linearizable?
- \solution We consider two examples.
- To see why the linearization point cannot occur at line 15, consider the following history where $enq(2) \to deq(2)$ with $enq(1)$ happening concurrently to $enq(2)$ and $deq(2)$, as is the situation in Figure \ref{fig:ex2-1-6-a}.
- Line numbers have been annotated to highlight how the code might be executed.
- If $enq(1)$ linearizes before $enq(2)$, then a dequeue operation must return 1, but event $deq(2)$ is valid because value 1 that might not yet be set can be skipped to read value 2.
- \begin{figure}[H]
- \centering \includegraphics[width=14cm]{res/ex2-1-6-a.png}
- \caption{First counterexample} \label{fig:ex2-1-6-a}
- \end{figure}
- To see why the linearization point cannot occur at line 16, consider the same history, but with $deq(2)$ strictly after both enqueue operations.
- This situation is displayed in Figure \ref{fig:ex2-1-6-b}.
- Event $enq(2)$ might execute line 15 after event $enq(1)$ does, but line 16 before event $enq(1)$.
- Event $enq(1)$ would have to linearize after $enq(2)$, so the FIFO queue must first return 2 and then 1.
- But because line 15 is executed first for event $enq(1)$, the order of elements would have to be 1 and then 2.
- Because the dequeue operation happens strictly after both enqueue operations, the dequeue operation cannot skip the value 1.
- \begin{figure}[H]
- \centering \includegraphics[width=14cm]{res/ex2-1-6-b.png}
- \caption{Second counterexample} \label{fig:ex2-1-6-b}
- \end{figure}
- Oddly enough, these results do not mean that \texttt{enq()} is not linearizable.
- Both examples are linearizable if we remove the restriction that a linearization point has to be placed at a specific line.
- In general, it cannot be said that if there is no single line on which a method linearizes, the method is not linearizable.
- %\emph{[2 bonus points] Come up with a single execution consisting of at most two \texttt{enq()} and one \texttt{deq()} operation, where the linearization points of the two \texttt{enq()} calls are different (not on the same code line).
- %Provide a proof sketch why the linearization points of these two \texttt{enq()} calls cannot occur on the same line in your example.}
- \section{Consensus} \label{sec:consensus}
- \subsection{(6 points)} \label{ex:52}
- Show that with sufficiently many $n$-thread binary consensus objects and atomic registers one can implement $n$-thread consensus over $n$ values.
- \solution We construct an algorithm to solve $n$-thread consensus over $n$ values.
- Initially we allocate two arrays of length $n$, \texttt{proposed} and \texttt{participating}.
- Every thread starts by writing its proposed value into the \texttt{proposed} array and then \texttt{true} to the \texttt{participating} array.
- Now we can start a series of $log_2(n)$ rounds, using a fresh $n$-thread binary consensus object in each round.
- Every thread considers the binary representation of its proposed value and proposes the $i$-th bit on round $i$.
- If the consensus value matches the bit of the proposed value, the thread continues this way.
- If at any round a different value is reached by way of consensus, then the thread scans array \texttt{proposed}, filters out any threads who are not participating by reading \texttt{participating} and tries to find a value whose binary representation matches the series of bits reached by consensus in each round so far.
- Once it founds such a value, it continues proposing this value instead.
- The end result must be that each thread assumes the same final value, because each thread acquires the resulting bits of the binary consensus object from each round.
- The algorithm is wait-free, because the algorithm terminates in a finite number of steps.
- It remains to show that the final value is a value that was proposed by a thread.
- There are two cases, either the thread wins consensus in every round or it does not.
- If it does, then it will have managed to reach its own proposed value as the final value, making it a valid value.
- If it does not, then it will have looked into the \texttt{proposed} array to find a matching value.
- There must have been such a matching value, because for every round $i$ there must have been a thread that proposed the $i$-th bit on round $i$, otherwise that bit would not have won binary consensus on round $i$.
- So there must always be a matched value.
- This matched value must also have been proposed by another thread, because only proposed values by threads who have written array \texttt{participating} are considered.
- Thus, this algorithm solves $n$-thread consensus over $n$ values.
- There is an interesting intuition behind why this works for those who are familiar with bus arbitration protocols (as is used for arbitration of the CAN bus protocol).
- In the bus protocol, every participant who is trying to place a value on the bus attempts to place its value by first trying to place its identifier on the bus.
- The bus is configured in a way that makes a one of the two logical values 0 and 1 dominant and the other recessive and the identifiers are ordered such that those devices with a higher priority have dominant logical values and win arbitration.
- Our case is almost identical, although instead of an identifier the arbitration happens for a proposed value.
- \subsection{(6 points)} \label{ex:53}
- The \texttt{Stack} class provides two methods: \texttt{push(x)} pushes a value onto the top of the stack and \texttt{pop()} removes and returns the most recently pushed value.
- Prove that the \texttt{Stack} class has consensus number \emph{exactly(!)} two.
- \solution This can be proven analogously to the queue example.
- We first consider why the consensus number must be at least two and we will then consider why the consensus number cannot be higher than two.
- If we initialize the stack as follows, $stack.push(LOSE) \to stack.push(WIN)$, we can solve consensus for at least two threads.
- Each thread can $stack.pop()$, if the value returned is $WIN$, it uses its own value in the proposed array, \texttt{proposed[ThreadID]}.
- If the value returned is $LOSE$, it uses the \emph{other} value in the proposed array instead, \texttt{proposed[1 - ThreadID]}.
- Both threads decide on the same proposed value in a finite number of steps.
- Now we consider why the stack's consensus number can be at most two.
- Consider three threads A, B and C.
- We know that there must exist a critical state where the move of a thread decides state of the consensus protocol.
- Without loss of generality, we assume that thread A's next move brings the protocol into a 0-valent state, while thread B's next move brings the protocol into a 1-valent state.
- Their next moves must be one of the two possible operations on the same stack.
- Now we consider all possible cases.
- If threads A and B are about to pop, then two elements will be removed from the stack.
- Either A moved first, bringing the protocol into a 0-valent state, or B moved first, bringing the protocol into a 1-valent state.
- Because the end result looks the same to C, no matter what it attempts to do, C cannot decide whether 0 or 1 is correct.
- If threads A and B are about to push, then two elements will be pushed onto the stack.
- Their push operations cannot determine the protocol state, because A and B won't have any new information to base their decision on.
- It follows that threads A and B must pop at some point.
- We can now bring about the same situation as before and C cannot decide whether 0 or 1 is correct.
- The last two cases are those where one thread pushes a value and the other thread pops a value.
- By a similar argument, if A and B move before C and one of them pushes before the other pops, the stack will end up in an empty state where it is once again indistinguishable for C whether 0 or 1 is correct.
- Thus, a stack with these two operations can have a consensus number that is at most two.
- Combining our two results, we learn that a stack with these two operations has consensus number exactly two.
- \subsection{(2 points)} \label{ex:54}
- Suppose we augment the FIFO \texttt{Queue} class with a \texttt{peek()} method that returns but does not remove the first element in the queue.
- Show that the augmented queue has infinite consensus number.
- \solution To show that this extended FIFO queue has an infinite consensus number, we construct an algorithm that solves $\infty$-thread consensus.
- Let each thread queue its proposed value and then use \texttt{peek()}.
- Each thread uses the value that is returned by \texttt{peek()}.
- To prove that this algorithm solves $\infty$-thread consensus, we distinguish between two cases.
- Either the thread was first to queue a value or it was not.
- If it was first, then the thread will peek its own value and return.
- If it was second, then the thread will not peek its own value, but the value of another thread who must have come before.
- Because every thread always peeks the first value, every thread reaches the same value.
- The queue and peek operations return in a finite number of steps and they do not depend on any other thread, so this algorithm is wait-free.
- The value that is returned by peek must have been a value proposed by someone, making it valid.
- Because every thread peeks only after queueing, it cannot be that the queue is empty.
- Thus this extended FIFO queue has consensus number $\infty$.
- \subsection{(2 points)} \label{ex:55}
- Consider three threads, $A$, $B$, $C$, each of which has a MRSW register, $X_A$, $X_B$, $X_C$, that it alone can write and the others can read.
- In addition each pair shares a \texttt{RMWRegister} register that provides only a \texttt{compareAndSet()} method: $A$ and $B$ share $R_{AB}$, $B$ and $C$ share $R_{BC}$, $A$ and $C$ share $R_{AC}$.
- Only the threads that share a register can call that register's \texttt{compareAndSet()} method -- which is the only way to interact with said register.
- Your mission: either give a consensus protocol and prove that it works or sketch an impossibility proof.
- \solution There cannot exist a consensus protocol for three threads with the given primitives and resources.
- To prove this, we consider what operations are available and why none of them can solve consensus.
- We already know that the MRSW registers cannot be used to solve 3-thread consensus, because they have consensus number 1.
- Thus the threads have to use the shared RMW registers to solve consensus.
- Suppose without loss of generality that A must make a critical move before B and B before C.
- The only operation available is CAS, so the RMW registers can only be used to determine if the CAS operation succeeded or not (has returned $true$ or $false$).
- The only way in which CAS can succeed is if the threads know which value to expect.
- This value cannot be random, because then there would only be a small probability of CAS succeeding for any thread.
- We assume that this value is initialized beforehand in a value that is known and common to all threads, like for example $-1$.
- If a thread tries to expect any other value, CAS can never succeed and the thread won't learn any new information.
- Thus every thread tries to expect $-1$.
- Because the value each thread tries to set has to be distinct, we can assume that they attempt to write their thread IDs, because that is their only differentiating property.
- The CAS operation returning true informs a thread of its success, which means that it was first to attempt CAS on the shared register.
- Both A and B succeeding before C therefore means that $R_{AC}$ will equal A and $R_{BC}$ will equal B.
- A and B know that they succeeded against C, and they can use the remaining shared register to determine that A was first, to decide on the proposed value of A.
- But there is no way for C to determine what value to use.
- If B were to succeed before A instead, the situation would look identical to C, so C cannot in general decide on the same value as A and B.
- Thread C could try to see if any of the MRSW registers was updated, but that would violate the wait-free property that is required of our consensus protocol.
- For any other order where one thread succeeds twice on its CAS operation, the end result is the same.
- There is another case in which no thread succeeds twice on its CAS operation, but this implies that each thread succeeds on only one shared RMW register.
- This case is even more hopeless, because there is no way to determine which proposed value to use as there is no distinguishing feature between threads.
- \subsection{(6 points)} \label{ex:56}
- Consider the situation described in \ref{ex:55} except that $A$, $B$, $C$ can apply a \texttt{doubleCompareAndSet()} operation to both of their shared registers at once.
- Regarding the semantics and signature of this new operation consider two variations \footnote{If any of the \texttt{doubleCompareAndSet()} variations is called by a thread for at least one register, which it does not have access to, we assume the method throws an \texttt{IllegalAccess} exception with no further side effects.}:
- \begin{enumerate} [label=\alph*)]
- \item Suppose the signature is
- \texttt{\textbf{bool} doubleCompareAndSet(RMWRegister shReg1, RMWRegister shReg2, \textbf{int} expect, \textbf{int} update)}.\\ \label{it:ex56_a}
- For a call of \texttt{doubleCompareAndSet(X, Y, exp, up)}, atomically the following is done: it checks if the value in \texttt{X} and \texttt{Y} is equal to \texttt{exp}.
- If both are equal to \texttt{exp}, \texttt{X} and \texttt{Y} are updated with \texttt{up} and \texttt{true} is returned.
- Otherwise no update is conducted and it returns \texttt{false}.
- The equivalent description in code can be seen in Listing \ref{list:DCAS_1}.
- \item Suppose the signature is
- \texttt{TwoBool doubleCompareAndSet(RMWRegister shReg1, RMWRegister shReg2, \textbf{int} expect, \textbf{int} update)},\\ \label{it:ex56_b}
- where \texttt{TwoBool} is a simple class consisting of two (publicly accessible) \texttt{\textbf{bool}} values.
- \texttt{\textbf{class} TwoBool \{ \textbf{public bool} val1; \textbf{public bool} val2; \}} \\
- Consider the semantics given by Listing \ref{list:DCAS_2}.
- \end{enumerate}
- \begin{lstlisting} [numbers=left, numbersep=5pt, tabsize=4, label=list:DCAS_1, caption=doubleCompareAndSet() - variation 1]
- bool doubleCompareAndSet(RMWRegister shReg1, RMWRegister shReg2, int expect, int update)
- {
- bool retVal = false;
- atomic
- {
- if(shReg1 == expect && shReg2 == expect)
- {
- shReg1 = update;
- shReg2 = update;
- retVal = true;
- }
- }
- return retVal;
- }
- \end{lstlisting}
- \begin{lstlisting} [numbers=left, numbersep=5pt, tabsize=4, label=list:DCAS_2, caption=doubleCompareAndSet() - variation 2]
- TwoBool doubleCompareAndSet(RMWRegister shReg1, RMWRegister shReg2, int expect, int update)
- {
- TwoBool retVal;
- atomic
- {
- retVal.val1 = shReg1.compareAndSet(expect, update); // the 'usual' CAS operation on a
- retVal.val2 = shReg2.compareAndSet(expect, update); // single register
- }
- return retVal;
- }
- \end{lstlisting}
- For both semantics of the \texttt{doubleCompareAndSet()} operation either provide a consensus protocol and prove that it works or sketch an impossibility proof.
- (Note that threads can still use their shared register's normal \texttt{compareAndSet()} method!)
- \solution \textbf{(a)} With this new double-CAS operation, we can solve 3-thread consensus.
- For the consensus object, let the shared RMW registers be initialized to a predetermined value, e.g. $R_{AB} = R_{BC} = R_{AC} = -1$.
- Let every thread try to double-CAS on its shared RMW registers after writing its proposed value to its MRSW register.
- If it succeeds, it uses its own proposed value.
- If it does not, then it uses CAS on its shared RMW registers, expecting the default value $-1$.
- The success of this CAS operation will tell the thread if the other thread who shares this RMW register has succeeded with its double-CAS (a CAS returning $false$ implies that some thread changed the value, by definition the partner who shares this RMW registers).
- After determining which thread succeeded with its double-CAS, it reads the appropriate MRSW register to determine the consensus value.
- This algorithm solves 3-thread consensus.
- Every thread acquires a value that was proposed, because it either uses its own proposed value or the value of a thread proposed before it.
- The first thread to succeed with double-CAS on its shared registers will determine the proposed value and because of the atomicity of double-CAS, any other thread will be prevented from succeeding with its own double-CAS operation.
- Every thread will determine correctly which thread succeeded with double-CAS and use its value, which must have already been written before it executed double-CAS.
- The sequence of operations finishes in a finite number of steps and no thread depends on the result of another thread, so this protocol is wait-free.
- \solution \textbf{(b)} With this new double-CAS operation, we cannot solve 3-thread consensus.
- The algorithm provided in (a) cannot be used because for this double-CAS operation, it is not guaranteed that a thread will either succeed on both shared registers or fail on both registers.
- As we have seen in \ref{ex:55}, to solve 3-thread consensus the operation of each thread cannot be a read/write to a MRSW register or a CAS on one of its shared RMW registers.
- It follows that every thread uses double-CAS.
- By similar reasoning as before, all threads must expect a common value, we assume this value to be $-1$.
- The first thread to use double-CAS will succeed on both of its shared registers.
- The next thread to use double-CAS therefore has to succeed on one shared register but fail on the other.
- This partial success of the double-CAS operation removes an important piece of information by removing the initial value $-1$ from the last shared register.
- The thread that comes last is in the same situation as it was in \ref{ex:55}, it cannot determine which thread used double-CAS first, because it can only use CAS on the shared registers and it won't find $-1$ in any of them.
- It follows that this double-CAS operation cannot be used to solve 3-thread consensus.
- \subsection{(2 points)} \label{ex:59}
- The \texttt{SetAgree} class, like the \texttt{Consensus} class provides a \texttt{decide()} method whose call returns a value that was the input of some thread's \texttt{decide()} call.
- However unlike the \texttt{Consensus} class, the values returned by \texttt{decide()} calls are not required to agree.
- Instead these calls may return no more than $k$ distinct values.
- (When $k$ is 1, \texttt{SetAgree} is the same as consensus.)
- What is the consensus number of the \texttt{SetAgree} class when $k > 1$?
- Sketch a proof!
- \solution To see whether this could work or not, we can restrict ourselves to a simple case, namely $n$-thread binary consensus with $k = 2$.
- The domain of this problem consists only of two values $true$ and $false$, so an $n$-thread binary consensus protocol can only decide on one of these two values.
- Because $k = 2$, it does not matter at all what each thread decides on.
- It follows that such a class cannot solve $n$-thread binary consensus for any $n > 1$.
- Because \texttt{SetAgree} cannot be used to solve $n$-thread binary consensus with $k = 2$, it cannot be used to solve consensus with any $k > 2$.
- The consensus of such a class is 1 (that is to say, it cannot solve consensus).
- \end{document}
|