SC concurrency

R20: concurrency

Concurrency is handling tasks in the overlapping period (multitask on a single-core computer) while parallelism is when tasks run at the same time (multi-processor)[1].

two models for concurrent programming

  1. message passing: modules share the same memory space
  2. shared memory: modules communicate through a communication channel

basic concepts

Process

A process is an instance of a running program that is isolated from other processes on the same machine. In particular, it has its own private section of the machine’s memory.

  • The process abstraction is a virtual computer.

Thread

A thread is a locus of control inside a running program. Think of it as a place in the program that is being run, plus the stack of method calls that led to that place (so the thread can go back up the stack when it reaches return statements).

  • the thread abstraction represents a virtual processor

Time slicing

  • the processor switches between threads when there are more threads than processors

starting a thread with an anonymous class

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
new Thread(new Runnable() {
public void run() {
System.out.println("Hello from a thread!");
}
}).start();

//named class
/** A comparison function that imposes a total ordering on some objects.
* ... */
public interface Comparator<T> {
/** Compares its two arguments for order.
* ...
* @return a negative integer, zero, or a positive integer if the first
* argument is less than, equal to, or greater than the second */
public int compare(T o1, T o2);
}

/** Orders Strings by length (shorter first) and then lexicographically. */
public class StringLengthComparator implements Comparator<String> {
@Override public int compare(String s1, String s2) {
if (s1.length() == s2.length()) {
return s1.compareTo(s2);
}
return s1.length() - s2.length();
}
}

SortedSet<String> strings = new TreeSet<>();
strings.addAll(List.of("yolanda", "zach", "alice", "bob"));
// strings is { "alice", "bob", "yolanda", "zach" }
With a Comparator:

// uses StringLengthComparator declared above
Comparator<String> compareByLength = new StringLengthComparator();
SortedSet<String> strings = new TreeSet<>(compareByLength);
strings.addAll(List.of("yolanda", "zach", "alice", "bob"));
// strings is { "bob", "zach", "alice", "yolanda" }

// uses StringLengthComparator declared above
SortedSet<String> strings = new TreeSet<>(new StringLengthComparator());
strings.addAll(List.of("yolanda", "zach", "alice", "bob"));
// strings is { "bob", "zach", "alice", "yolanda" }


// no StringLengthComparator class!
SortedSet<String> strings = new TreeSet<>(new Comparator<String>() {
@Override public int compare(String s1, String s2) {
if (s1.length() == s2.length()) {
return s1.compareTo(s2);
}
return s1.length() - s2.length();
}
});
strings.addAll(List.of("yolanda", "zach", "alice", "bob"));
// strings is { "bob", "zach", "alice", "yolanda" }

advantage: one-off implementations

Disadvantage: a named class is good for needing more than once.

Never call run() on a thread, or on a runnable that you created for a thread, which uses the same thread.

Instead, make a new Thread() with an instance of your Runnable, and call start() on the thread to start it.

junit: test fails if an exception is thrown and not caught by the test code itself.

Shared memory example

Race condition: the correctness of the program depends on the relative timing of events in concurrent computations A and B. “A is in a race with B.”

It can’t be told what the atomic operations(the individual steps of the computation) will be.

System.exit(): main() may return while its threads are still running. Use exit to force the process to exit.

Message passing example

Interleaving of the messages sent to the bank account.

concurrency is hard to debug

问题可能存在于多个程序,os?网络?cpu?

问题很难再现,想想吧,相同的问题位置,出错的地方确不同或者不会发生。

并且print-debug失效,因为它们相比于运行的程序的操作要慢100倍~1000倍。当使用print时,足够的时间改变操作的时间。

Nondeterministic behavior causation?

R21&22: thread safety

what is threadsafe?

if it behaves correctly when used from multiple threads, regardless of how those threads are executed, and without demanding additional coordination from the calling code.

  • “Behaves correctly” means satisfying its specification and preserving its rep invariant.
  • “Regardless of how threads are executed” means threads might be on multiple processors or timesliced on the same processor.
  • And “without additional coordination” means that the data type can’t put preconditions on its caller related to timing, like “you can’t call get() while set() is in progress.”

how to make a safety argument?

how to make code safe in concurrent programming?

Don’t share data between threads (by confinement, immutability, threadsafe data types).

The difficult one is how to implement a threadsafe type.

Strategy1: confinement

idea: Avoid races on reassignable references and mutable data by keeping those confined to a single thread.

Shared mutable state is the root cause of a race condition.

don’t use global variables: static variables are accessible to all threads therefore can be changed. While local variables are confined to single thread.

Strategy2: immutability

keep the data immutable and variables unreassignable.

Threadsafe immutability:

Strategy3:using threadsafe data types

Store shared data in existing threadsafe data types.

threadsafe collections:

wrapper methods that make collections threadsafe while still mutable

Notice:

  • don’t circumvent the wrapper
  • Iterators are still not threadsafe
    • Solution: collections’ lock
  • Atomic operations aren’t enough to prevent races

Strategy4: synchronization

The correctness of a concurrrent program should not depend on accidents of timing.

Problem: bugs in concurrent programming

steps to develop the datatype:

  1. Specify: define operations

  2. test

  3. rep

    a. Implement a simple, brute-force rep first

    b. write down the rep invariant and abstraction function, and implement checkRep()

  4. Synchronize

  5. Iterate

R23: queues & message-passing

主要用java讲了个queue&message-passing的例子。内容:

  1. 与shared memory system对比,message-passing使两个thread间有独立性,出错的概率更小。
  2. blocking queues pattern,其实和6.004中sophomore的例子相似,加上了一些java语法的细节
  3. 实例: 放在冰箱里有一些饮料,一些人需要拿饮料。(代码在本地文件夹中)

相当于6.004的sychronization另外的一个例子。

producer-consumer design pattern

problem: how to implement message passing with in single process?

Design: producer-consumer design pattern

Problem: how to stop the process?

Approach1: a poison pill: a special message on the queue that signals the consumer of that message to end its work.

Approach2: change the type of elements on the requests queue to an ADT:

1
FridgeRequest = DrinkRequest(n:int) + StopRequest

with operations:

1
2
drinksRequested : FridgeRequest → int
shouldStop : FridgeRequest → boolean

and when we want to stop the fridge, we enqueue a FridgeRequest where shouldStop returns true.

Approach3:signal a thread that it should stop working by calling that thread’s interrupt() method

race conditions

prevent deadlock:

  1. design the system so that there is no possibility of a cycle — so that if A is waiting for B, it cannot be that B was already (or will start) waiting for A.

  2. timeouts:If a module has been blocked for too long (maybe 100 milliseconds? or 10 seconds? how to decide?), then you stop blocking and throw an exception.

    problem: what do you do when that exception is thrown?

R24: sockets & networking

compared to producer-consumer pattern, client-server pattern abstracts the communication over the network using the sockets.

Client/server design pattern

In this pattern there are two kinds of processes: clients and servers. A client initiates the communication by connecting to a server. The client sends requests to the server, and the server sends replies back. Finally, the client disconnects. A server might handle connections from many clients concurrently, and clients might also connect to multiple servers.

sockets and streams

Basic concepts related to network communication, and to input/output. (I/O refers to communication into and out of a process.)

  • IP addresses

  • Hostnames

  • Port numbers

> Network sockets

A socket represents one end of the connection between client and server.

  • listening socket:used by a server process to wait for connections from remote clients.
  • connected socket:send and receive messages to and from the process on the other end of the connection.

Physical-socket analogy

  • 用户和服务器的连接过程就像将USB线插入接口一样,没插入前的接口是listening socket,插入接口的线和接口是connected socket。正如一条线有两个终端,连接的socket也有两个终端分别在用户端和服务器端。

  • 细节上的区别:在实际的socket连接中,当用户请求被listening socket接收,服务器重新产生一个新的connected socket来管理连接,而保留着listening socker继续接受用户请求。

> streams

  • buffers
  • Byte streams
  • character streams
  • Blocking

> wire protocols

A protocol is a set of messages that can be exchanged by two communicating parties.

A wire protocol in particular is a set of messages represented as byte sequences, like hello world and bye (assuming we’ve agreed on a way to encode those characters into bytes).

  • Telnet client

  • http

  • smtp

Design a wire protocol

use same strategy as designing ADT

Using network sockets in Java

  • client
  • server
  • multithreaded server
  • Closing streams and sockets with try-with-resources statement
1
2
3
4
5
6
7
8
9
10
11
try (
// preamble: declare variables initialized to objects that need closing after use
) {
// body: runs with those variables in scope
} catch(...) {
// catch clauses: optional, handles exceptions thrown by the preamble or body
} finally {
// finally clause: optional, runs after the body and any catch clause
}
// no matter how the try statement exits, it automatically calls
// close() on all variables declared in the preamble

testing

Example


  1. https://stackoverflow.com/questions/1050222/what-is-the-difference-between-concurrency-and-parallelism ↩︎