golang的协程调度是抢占式的吗？

📅 发表于2023-08-30
📁 编程
🏷️ golang ● schedule

操作系统的调度方式可以分为抢占式（Preemptive）和非抢占式（Non-preemptive，也称为协作式）两种。它们的区别主要体现在任务（进程或线程）被调度的时机以及调度的控制权如何分配。

抢占式调度

在抢占式调度中，操作系统具有较高的优先级，可以在任何时刻中断当前正在执行的任务，并强制将控制权交给另一个任务。也就是说一个正在执行的任务可能会被更高优先级的任务抢占，即使该任务尚未自愿释放控制权。这种方式可以确保高优先级任务及时响应，但也可能导致低优先级任务被频繁中断，影响系统的稳定性和可预测性。

非抢占式调度

在非抢占式调度中，任务只有在自愿释放控制权（例如等待I/O操作完成、进入休眠状态等）或者任务结束时，才会进行任务切换。操作系统无法强制中断当前正在执行的任务。这种方式可以确保低优先级任务不会被频繁中断，但也可能导致高优先级任务无法及时响应，特别是当某个任务陷入无限循环或阻塞状态时。

因此，抢占式和非抢占式就是看操作系统支不支持 优先级 这个概念，有优先级的话，高优先级的进程可以打断正在运行的低优先级进程，所以是抢占式的。没有优化级的话，进程只有自己让度CPU(如时间片到了，或者阻塞等），所以是非抢占式的。我们知道现在的操作系统都设置了 nice值，因此基本上都是抢占式的调度。

抢占式和协作式虽然说是操作系统的概念，但是golang的协程调度也是任务调度问题，那么它是抢占式的还是协作式的呢？

协程是用户态线程，我们可能会理所当然的认为go的协程调度就是非抢占式的，因为协程调度的GPM模型是由用户态主动让出执行线程的（让度条件就是自己的goroutine执行完了，或者pending）, 协程没有优化级的概念。

但真是这样的吗？其实不然，从 go 1.14 开始，go 调度器是非合作抢占的。每个 goroutine 在一定的时间片后被抢占。在 go 1.19.1 中是 10ms 源码链接。就算该goroutine一直在占用CPU进行计算，只要10ms的时间到了，它依然会被调度出去重新丢到等待队列里面。

 1// forcePreemptNS is the time slice given to a G before it is
 2// preempted.
 3const forcePreemptNS = 10 * 1000 * 1000 // 10ms
 4
 5func retake(now int64) uint32 {
 6	n := 0
 7	// Prevent allp slice changes. This lock will be completely
 8	// uncontended unless we're already stopping the world.
 9	lock(&allpLock)
10	// We can't use a range loop over allp because we may
11	// temporarily drop the allpLock. Hence, we need to re-fetch
12	// allp each time around the loop.
13	for i := 0; i < len(allp); i++ {
14		_p_ := allp[i]
15		if _p_ == nil {
16			// This can happen if procresize has grown
17			// allp but not yet created new Ps.
18			continue
19		}
20		pd := &_p_.sysmontick
21		s := _p_.status
22		sysretake := false
23		if s == _Prunning || s == _Psyscall {
24			// Preempt G if it's running for too long.
25			t := int64(_p_.schedtick)
26			if int64(pd.schedtick) != t {
27				pd.schedtick = uint32(t)
28				pd.schedwhen = now
29			} else if pd.schedwhen+forcePreemptNS <= now {
30				preemptone(_p_)
31				// In case of syscall, preemptone() doesn't
32				// work, because there is no M wired to P.
33				sysretake = true
34			}
35		}
36		if s == _Psyscall {
37			// Retake P from syscall if it's there for more than 1 sysmon tick (at least 20us).
38			t := int64(_p_.syscalltick)
39			if !sysretake && int64(pd.syscalltick) != t {
40				pd.syscalltick = uint32(t)
41				pd.syscallwhen = now
42				continue
43			}
44			// On the one hand we don't want to retake Ps if there is no other work to do,
45			// but on the other hand we want to retake them eventually
46			// because they can prevent the sysmon thread from deep sleep.
47			if runqempty(_p_) && atomic.Load(&sched.nmspinning)+atomic.Load(&sched.npidle) > 0 && pd.syscallwhen+10*1000*1000 > now {
48				continue
49			}
50			// Drop allpLock so we can take sched.lock.
51			unlock(&allpLock)
52			// Need to decrement number of idle locked M's
53			// (pretending that one more is running) before the CAS.
54			// Otherwise the M from which we retake can exit the syscall,
55			// increment nmidle and report deadlock.
56			incidlelocked(-1)
57			if atomic.Cas(&_p_.status, s, _Pidle) {
58				if trace.enabled {
59					traceGoSysBlock(_p_)
60					traceProcStop(_p_)
61				}
62				n++
63				_p_.syscalltick++
64				handoffp(_p_)
65			}
66			incidlelocked(1)
67			lock(&allpLock)
68		}
69	}
70	unlock(&allpLock)
71	return uint32(n)
72}