Python asyncio Event Loop Internals: A Deep CPython

Python asyncio Event Loop Internals: A Deep CPython

If you have ever used Python’s asyncio library, you have interacted with one of the most sophisticated pieces of machinery in the standard library. But what is actually happening underneath the surface? How does the event loop know when a socket is ready to read? How does it decide which coroutine to run next? This post answers those questions by walking directly through the CPython source code, so you can see every mechanism for yourself.

What Is the Event Loop, Really?

The asyncio event loop is the central dispatcher. It runs in a single OS thread and alternates between two activities: checking for I/O readiness using a platform selector (select, epoll, or kqueue) and executing ready callbacks. Everything else — coroutines, futures, tasks — is built on top of those two primitives.

The main class is BaseEventLoop, defined in Lib/asyncio/base_events.py. On Unix, the concrete implementation is SelectorEventLoop (Lib/asyncio/selector_events.py), which delegates I/O polling to the selectors module. On Windows, ProactorEventLoop uses IOCP.

The Core Run Loop: _run_once()

The heartbeat of asyncio is the _run_once() method. Every iteration of the event loop calls this method exactly once. Here is what it does in sequence:

  1. It drains the _ready queue by moving all scheduled callbacks that are due (using heapq on _scheduled) into _ready. These are callbacks wrapped in Handle objects.
  2. It computes a timeout for the I/O poll. If there are callbacks already in _ready, the timeout is zero — we want to return immediately. Otherwise, the timeout is the time until the next scheduled callback.
  3. It calls self._selector.select(timeout). This is a blocking OS call that returns a list of (key, events) pairs — file descriptors that are ready for reading or writing.
  4. For each ready file descriptor, it calls the registered callback — typically _read_ready() or _write_ready() on a transport object.
  5. Finally, it runs all callbacks in _ready by calling handle._run(), which invokes the underlying Python callable.

# Simplified from base_events.py def _run_once(self):     sched_count = len(self._scheduled)     # Move due callbacks to _ready     end_time = self.time() + self._clock_resolution     while self._scheduled:         handle = self._scheduled[0]         if handle._when >= end_time:             break         handle = heapq.heappop(self._scheduled)         handle._scheduled = False         self._ready.append(handle)     # Compute I/O timeout     timeout = None     if self._ready:         timeout = 0     elif self._scheduled:         timeout = max(0, self._scheduled[0]._when – self.time())     # Poll for I/O     event_list = self._selector.select(timeout)     self._process_events(event_list)     # Run ready callbacks     ntodo = len(self._ready)     for i in range(ntodo):         handle = self._ready.popleft()         handle._run()

How Tasks Drive Coroutines

Coroutines do not run by themselves. They need a Task to drive them. Task is defined in Lib/asyncio/tasks.py and it wraps a coroutine object. When you call asyncio.create_task(coro()), a Task is created and immediately schedules its __step() method as a callback via loop.call_soon().

__step() calls coro.send(None) (or throw() on exception). This resumes the coroutine up to its next yield point — always an awaited Future. When the coroutine yields a Future, Task registers itself as a callback on that Future via future.add_done_callback(self.__step). So when the Future resolves, the Task is called again, resuming the coroutine from where it left off.

This is the entire coroutine machinery: Tasks, Futures, and the event loop form a trampoline. No threads, no magic — just Python callbacks and send().

→ Related: How Python coroutines actually work — tracing __next__ and send() through the source (Blog 02)

I/O Integration: Transports and Protocols

When you open a TCP connection with asyncio.open_connection(), the loop creates a Transport (e.g., _SelectorSocketTransport) and registers the socket’s file descriptor with the selector for read events. The transport’s _read_ready() is registered as the callback.

When data arrives on the socket, _run_once() picks up the fd from the selector, calls _read_ready(), which calls socket.recv() and then forwards the data to the Protocol via protocol.data_received(). The Protocol is your application code. The Transport is the I/O adapter. This separation is the asyncio design philosophy.

The Handle and TimerHandle Classes

Every callback is wrapped in a Handle (Lib/asyncio/events.py). Handle stores the callable, its arguments, the loop, and a cancelled flag. Calling handle.cancel() just sets that flag — the event loop checks it before calling _run() and skips cancelled handles.

For time-delayed callbacks (created via call_later() or call_at()), a TimerHandle is used. TimerHandle extends Handle with a _when timestamp. These are stored in the _scheduled heap, sorted by _when.

The Selector Backend

The selectors module (Lib/selectors.py) wraps OS-level I/O multiplexing. On Linux, DefaultSelector maps to EpollSelector, which calls epoll_ctl to register file descriptors and epoll_wait to poll. On macOS, it uses KqueueSelector. The return value of select() is a list of SelectorKey objects, each containing the fd, its registered events, and the callback data attached at registration time.

Understanding this backend is crucial for diagnosing asyncio performance. If you register thousands of fds, epoll scales better than select — which is why Linux asyncio applications can handle hundreds of thousands of connections efficiently.

Running the Loop: run_forever() and run_until_complete()

run_forever() is simply a while True loop that calls _run_once() repeatedly until stop() is called. run_until_complete(future) wraps your coroutine in a Task, adds a done callback that calls stop(), then calls run_forever(). When the Task finishes, stop() is called, the loop exits, and the result is returned.

This is why you cannot run two event loops in the same thread simultaneously — run_forever() is blocking. For nested event loops (e.g., in Jupyter notebooks), the nest_asyncio package patches this behavior.

→ Related: Twisted reactor internals — how selectreactor schedules I/O callbacks (Blog 06)

Conclusion

The asyncio event loop is surprisingly transparent once you read the source. It is a scheduler built on OS I/O polling, a priority queue for timed callbacks, and a deque for ready callbacks. Tasks drive coroutines using send() and Future callbacks. If you want to master async Python, reading Lib/asyncio/base_events.py and Lib/asyncio/tasks.py is the most direct path.

Start with _run_once(). Follow the call graph. Every mystery in asyncio traces back to those few hundred lines.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *