Tracing down Bio in Block Subsystems
This blog is to answer two questions:
- Layers involved of I/O requests like write sth to a file in a local computer?
- How does md device (or any other block device like null block driver) receive its data?
Bottom halves¶
- Bottom halves perform interrupt-related work that was not performed by the interrupt handler (top half)
- Run with all interrupts enabled
- Deferring work means not now
- Work queue is a simple interface for deferring work to a generic kernel thread
runqueue & waitqueue
Interface¶
-
queuing work to workqueue[1]
1
2
3
4queue_work
queue_work_on
queue_delayed_work
queue_delayed_work_on -
schedule work to workqueue
Block drivers¶
- No need to open another kernel thread when using workqueues
- Waitqueue waits on the loop until the condition is met: https://stackoverflow.com/questions/11184581/why-does-wait-queue-implementation-wait-on-a-loop-until-condition-is-met
- wakeup will trigger an interrupt
- wake_up_interruptible wakes up only the processes that are in interruptible sleeps
- BIOs can be split, merged (chain). It’s in the scheduling layer.
- The null_blk driver is a bit different than others. It has two ways of receiving commands: bio based, req based.
- Device drivers are normally request based. BIOs are already split/merged in the block layer (scheduling) and grouped to a req which is sent to the device drivers. It should not touch BIOs inside a req/command in the device driver. The job of device driver is to translating a req to corresponding command.
- In-flight BIOs in the device driver don’t conclude the BIOs in the requests.
- Linux is running on the async context.
- flow control on device drivers may not be a good idea. A lot of places in the block layer have already done/could do that, like scheduling layer where requests are regulated.
Block IO¶
v6.3-rc2
high level: app -> fs -> block level
1 | application |
bio -> bio_vec/bi_sector -> memory page
1 | struct bio { |
gendisk -> request queue/block device -> request
submit_bio() -> submit_bio_noaccout -> submit_bio_noacct_nocheck -> _submit_bio/_submit_bio_noacct
(generic_make_request[2], v<=5.8)
1 | struct request_queue { |
bio layer[3] -> request layer -> device driver[4]
request queue[5]
create/delete a rq: blk_mq_init_queue[6]
process a request: blk_mq_start_request
device mapper[7]
https://embetronicx.com/tutorials/linux/device-drivers/work-queue-in-linux-own-workqueue/ ↩︎
http://books.gigatux.nl/mirror/kerneldevelopment/0672327201/ch13lev1sec3.html ↩︎
http://blog.vmsplice.net/2020/04/how-linux-vfs-block-layer-and-device.html ↩︎
https://linux-kernel-labs.github.io/refs/heads/master/labs/block_device_drivers.html#request-queues-multi-queue-block-layer ↩︎
https://linux-kernel-labs.github.io/refs/heads/master/labs/block_device_drivers.html#create-and-delete-a-request-queue ↩︎
https://xuechendi.github.io/2013/11/14/device-mapper-deep-dive ↩︎