fix(igc): buffer rx packets on irq#691
Conversation
Clone the selected network device while holding the net manager read lock, then release the lock before invoking the debug dump path. This keeps potentially slow diagnostic dumping outside the shared manager lock and matches the existing interface operation pattern.
Signed-off-by: Yuuki Takano <ytakanoster@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR aims to improve the Intel IGC driver RX path by buffering received packets into a software ring queue during IRQ handling, decoupling descriptor consumption from immediate upper-layer reads. It also introduces an on-demand “netdump” facility by adding a NetDevice::debug_dump() hook, a debug_dump_interface() helper, and a corresponding shell command.
Changes:
- Add an RX-side software ring queue (
RingQ<EtherFrameBuf>) to IGC and drain completed RX descriptors into it from the IRQ path; updaterecv()to pop from this queue first. - Add a generic
NetDevice::debug_dump()API +awkernel_lib::net::debug_dump_interface()helper, and wire a shell(netdump <interface_id>)command to invoke it. - Refactor
IgcInner::dump()to return aStringso callers decide how/when to emit logs.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| awkernel_lib/src/net/net_device.rs | Adds NetDevice::debug_dump() default implementation (new API surface). |
| awkernel_lib/src/net.rs | Adds debug_dump_interface() helper that looks up an interface and invokes debug_dump() outside the NET_MANAGER lock. |
| awkernel_drivers/src/pcie/intel/igc.rs | Implements RX buffering via RingQ, calls RX drain from IRQ, updates recv(), and changes dump() to return String. |
| applications/awkernel_shell/src/lib.rs | Registers a new BLisp-exported (netdump ...) command and FFI to call debug_dump_interface(). |
| applications/awkernel_shell/Cargo.toml | Adds num-bigint / num-traits dependencies to parse BLisp Int into u64 for netdump. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| { | ||
| let mut node = MCSNode::new(); | ||
| let mut rx = inner.queue_info.que[que_id].rx.lock(&mut node); | ||
| inner.igc_rx_recv(que_id, &mut rx)?; |
There was a problem hiding this comment.
Using ? here short-circuits the IMS/EIMS re-enable writes at the end of intr(). Since the ICR read has already ACKed the interrupt at the device, an error here can leave LSC/per-queue interrupts un-rearmed and silently drop subsequent IRQs. Capture the rx/tx results, still perform the rearm writes, and only then propagate the error.
| if rx.read_buf.is_none() { | ||
| return Ok(None); | ||
| return Ok(()); | ||
| } |
There was a problem hiding this comment.
The old igc_recv() had a defensive if que_id != 0 { return Ok(None); }, but igc_rx_recv() drops that guard while the PR body explicitly defers multi-queue to PR5. If irqs_to_queues ever yields a non-zero que_id before PR5 lands, this will touch ring state that is not yet wired up. Reinstate the guard (or a debug_assert_eq!(que_id, 0)) until PR5 enables multi-queue.
| Some(u16::from_le(unsafe { desc.wb.upper.vlan })) | ||
| } else { | ||
| None | ||
| while !rx.read_queue.is_full() { |
There was a problem hiding this comment.
A full memory barrier on every descriptor in the drain loop is more expensive than necessary. The barrier only needs to pair once with the device's DMA write-back; subsequent descriptors are picked up on later iterations or later IRQs. Move membar_sync() out of the loop (or replace the status-word read with an acquire-ordered load).
| .igc_recv(que_id) | ||
| .or(Err(net_device::NetDevError::DeviceError)) | ||
| { | ||
| let mut node = MCSNode::new(); |
There was a problem hiding this comment.
recv() locks rx, pops, drops the lock, then re-locks it to call igc_rx_recv() and pop again. The two acquisitions are pure overhead and also open a window where intr() can mutate read_queue between them. Collapse into a single critical section: lock once, try pop(), and only call igc_rx_recv() (and pop again) if pop returned None.
| lines.push_str("(task) ; print tasks\r\n"); | ||
| lines.push_str("(interrupt) ; print interrupt information\r\n"); | ||
| lines.push_str("(ifconfig) ; print network interfaces\r\n"); | ||
| lines.push_str("(netdump if); dump device registers\r\n"); |
There was a problem hiding this comment.
The PR description states this was changed from (netdump if) to (netdump id) to avoid the keyword-like if, but the diff still says if. The change appears to have been lost; please apply it (and consider clarifying that the argument is the numeric interface_id).
|
|
||
| rx.slots += 1; | ||
| rx.next_to_check += 1; | ||
| if rx.next_to_check == rx.rx_desc_ring.as_ref().len() { |
There was a problem hiding this comment.
The PR description says this was changed to increment dropped_pkts on push failure, but the diff still uses let _ = rx.read_queue.push(packet);. The promised observability is missing — any unexpected push failure remains invisible. Replace with if rx.read_queue.push(packet).is_err() { rx.dropped_pkts = rx.dropped_pkts.saturating_add(1); break; } (the break also avoids spinning until the next is_full() check).
This is PR4 in the igc incremental stack (stacked on top of PR3 / igc_observability; PR1–3 are already merged).
PR4 changes (this commit)
Decouples RX packet delivery from immediate upper-layer consumption by
introducing a software ring queue drained during interrupt handling.
Key modifications to
awkernel_drivers/src/pcie/intel/igc.rs:read_queue: RingQ<EtherFrameBuf>(32-packet capacity) toRxigc_recv()withigc_rx_recv(que_id, rx: &mut Rx)thatrecv()fast-path pops fromread_queue; falls back toigc_rx_recv()intr()callsigc_rx_recv()on RX IRQ, buffering packets before upper-layer consumptionread_queueinigc_allocate_queues(),igc_setup_rx_desc_ring(), andigc_stop()applications/tests/test_networkadjusted:INTERFACE_ID→ 1, addresses → 192.168.100.x, tests narrowed totcp_listen_test+ newudp_recv_test.PR4 review follow-up (also in this commit)
Addressed Copilot review feedback:
self.dump()inigc_up(): captures and logs the returned String vialog::debug!(regression introduced during PR3 refactoring whendump()was changed to return String instead of logging directly)let _ = rx.read_queue.push(packet)to incrementdropped_pktson failure, making any unexpected push failure observable via the existing drop counter (thewhile !is_full()guard makes this path unreachable in practice)log::debug!" fromdebug_dump_interface()doc comment; the log level is device-defined(netdump if)help text to(netdump id)to avoid the keyword-likeifNot adopted: kept
warn!in the defaultNetDevice::debug_dump()(PR3 design decision — operators should see a clear message when callingnetdumpon a device that has not implemented the hook); did not alter the PRdescription for
debug_dump/netdumpadditions (those are PR3 changes visible because PR4 is stacked on PR3).Deferred to later PRs
can_send()remains a reclaiming gateTesting
make clippy— cleanmake x86_64 RELEASE=1— successfulmake check_x86_64— passed