dpdk的中(zhong)(zhong)斷(duan)是(shi)用(yong)戶態(tai)的中(zhong)(zhong)斷(duan),實(shi)現方(fang)式是(shi)通過vfio或uio模塊將(jiang)內核的中(zhong)(zhong)斷(duan)傳遞到用(yong)戶態(tai),并且dpdk實(shi)現的中(zhong)(zhong)斷(duan)機(ji)制(zhi)(zhi)屬于控制(zhi)(zhi)中(zhong)(zhong)斷(duan),用(yong)來(lai)(lai)實(shi)現一些控制(zhi)(zhi)操作,例如uio中(zhong)(zhong)斷(duan)用(yong)來(lai)(lai)設(she)置(zhi)一些網卡的狀態(tai)之類(lei)。網卡收(shou)發包過程,還是(shi)使(shi)用(yong)輪詢(xun)的方(fang)式從網卡接收(shou)報文。
中斷初始化
中(zhong)(zhong)斷初始(shi)化主要在rte_eal_intr_init中(zhong)(zhong)完(wan)成
首(shou)先初始化intr_sources鏈表(biao)(biao)。所有設備的(de)中(zhong)斷(duan)都(dou)掛在這個(ge)鏈表(biao)(biao)上(shang),中(zhong)斷(duan)處理線程通過遍歷這個(ge)鏈表(biao)(biao),來執(zhi)行(xing)設備的(de)中(zhong)斷(duan)。
然后?創建intr_pipe管道(dao),用(yong)于epoll模型的(de)消息通知。
再創建(jian)(jian)線(xian)程intr_thread,線(xian)程的(de)(de)執行體是eal_intr_thread_main()函數,創建(jian)(jian)epoll模型(xing),遍(bian)歷intr_sources鏈(lian)表,監聽已注冊的(de)(de)所有設(she)備的(de)(de)中斷事(shi)件,并調用對(dui)(dui)應設(she)備的(de)(de)中斷處(chu)理(li)函數。中斷控制線(xian)程對(dui)(dui)應的(de)(de)例程為eal_intr_thread_main(),其(qi)使(shi)用了 epoll_create() 創建(jian)(jian)了一個epoll對(dui)(dui)象(xiang),然后(hou)使(shi)用 epoll_ctl() 將intr_pipe的(de)(de)讀端的(de)(de)fd,以(yi)及中斷源list中的(de)(de)對(dui)(dui)應的(de)(de)fd加入(ru)到epoll對(dui)(dui)象(xiang)的(de)(de)fd interesting list中。然后(hou)使(shi)用 epoll_wait() 監聽epoll對(dui)(dui)象(xiang)的(de)(de)fd interesting list
static __rte_noreturn void *
eal_intr_thread_main(__rte_unused void *arg)
{
/* host thread, never break out */
for (;;) {
/* build up the epoll fd with all descriptors we are to
* wait on then pass it to the handle_interrupts function
*/
static struct epoll_event pipe_event = {
.events = EPOLLIN | EPOLLPRI,
};
struct rte_intr_source *src;
unsigned numfds = 0;
/* create epoll fd */
int pfd = epoll_create(1);
if (pfd < 0)
rte_panic("Cannot create epoll instance\n");
pipe_event.data.fd = intr_pipe.readfd;
/**
* add pipe fd into wait list, this pipe is used to
* rebuild the wait list.
*/
if (epoll_ctl(pfd, EPOLL_CTL_ADD, intr_pipe.readfd,
&pipe_event) < 0) {
rte_panic("Error adding fd to %d epoll_ctl, %s\n",
intr_pipe.readfd, strerror(errno));
}
numfds++;
rte_spinlock_lock(&intr_lock);
TAILQ_FOREACH(src, &intr_sources, next) {
struct epoll_event ev;
if (src->callbacks.tqh_first == NULL)
continue; /* skip those with no callbacks */
memset(&ev, 0, sizeof(ev));
ev.events = EPOLLIN | EPOLLPRI | EPOLLRDHUP | EPOLLHUP;
ev.data.fd = rte_intr_fd_get(src->intr_handle);
/**
* add all the uio device file descriptor
* into wait list.
*/
if (epoll_ctl(pfd, EPOLL_CTL_ADD,
rte_intr_fd_get(src->intr_handle), &ev) < 0) {
rte_panic("Error adding fd %d epoll_ctl, %s\n",
rte_intr_fd_get(src->intr_handle),
strerror(errno));
}
else
numfds++;
}
rte_spinlock_unlock(&intr_lock);
/* serve the interrupt */
eal_intr_handle_interrupts(pfd, numfds);
/**
* when we return, we need to rebuild the
* list of fds to monitor.
*/
close(pfd);
}
}
中斷注冊和注銷
可以通過 rte_intr_callback_register() 注冊一個中斷源以及對應的callback,注冊完成后,中斷控制線程會對其信號進行監聽,內部會將中斷源鏈表中的所有中斷源描述符都加入到epoll實現的紅黑樹中, 當相應中斷源有事件發生時,epoll會調用這些中斷源注冊的回調函數。
還可以通過 rte_intr_callback_unregister() 注銷一(yi)個中(zhong)斷源的callback, 此時中(zhong)斷控制線程(cheng)會停止(zhi)對中(zhong)斷信號進行(xing)監聽(ting)。相關示例代碼如下
int
mlx4_intr_install(struct mlx4_priv *priv)
{
const struct rte_eth_intr_conf *const intr_conf =
Ð_DEV(priv)->data->dev_conf.intr_conf;
int rc;
mlx4_intr_uninstall(priv);
if (intr_conf->lsc | intr_conf->rmv) {
if (rte_intr_fd_set(priv->intr_handle, priv->ctx->async_fd))
return -rte_errno;
rc = rte_intr_callback_register(priv->intr_handle,
(void (*)(void *))
mlx4_interrupt_handler,
priv);
if (rc < 0) {
rte_errno = -rc;
goto error;
}
}
return 0;
error:
mlx4_intr_uninstall(priv);
return -rte_errno;
}
int
mlx4_intr_uninstall(struct mlx4_priv *priv)
{
int err = rte_errno; /* Make sure rte_errno remains unchanged. */
if (rte_intr_fd_get(priv->intr_handle) != -1) {
rte_intr_callback_unregister(priv->intr_handle,
(void (*)(void *))
mlx4_interrupt_handler,
priv);
if (rte_intr_fd_set(priv->intr_handle, -1))
return -rte_errno;
}
rte_eal_alarm_cancel((void (*)(void *))mlx4_link_status_alarm, priv);
priv->intr_alarm = 0;
mlx4_rxq_intr_disable(priv);
rte_errno = err;
return 0;
}
相關函數調用圖如下
