Bugzilla – Full Text Bug Listing |
Summary: | MPTCP - FullMesh: queue_delayed_work is only successful on the first call. | ||
---|---|---|---|
Product: | dce | Reporter: | Richard Withnell <rick_withnell> |
Component: | kernel | Assignee: | Hajime Tazaki <tazaki> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | ns-bugs |
Priority: | P5 | ||
Version: | unspecified | ||
Hardware: | PC | ||
OS: | Linux | ||
Attachments: |
Patch for arch/sim/workqueue.c to fix bug.
Simulation file to replicate bug. |
Created attachment 1996 [details]
Simulation file to replicate bug.
I have attached my simulation file that demonstrates the bug.
Running the simulation file, files-0/var/log/messages shows;
addr4_event_handler created event for 192.168.0.10, code 1 prio 0
addr4_event_handler created event for 10.1.0.10, code 1 prio 0
mptcp_alloc_mpcb: created mpcb with token 0x23b56742
mptcp_add_sock: token 0x23b56742 pi 1, src_addr:192.168.0.10:41546 dst_addr:172.16.1.1:5001, cnt_subflows now 1
mptcp_close: Close of meta_sk with tok 0x23b56742
mptcp_del_sock: Removing subsock tok 0x23b56742 pi:1 state 7 is_meta? 0
After applying the patch I get the following, which matches the behavior outside of NS3;
addr4_event_handler created event for 192.168.0.10, code 1 prio 0
addr4_event_handler created event for 10.1.0.10, code 1 prio 0
mptcp_alloc_mpcb: created mpcb with token 0x23b56742
mptcp_add_sock: token 0x23b56742 pi 1, src_addr:192.168.0.10:41546 dst_addr:172.16.1.1:5001, cnt_subflows now 1
mptcp_add_sock: token 0x23b56742 pi 2, src_addr:0.0.0.0:0 dst_addr:0.0.0.0:0, cnt_subflows now 2
mptcp_init4_subsockets: token 0x23b56742 pi 2 src_addr:10.1.0.10:0 dst_addr:172.16.1.1:5001
mptcp_close: Close of meta_sk with tok 0x23b56742
mptcp_del_sock: Removing subsock tok 0x23b56742 pi:2 state 7 is_meta? 0
mptcp_del_sock: Removing subsock tok 0x23b56742 pi:1 state 7 is_meta? 0
Assuming there are no issues with my suggested patch, all instances of test_set_bit should probably be changed to use WORK_STRUCT_PENDING_BIT.
As follows;
diff --git a/arch/sim/workqueue.c b/arch/sim/workqueue.c
index fead47b..d3f8b70 100644
--- a/arch/sim/workqueue.c
+++ b/arch/sim/workqueue.c
@@ -96,7 +96,7 @@ bool queue_work_on(int cpu, struct workqueue_struct *wq,
{
int ret = 0;
- if (!test_and_set_bit (WORK_STRUCT_PENDING, work_data_bits(work))) {
+ if (!test_and_set_bit (WORK_STRUCT_PENDING_BIT, work_data_bits(work))) {
list_add_tail (&work->entry, &wq->list);
sim_task_wakeup (workqueue_task (wq));
ret = 1;
@@ -119,7 +119,7 @@ void flush_workqueue(struct workqueue_struct *wq)
bool cancel_work_sync(struct work_struct *work)
{
int retval = 0;
- if (!test_and_set_bit(WORK_STRUCT_PENDING, work_data_bits(work)))
+ if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work)))
{
// work was not yet queued
return 0;
@@ -144,7 +144,7 @@ bool queue_delayed_work_on(int cpu, struct workqueue_struct *wq,
return queue_work (wq, work);
}
- if (!test_and_set_bit(WORK_STRUCT_PENDING, work_data_bits(work)))
+ if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work)))
{
sim_assert (!timer_pending (timer));
dwork->wq = wq;
thanks for the report. which version (i.e., branch) of net-next-sim tree are you using ? I noticed this issue and fixed it in the HEAD branch (sim-ns3-dev-branch), but not backported yet.. https://github.com/direct-code-execution/net-next-sim/blob/sim-ns3-dev-branch/arch/lib/workqueue.c I'm currently on sim-ns3-3.14.0-branch. I've fixed this issue on the sim-ns3-3.14.0-branch: no other branches are ported yet. https://github.com/direct-code-execution/net-next-sim/commit/d6ed19f948135e486bc55aaa886f26d0d7ae2b91 In order to track this issue, I've updated the test case for mptcp in ns-3-dce, based on the example provided by Richard. http://code.nsnam.org/ns-3-dce/rev/03b20ee32e51 Thanks, Richard ! |
Created attachment 1995 [details] Patch for arch/sim/workqueue.c to fix bug. I'm trying to add multiple addresses to a nodes net device at arbitrary time values throughout a simulation. In mptcp_fullmesh.c, if mptcp_address_worker() exits, subsequent addresses that are added to a node do not get put on to the work queue, resulting in the second address not being added to the meta socket. To replicate, add an address to a net device at Second(x), then add another address at Second(x + 1). The second address will not cause mptcp_address_worker() to be called. If multiple addresses are added at Second(x), mptcp_address_worker() processes all events before exiting, and they are added successfully. In add_pm_event(): 880 if (!delayed_work_pending(&fm_ns->address_worker)) 881 queue_delayed_work(mptcp_wq, &fm_ns->address_worker, 882 msecs_to_jiffies(500)); Delayed_work_pending() returns the expected result, but queue_delayed_work() returns false. I've fixed this with the following patch; Is there any reason for using WORK_STRUCT_PENDING instead of WORK_STRUCT_PENDING_BIT here? diff --git a/arch/sim/workqueue.c b/arch/sim/workqueue.c index fead47b..47dd75f 100644 --- a/arch/sim/workqueue.c +++ b/arch/sim/workqueue.c @@ -144,7 +144,7 @@ bool queue_delayed_work_on(int cpu, struct workqueue_struct *wq, return queue_work (wq, work); } - if (!test_and_set_bit(WORK_STRUCT_PENDING, work_data_bits(work))) + if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work))) { sim_assert (!timer_pending (timer)); dwork->wq = wq;