Bugzilla – Bug 2080
MPTCP - FullMesh: queue_delayed_work is only successful on the first call.
Last modified: 2015-05-25 10:43:39 EDT
Created attachment 1995 [details] Patch for arch/sim/workqueue.c to fix bug. I'm trying to add multiple addresses to a nodes net device at arbitrary time values throughout a simulation. In mptcp_fullmesh.c, if mptcp_address_worker() exits, subsequent addresses that are added to a node do not get put on to the work queue, resulting in the second address not being added to the meta socket. To replicate, add an address to a net device at Second(x), then add another address at Second(x + 1). The second address will not cause mptcp_address_worker() to be called. If multiple addresses are added at Second(x), mptcp_address_worker() processes all events before exiting, and they are added successfully. In add_pm_event(): 880 if (!delayed_work_pending(&fm_ns->address_worker)) 881 queue_delayed_work(mptcp_wq, &fm_ns->address_worker, 882 msecs_to_jiffies(500)); Delayed_work_pending() returns the expected result, but queue_delayed_work() returns false. I've fixed this with the following patch; Is there any reason for using WORK_STRUCT_PENDING instead of WORK_STRUCT_PENDING_BIT here? diff --git a/arch/sim/workqueue.c b/arch/sim/workqueue.c index fead47b..47dd75f 100644 --- a/arch/sim/workqueue.c +++ b/arch/sim/workqueue.c @@ -144,7 +144,7 @@ bool queue_delayed_work_on(int cpu, struct workqueue_struct *wq, return queue_work (wq, work); } - if (!test_and_set_bit(WORK_STRUCT_PENDING, work_data_bits(work))) + if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work))) { sim_assert (!timer_pending (timer)); dwork->wq = wq;
Created attachment 1996 [details] Simulation file to replicate bug. I have attached my simulation file that demonstrates the bug. Running the simulation file, files-0/var/log/messages shows; addr4_event_handler created event for 192.168.0.10, code 1 prio 0 addr4_event_handler created event for 10.1.0.10, code 1 prio 0 mptcp_alloc_mpcb: created mpcb with token 0x23b56742 mptcp_add_sock: token 0x23b56742 pi 1, src_addr:192.168.0.10:41546 dst_addr:172.16.1.1:5001, cnt_subflows now 1 mptcp_close: Close of meta_sk with tok 0x23b56742 mptcp_del_sock: Removing subsock tok 0x23b56742 pi:1 state 7 is_meta? 0 After applying the patch I get the following, which matches the behavior outside of NS3; addr4_event_handler created event for 192.168.0.10, code 1 prio 0 addr4_event_handler created event for 10.1.0.10, code 1 prio 0 mptcp_alloc_mpcb: created mpcb with token 0x23b56742 mptcp_add_sock: token 0x23b56742 pi 1, src_addr:192.168.0.10:41546 dst_addr:172.16.1.1:5001, cnt_subflows now 1 mptcp_add_sock: token 0x23b56742 pi 2, src_addr:0.0.0.0:0 dst_addr:0.0.0.0:0, cnt_subflows now 2 mptcp_init4_subsockets: token 0x23b56742 pi 2 src_addr:10.1.0.10:0 dst_addr:172.16.1.1:5001 mptcp_close: Close of meta_sk with tok 0x23b56742 mptcp_del_sock: Removing subsock tok 0x23b56742 pi:2 state 7 is_meta? 0 mptcp_del_sock: Removing subsock tok 0x23b56742 pi:1 state 7 is_meta? 0 Assuming there are no issues with my suggested patch, all instances of test_set_bit should probably be changed to use WORK_STRUCT_PENDING_BIT. As follows; diff --git a/arch/sim/workqueue.c b/arch/sim/workqueue.c index fead47b..d3f8b70 100644 --- a/arch/sim/workqueue.c +++ b/arch/sim/workqueue.c @@ -96,7 +96,7 @@ bool queue_work_on(int cpu, struct workqueue_struct *wq, { int ret = 0; - if (!test_and_set_bit (WORK_STRUCT_PENDING, work_data_bits(work))) { + if (!test_and_set_bit (WORK_STRUCT_PENDING_BIT, work_data_bits(work))) { list_add_tail (&work->entry, &wq->list); sim_task_wakeup (workqueue_task (wq)); ret = 1; @@ -119,7 +119,7 @@ void flush_workqueue(struct workqueue_struct *wq) bool cancel_work_sync(struct work_struct *work) { int retval = 0; - if (!test_and_set_bit(WORK_STRUCT_PENDING, work_data_bits(work))) + if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work))) { // work was not yet queued return 0; @@ -144,7 +144,7 @@ bool queue_delayed_work_on(int cpu, struct workqueue_struct *wq, return queue_work (wq, work); } - if (!test_and_set_bit(WORK_STRUCT_PENDING, work_data_bits(work))) + if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work))) { sim_assert (!timer_pending (timer)); dwork->wq = wq;
thanks for the report. which version (i.e., branch) of net-next-sim tree are you using ? I noticed this issue and fixed it in the HEAD branch (sim-ns3-dev-branch), but not backported yet.. https://github.com/direct-code-execution/net-next-sim/blob/sim-ns3-dev-branch/arch/lib/workqueue.c
I'm currently on sim-ns3-3.14.0-branch.
I've fixed this issue on the sim-ns3-3.14.0-branch: no other branches are ported yet. https://github.com/direct-code-execution/net-next-sim/commit/d6ed19f948135e486bc55aaa886f26d0d7ae2b91 In order to track this issue, I've updated the test case for mptcp in ns-3-dce, based on the example provided by Richard. http://code.nsnam.org/ns-3-dce/rev/03b20ee32e51 Thanks, Richard !