Bug 842

Summary: ns-3-dev crashes using block ack
Product: ns-3 Reporter: Bruno Ranieri <Yrrsinn>
Component: wifiAssignee: Mirko Banchi <mk.banchi>
Status: RESOLVED FIXED    
Severity: normal CC: katzfiller5, mathieu.lacage, mk.banchi, nicola, ns-bugs, tomh
Priority: P3    
Version: pre-release   
Hardware: All   
OS: All   
Attachments: program reproducing the bug
This patch should resolve all problems

Description Bruno Ranieri 2010-03-11 07:17:14 UTC
Hi,
 
compiling ns-3-dev changeset 6123:4fc850094aeb (Thu Mar 11 13:37:22 2010 +0300) was successful.

But something seems to be wrong with the block ack support. The change
of the transmissions data rate cause ns-3 to crash:

The following change in examples/wireless/wifi-blockack.cc:

Change the data rate:

- onOff.SetAttribute ("OnTime", RandomVariableValue (ConstantVariable (0.01)));
+ onOff.SetAttribute ("OnTime", RandomVariableValue (ConstantVariable (0.1)));


Results in:

$ ./waf --run examples/wireless/wifi-blockack
[...]
assert failed. file=../src/common/buffer.h, line=587, cond="m_current >=
m_dataStart && m_current <= m_dataEnd"
 



Regards,
 Bruno
Comment 1 Nicola Baldo 2010-04-13 05:27:29 UTC
Created attachment 828 [details]
program reproducing the bug
Comment 2 Nicola Baldo 2010-04-13 05:31:04 UTC
confirmed. Below is a backtrace.
My first impression is that the error is in the LLC, so I am assigning this bug to the node-module component.


assert failed. file=../src/common/buffer.cc, line=953, cond="m_current + delta <= m_dataEnd"

Program received signal SIGSEGV, Segmentation fault.
0x008f4e81 in ns3::Buffer::Iterator::Next (this=0xbfffdc08, delta=6) at ../src/common/buffer.cc:953
953	  NS_ASSERT (m_current + delta <= m_dataEnd);
(gdb) back
#0  0x008f4e81 in ns3::Buffer::Iterator::Next (this=0xbfffdc08, delta=6) at ../src/common/buffer.cc:953
#1  0x009d58e3 in ns3::LlcSnapHeader::Deserialize (this=0xbfffdd6c, start=...) at ../src/node/llc-snap-header.cc:84
#2  0x0091eecb in ns3::Packet::RemoveHeader (this=0x80a71f8, header=...) at ../src/common/packet.cc:264
#3  0x00dfa1c6 in ns3::WifiNetDevice::ForwardUp (this=0x809fe90, packet=..., from=..., to=...) at ../src/devices/wifi/wifi-net-device.cc:292
#4  0x00dfdb27 in ns3::MemPtrCallbackImpl<ns3::WifiNetDevice*, void (ns3::WifiNetDevice::*)(ns3::Ptr<ns3::Packet>, ns3::Mac48Address, ns3::Mac48Address), void, ns3::Ptr<ns3::Packet>, ns3::Mac48Address, ns3::Mac48Address, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty>::operator() (this=0x80a16a0, a1=..., 
    a2=..., a3=...) at debug/ns3/callback.h:229
#5  0x00dddd24 in ns3::Callback<void, ns3::Ptr<ns3::Packet>, ns3::Mac48Address, ns3::Mac48Address, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty>::operator() (this=0x80a008c, a1=..., a2=..., a3=...) at debug/ns3/callback.h:416
#6  0x00e32753 in ns3::QapWifiMac::ForwardUp (this=0x809ffa8, packet=..., from=..., to=...) at ../src/devices/wifi/qap-wifi-mac.cc:369
#7  0x00e35557 in ns3::QapWifiMac::Receive (this=0x809ffa8, packet=..., hdr=0x80a87d4) at ../src/devices/wifi/qap-wifi-mac.cc:607
#8  0x00e39766 in ns3::MemPtrCallbackImpl<ns3::QapWifiMac*, void (ns3::QapWifiMac::*)(ns3::Ptr<ns3::Packet>, ns3::WifiMacHeader const*), void, ns3::Ptr<ns3::Packet>, ns3::WifiMacHeader const*, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty>::operator() (this=0x809fdf8, a1=..., a2=0x80a87d4)
    at debug/ns3/callback.h:226
#9  0x00d9f407 in ns3::Callback<void, ns3::Ptr<ns3::Packet>, ns3::WifiMacHeader const*, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty>::operator() (this=0x80a00d8, a1=..., a2=0x80a87d4) at debug/ns3/callback.h:413
#10 0x00daac68 in ns3::MacRxMiddle::Receive (this=0x80a00a8, packet=..., hdr=0x80a87d4) at ../src/devices/wifi/mac-rx-middle.cc:298
#11 0x00ddf4ba in ns3::MemPtrCallbackImpl<ns3::MacRxMiddle*, void (ns3::MacRxMiddle::*)(ns3::Ptr<ns3::Packet>, ns3::WifiMacHeader const*), void, ns3::Ptr<ns3::Packet>, ns3::WifiMacHeader const*, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty>::operator() (this=0x809fe10, a1=..., a2=0x80a87d4)
    at debug/ns3/callback.h:226
#12 0x00d9f407 in ns3::Callback<void, ns3::Ptr<ns3::Packet>, ns3::WifiMacHeader const*, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty>::operator() (this=0x80a00fc, a1=..., a2=0x80a87d4) at debug/ns3/callback.h:413
#13 0x00d9b589 in ns3::MacLow::RxCompleteBufferedPacketsWithSmallerSequence (this=0x80a00e0, seq=112, originator=..., tid=0 '\000')
    at ../src/devices/wifi/mac-low.cc:1586
#14 0x00d9c474 in ns3::MacLow::SendBlockAckAfterBlockAckRequest (this=0x80a00e0, reqHdr=..., originator=..., duration=..., blockAckReqTxMode=...)
    at ../src/devices/wifi/mac-low.cc:1718
#15 0x00d9db81 in Notify (this=0x80b0068) at debug/ns3/make-event.h:223
#16 0x008a7eec in ns3::EventImpl::Invoke (this=0x80b0068) at ../src/simulator/event-impl.cc:37
#17 0x008c435c in ns3::DefaultSimulatorImpl::ProcessOneEvent (this=0x807b570) at ../src/simulator/default-simulator-impl.cc:128
#18 0x008c4528 in ns3::DefaultSimulatorImpl::Run (this=0x807b570) at ../src/simulator/default-simulator-impl.cc:158
#19 0x008a8a1b in ns3::Simulator::Run () at ../src/simulator/simulator.cc:173
#20 0x080517cc in main (argc=1, argv=0xbfffefc4) at ../scratch/bug842.cc:137
Comment 3 Mathieu Lacage 2010-04-22 14:18:07 UTC
Maybe you are removing a header which is not here: did you consider calling Packet::EnableChecking ?
Comment 4 Mirko Banchi 2010-04-26 08:54:45 UTC
Yes Mathieu is right. I tried to call Packet::EnableChecking and this shows the problem. I can also confirm what Nicola wrote: the problem is with LlcSnapHeader. In particular, with that value for the OnTime attribute (0.1) in WifiNetDevice::ForwardUp a llc header is removed but seems it's not there. This doesn't happen always but only in a particular point of the execution. This is very strange :(
Comment 5 Mathieu Lacage 2010-04-26 14:09:35 UTC
Hints:
1) is there a tx codepath which does not insert the llc header ?
2) is there a trace hook which removes it ?
3) is there a rx codepath which could remove the llc header twice ?
Comment 6 Mirko Banchi 2010-05-15 11:57:45 UTC
(In reply to comment #5)
> Hints:
> 1) is there a tx codepath which does not insert the llc header ?
No. The LLC header is added in WifiNetDevice::Send and WifiNetDevice::SendFrom methods which add the header unconditionally.
> 2) is there a trace hook which removes it ?
No.
> 3) is there a rx codepath which could remove the llc header twice ?
No. The LLC header is removed only once in WifiNetDevice::ForwardUp method.

I confess i'm very confused by this bug :) However i'll investigate.
Comment 7 Mirko Banchi 2010-05-16 10:14:09 UTC
Maybe i found the reason for this bug. I updated ns-3-dev with changes that should fix it. Feedbacks are welcome in order to mark the bug as fixed!

Thank you all.
Comment 8 Mathieu Lacage 2010-05-17 02:00:08 UTC
changeset id ?
Comment 9 Mirko Banchi 2010-05-17 03:37:43 UTC
sorry, 645b4e644c12.
Comment 10 Nicola Baldo 2010-05-17 04:46:24 UTC
(In reply to comment #7)
> Maybe i found the reason for this bug. I updated ns-3-dev with changes that
> should fix it. Feedbacks are welcome in order to mark the bug as fixed!

I tried it with the program that reproduces bug 842, I get the following:

assert failed. file=../src/devices/wifi/mac-low.cc, line=1699, cond="duration >= MicroSeconds (0)"
Comment 11 Mirko Banchi 2010-05-17 07:26:58 UTC
(In reply to comment #10)
> > Maybe i found the reason for this bug. I updated ns-3-dev with changes that
> > should fix it. Feedbacks are welcome in order to mark the bug as fixed!
> 
> I tried it with the program that reproduces bug 842, I get the following:
> 
> assert failed. file=../src/devices/wifi/mac-low.cc, line=1699, cond="duration
> >= MicroSeconds (0)"

Hi Nicola, i can't reproduce the problem :(. I tried the example/wireless/wifi-blockack.cc changing the value of OnOffHelper's attribute "OnTime" from 0.01 to 0.1. The program run fine. could you send me by e-mail your script? Thank you.
Comment 12 Nicola Baldo 2010-05-17 08:31:32 UTC
(In reply to comment #11)
> Hi Nicola, i can't reproduce the problem :(. I tried the
> example/wireless/wifi-blockack.cc changing the value of OnOffHelper's attribute
> "OnTime" from 0.01 to 0.1. The program run fine. could you send me by e-mail
> your script? Thank you.

Hi Mirko,

I am using the program that is already attached to this bug:
http://www.nsnam.org/bugzilla/attachment.cgi?id=828

I am using ns-3-dev changeset 6306:283c83f1f7be
Comment 13 Mirko Banchi 2010-05-17 09:14:49 UTC
(In reply to comment #10)
> > Maybe i found the reason for this bug. I updated ns-3-dev with changes that
> > should fix it. Feedbacks are welcome in order to mark the bug as fixed!
> 
> I tried it with the program that reproduces bug 842, I get the following:
> 
> assert failed. file=../src/devices/wifi/mac-low.cc, line=1699, cond="duration
> >= MicroSeconds (0)"

Hi Nicola, i can't reproduce the problem :(. I tried the example/wireless/wifi-blockack.cc changing the value of OnOffHelper's attribute "OnTime" from 0.01 to 0.1. The program run fine. could you send me by e-mail your script? Thank you.
Comment 14 Mirko Banchi 2010-05-17 09:17:57 UTC
(In reply to comment #12)
> (In reply to comment #11)
> > Hi Nicola, i can't reproduce the problem :(. I tried the
> > example/wireless/wifi-blockack.cc changing the value of OnOffHelper's attribute
> > "OnTime" from 0.01 to 0.1. The program run fine. could you send me by e-mail
> > your script? Thank you.
> 
> Hi Mirko,
> 
> I am using the program that is already attached to this bug:
> http://www.nsnam.org/bugzilla/attachment.cgi?id=828
> 
> I am using ns-3-dev changeset 6306:283c83f1f7be

This is very strange. I tried that script with the same version of ns-3-dev and i can't reproduce the bug :( Tests from others?
Comment 15 Mirko Banchi 2010-07-31 08:39:11 UTC
Created attachment 959 [details]
This patch should resolve all problems
Comment 16 Mirko Banchi 2010-07-31 08:42:58 UTC
(In reply to comment #10)
> (In reply to comment #7)
> > Maybe i found the reason for this bug. I updated ns-3-dev with changes that
> > should fix it. Feedbacks are welcome in order to mark the bug as fixed!
> 
> I tried it with the program that reproduces bug 842, I get the following:
> 
> assert failed. file=../src/devices/wifi/mac-low.cc, line=1699, cond="duration
> >= MicroSeconds (0)"

Although i can't reproduce the bug i think there was a problem with block-ack- request restransmission...i have attached a patch that should resolve the problem. Please Nicola or Mathieu could you test it? Thank you all.
Comment 17 Nicola Baldo 2010-08-02 11:49:33 UTC
(In reply to comment #16)
> Although i can't reproduce the bug i think there was a problem with block-ack-
> request restransmission...i have attached a patch that should resolve the
> problem. Please Nicola or Mathieu could you test it? Thank you all.

I am not very confident with the block ack procedure... can you please briefly explain what is the problem that is solved by the patch?
Comment 18 Mirko Banchi 2010-08-02 13:03:54 UTC
(In reply to comment #17)
> (In reply to comment #16)
> > Although i can't reproduce the bug i think there was a problem with block-ack-
> > request restransmission...i have attached a patch that should resolve the
> > problem. Please Nicola or Mathieu could you test it? Thank you all.
> 
> I am not very confident with the block ack procedure... can you please briefly
> explain what is the problem that is solved by the patch?

Please verify if the attached program (program reproducing the bug) crashes after that the patch is applied. In particular with the patch you shouldn't obtain the assert failed in mac-low: see comment #10 for details. Let me know. Thank you Nicola.
Comment 19 Nicola Baldo 2010-08-02 13:35:35 UTC
Ok I re-tested the "program reproducing the bug". Now it turns out that I can't reproduce the bug any more (ns-3-dev 6474:0894b2a245e9). The program also runs correctly with the patch, if that's of any interest.

Any suggestions on what to do now?
Comment 20 Mirko Banchi 2010-08-02 14:20:29 UTC
(In reply to comment #19)
> Ok I re-tested the "program reproducing the bug". Now it turns out that I can't
> reproduce the bug any more (ns-3-dev 6474:0894b2a245e9). The program also runs
> correctly with the patch, if that's of any interest.
> 
> Any suggestions on what to do now?

This is a good question. We could wait for test from others even if i think the bug could be considered as closed. Mathieu, Tom?Could you please test the "program reproducing the bug" with and without the patch?
Comment 21 Nicola Baldo 2010-08-03 04:27:40 UTC
(In reply to comment #20)
> (In reply to comment #19)
> > 
> > Any suggestions on what to do now?
> 
> This is a good question. We could wait for test from others even if i think the
> bug could be considered as closed. Mathieu, Tom?Could you please test the
> "program reproducing the bug" with and without the patch?


Mirko, I think we should take a decision based also on the code, not only on resulting behavior. In this case it should be very simple, because the only change in behavior introduced by your patch is the additional "else if" block that I am reporting below. Can you please explain us why you think this change is needed?


--- a/src/devices/wifi/edca-txop-n.cc	Mon Aug 02 13:15:36 2010 +0200
+++ b/src/devices/wifi/edca-txop-n.cc	Tue Aug 03 10:21:55 2010 +0200
@@ -378,6 +376,10 @@
       StartAccessIfNeeded ();
       NS_LOG_DEBUG ("tx broadcast");
     }
+  else if (m_currentHdr.GetType() == WIFI_MAC_CTL_BACKREQ)
+    {
+      SendBlockAckRequest (m_currentBar);
+    }
   else
     {
Comment 22 Mirko Banchi 2010-08-03 14:17:27 UTC
 
> Mirko, I think we should take a decision based also on the code, not only on
> resulting behavior. In this case it should be very simple, because the only
> change in behavior introduced by your patch is the additional "else if" block
> that I am reporting below. Can you please explain us why you think this change
> is needed?
> 
> 
> --- a/src/devices/wifi/edca-txop-n.cc    Mon Aug 02 13:15:36 2010 +0200
> +++ b/src/devices/wifi/edca-txop-n.cc    Tue Aug 03 10:21:55 2010 +0200
> @@ -378,6 +376,10 @@
>        StartAccessIfNeeded ();
>        NS_LOG_DEBUG ("tx broadcast");
>      }
> +  else if (m_currentHdr.GetType() == WIFI_MAC_CTL_BACKREQ)
> +    {
> +      SendBlockAckRequest (m_currentBar);
> +    }
>    else
>      {

That change is needed because during retransmission of a bar MacLowTransmissionParameters must be set correctly in order to wait for a block ack and not for a normal ack. So, for me this bug is closed. Within few day i'll push the change in ns-3-dev. Thank you Nicola.
Comment 23 Bruno Ranieri 2010-08-03 19:25:55 UTC
I apologize for the long delay. I was busy finishing my master. In the meantime think i figured out the problem.

The Block Ack defines a sliding window mechanism with a sending buffer for retransmissions, a receiving buffer for reordering and a cache to keep track of already received MPDUs. This cache is missing in the current implementation.

The bug occurs (at least) in the following situation:
a the recipient receives MPDUs under Block Ack
b the recipient receives a Block Ack Request
c the recipient transmits a Block Ack and forwards the MPDUs up
d the recipients Block Ack is lost
e the recipient receives a Block Ack Request for the same Seq Nr range as in (b)

Now the failure happens:
The state at the recipient has changed in the meantime. The MPDUs are forwarded up to MacTxMiddle and no longer in the buffer in MacLow. Thus the recipient re-requests these MPDUs. If one of these MPDUs is received and forwarded up the assertion is violated, since the simulation only transmits pointers to the instances of the packets and the LLC Header is already removed (during the first transmission). MacTxMiddle can't catch the retransmission since it is designed for the asynchronous transmission with the normal Ack.  

The recipient must keep track of already seen MPDUs in a cache and respond in the Block Ack accordingly. The receiving buffer only stores MPDUs received out of order and forwards them up as soon as all missing MPDUs are present or timed out.
Comment 24 Nicola Baldo 2010-08-04 06:15:14 UTC
Mirko, Bruno, thank you very much for your effort in addressing the problem.


(In reply to comment #23)
> Now the failure happens:
> The state at the recipient has changed in the meantime. The MPDUs are forwarded
> up to MacTxMiddle and no longer in the buffer in MacLow. Thus the recipient
> re-requests these MPDUs. If one of these MPDUs is received and forwarded up the
> assertion is violated, since the simulation only transmits pointers to the
> instances of the packets and the LLC Header is already removed (during the
> first transmission). 

It's not true that "the simulation only transmits pointers to the instances of the packets", in fact YansWifiChannel always calls Packet::Copy() before forwarding packets to the receivers. Does this change your interpretation of what's happening?
Comment 25 Nicola Baldo 2010-08-04 06:42:00 UTC
Mirko, Bruno,

it is clear that you are much more expert on the block ack procedure than me; besides, I will be unreachable for a couple of weeks starting from tomorrow 5 August. So I I propose that if you both (Mirko and Bruno) agree on a patch that solves the problem, then Mirko can push it to ns-3-dev. Note that if this happens after the end of the bug fixing period (currently scheduled for 6 August), the approval of the release managers (Tom and Josh) is required.

Thanks again for your effort in resolving this bug!

Regards,

Nicola
Comment 26 Mirko Banchi 2010-08-05 14:25:00 UTC
(In reply to comment #23)
> I apologize for the long delay. I was busy finishing my master. In the meantime
> think i figured out the problem.
> 
> The Block Ack defines a sliding window mechanism with a sending buffer for
> retransmissions, a receiving buffer for reordering and a cache to keep track of
> already received MPDUs. This cache is missing in the current implementation.
> 
> The bug occurs (at least) in the following situation:
> a the recipient receives MPDUs under Block Ack
> b the recipient receives a Block Ack Request
> c the recipient transmits a Block Ack and forwards the MPDUs up
> d the recipients Block Ack is lost
> e the recipient receives a Block Ack Request for the same Seq Nr range as in
> (b)
> 
> Now the failure happens:
> The state at the recipient has changed in the meantime. The MPDUs are forwarded
> up to MacTxMiddle and no longer in the buffer in MacLow. Thus the recipient
> re-requests these MPDUs. If one of these MPDUs is received and forwarded up the
> assertion is violated, since the simulation only transmits pointers to the
> instances of the packets and the LLC Header is already removed (during the
> first transmission). MacTxMiddle can't catch the retransmission since it is
> designed for the asynchronous transmission with the normal Ack.  
> 
> The recipient must keep track of already seen MPDUs in a cache and respond in
> the Block Ack accordingly. The receiving buffer only stores MPDUs received out
> of order and forwards them up as soon as all missing MPDUs are present or timed
> out.

Hi Bruno, I agree with you about the cache mechanism. It's not implemented currently (a patch is welcome:). However please note that the problem with this is not the program crashing because the packets that are already forwarded up are removed from buffer so during the second buffer scanning those packets won't be there and the rx callback will not be called for them. However i agree with you that if the first block ack response is lost wrong information about those packets (already forwarded up) are sent in the block ack. What do you think about?
Comment 27 Tom Henderson 2010-08-10 02:24:09 UTC
Crasher was fixed, but remaining issue pointed out by Bruno added to new bug 981
Comment 28 Chae Amaya 2016-02-03 22:46:22 UTC
Nice work! I had a good experience filling forms online and happy to share it with you. Maybe you would be interested in an online service with a ton of Form templates (tax, real estate, legal, business, insurance forms, etc..) I used it to fill out <a href="http://pdf.ac/7SEHA4" >http://pdf.ac/7SEHA4</a>.