Bugzilla – Full Text Bug Listing |
Summary: | [patch] fixes for opening stats file with OMNeT++ | ||
---|---|---|---|
Product: | ns-3 | Reporter: | Tom Henderson <tomh> |
Component: | general | Assignee: | Joe Kopena <tjkopena> |
Status: | RESOLVED FIXED | ||
Severity: | enhancement | CC: | andras, craigdo, faker.moatamri, mathieu.lacage, ns-bugs, rudolf.hornig, tjkopena |
Priority: | P3 | ||
Version: | ns-3-dev | ||
Hardware: | All | ||
OS: | All | ||
Attachments: | patch to fix |
The new file format contains now a name and a context (i.e. where the value was collected) the format is: scalar context name value Statistics can be also gathered (count,sum,mean,min,max) is collected. Teh format is: statistics context name field stattype1 value1 field stattype2 value2 Example: run run-100-1 attr experiment "wifi-distance-test" attr strategy "wifi-default" attr measurement "100" attr description "" attr "author" "tjkopena" scalar . measurement "100" scalar node[0] wifi-tx-frames 30 scalar node[1] wifi-rx-frames 30 scalar node[0] sender-tx-packets 30 scalar node[1] receiver-rx-packets 30 statistic node[0] tx-pkt-size field count 30 field sum 1920 field mean 64 field min 64 field max 64 scalar . delay-count 30 scalar . delay-total 5889990 scalar . delay-average 196333 scalar . delay-max 196333 scalar . delay-min 196333 I am curious: is there somewhere an official documentation for the omnet++ file format ? (In reply to comment #2) Yes, it's an appendix to the OMNeT++ documentation. http://www.omnetpp.org/doc/omnetpp40/manual/usman.html#sec477 If there's interest, it can be expanded and turned into a standalone specification. There's also an alternative implementation in Java: http://www.omnetpp.org/omnetpp/doc_details/2194-jresultwriter-09 Some explanation to the patch. The syntax for recording scalar values in the file is this: scalar <moduleName> <statisticName> <value> where moduleName is supposed to denote the component the statistic comes from, like "the 802.11 MAC module of host 12", and the statisticName should be something like "num pks received" or "avg delay". moduleName in OMNeT++ is in a dotted notation ("net.host[12].wifi.mac"), in ns3 it could be a name like "/NodeList/0/ApplicationList/0/$Sender/Tx" -- it doesn't really matter. Global statistics can be recorded under the name of the root object, or using the abbreviation ".". The current ns3 code generates lines like this: scalar wifi-tx-frames count 30 scalar wifi-rx-frames count 30 scalar sender-tx-packets count 30 scalar receiver-rx-packets count 30 So the name is written where the moduleName should be, and the statisticName is "count". This should be like this: scalar ??? wifi-tx-frames 30 ^ moduleName should come here The patch largely addresses this issue. moduleName (i.e. its ns3 equivalent, a hierarchical name) was not available in the original C++ code where the file was written, so we had to change the code to take it there. It became the "context" parameter. The other change is how statistical summary info (like count, total, avg, min, max, stddev) are recorded. Currently it's like this in the ns3 code: scalar tx-pkt-size count 30 scalar tx-pkt-size total 1920 scalar tx-pkt-size average 64 scalar tx-pkt-size max 64 scalar tx-pkt-size min 64 Where again, moduleName is missing, but if we introduce it, where to write count, total, min, max? There are two a ways. A simple solution is to treat them as independent statistics, like this: scalar <moduleName> tx-pkt-size-count 30 scalar <moduleName> tx-pkt-size-total 1920 scalar <moduleName> tx-pkt-size-average 64 scalar <moduleName> tx-pkt-size-max 64 scalar <moduleName> tx-pkt-size-min 64 The file format offers a better solution, and we went for that: statistic <moduleName> tx-pkt-size field count 30 field sum 1920 field min 64 field max 64 This is why the StatisticalSummary interface was introduced in the code, made MinMaxAvgTotalCalculator implement it, and the OutputStatistic() method was added. Hope this makes it easier to interpret the changes introduced by the patch. Hi all, In some ways there are two parts to this discussion: The context, and the statistical summary. They're somewhat separate, but not totally. I am not very familiar with Omnet++, and the documentation link above is slightly different from the drafts used when this stat package was originally written. That said, I don't think the addition of the context is obviously the best move, nor an obviously bad one. For each recorded entry, there are arguably at least four pieces of data: The source, the variable, the statistic, and the value. For example, source might be node[0], the variable tx-frames-size, the statistic minimum, and the value 128. Omnet++ gives three places to shove those four elements. In the original code, the source and the variable were presumed to be shoved into the key field of OutputSingleton(), and therefore the module name in Omnet terms. The proposed patch shoves the variable and the statistic into the variable field of OutputSingleton(), and therefore the statistic name in Omnet terms. In many respects I think the differences are basically a wash. Which one is easier to work with depends entirely on what kind of data you are collecting, and what tools and techniques you are applying to analyze it. I would say that the primary shortcoming of the statistics package is that it does not deal with this issue more flexibly, i.e., providing a scheme to arbitrarily tag data. The difficulties to overcome in doing so are developing the appropriate framework to both present a common API and shove that kind of flexible format into disparate formats. For example, it might be easy to accomodate that in a DB, but it seems like more choices would have to be made to work with Omnet. In this case, the choice was made as described in the above paragraph. The reason the original code bundled the elements as described rather than as in the proposed code was in part to avoid the problems with the proposed StatisticalSummary interface. Note that, as implemented, the StatisticalSummary introduces several cases of multiple inheritance. I may have missed something in the past few months, but my previous understanding was that there was a concerted effort in ns-3's design approach to avoid such structures. More generally, the problem with the StatiscalSummary is that it seems to obviate the rationale for structuring the data output mechanism as it is. The reason for the less-than-straightforward control structure with its hierarchy of data collectors, callbacks, and inverse control is to disconnect the output from the actual statistics being collected. The output format doesn't have to know whether it's a simple average, a maximum, a confidence interval, etc. It just takes what the collector has to give, which the collector in turn identifies for the analysis scripts via the variable field in OutputSingleton(). The proposed StatisticalSummary hardcodes the output formats to a fixed set of statistics. Granted, it includes a scheme to not implement some statistics, somewhat awkwardly signalled via a NaN result rather than a typing mechanism. But it doesn't allow for adding new statistics to be output without modifying all of the output formats and their common interface. That reduces the flexibility of the scheme, and makes most of the structure unnecesary complication that could be simplified if that sort of hardcoding is acceptable. The actual patch is well done, fairly thorough, and adds some good documentation and so on, but I think contains a few design decisions that are not clearly better than what's there, and degrade some of the features, so hopefully there are better ways to meet the objectives. Thoughts? Thx (In reply to comment #5) Joe, thanks for your comment. It was quite long and since I'm not sure I got the message, I'll just reflect with some general thoughts here. Let me begin by stating the obvious: The purpose of the particular ns3 feature is to generate output files that are consistent with those generated by OMNeT++, and can be processed with the OMNeT++ Analysis tool. Now, it is not of particular interest to me what code is used to create those result files. We have just noticed that the existing code generates result files which are not entirely consistent with OMNeT++-generated files (for whatever reason; my guess is that the file format spec we wrote wasn't clear or detailed enough, and got misunderstood), and we wanted to improve on that. The most constructive way seemed to be submitting a patch. Obviously, files of the same format can be generated in several ways; it really makes no difference to me what code or whose code makes it into ns3, as long as the output is consistent with OMNeT++ files.Feel free to use or not use the patch, change it or rewrite it from scratch -- anything is fine with us, as long as the resulting code writes similar files as the patch. As for the file format specification: we intend to improve it (clarify etc), and make it available as a standalone spec (i.e. not part of the OMNeT++ manual). We believe the file format is stable and flexible enough so that we won't need to touch it for years, and new stuff (if needed) can be added without breaking backwards compatibility. PS. a note about the "statistical summary" interface: a similar concept is also part of the Apache Math library, see http://commons.apache.org/math/api-1.0/org/apache/commons/math/stat/descriptive/StatisticalSummary.html I noticed this bug got categorized as "Enhancement". I think that is misleading, because the currently generated files have problems, and cannot be loaded into the OMNeT++ result analyser. So this issue is a bug of some sort, and not an enhancement. Just my 2c. I have been through the patch and I have 2 remarks: - .project file should never be included in any patch or any repository, this file should always be kept private - Why in wifi-example-db.sh you changed 'name' to 'variable'? Is there any reason why you did that? Thanks (In reply to comment #8) > I have been through the patch and I have 2 remarks: > - .project file should never be included in any patch or any repository, this > file should always be kept private In general, yes; however, here the .project file is somewhat useful for the end user as well. The OMNeT++ IDE is based on Eclipse, and IDE can only use the ANF file (part of the patch) if it is within an Eclipse project. Of course the end user could create the project himself/herself as well. Of course you decide whether .project is useful enough to be checked in. We included it in the patch because it's easier to delete than create. > - Why in wifi-example-db.sh you changed 'name' to 'variable'? Is there any > reason why you did that? This seems to be accidental, we didn't change the database output. Sorry about it. > > In general, yes; however, here the .project file is somewhat useful for the end > user as well. The OMNeT++ IDE is based on Eclipse, and IDE can only use the ANF > file (part of the patch) if it is within an Eclipse project. Of course the end > user could create the project himself/herself as well. > > Of course you decide whether .project is useful enough to be checked in. We > included it in the patch because it's easier to delete than create. > Ok since your have a good explanation for that, I don't mind leaving the .project file in the patch. > > This seems to be accidental, we didn't change the database output. Sorry about > it. > Please fix this and push the patch to the repository. If you don't have enough rights to push it into repository, send me the latest patch and I will apply it. Thanks Best regards Faker Moatamri Patch applied, changeset: 5502:04acce3f7133 |
Created attachment 538 [details] patch to fix Contributed by Andras Varga and Rudolf Hornig: We found some issues when we opened the generated files in the OMNeT++ analysis tool, and we did some changes in the code to rectify them. Please find attached the patch! This patch is against version 4620:5d448b8564e5. Changes in the contrib/stats folder: - we have introduced a new base class called StatisticalSummary for MinMaxAvgTotalCalculator, so it can return the calculated values for the DataOutputCallback::OutputStatistic. This change allows the gathering and writing out of min/max/avg/total values for the same attribute. The output variable names are: name-count, name-min, name-max, name-total, name-sqrsum, name-stddev (the last two are unimplemented in the data collector) - for the DataOutputCallback::OutputSingleton method. we have changed the signature and introduced a CONTEXT parameter, and the former KEY parameter is now used as the variable name. In general, now CONTEXT is used to specify where the statistics are gathered (i.e. it can be "." for global stats or can be like "node[1].wifi" or whatever naming schema is used). The KEY attribute is now used to code the name of the statistic, like "length", "dropcount" etc. When creating an a DataCollector, one should specify both the CONTEXT and the KEY. This change was necessary to generate a "more correct" omnet result file, because the unpatched version writes the variable name where the module path (CONTEXT) should be, and module path info was not present at all. - of course DataCalculators now have a SetContext method In the examples folder: - a new wifi-example-omnet.sh was added; it runs the same example but generates omnet output. (this file must be made executable in the repo) - a minor modification to the wifi-example-db.sh was made and the SQL query was modified to reflect the fact that the variable names are now stored in the variable column instead of the name column)