8000 Initial support for concatenated archives, and memory fixes by jhol · Pull Request #88 · USCiLab/cereal · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Initial support for concatenated archives, and memory fixes #88

New issue < 8000 details-dialog class="Box Box--overlay d-flex flex-column anim-fade-in fast overflow-auto" aria-label="Sign up for GitHub">

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

jhol
Copy link
Contributor
@jhol jhol commented Mar 31, 2014

This branch does the following...

Memory management fixes

Fixes some memory management issues. Currently cereal has a mixture of strings handled with std::string and strings handled with char*. The use of raw-pointers leads to some counter-intuitive behaviour. For example:

if (foo) {
  cons
8000
t std::string s("test");
  a.setNextName(s.c_str());
}

Typically the caller would expect this to work, because setNextName would take a copy of the string - but it doesn't. Therefore, in this case there will be a memory error after s goes out of scope. The first 5 patches address this issue by replacing some of cereal's raw pointers with std::strings.

Support for concatenated archives

The JSON and XML input archives currently read in the whole of an input stream until they hit the end of the file. In some cases this is undesirable. For example a file may contain more data after an archive, or it may be received via some kind of socket stream, or via a UNIX pipe. In these cases it is incorrect to read to the end of the file. The correct behavior is to read to the end of the XML or JSON document.

This branch addresses this issue by adding a test case to show this issue. It writes two archives into a single stream, then checks that they are both read out correctly.

Following the test case, I have modified XMLInputArchive to add support.

Currently if two XML archives are written to an output stream the following output will be generated:

<?xml version="1.0" encoding="utf-8"?>
<cereal>
        <value0>Apples

Pears</value0>
</cereal>

<?xml version="1.0" encoding="utf-8"?>
<cereal>
        <value0>Oranges

Lemons</value0>
</cereal>

Here we are outputting one archive with the string Apples\n\nPears and another with the string Oranges\n\nLemons. cereal's XML output will insert an empty new-line after each XML document.

Currently RapidXML does not provide a mechanism by which it can read only up to the end of the document, so my modification reads until it finds an empty line, then attempts to parse the accumulated XML. If the parse fails, this indicates that we have hit an empty line inside the XML document and the we should continue reading. The drawback of this approach is that the document must be re-parsed every time the input stream emits a blank line, which might become problematic in an XML document containing many empty lines.

There is still some work remaining to fix JSONInputArchive similarly to pass the test. It may be possible to do this via RapidJSON's internal stream support, or by counting the the number of { and } braces, until the tally returns to zero.

@AzothAmmo
Copy link
Contributor

We might be moving away from both RapidXML and RapidJSON in the next release (see #82 and #83), which would affect both changes you've proposed here, though I'm not entirely sure how significant the impact would be.

Can you describe the situation in which you ran into issues using the current setNextName implementation?

@jhol
Copy link
Contributor Author
jhol commented Apr 1, 2014

The setNextName issue we encountered, occurs when we're outputting to XML, and you create a temporary C-string (e.g. something from std::string::c_str()), that gets deleted before XMLOuputArchive::NodeInfo::getValueName gets executed. At this point, the name string has been de-allocated by the application, but the ghost data may still be hanging around in memory - which is why this issue is mostly invisible, unless you run the application inside efence or valgrind. Also, none of the test suites call this function right now.

The resolution is to copy the string inside setNextName, which is the is a more conventional pattern IMO. If setNextName isn't going to copy the string, there should be a big warning in the documentation that the user should keep the string around.

With that change in place for XMLOutputArchive, I applied a similar modification to XMLInputArchive, JSONInputArchive and JSONOutputArchive. Perhaps the change does not need to be applied to the input archives - in which case those patches can be excluded.

This branch is kind-of like a buffet - take from it what you want.

With respect to the concatenation patches. I would be keen to see the test case included at the very least. The patch to XMLInputArchive are also quite low impact, and not unreasonable in the interim especially if RapidXML is going to be replaced soon.

Finally, 179d839 should be applied straight away. This is a simple fix for a memory bug caused by a temporary string going out of scope.

@AzothAmmo
Copy link
Contributor

I merged 179d839 into develop.

@jhol
Copy link
Contributor Author
jhol commented Apr 3, 2014

@AzothAmmo thanks for pulling that commit. - though what should we do about the others?

Is there somewhere regular contributors meet? the mailing list or an IRC channel so that we can discuss these things?

@AzothAmmo
Copy link
Contributor

We recently made a mailing list: https://groups.google.com/forum/#!forum/cerealcpp. It hasn't been used much yet.

As far as the other issues, I think they should be investigated in tandem with migrating to PugiXML, especially your desire to be able to have more than one archive in a given stream. I think the migration itself will likely not be a large amount of work.

@AzothAmmo AzothAmmo added this to the v2.0.0 milestone May 4, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0