8000 Structure of future attributes · Issue #419 · cclib/cclib · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Structure of future attributes #419

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
berquist opened this issue Aug 18, 2017 · 8 comments
Open

Structure of future attributes #419

berquist opened this issue Aug 18, 2017 · 8 comments

Comments

@berquist
Copy link
Member

This is for discussion of how attributes will be represented in the future. Some relevant starting points are #398 (comment) and #227.

@langner langner added this to the v2.x milestone Aug 20, 2017
@langner
Copy link
Member
langner commented Aug 20, 2017

I think the design constraints are something like this:

  • Attributes are extensible, in order to parse and provide access to variantss that are rare or program-specific, but nonetheless clearly the same attribute. The two examples we've run into so far have been atomic charges (which is currently a dictionary) and transition properties (now being discussed in ORCA TD 8000 DFT and ROCIS Parsing #398).
  • Variants are accessible directly as part of the data structure (ccData), although possibly by calling some method if that makes sense.

It's pretty clear it'll be hard to make this change backwards compatible, so let's plan this for a 2.x branch.

It might make sense to think about the recently added metadata as well - should it remain a dictionary or adhere to this new paradigm (whatever it is)?

@langner
Copy link
Member
langner commented Aug 20, 2017

Some discussion from #227:

I'm opposed to creating multiple attributes for the same "thing". All these representation as, in principle, equivalent.

How about we just go forward with generalizing our attributes into classes? We could do it for this attribute now, and move atomcharges and whatever else would be backwards incompatible in 2.x. This will take > some coding, but seem fairly straightforward. Actually, I see several options:

  • turn attributes into classes (changes API a lot but provides lots of flexibility)
  • make attributes into function calls (changes API, allows us to hide multiplicity)
  • subclass int/float/ndarray/whatever and modify getitem (does not change API, hides multiplicity)

I can try and make a mockup based on top of this PR. I don't like 3, because now is the only chance to change the API. I can't visualize what 2 would look like, is there anything it could do that 1 couldn't? I'm currently in favor of 1.

I think I'm leaning towards 1 as well since it's the most flexible, but I wanted to consider the alternatives.

@ATenderholt any other ideas or suggestions at this point? What do you think about this endeavor in general?

@berquist
Copy link
Member Author

I have started working on this, for something to discuss: https://github.com/berquist/cclib/compare/cjson...berquist:qcjson?expand=1

Each attribute should eventually have its own to_molden(), to_cjson(), to_qcjson(), or whatever specialized format we want conversion to, and then the writers will collect all the available relevant attributes and put them in the right hierarchy; the formatting will be abstracted to the attribute classes. They may also contain information about the hierarchy.

Currently, __call__ is used to make an attribute instance work like we have attributes now, as combined getter/setter. What is currently

        # Static polarizability from **PROPERTIES/.POLARI.
        if line.strip() == "Static polarizabilities (au)":
            if not hasattr(self, 'polarizabilities'):
                self.polarizabilities = []
            polarizability = []
            self.skip_lines(inputfile, ['d', 'b', 'directions', 'b'])
            for _ in range(3):
                line = next(inputfile)
                # Separate possibly unspaced huge negative polarizability tensor
                # element and the left adjacent column from each other.
                line = line.replace('-', ' -')
                polarizability.append(line.split()[1:])
            self.polarizabilities.append(numpy.array(polarizability))

would become

        # Static polarizability from **PROPERTIES/.POLARI.
        if line.strip() == "Static polarizabilities (au)":
            if not hasattr(self, 'polarizabilities'):
                self.polarizabilities = attributes.polarizabilities()
                self.polarizabilities([])
            polarizability = []
            self.skip_lines(inputfile, ['d', 'b', 'directions', 'b'])
            for _ in range(3):
                line = next(inputfile)
                # Separate possibly unspaced huge negative polarizability tensor
                # element and the left adjacent column from each other.
                line = line.replace('-', ' -')
                polarizability.append(line.split()[1:])
            polarizabilities = self.polarizabilities() + [numpy.array(polarizability)]
            self.polarizabilities(polarizabilities)

This is not good, but I'm still learning about private class methods. I also still need to handle type conversion for "compound" types (like lists of arrays).

8000

@berquist
Copy link
Member Author

Desired behavior notes:

  • In order to not require rewriting the current parsers, the current method of setting attributes should still work. For example,
if not hasattr(self, 'polarizabilities'):
    self.polarizabilities = []
polarizability = []
for _ in range(3):
    line = next(inputfile)
    polarizability.append(line.split()[1:])
self.polarizabilities.append(polarizability)

where self.polarizabilities is initially a list of "3 by 3" lists of strings, that then gets converted to a list of NumPy ndarrays of shape [3, 3].

  • Type checking and conversion are moved into the attribute class instances rather than ccData and Logfile.
  • In ccData, when preparing the dictionary of allowed attributes, if each attribute isn't an instance of a class of the same name as the attribute, create it by passing the current attribute to an instance of the class:
assert isinstance(attribute_classes, dict)
assert isinstance(attribute_instances, dict)
for attribute_name, attribute_instance in attribute_instances:
    assert isinstance(attribute_name, str)
    # The keys of attribute_instances should be a subset of the keys of attribute_classes
    assert attribute_name in attribute_classes
    if not isinstance(attribute_instance, attribute_classes[attribute_name])):
        attribute_instances[attribute_name] = attribute_classes[attribute_name](attribute_instance)
    else:
        # The constructor already does type checking/conversion if passed an argument.
        attribute_instance.typecheck()
  • Once the final ccData instance has been created, what should data.polarizabilities give?
    • Default Python behavior gives the attribute directly, which would be the class instance.
    • Backwards compatibility would give the int, double, ndarray, list of ..., ...

@berquist
Copy link
Member Author

I have a general implementation of attribute classes, but still need to modify all the parsers. Here are my current thoughts.

If each attribute is a property on ccData, then getters and setters can be written in order to maintain backwards compatibility where we have syntax like self.atomcoords = ... and ... = self.atomcoords[] rather than having self.atomcoords() = ... and ... = self.atomcoords()[] or similar. This might avoid requiring parser changes to start with.

If we want to do this dynamically, as we do now, there is some trickery that needs to be done according to https://stackoverflow.com/a/1355444/, but I haven't tested it.

@berquist
Copy link
Member Author

Can we have a 2.x branch that backwards incompatible features get merged into?

@berquist
Copy link
Member Author

Poke for #419 (comment).

@langner
Copy link
Member
langner commented Jun 29, 2020

Poke for #419 (comment).

Yep: https://github.com/cclib/cclib/tree/2.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants
0