8000 Testing the calculations in format.c automatically · Issue #9 · ltratt/multitime · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Testing the calculations in format.c automatically #9
Open
@snim2

Description

@snim2

Matthew Howell, one of my undergraduate students, has reported what might be a bug in multitime. The bug report from Matthew was that when running very short programs at reasonable values of n (e.g. n=30) the resulting confidence intervals were very large, larger than the mean of the wall-clock time. This was on a very short-running program.

This might not be a bug, it may be that the CI is correct, but without being able to see the times of the individual runs it is difficult to tell. On the other hand, there may be an error in the CI calculations, or it may be that my code is not using the t-value and z-value LUTs correctly.

It is difficult to see how to track this potential bug down in such a way that we can use the work done now for regression testing in the future, without incurring a huge overhead in adding many unit tests now, but here is one (slightly eccentric) idea:

  1. Split the calculations in void format_other(Conf) into smaller units that can be tested separately. This would create separate functions for calculating means, CIs, etc.
  2. Create a separate repository called python-multitime which uses cffi to expose the functions created in Step 1. to Python.
  3. In this repository (?) create a number of assertions against the functions from Step 1. that can be tested using Python quickcheck. The advantage of doing this is that we can compare the standard deviations, confidence intervals, etc. from multitime to those generated by scipy or similar, as a ground truth.
  4. In this repository create a .travis.yml file which uses python-multitime from Step 2. and Python quickcheck to test the functions from Step 1. automatically.

This is a bit messy, and it would probably be neater to do everything in C...

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    0