Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMBER tests could be clearer #65

Closed
DrDomenicoMarson opened this issue Sep 12, 2022 · 19 comments · Fixed by #71
Closed

AMBER tests could be clearer #65

DrDomenicoMarson opened this issue Sep 12, 2022 · 19 comments · Fixed by #71

Comments

@DrDomenicoMarson
Copy link
Contributor

Hello everyone,
I'm testing my changes with the AMBER parser, and I was thinking about the reasoning behind having just one test "test_invalidfiles" to try a bunch of invalid files, all of which have the same name, having every time to check a descr file to know what was wrong. I thought it would be much easier (as pytest returns the name of the file read and the name of the test that failed) if
a) we rename the files with proper descriptive names (not a good idea I think)
b) divide test_invalidfiles into one test for each file/failure case, in this way it's easier to add new tests for possible invalid files, and the pytest output in case of error would be more informative.

I don't know if this suggestion is against some common behavior or paradigm, as it's the first time I'm using pytest and contributing to a public repository.

@xiki-tempula
Copy link
Collaborator

Thanks for opening an issue. Sorry, I don't quite understand the issue, maybe because I'm not familiar with the amber parser. I wonder if you mind providing an example of test_invalidfiles? Thanks.

@DrDomenicoMarson
Copy link
Contributor Author

Right now we have, in test_amber.py in alchemlyb:

@pytest.fixture(
    scope="module", params=[filename for filename in load_invalidfiles()['data'][0]])
def invalid_file(request):
    return request.param

def test_dHdl_invalidfiles(invalid_file):
    assert extract_dHdl(invalid_file, T=298.0) is None

def test_invalidfiles(invalid_file):
    """
    Test the file validation function to ensure the 
    function returning False if the file is invalid
    """
    assert file_validation(invalid_file, extract_mbar=True) is None

with in alchemtest:

def load_invalidfiles():
    """Load the invalid files.

    Returns
    -------
    data : Bunch
        Dictionary-like object, the interesting attributes are:

        - 'data' : the example of invalid data files
        - 'DESCR': the full description of the dataset

    """

    module_path = dirname(__file__)
    data = [glob(join(module_path, 'invalidfiles/*.out.bz2'))]

    with open(join(module_path, 'invalidfiles', 'descr.rst')) as rst_file:
        fdescr = rst_file.read()

    return Bunch(data=data,
                 DESCR=fdescr)

So essentially all "invalid_files" are loaded and checked, but from the errors reported it's not so easy to understand what it was testing and where it went wrong.

Instead of a common test_invalidfiles() we could have a test for each file in the "invalid_files" directory, with the proper name:

file_contains_no_useful_data
file_contains_no_control_data
file_with_Non_constant_temperature
file_with_no_free_energy_section
file_with_no_ATOMIC_section
file_with_no_RESULTS_section

something along this line... I don't know if it creates any problems, or if it breaks some common-practice!

@xiki-tempula
Copy link
Collaborator

I will close this one as it is not relevant now.

@xiki-tempula
Copy link
Collaborator

xiki-tempula commented Sep 25, 2022

If the concern is that

So essentially all "invalid_files" are loaded and checked, but from the errors reported it's not so easy to understand what it was testing and where it went wrong.

Shall we just name the files as contains_no_useful_data.out.bz2 instead of say invalid-case-1.out.bz2

So instead of

src/alchemlyb/tests/parsing/test_amber.py::test_invalidfiles[/home/runner/micromamba-root/envs/test/lib/python3.8/site-packages/alchemtest/amber/invalidfiles/invalid-case-1.out.bz2]

we got

src/alchemlyb/tests/parsing/test_amber.py::test_invalidfiles[/home/runner/micromamba-root/envs/test/lib/python3.8/site-packages/alchemtest/amber/invalidfiles/contains_no_useful_data.out.bz2]

Which should tell people which test behaving wrongly.

If you felt that this is too ugly we could change the format of load_invalidfiles()

Give that

>>> load_bace_example().data['solvated']['decharge'][0]
'alchemtest/amber/bace_CAT-13d~CAT-17a/solvated/decharge/0.00/ti-0.00.out.bz2'

We could have

>>> load_invalidfiles().data['contains_no_useful_data'][0]
'alchemtest/amber/invalidfiles/invalid-case-6.out.bz2'

My concern is that if we have

file_contains_no_useful_data
file_contains_no_control_data
file_with_Non_constant_temperature
file_with_no_free_energy_section
file_with_no_ATOMIC_section
file_with_no_RESULTS_section

In alchemtest, then these tests are not having the some format as the rest of the files in the repo like those load_simplesolvated, load_bace_example.

@DrDomenicoMarson
Copy link
Contributor Author

I think it is perfectly right to have these files in a different format than the other test files present to test amber. The thing is, these files are used to test completely different things. The files loaded by loadsimplesolvated/load_bace_example are entire data sets, created from some simulation, and represent a "real case scenario".
On the other hand, files that are in "invalid_files" (and now also in "files_to_check") are single, uncorrelated files, that each represent a specific issue that would create some problem in the parsing. These files are needed just to increase the coverage of the amber parser, as they force the parser to take all the code branches possible.

Franckly, some of the errors in the files in here are hard if not impossible to find in real-world scenarios, but I think it's fine to have them in here.

In my opinion, it's not good practice to have a single test that checks all invalid files at once, as we have right now, but we should have a test for each possible invalid file. Moreover, for example, the three test files that I would add in an update (issue #69) are needed to test some code branches that are currently not tested, but that are strictly not "invalid", just that would rise warnings to the user.

@xiki-tempula
Copy link
Collaborator

xiki-tempula commented Sep 25, 2022

Frankly, I'm not too happy with the current structure of the alchemtest as well (see #56 that I raised a year ago), as there are two conflicting goals for this repo.

I think currently, the primary goal of the alchemtest is to offer the user some ready-made dataset to test the alchemlyb or other alchemistry software. The secondary goal is then to provide test files for alchemlyb for the developers to use.

To achieve the primary goal, we don't want to confuse the user with too many datasets when one does from alchemtest.amber import , which is the reason that I don't like the approach of having

  • file_contains_no_useful_data
  • file_contains_no_control_data
  • file_with_Non_constant_temperature
  • file_with_no_free_energy_section
  • file_with_no_ATOMIC_section
  • file_with_no_RESULTS_section

under alchemtest.amber.

My thought is to have all the test files in load_invalidfiles or even load_testfiles if the naming is a bit confusing.

Then in the alchemlyb/tests/parsing/test_amber.py files you could have

@pytest.fixture
def file_contains_no_useful_data():
    return load_invalidfiles().data['file_contains_no_useful_data']

@pytest.fixture
def file_contains_no_control_data():
    return load_invalidfiles().data['file_contains_no_control_data']

def test_contains_no_useful_data(file_contains_no_useful_data):
    '''Test a single invalid file'''
    data = file_contains_no_useful_data
    ...


@pytest.mark.parametrize('dataloader',
                         ['file_contains_no_useful_data',
                          'file_contains_no_control_data'])
def test_invalidfiles(dataloader, request):
    data = request.getfixturevalue(dataloader)
    ...

@DrDomenicoMarson
Copy link
Contributor Author

Now I understand, I viewed alchemtest just as a suite of tests for the developer!

So, right now a user can use from alchemtest.amber import to load a dataset for some test, the only function that load files that are useful only for us developers is now load_invalidfiles.

As far as I understand load_invalidfiles is of no use for the regular user. So if we just change the function to pick a single file inside the directory invalid_files (which can be renamed to testfiles) like that:

def load_invalidfiles():
    """Load a single file to be used to run tests on the amber parser functionalities.

    Returns
    -------
    data : Bunch
        Dictionary-like object, the interesting attributes are:

        - 'data' : the requested file
        - 'DESCR': the full description of the file

    """

    module_path = dirname(__file__)
    data = join(module_path, 'invalidfiles', f'{filename}.out.bz2')

    with open(join(module_path, 'invalidfiles', f'{filename}.descr.rst')) as rst_file:
        fdescr = rst_file.read()

    return Bunch(data=data,
                 DESCR=fdescr)

in this way in the test_amber.py file, instead of calling load_invalidfiles to have all the files, we just call when needed the function with the name of the file we want, like

test_something(filename=load_invalidfiles("file_needed")['data']):
   "do some test with filename"

is that something doable? I don't think it will make the situation worse for the common user.

@xiki-tempula
Copy link
Collaborator

xiki-tempula commented Sep 25, 2022

So, right now a user can use from alchemtest.amber import to load a dataset for some test, the only function that load files that are useful only for us developers is now load_invalidfiles.

Yes.

I'm happy with grabbing the test file like

test_something(filename=load_invalidfiles("file_needed")['data']):
   "do some test with filename"

But I struggle to see how would

def load_invalidfiles():
    """Load a single file to be used to run tests on the amber parser functionalities.

    Returns
    -------
    data : Bunch
        Dictionary-like object, the interesting attributes are:

        - 'data' : the requested file
        - 'DESCR': the full description of the file

    """

    module_path = dirname(__file__)
    data = join(module_path, 'invalidfiles', f'{filename}.out.bz2')

    with open(join(module_path, 'invalidfiles', f'{filename}.descr.rst')) as rst_file:
        fdescr = rst_file.read()

    return Bunch(data=data,
                 DESCR=fdescr)

Achieve this?

@DrDomenicoMarson
Copy link
Contributor Author

Sorry, my bad, the function declaration should be

def load_invalidfiles(filename):
.... ....

@xiki-tempula
Copy link
Collaborator

xiki-tempula commented Sep 25, 2022

I think this is a good API call. I'm not sure if the doc would be correctly generated but if the doc is fine, I think this is a reasonable way of calling the test.

@orbeckst
Copy link
Member

orbeckst commented Sep 27, 2022

I fully support better naming the keys of the disparate test files.

I can also get behind renaming load_invalidfiles() to load_testfiles() — we are still on 0.x, so we can make breaking changes. And I doubt this is a particularly well-used function except in alchemlyb. However, in the interest of a smooth transition, I'd want to keep load_invalidfiles() around with a deprecation note and warning for a 0.7 release and then we can remove it for the next one.

As #65 (comment) stated, we want to keep our namespace reasonably clean so I'd like to keep a single accessor for all the test files, i.e., one DESCR etc.

I'd make the argument of load_testfiles(singlefilekey=None) optional and return the whole bunch by default. This maintains the standard behavior (for None) and otherwise pulls out the specific file.

I don't quite see how load_testfiles(singlefilekey=None)['data'] is much better than load_testfiles()[singlefile]['data'], though. If you all think it's better then I won't block it, though.

@xiki-tempula
Copy link
Collaborator

My original proposal in #65 (comment) is that the single test file could be accessed through
load_testfiles().data['file_contains_no_useful_data']

Without reading through this PR, I don't have time right now. I think in #65 (comment), @DrDomenicoMarson is suggesting that the same file could be accessed through load_testfiles('file_contains_no_useful_data').data

I think the advantage of load_testfiles().data['file_contains_no_useful_data'] is that we would have the structure largely unchanged but the disadvantage is that the description for all of the test files has to be in the same description of the bunch object.

Bunch(data={'test_1': 'test_1.xvg',
            'test_2': 'test_2.xvg',}
      DESCR='''test_1: test_1;
               test_2: test_2''')

The way I see load_testfiles('file_contains_no_useful_data').data is that it has the advantage of allowing each test file to have its own description in the bunch file but I'm not sure if this is compatible with RTD.

if key == 'test_1':
    return Bunch(data='test_1.xvg',
          DESCR='''test_1''')
elif key == 'test_2':
    return Bunch(data='test_2.xvg',
          DESCR='''test_2''')

@orbeckst
Copy link
Member

orbeckst commented Sep 27, 2022

load_testfiles().data['file_contains_no_useful_data'] changes the API that we defined in the contribution docs. There we state that we get a list of files in data. EDIT: <-- wrong, it's a dict and the items are lists.

The currently API-compatible way to get single files is load_testfiles()['file_contains_no_useful_data'].data[0].

EDIT: The above is incorrect, should be load_testfiles().data['file_contains_no_useful_data'][0]

@orbeckst
Copy link
Member

Is breaking the API and potentially having to fix the docs build worth the effort of not writing [0]?

@xiki-tempula
Copy link
Collaborator

xiki-tempula commented Sep 27, 2022

load_testfiles().data['file_contains_no_useful_data'] changes the API that we defined in the contribution docs. There we state that we get a list of files in data.

No, we don't.

>>> from alchemtest.amber import load_bace_example
>>> type(load_bace_example().data)
dict

In fact, I think all the XXX.data returns a dictionary?

@orbeckst
Copy link
Member

orbeckst commented Sep 27, 2022

Sorry, you're right.
https://alchemtest.readthedocs.io/en/latest/contributing.html#process says

The accessor function makes the dataset available as a dict under the data key in the Bunch. The data are typically another dict with different parts of a calculation such as Coulomb and VDW parts being different keys in a dictionary. All files that are needed for a single free energy calculation are in a list under the appropriate key. The description text is added the DESCR key.

So accessing a single file would be load_testfiles().data['file_contains_no_useful_data'][0].

@orbeckst
Copy link
Member

#65 (comment)

In my opinion, it's not good practice to have a single test that checks all invalid files at once, as we have right now, but we should have a test for each possible invalid file.

I agree.

Moreover, for example, the three test files that I would add in an update (issue #69) are needed to test some code branches that are currently not tested, but that are strictly not "invalid", just that would rise warnings to the user.

It makes sense to create a new load_testfiles() that includes both the invalid files and the new ones. Deprecate load_invalid() for removal in the release after 0.7 (i.e. 0.8).

@xiki-tempula
Copy link
Collaborator

Is breaking the API and potentially having to fix the docs build worth the effort of not writing [0]?

Sorry for the confusion. When I say load_testfiles().data['file_contains_no_useful_data'], I mean load_testfiles().data['file_contains_no_useful_data'][0]. I omitted [0] for simplicity as it is not the focus of the discussion.

@orbeckst
Copy link
Member

Ok, fully agree with you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants