8000 FIX: Second level contrast computation, and dense/sparse transformation issues by adelavega · Pull Request #48 · poldracklab/fitlins · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

FIX: Second level contrast computation, and dense/sparse transformation issues #48

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jun 7, 2018

Conversation

adelavega
Copy link
Collaborator

@tyarkoni and I discovered that second level contrasts were not being computed correctly. As it was previously done there were two major issues:

  1. The first level was outputting stat maps, rather than effect sizes (parameter estimates). Those stat maps were being input to the 2nd level, whereas they should really be the effect sizes.
  2. A "model" was being fit for the 2nd level, with all contrasts in the same model, whereas each contrast should be independent. In addition, the required "intercept" was problematic because it was absorbing most of the variance, and as a result identity contrasts looked like [1, -1], instead of [1, 0].

From reading nistats code/docs, it seems as thought the intention is to run a separate model for each contrast, and model the contrast as the "intercept" (i.e. the mean), without additional regressors in the model. This does what you want: a one-sample t-test against 0 for each first-level contrast, operating on the first-level parameter estimates. Images with 0 in that contrast must be excluded.

@adelavega
Copy link
Collaborator Author

In addition, I also fixed a couple other things:

  • Don't drop NAs when reading in event files (perhaps this should be configurable by the user, but I think this is a better default)
  • Given that sparse events can be densified in transformations, but they should still be given to nistats as "sparse" events so they get convolved, we can't rely on sparse vs dense mode to get the right variables. Now we force sparse mode for all variables, and select HRF_variables from them. For dense variables, request non-HRF_variables.

@adelavega adelavega changed the title FIX: Second level contrast computation, and other minor issues. FIX: Second level contrast computation, and dense/sparse transformation issues Jun 1, 2018
fitlins/base.py Outdated
else:
raise ValueError("Unknown input: {}".format(in_name))
# else: ### Comment out to skip errors with periods
# raise ValueError("Unknown input: {}".format(in_name))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What errors with periods are you getting?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhh, can't remember exactly now, but I think if my contrast names had periods in it, it would throw an error. We can comment this back in, the intent was to just skip that one instead of crashing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Just trying to figure out if this is a case we should work around. I suspect we do generally not want to accept arbitrary inputs.

fitlins/base.py Outdated
@@ -145,23 +146,21 @@ def second_level(analysis, block, space, deriv_dir):
job_desc.setdefault('contrasts', []).append(desc)

if op.exists(stat_fname):
print(stat_fname)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be dropped before merging.


fmri_glm.fit(data, design_matrix=paradigm)
fmri_glm.fit(i_data,
design_matrix=pd.DataFrame({'intercept': intercept}))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it impossible for a non-intercept contrast to occur? My original intent was to leave that possibility open.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. We will have to revisit this later (maybe at the Hackathon), as it's a bit confusing how nistats works. But see this note in compute_contrast: Basically, at the second level its rare to do anything beside a simple one sample t-test. That covers 90+% of use cases.

        VERY IMPORTANT: The 'intercept' corresponds to the second level
        effect after taking confounders in consideration. If there are
        no confounders then this will be equivalent to a simple t test.
        By default we compute the 'intercept' second level contrast.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think of this as an intercept in the standard sense; it would be better if the argument were named weights. You can pass any pattern of weights in as the intercept, and it only acts like a traditional intercept in the special case where we want a one-sample t-test on a single set of maps (i.e., the "identity contrast" case).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just ran through this today, so I 10000 think it merits more investigation to see if this is the right way to do it. But at least for 1 vs 0 contrasts, it should be OK (which is again, most use cases, since people would compute complex contrasts (e.g. -1, 1), at the first level, then just want a one sample t-test of that contrast across the group)

Copy link
Collaborator
@tyarkoni tyarkoni Jun 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're asking whether we would ever want to pass additional design matrix columns in besides the weights, the answer is no--at least, not when computing contrasts. We will eventually have to implement handling of second-level models (i.e., the stuff represented in the 'model' section of each block), and that's where we'll potentially have a full design matrix that includes multiple columns (including, e.g., second-level covariates like average performance for each subject). But we can ignore this for now, as it's not all that common a use case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that's my understanding (that wasn't a question at you, just elaboration before I saw your comment)

block = analysis.blocks[0]

entities = []
session_info = []
contrast_info = []
for paradigm, _, ents in block.get_design_matrix(
block.model['HRF_variables'], mode='sparse'):
block.model['HRF_variables'], mode='sparse', force=True):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the force=True doing here, again?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, you only get back variables that or of the specified type ('sparse' or 'dense'), and others are ignored. The force argument controls whether variables of the other type are forced to the target representation (i.e., dense to sparse, or sparse to dense).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sparsi-fying dense events. This is case that an HRF_variable was densified in place. This is necessary for certain operation, such as orthogonality. If nistats allowed dense events to be convolved, this would not be necessary.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Does this work well? I would imagine any orthogonalization that happens pre-convolution would not make something that can be nicely convolved...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the orthogonalization is supposed to happen in psychological space, then it should be done pre-convolution.


_, confounds, _ = block.get_design_matrix(mode='dense',
_, confounds, _ = block.get_design_matrix(dense_vars,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did something change? I thought mode='dense' only returned dense variables...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but if I densified an HRF_variable it will now be dense, and hence returned. And here I only want "confounds" that will not be convolved.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

contrasts = block.get_contrasts([contrast['name']
for contrast in block.contrasts],
# Make contrast for each HRF_variable
contrasts = block.get_contrasts([con for con in block.model['HRF_variables']],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this will stop performing anything except identity contrasts.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, this was just a temporary hack I put in to get identity contrasts for all my HRF variables.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't think we want this constraint.

**ents)[0][0].T
# Add test indicator column
contrasts['type'] = [contrast['type']
for contrast in block.contrasts]
contrasts['type'] = ['T' for contrast in contrasts]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See this section of #30 for how I'm planning to handle identity contrasts.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also a temporary hack.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to handle identity contrasts any differently from other contrasts... In theory the changes @adelavega made should handle any kind of contrast. The key to getting identity contrasts working with nistats is to drop all the rows that have weights of 0.

Copy link
Collaborator
@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slow turnaround. I think this is almost ready.

@@ -157,14 +157,14 @@ def _run_interface(self, runtime):
selectors = self.inputs.selectors

analysis = ba.Analysis(model=self.inputs.model, layout=layout)
analysis.setup(**selectors)
analysis.setup(drop_na=False, derivatives='only', **selectors)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is saying that it won't use event files in the original BIDS directory? That's going to break probably everything but Neuroscout-based analyses.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I meant not to push that. On that note though, since I need to set that to "only" could we make that configurable?

I think this is a good example of a feature that would be useful to allow access to via a Python API, but is probably overkill for the CLI.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, how about adding to the input spec:

derivatives = traits.Enum('both', 'ignore', 'only', desc='...')

And adapt this for the desc.

@effigies
Copy link
Collaborator
effigies commented Jun 6, 2018

Also, does the pybids pin need updating?

@adelavega
Copy link
Collaborator Author

Not for this specifically, although this commit is what I'm using ca90e9c99cd0fb7db96a924c351138b714ebecb1

@effigies
Copy link
Collaborator
effigies commented Jun 6, 2018

Okay. Once this is cleaned up, I'll pull it and make sure I can still run some non-NeuroScout models.

@effigies
Copy link
Collaborator
effigies commented Jun 6, 2018

I'm probably doing something wrong here, but nothing seems to be happening:

$ docker run -v /data/bids/Sherlock_Merlin:/data:ro \
    -v /data/out/Sherlock_Merlin/derivatives:/out \
    -v /data/scratch/Sherlock_Merlin:/scratch \
    poldracklab/fitlins:neuroscout /data /out dataset \
    -w /scratch -m /data/derivatives/neuroscout/qW/model.json
180606-20:23:32,562 cli INFO:
         
Running FitLins version 0.0.3+113.g0e0ff0a.dirty:
  * Participant list: None.

180606-20:23:32,636 workflow INFO:
         Workflow fitlins_wf settings: ['check', 'execution', 'logging', 'monitoring']

I'll check with current master.

@effigies
Copy link
Collaborator
effigies commented Jun 6, 2018

Yeah, doesn't work with master, either. I guess I'll need to figure out what I did in the past to make this work. Let me know if you see something obvious with the command.

@adelavega
Copy link
Collaborator Author

Not sure what's up with that, command looks ok to me.

@effigies
Copy link
Collaborator
effigies commented Jun 7, 2018

Ah, a permissions issue that got squashed by the try/except block.

@effigies
Copy link
Collaborator
effigies commented Jun 7, 2018

On Sherlock_Merlin, I run into this:

unknown input: num_colors_0.07

I guess this is why you commented out that try/except? Are periods valid characters in variables? (@tyarkoni?)

@effigies
Copy link
Collaborator
effigies commented Jun 7, 2018

I guess they must be, since they're added by some transformers. I think the issue is probably the underscores.

@effigies
Copy link
Collaborator
effigies commented Jun 7, 2018

Anyway, I can run this on my datasets, but not yours... How are you making this work?

@adelavega
Copy link
Collaborator Author

Yes, I ran into the periods issue as well, and I changed the variable name in my db because of it to an underscore. And that's why I got rid of the try/except more generally-- makes it hard to debug.

@adelavega
Copy link
Collaborator Author

If that's the only outstanding issue, I think this is good to go, since it's working for me. Unless we also want to talked allowing periods (I think the issue was with snake to camel?)

@effigies effigies merged commit 5bc2dcc into poldracklab:master Jun 7, 2018
@effigies effigies mentioned this pull request Jul 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0