-
Notifications
You must be signed in to change notification settings - Fork 34
FIX: Second level contrast computation, and dense/sparse transformation issues #48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In addition, I also fixed a couple other things:
|
fitlins/base.py
Outdated
else: | ||
raise ValueError("Unknown input: {}".format(in_name)) | ||
# else: ### Comment out to skip errors with periods | ||
# raise ValueError("Unknown input: {}".format(in_name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What errors with periods are you getting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhh, can't remember exactly now, but I think if my contrast names had periods in it, it would throw an error. We can comment this back in, the intent was to just skip that one instead of crashing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Just trying to figure out if this is a case we should work around. I suspect we do generally not want to accept arbitrary inputs.
fitlins/base.py
Outdated
@@ -145,23 +146,21 @@ def second_level(analysis, block, space, deriv_dir): | |||
job_desc.setdefault('contrasts', []).append(desc) | |||
|
|||
if op.exists(stat_fname): | |||
print(stat_fname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be dropped before merging.
|
||
fmri_glm.fit(data, design_matrix=paradigm) | ||
fmri_glm.fit(i_data, | ||
design_matrix=pd.DataFrame({'intercept': intercept})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it impossible for a non-intercept contrast to occur? My original intent was to leave that possibility open.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. We will have to revisit this later (maybe at the Hackathon), as it's a bit confusing how nistats works. But see this note in compute_contrast
: Basically, at the second level its rare to do anything beside a simple one sample t-test. That covers 90+% of use cases.
VERY IMPORTANT: The 'intercept' corresponds to the second level
effect after taking confounders in consideration. If there are
no confounders then this will be equivalent to a simple t test.
By default we compute the 'intercept' second level contrast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think of this as an intercept in the standard sense; it would be better if the argument were named weights
. You can pass any pattern of weights in as the intercept, and it only acts like a traditional intercept in the special case where we want a one-sample t-test on a single set of maps (i.e., the "identity contrast" case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just ran through this today, so I 10000 think it merits more investigation to see if this is the right way to do it. But at least for 1 vs 0 contrasts, it should be OK (which is again, most use cases, since people would compute complex contrasts (e.g. -1, 1), at the first level, then just want a one sample t-test of that contrast across the group)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're asking whether we would ever want to pass additional design matrix columns in besides the weights, the answer is no--at least, not when computing contrasts. We will eventually have to implement handling of second-level models (i.e., the stuff represented in the 'model'
section of each block), and that's where we'll potentially have a full design matrix that includes multiple columns (including, e.g., second-level covariates like average performance for each subject). But we can ignore this for now, as it's not all that common a use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's my understanding (that wasn't a question at you, just elaboration before I saw your comment)
block = analysis.blocks[0] | ||
|
||
entities = [] | ||
session_info = [] | ||
contrast_info = [] | ||
for paradigm, _, ents in block.get_design_matrix( | ||
block.model['HRF_variables'], mode='sparse'): | ||
block.model['HRF_variables'], mode='sparse', force=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the force=True
doing here, again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default, you only get back variables that or of the specified type ('sparse'
or 'dense'
), and others are ignored. The force
argument controls whether variables of the other type are forced to the target representation (i.e., dense to sparse, or sparse to dense).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sparsi-fying dense events. This is case that an HRF_variable was densified in place. This is necessary for certain operation, such as orthogonality. If nistats allowed dense events to be convolved, this would not be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Does this work well? I would imagine any orthogonalization that happens pre-convolution would not make something that can be nicely convolved...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the orthogonalization is supposed to happen in psychological space, then it should be done pre-convolution.
|
||
_, confounds, _ = block.get_design_matrix(mode='dense', | ||
_, confounds, _ = block.get_design_matrix(dense_vars, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did something change? I thought mode='dense'
only returned dense variables...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but if I densified an HRF_variable it will now be dense, and hence returned. And here I only want "confounds" that will not be convolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The confusion here is that pybids divorces semantics from syntax internally, whereas nistats doesn't. I.e., to pybids, a confound variable that shouldn't be convolved with the HRF can be represented as either sparse or dense. But nistats requires confounds to be passed in as dense, and HRF-convolved variables to be passed in as sparse. I think this is kind of suboptimal, but we're stuck with it for now. So all we're doing here is ensuring that we're retrieving all variables that shouldn't be convolved (in dense format), whether they're internally stored as sparse or dense.
fitlins/interfaces/bids.py
Outdated
contrasts = block.get_contrasts([contrast['name'] | ||
for contrast in block.contrasts], | ||
# Make contrast for each HRF_variable | ||
contrasts = block.get_contrasts([con for con in block.model['HRF_variables']], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure this will stop performing anything except identity contrasts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry, this was just a temporary hack I put in to get identity contrasts for all my HRF variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't think we want this constraint.
fitlins/interfaces/bids.py
Outdated
**ents)[0][0].T | ||
# Add test indicator column | ||
contrasts['type'] = [contrast['type'] | ||
for contrast in block.contrasts] | ||
contrasts['type'] = ['T' for contrast in contrasts] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See this section of #30 for how I'm planning to handle identity contrasts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also a temporary hack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to handle identity contrasts any differently from other contrasts... In theory the changes @adelavega made should handle any kind of contrast. The key to getting identity contrasts working with nistats is to drop all the rows that have weights of 0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the slow turnaround. I think this is almost ready.
fitlins/interfaces/bids.py
Outdated
@@ -157,14 +157,14 @@ def _run_interface(self, runtime): | |||
selectors = self.inputs.selectors | |||
|
|||
analysis = ba.Analysis(model=self.inputs.model, layout=layout) | |||
analysis.setup(**selectors) | |||
analysis.setup(drop_na=False, derivatives='only', **selectors) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is saying that it won't use event files in the original BIDS directory? That's going to break probably everything but Neuroscout-based analyses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I meant not to push that. On that note though, since I need to set that to "only" could we make that configurable?
I think this is a good example of a feature that would be useful to allow access to via a Python API, but is probably overkill for the CLI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, how about adding to the input spec:
derivatives = traits.Enum('both', 'ignore', 'only', desc='...')
And adapt this for the desc
.
Also, does the pybids pin need updating? |
Not for this specifically, although this commit is what I'm using |
Okay. Once this is cleaned up, I'll pull it and make sure I can still run some non-NeuroScout models. |
I'm probably doing something wrong here, but nothing seems to be happening:
I'll check with current master. |
Yeah, doesn't work with master, either. I guess I'll need to figure out what I did in the past to make this work. Let me know if you see something obvious with the command. |
Not sure what's up with that, command looks ok to me. |
Ah, a permissions issue that got squashed by the |
On Sherlock_Merlin, I run into this:
I guess this is why you commented out that |
I guess they must be, since they're added by some transformers. I think the issue is probably the underscores. |
Anyway, I can run this on my datasets, but not yours... How are you making this work? |
Yes, I ran into the periods issue as well, and I changed the variable name in my db because of it to an underscore. And that's why I got rid of the |
If that's the only outstanding issue, I think this is good to go, since it's working for me. Unless we also want to talked allowing periods (I think the issue was with snake to camel?) |
@tyarkoni and I discovered that second level contrasts were not being computed correctly. As it was previously done there were two major issues:
From reading nistats code/docs, it seems as thought the intention is to run a separate model for each contrast, and model the contrast as the "intercept" (i.e. the mean), without additional regressors in the model. This does what you want: a one-sample t-test against 0 for each first-level contrast, operating on the first-level parameter estimates. Images with 0 in that contrast must be excluded.