8000 Understanding Lookahead:_compute_cfvs() · Issue #23 · lifrordi/DeepStack-Leduc · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Understanding Lookahead:_compute_cfvs() #23

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Kiv opened this issue May 16, 2018 · 2 comments
Open

Understanding Lookahead:_compute_cfvs() #23

Kiv opened this issue May 16, 2018 · 2 comments
8000

Comments

@Kiv
Copy link
Kiv commented May 16, 2018

I understand in normal CFR how a parent node computes the CFV, but I'm struggling to understand how this is done in the lookahead's vectorized implementation.

Specifically, in Lookahead:_compute_cfvs():

function Lookahead:_compute_cfvs()
  for d=self.depth,2,-1 do
    local gp_layer_terminal_actions_count = self.terminal_actions_count[d-2]
    local ggp_layer_nonallin_bets_count = self.nonallinbets_count[d-3]

    self.cfvs_data[d][{{}, {}, {}, {1}, {}}]:cmul(self.empty_action_mask[d])
    self.cfvs_data[d][{{}, {}, {}, {2}, {}}]:cmul(self.empty_action_mask[d])

    self.placeholder_data[d]:copy(self.cfvs_data[d])

    --player indexing is swapped for cfvs
    self.placeholder_data[d][{{}, {}, {}, self.acting_player[d], {}}]:cmul(self.current_strategy_data[d])

    torch.sum(self.regrets_sum[d], self.placeholder_data[d], 1)

    --use a swap placeholder to change {{1,2,3}, {4,5,6}} into {{1,2}, {3,4}, {5,6}}
    local swap = self.swap_data[d-1]
    swap:copy(self.regrets_sum[d])
    self.cfvs_data[d-1][{{gp_layer_terminal_actions_count+1, -1}, {1, ggp_layer_nonallin_bets_count}, {}, {}, {}}]:copy(swap:transpose(2,3))
  end

end

So I understand we multiply empty (illegal) actions by the mask so their CFV is zero, but I'm lost about what the swap and transpose is doing, or how slicing the parent's cfvs_data copies things to the right place.

Can anyone explain more clearly?

@DWingHKL
Copy link
DWingHKL commented May 27, 2018

I also not understand transpose(2,3) and the comment “use a swap placeholder to change {{1,2,3}, {4,5,6}} into {{1,2}, {3,4}, {5,6}”

@yffbit
Copy link
yffbit commented Jan 14, 2021

There are two things to note:

  1. Every node in d layer is indexed by [action_id, parent_id, gp_id]. When transitioning to d+1 layer, action_id becomes the new parent_id and [parent_id, gp_id] are combined to generate the new gp_id.

    local next_parent_id = action_id - prev_layer_terminal_actions_count
    local next_gp_id = (gp_id - 1) * gp_nonallinbets_count + (parent_id)

    For new gp_id, the code (gp_id - 1) * gp_nonallinbets_count + (parent_id) means that parent_id axis are higher than gp_id axis. That is to say [parent_id, gp_id] should be transposed to [gp_id, parent_id].
    From d layer to d+1 layer,
    [parent_id, gp_id] --(transpose)--> [gp_id, parent_id] --(change view)--> [gp_id*parent_id]
    From d+1 layer to d layer,
    [gp_id] --(change view)--> [gp_id, parent_id] --(transpose)--> [parent_id, gp_id]

  2. cfvs_data[d][{{}, {}, {}, 1, {}}] stores the cfv data for “player 2” in the lookahead tree, and cfvs_data[d][{{}, {}, {}, 2, {}}] stores the cfv data for “player 1”. That is to say the player indexs are swapped. Note that “player 1” is the first player in lookahead tree, and not really refer to player 1 in the real game.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
0