Understanding Lookahead:_compute_cfvs() #23

Kiv · 2018-05-16T11:28:34Z

I understand in normal CFR how a parent node computes the CFV, but I'm struggling to understand how this is done in the lookahead's vectorized implementation.

Specifically, in Lookahead:_compute_cfvs():

function Lookahead:_compute_cfvs()
  for d=self.depth,2,-1 do
    local gp_layer_terminal_actions_count = self.terminal_actions_count[d-2]
    local ggp_layer_nonallin_bets_count = self.nonallinbets_count[d-3]

    self.cfvs_data[d][{{}, {}, {}, {1}, {}}]:cmul(self.empty_action_mask[d])
    self.cfvs_data[d][{{}, {}, {}, {2}, {}}]:cmul(self.empty_action_mask[d])

    self.placeholder_data[d]:copy(self.cfvs_data[d])

    --player indexing is swapped for cfvs
    self.placeholder_data[d][{{}, {}, {}, self.acting_player[d], {}}]:cmul(self.current_strategy_data[d])

    torch.sum(self.regrets_sum[d], self.placeholder_data[d], 1)

    --use a swap placeholder to change {{1,2,3}, {4,5,6}} into {{1,2}, {3,4}, {5,6}}
    local swap = self.swap_data[d-1]
    swap:copy(self.regrets_sum[d])
    self.cfvs_data[d-1][{{gp_layer_terminal_actions_count+1, -1}, {1, ggp_layer_nonallin_bets_count}, {}, {}, {}}]:copy(swap:transpose(2,3))
  end

end

So I understand we multiply empty (illegal) actions by the mask so their CFV is zero, but I'm lost about what the swap and transpose is doing, or how slicing the parent's cfvs_data copies things to the right place.

Can anyone explain more clearly?

The text was updated successfully, but these errors were encountered:

DWingHKL · 2018-05-27T15:48:28Z

I also not understand transpose(2,3) and the comment “use a swap placeholder to change {{1,2,3}, {4,5,6}} into {{1,2}, {3,4}, {5,6}”

yffbit · 2021-01-14T15:13:07Z

There are two things to note:

Every node in d layer is indexed by [action_id, parent_id, gp_id]. When transitioning to d+1 layer, action_id becomes the new parent_id and [parent_id, gp_id] are combined to generate the new gp_id.

DeepStack-Leduc/Source/Lookahead/lookahead_builder.lua

Lines 235 to 236 in da416f9

    
           local next_parent_id = action_id - prev_layer_terminal_actions_count 
        
           local next_gp_id = (gp_id - 1) * gp_nonallinbets_count + (parent_id)

For new gp_id, the code (gp_id - 1) * gp_nonallinbets_count + (parent_id) means that parent_id axis are higher than gp_id axis. That is to say [parent_id, gp_id] should be transposed to [gp_id, parent_id].
From d layer to d+1 layer,
[parent_id, gp_id] --(transpose)--> [gp_id, parent_id] --(change view)--> [gp_id*parent_id]
From d+1 layer to d layer,
[gp_id] --(change view)--> [gp_id, parent_id] --(transpose)--> [parent_id, gp_id]

cfvs_data[d][{{}, {}, {}, 1, {}}] stores the cfv data for “player 2” in the lookahead tree, and cfvs_data[d][{{}, {}, {}, 2, {}}] stores the cfv data for “player 1”. That is to say the player indexs are swapped. Note that “player 1” is the first player in lookahead tree, and not really refer to player 1 in the real game.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Understanding Lookahead:_compute_cfvs() #23

Understanding Lookahead:_compute_cfvs() #23

Understanding Lookahead:_compute_cfvs() #23

Understanding Lookahead:_compute_cfvs() #23

Comments