Skip to content

issues Search Results · repo:sfujim/BCQ language:Python

Filter by

15 results
 (64 ms)

15 results

insfujim/BCQ (press backspace or delete to remove)

I think size with an initial value 0 can be problematic in sampling ind = np.random.randint(0, self.size, size=batch_size). What do you think?
  • Yigit-Kuyu
  • Opened 
    on Sep 17, 2024
  • #17

Traceback (most recent call last): File C:\Users\user\Desktop\BCQ-master\BCQ-master\discrete_BCQ\main.py , line 297, in module train_BCQ(env, replay_buffer, is_atari, num_actions, state_dim, device, args, ...
  • guest-oo
  • Opened 
    on May 18, 2024
  • #16

Hello, I am currently studying offline reinforcement learning and came across BCQ. It s a great work worth delving into. However, I have some questions regarding the paper that I d like to clarify and ...
  • awecefil
  • Opened 
    on Dec 14, 2023
  • #15

Hi, I ve been reading the original and discretised BCQ papers and wanted to ask if the original BCQ algorithm could be applied to an environment with discrete actions. I m probably misinterpreting sections ...
  • VinalAsodia
  • 1
  • Opened 
    on May 31, 2023
  • #14

is this? i_loss = F.nll_loss(imt, action.reshape(-1)) thx!
  • zichuan-liu
  • 1
  • Opened 
    on Oct 29, 2022
  • #13

imt = (imt/imt.max(1, keepdim=True)[0] self.threshold).float() should be imt = (imt/imt.max(1, keepdim=True) self.threshold).float()
  • tangbotony
  • 1
  • Opened 
    on Dec 15, 2021
  • #12

I am trying to reproduce the results of continuous environment, but the results are poor. Could you please give more details about the results? For example, what is the result when we run python main.py ...
  • SZH1230456
  • 1
  • Opened 
    on Sep 26, 2021
  • #11

Hi, I think BCQ addresses the extrapolation error very well. And I m curious about how to plot the figure 1.f as the paper show based on the collected offline batch. Great thanks for you reply
  • ReinholdM
  • 1
  • Opened 
    on Sep 9, 2021
  • #10

First of all, thanks for sharing the source code. My question is whether the done condition was used incorrectly in the discrete action branch?https://github.com/sfujim/BCQ/blob/4876f7e5afa9eb2981feec5daf67202514477518/discrete_BCQ/discrete_BCQ.py#L137 ...
  • XuJing1022
  • Opened 
    on Aug 26, 2021
  • #9

According your setting, I just can t train a behavioral agent for the DQN model don t converge. I want to know whether I make some mistakes or there is something wrong in your code. Can you help me?
  • wadx2019
  • Opened 
    on Jul 9, 2021
  • #8
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Restrict your search to the title by using the in:title qualifier.
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Press the
/
key to activate the search input again and adjust your query.
Issue search results · GitHub