issues Search Results · repo:sfujim/BCQ language:Python
Filter by
15 results
(64 ms)15 results
insfujim/BCQ (press backspace or delete to remove)I think size with an initial value 0 can be problematic in sampling ind = np.random.randint(0, self.size,
size=batch_size). What do you think?
Yigit-Kuyu
- Opened on Sep 17, 2024
- #17
Traceback (most recent call last): File C:\Users\user\Desktop\BCQ-master\BCQ-master\discrete_BCQ\main.py , line 297, in
module train_BCQ(env, replay_buffer, is_atari, num_actions, state_dim, device, args, ...
guest-oo
- Opened on May 18, 2024
- #16
Hello, I am currently studying offline reinforcement learning and came across BCQ. It s a great work worth delving into.
However, I have some questions regarding the paper that I d like to clarify and ...
awecefil
- Opened on Dec 14, 2023
- #15
Hi,
I ve been reading the original and discretised BCQ papers and wanted to ask if the original BCQ algorithm could be
applied to an environment with discrete actions. I m probably misinterpreting sections ...
VinalAsodia
- 1
- Opened on May 31, 2023
- #14
imt = (imt/imt.max(1, keepdim=True)[0] self.threshold).float()
should be
imt = (imt/imt.max(1, keepdim=True) self.threshold).float()
tangbotony
- 1
- Opened on Dec 15, 2021
- #12
I am trying to reproduce the results of continuous environment, but the results are poor. Could you please give more
details about the results? For example, what is the result when we run python main.py ...
SZH1230456
- 1
- Opened on Sep 26, 2021
- #11
Hi, I think BCQ addresses the extrapolation error very well. And I m curious about how to plot the figure 1.f as the
paper show based on the collected offline batch. Great thanks for you reply
ReinholdM
- 1
- Opened on Sep 9, 2021
- #10
First of all, thanks for sharing the source code.
My question is whether the done condition was used incorrectly in the discrete action
branch?https://github.com/sfujim/BCQ/blob/4876f7e5afa9eb2981feec5daf67202514477518/discrete_BCQ/discrete_BCQ.py#L137 ...
XuJing1022
- Opened on Aug 26, 2021
- #9
According your setting, I just can t train a behavioral agent for the DQN model don t converge. I want to know whether I
make some mistakes or there is something wrong in your code. Can you help me?
wadx2019
- Opened on Jul 9, 2021
- #8

Learn how you can use GitHub Issues to plan and track your work.
Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub IssuesProTip!
Restrict your search to the title by using the in:title qualifier.
Learn how you can use GitHub Issues to plan and track your work.
Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub IssuesProTip!
Press the /
key to activate the search input again and adjust your query.