issues Search Results · repo:databricks/megablocks language:Python
Filter by
65 results
(52 ms)65 results
indatabricks/megablocks (press backspace or delete to remove)I am training a ~520 M model, but I have found that the megablocks moe version uses substantially more memory and takes
longer to train than a dense model of corresponding size. I am using a model embedding ...
samuelwheeler
- 1
- Opened 9 days ago
- #166
Same as https://github.com/databricks/megablocks/issues/159
~Solution is to install megablocks without causing a torch reinstall. (In particular, changing to torch =2.0.0 doesn t
work, but torch =2.6.0 ...
ad8e
- Opened 15 days ago
- #165
[rank5]: Traceback (most recent call last):
[rank5]: File /root/Stanford-Megatron-LM/pretrain_gpt.py , line 154, in module
[rank5]: pretrain(train_valid_test_datasets_provider, model_provider, ...
rtmadduri
- Opened on Feb 8
- #164
When running the grouped gemm implementation and expert parallelism, i am faced with the following error:
[rank5]: File /env/lib/python3.11/site-packages/megablocks-0.8.0.dev0-py3.11-linux-x86_64.egg/megablocks/layers/glu.py ...
cassanof
- 2
- Opened on Jan 3
- #163
I have been trying to run some of the exp training code onnvcr.io/nvidia/pytorch:23.09-py3 . However, I seem to keep
getting errors regardless of the scripts. After some testing, it seems that even running ...
kevin3567
- Opened on Nov 14, 2024
- #161
Hi there, thanks for the amazing work! I found expert parallel is not compatible with the distributed optimizer in the
fork version of Megatron-LM here:
https://github.com/stanford-futuredata/Megatron-LM/blob/85f95aef3b648075fe6f291c86714fdcbd9cd1f5/megatron/arguments.py#L352-L356 ...
Spico197
- 1
- Opened on Nov 11, 2024
- #160
Hi there, there seems to be an error with the newer pytorch (I already relaxed the 2.4.1 constraint in setup.py):
10 # Wrap this in a try-block with better error message and
11 # instructions ...
jramapuram
- 5
- Opened on Oct 19, 2024
- #159
I am trying to setup and use megablocks to train MoE models, but I see the following error:
Traceback (most recent call last):
File /n/holyscratch01/dam_lab/brachit/moes/megablocks/third_party/Megatron-LM/pretrain_gpt.py ...
RachitBansal
- 4
- Opened on Oct 11, 2024
- #157
Thanks for your work. i check the code and find the mlp only have w1 and w2. Does the sparse mlp support the bias?
thanks a lot! i want to initialize the mlp with our original mlp with bias. Do you have ...
maobenz
- 1
- Opened on Oct 9, 2024
- #156
hello, i have tried to use megablocks in V100 + pytorch2.4.0+cu121, but get error with cannot support bf16 . If i use
megablocks in fp32, i get error group gemm must use bf16 . So i change my enviroment ...
Guodanding
- 14
- Opened on Oct 8, 2024
- #155

Learn how you can use GitHub Issues to plan and track your work.
Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub IssuesProTip!
Press the /
key to activate the search input again and adjust your query.
Learn how you can use GitHub Issues to plan and track your work.
Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub IssuesProTip!
Restrict your search to the title by using the in:title qualifier.