Skip to content

issues Search Results · repo:databricks/megablocks language:Python

Filter by

65 results
 (52 ms)

65 results

indatabricks/megablocks (press backspace or delete to remove)

I am training a ~520 M model, but I have found that the megablocks moe version uses substantially more memory and takes longer to train than a dense model of corresponding size. I am using a model embedding ...
  • samuelwheeler
  • 1
  • Opened 
    9 days ago
  • #166

Same as https://github.com/databricks/megablocks/issues/159 ~Solution is to install megablocks without causing a torch reinstall. (In particular, changing to torch =2.0.0 doesn t work, but torch =2.6.0 ...
  • ad8e
  • Opened 
    15 days ago
  • #165

[rank5]: Traceback (most recent call last): [rank5]: File /root/Stanford-Megatron-LM/pretrain_gpt.py , line 154, in module [rank5]: pretrain(train_valid_test_datasets_provider, model_provider, ...
  • rtmadduri
  • Opened 
    on Feb 8
  • #164

When running the grouped gemm implementation and expert parallelism, i am faced with the following error: [rank5]: File /env/lib/python3.11/site-packages/megablocks-0.8.0.dev0-py3.11-linux-x86_64.egg/megablocks/layers/glu.py ...
  • cassanof
  • 2
  • Opened 
    on Jan 3
  • #163

I have been trying to run some of the exp training code onnvcr.io/nvidia/pytorch:23.09-py3 . However, I seem to keep getting errors regardless of the scripts. After some testing, it seems that even running ...
  • kevin3567
  • Opened 
    on Nov 14, 2024
  • #161

Hi there, thanks for the amazing work! I found expert parallel is not compatible with the distributed optimizer in the fork version of Megatron-LM here: https://github.com/stanford-futuredata/Megatron-LM/blob/85f95aef3b648075fe6f291c86714fdcbd9cd1f5/megatron/arguments.py#L352-L356 ...
  • Spico197
  • 1
  • Opened 
    on Nov 11, 2024
  • #160

Hi there, there seems to be an error with the newer pytorch (I already relaxed the 2.4.1 constraint in setup.py): 10 # Wrap this in a try-block with better error message and 11 # instructions ...
  • jramapuram
  • 5
  • Opened 
    on Oct 19, 2024
  • #159

I am trying to setup and use megablocks to train MoE models, but I see the following error: Traceback (most recent call last): File /n/holyscratch01/dam_lab/brachit/moes/megablocks/third_party/Megatron-LM/pretrain_gpt.py ...
  • RachitBansal
  • 4
  • Opened 
    on Oct 11, 2024
  • #157

Thanks for your work. i check the code and find the mlp only have w1 and w2. Does the sparse mlp support the bias? thanks a lot! i want to initialize the mlp with our original mlp with bias. Do you have ...
  • maobenz
  • 1
  • Opened 
    on Oct 9, 2024
  • #156

hello, i have tried to use megablocks in V100 + pytorch2.4.0+cu121, but get error with cannot support bf16 . If i use megablocks in fp32, i get error group gemm must use bf16 . So i change my enviroment ...
  • Guodanding
  • 14
  • Opened 
    on Oct 8, 2024
  • #155
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Press the
/
key to activate the search input again and adjust your query.
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Restrict your search to the title by using the in:title qualifier.
Issue search results · GitHub