-
-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Re-)Add content on generative AI #697
Conversation
Thank you!Thank you for your pull request 😃 🤖 This automated message can help you check the rendered files in your submission for clarity. If you have any questions, please feel free to open an issue in {sandpaper}. If you have files that automatically render output (e.g. R Markdown), then you should check for the following:
Rendered Changes🔍 Inspect the changes: https://github.com/swcarpentry/python-novice-gapminder/compare/md-outputs..md-outputs-PR-697 The following changes were observed in the rendered markdown documents:
What does this mean?If you have source files that require output and figures to be generated (e.g. R Markdown), then it is important to make sure the generated figures and output are reproducible. This output provides a way for you to inspect the output in a diff-friendly manner so that it's easy to see the changes that occur due to new software versions or randomisation. ⏱️ Updated at 2025-03-28 15:02:55 +0000 |
Quoting the comment received before #695 was merged. @rowleya wrote:
|
A difficult topic. Licensing concerns are tricky. It may be worth exploring how such tools are used ethically in educational settings, particularly for programming, and whether in future they could be incorporated into Carpentries lessons. One need not use LLMs, machine learning models and editor add ons can also be helpful. These might be explored in a separate lesson piloted through the incubator first though. |
Thanks @bkmgit. Indeed I would like to help get some lessons like you described through the Incubator. E.g. I think we could do a really nice follow-up to DC Image Processing, exploring ML methods for image analysis. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to link to sources on any of this? I can help provide them, but do not want to put them in if the goal is to not link out
more broadly, this might not fit here, but also, part of why learning to write code to do analyses is important is because LLMs do not reliably answer mathematical questions and for data privacy (as a counter to the idea a person might have to upload their data to a chatbot and ask it to do the analysis)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice @tobyhodges! Thanks!
Co-authored-by: David Pérez-Suárez <dps.helio@gmail.com> Co-authored-by: Sarah Brown <brownsarahm@uri.edu>
@brownsarahm wrote
I considered this, but came down on the side of not including sources. If nothing else, just for the sake of keeping the focus on what we want to say in workshops, as opposed to which sources are the best fit for each point. But I could be easily persuaded in the opposite direction! If you or anyone else feels strongly, I can put some links in. |
Co-authored-by: Federica Gazzelloni (she/her) <61802414+Fgazzelloni@users.noreply.github.com>
I think you've threaded the needle reasonably well here. I am pretty strongly in the "don't use generative AI" camp, and I appreciate that you've incorporated context about pros, cons, and real world considerations alongside the blanket request to not use AI when you are learning to code because it defeats the purpose. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice work @tobyhodges. Appreciate all your efforts to include the community in this change.
These tools sometimes generate plausible but incorrect or misleading information, so (just as with an answer found on the internet) it is essential to verify their accuracy. | ||
You need the knowledge and skills to be able to understand these responses, to judge whether or not they are accurate, and to fix any errors in the code it offers you. | ||
|
||
In addition to asking for help, programmers can use generative AI tools to generate code from scratch; extend, improve and reorganise existing code; translate code between programming languages; figure out what terms to use in a search of the internet; and more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this feels like more of a "sidebar" comment, since it's not about getting help when you're stuck. I wonder if it makes more sense to put it at the end of this section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intended this to aid flow into the next paragraph. Roughly, something like:
- This is how the thing we are talking about now is similar to the things we talked about just before
- These are some of the ways in which people use this thing beyond what we already talked about
- But these are some ways in which that could be considered problematic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. In my view, the ethical problems apply regardless of whether you're using an LLM to get help when stuck, or using it to write whole programs from scratch (or refactor, translate code, etc). So I don't see why it needs to be right here in that regard.
You need the knowledge and skills to be able to understand these responses, to judge whether or not they are accurate, and to fix any errors in the code it offers you. | ||
|
||
In addition to asking for help, programmers can use generative AI tools to generate code from scratch; extend, improve and reorganise existing code; translate code between programming languages; figure out what terms to use in a search of the internet; and more. | ||
However, there are drawbacks that you should be aware of. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you take my suggestion of making the prior paragraph a "sidebar", then this should probably change to something like:
However, there are drawbacks that you should be aware of. | |
Additionally, there are drawbacks that you should be aware of. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this thoughtful and important addition @tobyhodges (and all the comments from the community have been really excellent as well). @swcarpentry/python-novice-gapminder-maintainers getting a lot of pings on this one 🤣
I'm running a clinic at the next CSDMS Annual Meeting on using LLMs for computational modeling so all of the points raised here are also helpful in thinking about how to refine and tailor that stream of work.
It would be great to include some summaries of the mailing list discussion here as well, or perhaps just a link to that entire thread?
episodes/04-built-in.md
Outdated
The section on generative AI is intended to be concise but Instructors may choose to devote more time to the topic in a workshop. | ||
Depending on your own level of experience and comfort with talking about and using these tools, you could choose to do any of the following: | ||
|
||
* Explain how large language models work and are trained, and/or the difference between generative AI, other forms of AI that currently exist, and the concept of artificial general intelligence. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it may be important to emphasize that current LLMs do not appear to be "reasoning" by any sense of the word. They are statistical engines trained on enormous corpuses of data (internet+) that are very good at producing plausible, grammatically correct sentences that are statistically relevant to the given input (which you explain clearly later on). Would it be useful to be clear that current LLMs are nothing like AGI (which even if they did exist should also not be trusted unequivocally).
Though there is very interesting work being done by google deepmind in the area of neurosymbolic AI (alphageometry etc)...
I still find LLMs to be quite useful at search, summarization, explaining new concepts, rubber ducking, generating first drafts of code snippets, text, images, tests, build scaffolding and scripts, Dockerfiles, Apptainer recipes, k8s configuration, etc.
|
||
**We recommend that you avoid getting help from generative AI during the workshop** for several reasons: | ||
|
||
1. For most problems you will encounter at this stage, help and answers can be found among the first results returned by searching the internet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't necessarily agree with the recommendation that learners not use LLMs during a workshop but this is a pretty soft position. Perhaps a brief interlude could have the instructor use a LLM live and demonstrate the back and forth assessment of the responses provided and techniques for effectively constraining and prompting an LLM. I have found tools like the the latest chatgpt, perplexity, notebooklm, github copilot, v0.dev, etc. to be valuable aids to my work but have also seen first-hand the effects of blind application of LLM responses without critical assessment; regardless people are using and will continue to use these tools so it may be important to demo some subset of good practices around them.
There was a "cooking" / "recipe" analogy raised on the mailing list that I don't think quite fits either, the main concern in my opinion is LLMs are statistical regurgitation-of-our-past-knowledge black boxes - https://garymarcus.substack.com/p/decoding-and-debunking-hard-forks does a good job of articulating these issues...
For a resources section I think this YouTube video (by Andrej Karpathy - worth following IMO) might be worth including, it is an excellent introduction to understanding what the current crop of LLMs actually do and how to more effectively use them:
Co-authored-by: Daniel McCloy <dan@mccloy.info>
@tobyhodges I also think this is a very considerate addition. I fully agree with the recommendations. |
|
||
This is a fast-moving technology. | ||
If you are preparing to teach this section and you feel it has become outdated, please open an issue on the lesson repository to let the Maintainers know and/or a pull request to suggest updates and improvements. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on how this differs from the point already included on line 284 of the diff?
Demonstrate how you recommend that learners use generative AI.
My aim with this guidance to Instructros was to actively encourage different approaches based on the individual's level of expertise and comfort with using the technology
Co-authored-by: Daniel McCloy <dan@mccloy.info> Co-authored-by: Sarah Brown <brownsarahm@uri.edu>
Thanks everyone for your latest contributions. I really appreciate the feedback. A heads-up that I plan to ask the Maintainers to merge this on Monday next week, so this is a last call for any "dealbreaking" reviews and suggestions! |
@swcarpentry/python-novice-gapminder-maintainers I think this is ready to merge now, if you are happy to do so. Thanks for your patience and support 🙌 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks all for the stimulating and thoughtful discussion around this PR! 🥂
Auto-generated via `{sandpaper}` Source : 5c93536 Branch : md-outputs Author : GitHub Actions <actions@github.com> Time : 2025-04-01 20:28:21 +0000 Message : markdown source builds Auto-generated via `{sandpaper}` Source : 0b6744f Branch : main Author : Allen Lee <alee@users.noreply.github.com> Time : 2025-04-01 20:27:24 +0000 Message : Merge pull request #697 from tobyhodges/llm-assistants (Re-)Add content on generative AI
Auto-generated via `{sandpaper}` Source : 0b6744f Branch : main Author : Allen Lee <alee@users.noreply.github.com> Time : 2025-04-01 20:27:24 +0000 Message : Merge pull request swcarpentry#697 from tobyhodges/llm-assistants (Re-)Add content on generative AI
Auto-generated via `{sandpaper}` Source : 9220ce4 Branch : md-outputs Author : GitHub Actions <actions@github.com> Time : 2025-04-02 18:08:30 +0000 Message : markdown source builds Auto-generated via `{sandpaper}` Source : 0b6744f Branch : main Author : Allen Lee <alee@users.noreply.github.com> Time : 2025-04-01 20:27:24 +0000 Message : Merge pull request swcarpentry#697 from tobyhodges/llm-assistants (Re-)Add content on generative AI
[This repeats the changes made in #695 (then reverted in #696). I am sorry for the confusion! 😅 The pull request will stay open for at least a week, to give community members time to provide feedback and suggest improvements. Thanks for your patience @vahtras and other Maintainers ❤️ ]
This adds a new section to Built-in Functions and Help, titled "Other ways to get help" that discusses searching the internet, StackOverflow, talking to another person, and generative AI chatbots e.g. ChatGPT as possible ways to get more help when faced with errors while coding.
Some notes to guide feedback: