(Re-)Add content on generative AI #697

tobyhodges · 2025-03-14T14:01:26Z

[This repeats the changes made in #695 (then reverted in #696). I am sorry for the confusion! 😅 The pull request will stay open for at least a week, to give community members time to provide feedback and suggest improvements. Thanks for your patience @vahtras and other Maintainers ❤️ ]

This adds a new section to Built-in Functions and Help, titled "Other ways to get help" that discusses searching the internet, StackOverflow, talking to another person, and generative AI chatbots e.g. ChatGPT as possible ways to get more help when faced with errors while coding.

Some notes to guide feedback:

I have tried to keep this as concise as possible, sticking to what I consider to be the most essential information only. But concede that it is still pretty wordy!
These changes are guided by conversations within the community over recent months, including but not limited to the community discussion sessions summarised in a couple of recent blog posts (The Ethics of Teaching LLMs in Carpentries Workshops and Essential Knowledge and Misconceptions).

github-actions · 2025-03-14T14:01:40Z

Thank you!

Thank you for your pull request 😃

🤖 This automated message can help you check the rendered files in your submission for clarity. If you have any questions, please feel free to open an issue in {sandpaper}.

If you have files that automatically render output (e.g. R Markdown), then you should check for the following:

🎯 correct output
🖼️ correct figures
❓ new warnings
‼️ new errors

Rendered Changes

🔍 Inspect the changes: https://github.com/swcarpentry/python-novice-gapminder/compare/md-outputs..md-outputs-PR-697

The following changes were observed in the rendered markdown documents:

 04-built-in.md | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 md5sum.txt     |  2 +-
 2 files changed, 59 insertions(+), 1 deletion(-)

What does this mean?

If you have source files that require output and figures to be generated (e.g. R Markdown), then it is important to make sure the generated figures and output are reproducible.

This output provides a way for you to inspect the output in a diff-friendly manner so that it's easy to see the changes that occur due to new software versions or randomisation.

⏱️ Updated at 2025-03-28 15:02:55 +0000

tobyhodges · 2025-03-14T14:03:14Z

Quoting the comment received before #695 was merged. @rowleya wrote:

I personally think this is a very reasoned and reasonable approach and I think what you have written here relating both to AI and getting help certainly represents how I do things and also how I think about AI especially when learning. Good job!

bkmgit · 2025-03-14T17:26:19Z

A difficult topic. Licensing concerns are tricky.

It may be worth exploring how such tools are used ethically in educational settings, particularly for programming, and whether in future they could be incorporated into Carpentries lessons. One need not use LLMs, machine learning models and editor add ons can also be helpful. These might be explored in a separate lesson piloted through the incubator first though.

tobyhodges · 2025-03-14T17:42:27Z

Thanks @bkmgit. Indeed I would like to help get some lessons like you described through the Incubator. E.g. I think we could do a really nice follow-up to DC Image Processing, exploring ML methods for image analysis.

brownsarahm

do we want to link to sources on any of this? I can help provide them, but do not want to put them in if the goal is to not link out

more broadly, this might not fit here, but also, part of why learning to write code to do analyses is important is because LLMs do not reliably answer mathematical questions and for data privacy (as a counter to the idea a person might have to upload their data to a chatbot and ask it to do the analysis)

episodes/04-built-in.md

dpshelio

Very nice @tobyhodges! Thanks!

episodes/04-built-in.md

Co-authored-by: David Pérez-Suárez <dps.helio@gmail.com> Co-authored-by: Sarah Brown <brownsarahm@uri.edu>

tobyhodges · 2025-03-18T10:31:27Z

@brownsarahm wrote

do we want to link to sources on any of this?

I considered this, but came down on the side of not including sources. If nothing else, just for the sake of keeping the focus on what we want to say in workshops, as opposed to which sources are the best fit for each point. But I could be easily persuaded in the opposite direction! If you or anyone else feels strongly, I can put some links in.

Co-authored-by: Federica Gazzelloni (she/her) <61802414+Fgazzelloni@users.noreply.github.com>

mrawls · 2025-03-18T17:23:17Z

I think you've threaded the needle reasonably well here. I am pretty strongly in the "don't use generative AI" camp, and I appreciate that you've incorporated context about pros, cons, and real world considerations alongside the blanket request to not use AI when you are learning to code because it defeats the purpose.

drammock

very nice work @tobyhodges. Appreciate all your efforts to include the community in this change.

episodes/04-built-in.md

drammock · 2025-03-18T19:48:02Z

episodes/04-built-in.md

+These tools sometimes generate plausible but incorrect or misleading information, so (just as with an answer found on the internet) it is essential to verify their accuracy.
+You need the knowledge and skills to be able to understand these responses, to judge whether or not they are accurate, and to fix any errors in the code it offers you.
+
+In addition to asking for help, programmers can use generative AI tools to generate code from scratch; extend, improve and reorganise existing code; translate code between programming languages; figure out what terms to use in a search of the internet; and more.


this feels like more of a "sidebar" comment, since it's not about getting help when you're stuck. I wonder if it makes more sense to put it at the end of this section?

I intended this to aid flow into the next paragraph. Roughly, something like:

This is how the thing we are talking about now is similar to the things we talked about just before

These are some of the ways in which people use this thing beyond what we already talked about

But these are some ways in which that could be considered problematic

I see. In my view, the ethical problems apply regardless of whether you're using an LLM to get help when stuck, or using it to write whole programs from scratch (or refactor, translate code, etc). So I don't see why it needs to be right here in that regard.

drammock · 2025-03-18T19:49:12Z

episodes/04-built-in.md

+You need the knowledge and skills to be able to understand these responses, to judge whether or not they are accurate, and to fix any errors in the code it offers you.
+
+In addition to asking for help, programmers can use generative AI tools to generate code from scratch; extend, improve and reorganise existing code; translate code between programming languages; figure out what terms to use in a search of the internet; and more.
+However, there are drawbacks that you should be aware of.


If you take my suggestion of making the prior paragraph a "sidebar", then this should probably change to something like:

Suggested change

However, there are drawbacks that you should be aware of.

Additionally, there are drawbacks that you should be aware of.

episodes/04-built-in.md

alee

Thank you for this thoughtful and important addition @tobyhodges (and all the comments from the community have been really excellent as well). @swcarpentry/python-novice-gapminder-maintainers getting a lot of pings on this one 🤣

I'm running a clinic at the next CSDMS Annual Meeting on using LLMs for computational modeling so all of the points raised here are also helpful in thinking about how to refine and tailor that stream of work.

It would be great to include some summaries of the mailing list discussion here as well, or perhaps just a link to that entire thread?

alee · 2025-03-18T23:31:05Z

episodes/04-built-in.md

+The section on generative AI is intended to be concise but Instructors may choose to devote more time to the topic in a workshop.
+Depending on your own level of experience and comfort with talking about and using these tools, you could choose to do any of the following:
+
+* Explain how large language models work and are trained, and/or the difference between generative AI, other forms of AI that currently exist, and the concept of artificial general intelligence.


I think it may be important to emphasize that current LLMs do not appear to be "reasoning" by any sense of the word. They are statistical engines trained on enormous corpuses of data (internet+) that are very good at producing plausible, grammatically correct sentences that are statistically relevant to the given input (which you explain clearly later on). Would it be useful to be clear that current LLMs are nothing like AGI (which even if they did exist should also not be trusted unequivocally).

Though there is very interesting work being done by google deepmind in the area of neurosymbolic AI (alphageometry etc)...

I still find LLMs to be quite useful at search, summarization, explaining new concepts, rubber ducking, generating first drafts of code snippets, text, images, tests, build scaffolding and scripts, Dockerfiles, Apptainer recipes, k8s configuration, etc.

episodes/04-built-in.md

alee · 2025-03-19T00:32:52Z

episodes/04-built-in.md

+
+**We recommend that you avoid getting help from generative AI during the workshop** for several reasons:
+
+1. For most problems you will encounter at this stage, help and answers can be found among the first results returned by searching the internet.


I don't necessarily agree with the recommendation that learners not use LLMs during a workshop but this is a pretty soft position. Perhaps a brief interlude could have the instructor use a LLM live and demonstrate the back and forth assessment of the responses provided and techniques for effectively constraining and prompting an LLM. I have found tools like the the latest chatgpt, perplexity, notebooklm, github copilot, v0.dev, etc. to be valuable aids to my work but have also seen first-hand the effects of blind application of LLM responses without critical assessment; regardless people are using and will continue to use these tools so it may be important to demo some subset of good practices around them.

There was a "cooking" / "recipe" analogy raised on the mailing list that I don't think quite fits either, the main concern in my opinion is LLMs are statistical regurgitation-of-our-past-knowledge black boxes - https://garymarcus.substack.com/p/decoding-and-debunking-hard-forks does a good job of articulating these issues...

For a resources section I think this YouTube video (by Andrej Karpathy - worth following IMO) might be worth including, it is an excellent introduction to understanding what the current crop of LLMs actually do and how to more effectively use them:

https://www.youtube.com/watch?v=EWvNQjAaOHw

Co-authored-by: Daniel McCloy <dan@mccloy.info>

tobyhodges · 2025-03-19T13:52:37Z

Thanks to @alee for the suggestion to link to the ongoing discussion on the mailing list. For reasons that remain unclear to me, the thread is split into three (so far?) on the TopicBox site, but here they are:

mhagdorn · 2025-03-25T14:32:35Z

@tobyhodges I also think this is a very considerate addition. I fully agree with the recommendations.

brownsarahm · 2025-03-28T13:37:35Z

episodes/04-built-in.md

+
+This is a fast-moving technology. 
+If you are preparing to teach this section and you feel it has become outdated, please open an issue on the lesson repository to let the Maintainers know and/or a pull request to suggest updates and improvements.
+


Suggested change

If you are comfortable, demonstrating how to work with an LLM, especially through the build in tools, could be a good way to end the workshop, in addition to pointing out other resources.

motivated by @alee's comment

Can you elaborate on how this differs from the point already included on line 284 of the diff?

Demonstrate how you recommend that learners use generative AI.

My aim with this guidance to Instructros was to actively encourage different approaches based on the individual's level of expertise and comfort with using the technology

Co-authored-by: Daniel McCloy <dan@mccloy.info> Co-authored-by: Sarah Brown <brownsarahm@uri.edu>

tobyhodges · 2025-03-28T15:05:40Z

Thanks everyone for your latest contributions. I really appreciate the feedback.

A heads-up that I plan to ask the Maintainers to merge this on Monday next week, so this is a last call for any "dealbreaking" reviews and suggestions!

tobyhodges · 2025-04-01T07:32:10Z

@swcarpentry/python-novice-gapminder-maintainers I think this is ready to merge now, if you are happy to do so. Thanks for your patience and support 🙌

alee

Thanks all for the stimulating and thoughtful discussion around this PR! 🥂

Auto-generated via `{sandpaper}` Source : 0b6744f Branch : main Author : Allen Lee <alee@users.noreply.github.com> Time : 2025-04-01 20:27:24 +0000 Message : Merge pull request #697 from tobyhodges/llm-assistants (Re-)Add content on generative AI

Auto-generated via `{sandpaper}` Source : 5c93536 Branch : md-outputs Author : GitHub Actions <actions@github.com> Time : 2025-04-01 20:28:21 +0000 Message : markdown source builds Auto-generated via `{sandpaper}` Source : 0b6744f Branch : main Author : Allen Lee <alee@users.noreply.github.com> Time : 2025-04-01 20:27:24 +0000 Message : Merge pull request #697 from tobyhodges/llm-assistants (Re-)Add content on generative AI

Auto-generated via `{sandpaper}` Source : 0b6744f Branch : main Author : Allen Lee <alee@users.noreply.github.com> Time : 2025-04-01 20:27:24 +0000 Message : Merge pull request swcarpentry#697 from tobyhodges/llm-assistants (Re-)Add content on generative AI

Auto-generated via `{sandpaper}` Source : 9220ce4 Branch : md-outputs Author : GitHub Actions <actions@github.com> Time : 2025-04-02 18:08:30 +0000 Message : markdown source builds Auto-generated via `{sandpaper}` Source : 0b6744f Branch : main Author : Allen Lee <alee@users.noreply.github.com> Time : 2025-04-01 20:27:24 +0000 Message : Merge pull request swcarpentry#697 from tobyhodges/llm-assistants (Re-)Add content on generative AI

re-add genAI content

Verified

This commit was signed with the committer’s verified signature.

tobyhodges Toby Hodges

GPG key ID: 15CFCAB10429A11B

Verified
Learn about vigilant mode

Loading
Loading status checks…

ba82693

tobyhodges changed the title ~~re-add genAI content~~ (Re-)Add content on generative AI Mar 14, 2025

github-actions bot pushed a commit that referenced this pull request Mar 14, 2025

differences for PR #697

f6ebfcf

brownsarahm reviewed Mar 14, 2025

View reviewed changes

episodes/04-built-in.md Outdated Show resolved Hide resolved

episodes/04-built-in.md Outdated Show resolved Hide resolved

episodes/04-built-in.md Outdated Show resolved Hide resolved

episodes/04-built-in.md Outdated Show resolved Hide resolved

dpshelio reviewed Mar 14, 2025

View reviewed changes

episodes/04-built-in.md Outdated Show resolved Hide resolved

episodes/04-built-in.md Outdated Show resolved Hide resolved

episodes/04-built-in.md Outdated Show resolved Hide resolved

tobyhodges mentioned this pull request Mar 18, 2025

Add content on generative AI #695

Merged

tobyhodges and others added 2 commits March 18, 2025 11:20

formulating -> articulating

Verified

This commit was created on github.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Verified
Learn about vigilant mode

Loading
Loading status checks…

06561b9

github-actions bot pushed a commit that referenced this pull request Mar 18, 2025

differences for PR #697

3731c7c

github-actions bot pushed a commit that referenced this pull request Mar 18, 2025

differences for PR #697

8a0946c

github-actions bot pushed a commit that referenced this pull request Mar 18, 2025

differences for PR #697

1e8d46c

drammock reviewed Mar 18, 2025

View reviewed changes

alee reviewed Mar 19, 2025

View reviewed changes

tobyhodges and others added 2 commits March 19, 2025 14:25

clarify which developers I am talking about

Verified

This commit was created on github.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Verified
Learn about vigilant mode

Loading
Loading status checks…

79cb5de

github-actions bot pushed a commit that referenced this pull request Mar 19, 2025

differences for PR #697

d3f4cb0

github-actions bot pushed a commit that referenced this pull request Mar 19, 2025

differences for PR #697

fd3a9d0

brownsarahm reviewed Mar 28, 2025

View reviewed changes

brownsarahm mentioned this pull request Mar 28, 2025

LLM use & how we prepare instructors carpentries/instructor-training#1800

Open

tobyhodges and others added 2 commits March 28, 2025 16:00

typo fix

Verified

This commit was created on github.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Verified
Learn about vigilant mode

Loading
Loading status checks…

cf0f86e

github-actions bot pushed a commit that referenced this pull request Mar 28, 2025

differences for PR #697

bf3bd13

alee approved these changes Apr 1, 2025

View reviewed changes

alee merged commit 0b6744f into swcarpentry:main Apr 1, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

(Re-)Add content on generative AI #697

(Re-)Add content on generative AI #697

tobyhodges commented Mar 14, 2025 •

edited

Loading

github-actions bot commented Mar 14, 2025 •

edited

Loading

tobyhodges commented Mar 14, 2025

bkmgit commented Mar 14, 2025

tobyhodges commented Mar 14, 2025

brownsarahm left a comment •

edited

Loading

dpshelio left a comment

tobyhodges commented Mar 18, 2025

mrawls commented Mar 18, 2025

drammock left a comment

drammock Mar 18, 2025

tobyhodges Mar 19, 2025

drammock Mar 19, 2025

drammock Mar 18, 2025

alee left a comment

alee Mar 18, 2025

alee Mar 19, 2025 •

edited

Loading

tobyhodges commented Mar 19, 2025

mhagdorn commented Mar 25, 2025

brownsarahm Mar 28, 2025

tobyhodges Mar 28, 2025

tobyhodges commented Mar 28, 2025

tobyhodges commented Apr 1, 2025

alee left a comment

	However, there are drawbacks that you should be aware of.
	Additionally, there are drawbacks that you should be aware of.


		We recommend that you avoid getting help from generative AI during the workshop for several reasons:

		1. For most problems you will encounter at this stage, help and answers can be found among the first results returned by searching the internet.


		This is a fast-moving technology.
		If you are preparing to teach this section and you feel it has become outdated, please open an issue on the lesson repository to let the Maintainers know and/or a pull request to suggest updates and improvements.


	If you are comfortable, demonstrating how to work with an LLM, especially through the build in tools, could be a good way to end the workshop, in addition to pointing out other resources.

(Re-)Add content on generative AI #697

(Re-)Add content on generative AI #697

Conversation

tobyhodges commented Mar 14, 2025 • edited Loading

github-actions bot commented Mar 14, 2025 • edited Loading

Thank you!

Rendered Changes

tobyhodges commented Mar 14, 2025

bkmgit commented Mar 14, 2025

tobyhodges commented Mar 14, 2025

brownsarahm left a comment • edited Loading

Choose a reason for hiding this comment

dpshelio left a comment

Choose a reason for hiding this comment

tobyhodges commented Mar 18, 2025

mrawls commented Mar 18, 2025

drammock left a comment

Choose a reason for hiding this comment

drammock Mar 18, 2025

Choose a reason for hiding this comment

tobyhodges Mar 19, 2025

Choose a reason for hiding this comment

drammock Mar 19, 2025

Choose a reason for hiding this comment

drammock Mar 18, 2025

Choose a reason for hiding this comment

alee left a comment

Choose a reason for hiding this comment

alee Mar 18, 2025

Choose a reason for hiding this comment

alee Mar 19, 2025 • edited Loading

Choose a reason for hiding this comment

tobyhodges commented Mar 19, 2025

mhagdorn commented Mar 25, 2025

brownsarahm Mar 28, 2025

Choose a reason for hiding this comment

tobyhodges Mar 28, 2025

Choose a reason for hiding this comment

tobyhodges commented Mar 28, 2025

tobyhodges commented Apr 1, 2025

alee left a comment

Choose a reason for hiding this comment

tobyhodges commented Mar 14, 2025 •

edited

Loading

github-actions bot commented Mar 14, 2025 •

edited

Loading

brownsarahm left a comment •

edited

Loading

alee Mar 19, 2025 •

edited

Loading