BMI 5/625, Spring 2025

Course policies and grading

Grading Breakdown

Participation 20%	Attendance at all sessions and labs; other participation.
Labs 40%	Completion of all lab activities
Written Assignments (inc. KWLA essay) 20%	As described
Final Project 20%	As described

Attendance & Participation

Attendance is required. If you need to miss a class, I will need advance notice except in cases of emergencies.

In addition to attending each class session, I expect all students to actively participate in the discussions. This can be in the form of asking a question, responding to another student’s question (or one from the instructor!), raising an issue, etc.

Plagiarism & Attribution

We expect and require that all submissions be the student’s own, original work. Any and all text, code, figures, etc. that you include from any other source must be properly cited, including quotation and paraphrasing. The Purdue University Online Writing Lab has an excellent set of online resources regarding citation and attribution, as well as a useful resource specifically on avoiding plagiarism. If you are unsure about whether something must be cited, the answer is probably “yes”; when in doubt, please ask.

Note that the School of Medicine has a policy regarding ethical and professional conduct for graduate students that specifically addresses plagiarism (sections 4.b and 4.c). We expect all students to be aware of and familiar with this policy. If you have any questions about this policy, please ask.

On a more personal note: in my experience, students who engage in plagiarism typically do so because they feel that they have no other choice. A deadline is looming, they are overwhelmed by some aspect of the assignment, a personal crisis comes up that keeps them from being able to finish, etc., and they feel like using somebody else’s work, or reusing some of their own work from another class, is the best option available. I can 100% guarantee that this is not the case: you have other options, and choosing plagiarism will not result in a good outcome.

Regarding options, I promise that whatever situation you are in can be fixed: extensions can be granted, projects can be re-scoped, etc. I will help you! But you need to come to me before you copy somebody else’s work.
Regarding outcomes, it is important for you to know that we automatically run plagiarism-detection software (TurnItIn) on every submission, and I have found it to be spookily good at tracking down chunks of text from even the most obscure places. If you plagiarize, you will be caught.

When we catch you, the consequences will depend on the precise circumstances, but will at a minimum involve a score of zero points for the assignment in question, and often involve a failing grade on the course.

So: don’t wait for me to catch you: ask for help early and often.

Code Snippets and AI Tools

Automated code- or text-generation tools such as GitHub’s Copilot or OpenAI’s ChatGPT, and image-generation tools like Midjourney and DALL-E, pose a particular challenge to both students and instructors. As a guiding principle, recall that we expect and require that all submissions be your own, original work, and that part of the point of this class is to develop your own practical abilities. When considering using such a tool, ask yourself: will the tool’s output be something I will be turning in directly? In general, you may use such tools as a source of information (though see the note below), but not to produce output that you intend to turn in, or as a replacement for a traditional cited reference.

Here are examples of appropriate, in-bounds uses of AI text-generation tools:

Using ChatGPT for debugging assistance (“why isn’t this R function working?”)
Using ChatGPT as an informational resource (“what are some good R libraries to use for making chloropleth maps?”, “how do I make ggplot hide a figure legend?”)
- Note that this use case is one to be very careful with, as ChatGPT’s output is never guaranteed to be correct!
Using Midjourney or Photoshop’s “Generative Fill” features to brainstorm ideas for color palettes in a figure
Using Google Translate to help with grammar, or to look up a word or phrase

Here are examples of inappropriate, out-of-bounds uses:

Using ChatGPT to solve a homework or lab problem (“write the ggplot code to make a plot with…”)
Using ChatGPT to generate output for use as code in any part of an assignment (“Generate an R script to read data from… and make a plot with …”)
Using Midjourney to generate a chart for use as the ultimate deliverable in an assignment
Using GitHub Copilot (or similar) to complete the rest of a code snippet in a homework problem in your text editor
Using ChatGPT to draft any part of the prose in a submission (“write the introduction section to a paper about using census data”), including as paraphrase.
Using ChatGPT to rewrite, revise, “polish”, etc. a rough draft of your own writing
Using ChatGPT to draft a passage of text, and quoting or paraphrasing it in your writeup, even with attribution (see note below)
Citing ChatGPT as a source of information in a paper (see note below)

Note: I am using “ChatGPT” here as a generic noun referring to “LLM-based writing/chat tools”; please assume that Claude, DeepSeek, whatever Google’s LLM is called this week, Grok, etc. are all included in this category. If you are in doubt about whether a given tool “counts”, please ask before using.

Out-of-bounds uses of AI tools will be screened for and treated in the same manner as other forms of plagiarism; if you are uncertain about whether your use is in- or out-of-bounds, please ask. And if you think up an interesting or helpful (in-bounds) use case for these technologies, please feel free to share it on the class Sakai forum.

This technology is quite new and is also developing rapidly, so there may be situations and use cases that this policy does not address- we are figuring this out together, in real-time. 🤘

Why Can’t I Use CoPilot (Or Whatever) To Write My Code For Me?

There are several reasons that I personally do not think that AI-assisted coding is a good idea, especially for people who are earlier in their journey as programmers.

First, and perhaps counter-intuitively, is the matter of efficiency. I have found that an important part of becoming fast and proficient as a programmer is to build up “muscle memory”: learning to efficiently use your editing environment, internalizing the structure of whatever libraries you are working with, etc. Many common bugs and errors result from typos, missing quotation marks or mismatched parentheses, mis-typed function names, and so on, and the longer you work with R the more you will a) encounter such bugs, and b) get used to catching them right away and spotting when such problems have occurred.

By making these kinds of mistakes, and then diagnosing a resulting error message and fixing the issue, you will build up important pattern-matching skills for reading code.

The only way I have ever found to build up this familiarity is to actually write code. If you instead delegate that task to an LLM, those are skills you’ll never build up, and I think they are very important skills to have. Using an LLM to help you code more quickly now will limit how efficient you’ll be later: “short cuts make long roads,” as the saying goes.

The second reason is that of accountability. In your future work, you may be writing R code (or Python code, or whatever) to perform important analyses, potentially in contexts with high stakes, and to present the results of those analyses to people who presumably care a lot about them. In that kind of situation, it is important that you feel personally confident in your results, which requires that you feel confident in the code that generated those results. This, in turn, requires that you actually understand what it is doing and why, at a very detailed and comprehensive level. If you’ve had an LLM write the code for you, and did so never having built up the skills to read and debug code, there are a lot of ways that things can go badly awry.

The third reason is that of intellectual property. Just as the text generated by LLMs often includes verbatim quotes from sources that the model saw during training, the code that it writes also frequently includes other people’s copyrighted material. It is very important to be mindful of this issue when programming in general, especially in a commercial setting (i.e., if somebody is paying you to write code).

My basic principle for using LLMs, as of March 2025, is that I only rely on an LLM’s output in situations where I personally have enough expertise to tell whether it was “correct” or not, without having to do a ton of work. I absolutely think there can be a role for LLM-based support in programming… but only if it is being used on top of a solid foundation, and only in moderation. In other words, we are not “vibe coders” in my class.

A Note on Using ChatGPT (Or Whatever) As A Research Assistant

Regarding the use of ChatGPT or similar tools as informational resources, it is important to keep in mind that ChatGPT’s output often contains “confabulations”: content that is not “real”, and that the language model has “made up”. In the recent past (as of March 2025), I have personally seen ChatGPT and its cousins…

suggest the use of R packages that do not exist…
give incorrect information about how certain things work in ggplot…
make up non-existant URLs to non-existant informational resources…
etc.

In the context of a coding problem set, this means that you will waste valuable time and energy trying to debug something that was never going to work in the first place. In the context of a paper’s background section or literature review, this means that you will find yourself spending a great deal of time attempting to find non-existent articles and books.

It is best to think of a ChatGPT-generated literature review as being closer to “fan-fiction” than an actual review. Remember, you are responsible for the veracity and accuracy of anything you turn in.

Some Notes on Citing AI Tools

One might ask, “why can’t I just cite ChatGPT’s output like I would any other source?” There are several reasons;

The output of tools like ChatGPT is not deterministic, so a citation would not necessarily allow your reader to see the same “original” information as you saw at the time of writing (which is one of the more important reasons to cite something in the first place).
Part of the point of citing a source is to allow your reader to see where you got your information, so as to be able to learn more about the subject or understand the context of your quotation. Some LLM-based tools are able to include citations, but those citations are often incorrect, either in that they are made up entirely or they do not accurately reflect what the source had to say. In any event, the correct thing to do in this case is to verify the original source and cite that, not the LLM.
Another, closely related, reason for citing sources is that it allows the reader to critically evaluate the veracity, quality, credibility, and perspective of your source (“do they know what they are talking about?”, “what sort of biases and context might they have that would affect their opinion?”). ChatGPT and related tools do not provide us with any way to make this assessment, so citing the fact that “ChatGPT said X” is not helpful for this purpose.
One of the main purposes for citing a source is to give credit to the source’s author for their work and expertise; ChatGPT is not a person (as it will be the first to point out), and as such, cannot be so credited.
ChatGPT (and other such tools) frequently produce output that is confabulated - “made up”, in other words. This is different from information that is merely inaccurate, outdated, or otherwise “wrong”; sources providing such information may be mistaken, but they are mistaken for a knowable reason, and in a scholarly context it is usually safe to assume that their author believed their contents to be true at the time that they wrote it. This is qualitatively different from the way that ChatGPT and its cousins work: they are not “wrong” for any predictable, consistent, or ontologically grounded reasons. Citing their output is meaningless, as the output does not reflect an actual claim about the state of reality. Citing confabulated information is not helpful; if you have verified the accuracy of ChatGPT’s output (by cross-referencing it with a more authoritative source), you should cite that source, instead.
Finally, ChatGPT (and similar tools) are built by incorporating the creative and scholarly work of countless actual humans, and the output of these tools often includes verbatim or paraphrased portions of that database. In other words, ChatGPT’s output often includes plagiarized text. Citing this and crediting it to ChatGPT would be mis-allocating credit that is owed to the actual people who made the content that ChatGPT is regurgitating.

A Note on Accessibility & Accommodation

Over the last year I have encountered multiple students and colleagues who have found resources such as ChatGPT to be helpful tools in their toolboxes for managing various aspects of their neurology in an academic setting, e.g. as a way to help them organize their thoughts in writing, or to overcome executive function challenges. As they have been described to me, many of these sorts of uses would ordinarily fall “out of bounds” according to the strictest interpretation of this policy. However, if you are in need of an accommodation involving the use of generative AI tools, please do not hesitate to reach out to either myself or the office of student access; we have robust and flexible policies on accommodation (see below) and I am very willing to discuss this issue.

Useful Readings about AI Tools

Here are a few recent discussions on how to think about generative AI tools that you may find helpful and informative:

Shanahan M. Talking about Large Language Models. Commun ACM. 2024 Feb;67(2):68–79.
Fraser C. Generative AI is a hammer and no one knows what is and isn’t a nail. Medium.com [Internet] Posted Feb. 21 2024, cited Mar. 30 2024. (Wayback Machine Snapshot)
Jaźwińska K. and Chandrasekar A. AI Search Has A Citation Problem. Columbia Journalism Review, 2025 March.
Rogers A. On AI-assisted writing in graduate school. [Internet] Posted Aug. 24, 2024, cited Mar. 27, 2025. (Wayback Machine Snapshot)

When I’m not teaching data visualization, language models are one of my core areas of academic research. As such, if you are interested in digging further into this space, please reach out and I can send you more pointers of things to read.

Additional Policies

See the syllabus page on Sakai for a full list of university policies, etc.

Accessibility & Accomodations

I would like to make an additional note regarding accessibility and accommodations. The syllabus link above will include the University’s official language about accessibility, and will list the various resources that you have available to you. While comprehensive, the official verbiage is pretty dense; in the past, some have found its “legalese” language to be off-putting or unclear. I am committed to helping each of you succeed to the best of my ability, and I fully support the University’s Office of Student Access.

If you anticipate needing any kind of accommodation, I encourage you to reach out to the Office of Student Access or to myself as early as possible in the term. I will be able to help you more effectively if we begin our discussions around your needs earlier rather than later. If you have a need that is not covered by the OHSU accessibility and accommodation policies, or if you have questions or concerns about anything along these lines, please do not hesitate to ask me for information or help.