AI Researchers, Ask Yourself These 6 Questions to Strengthen Your Moral Muscles

News Room
17 Min Read

Welcome to CNET’s new series of guest columns called Alt View, a forum for a diverse array of experts and luminaries to share their insights into the rapidly evolving field of artificial intelligence. For more AI coverage, check out CNET’s AI Atlas.


Of course you have moral principles – but how often do you use them? 

I, Meia, am a professor doing psychology research, and I can tell you that most bad outcomes are caused not by a lack of moral principles, but by them not being activated. I, Max, am a professor doing AI research, and I can tell you that your choices as an AI researcher truly matter, because you’re helping build what will become the most powerful technology ever: AI will gain the potential to bring either unprecedented health, prosperity, liberty, dignity and empowerment, or a race to replace our jobs, our relationships, our decision-making, our power and even our species. 

Hardly a day goes by without the AI community facing moral decisions, on topics ranging from AI companions to surveillance, hacking and military use. Many top AI companies are fighting lawsuits about everything from data centers to AI safety, most prominently in the courtroom drama featuring OpenAI’s Sam Altman and xAI’s Elon Musk. Meanwhile, Anthropic is in a prolonged showdown with the Pentagon. 

So for all you AI researchers out there, here’s a handy checklist to tone up your moral strength.

1. Do you have red lines? 

Is there any action that you find so morally unacceptable that, if the organization you work for takes it, you’ll quit? Or take some other predetermined costly action, say, whistleblowing? Such actions are your moral red lines. 

CNET Alt View badge

For example, Rosa Parks got fined and fired for her civil disobedience against segregation; Vasily Arkhipov was criticized after vetoing a Soviet nuclear strike against the US; and Edward Snowden ended up in exile for whistleblowing on mass surveillance. Many AI researchers have left top AI companies that crossed their red lines, including Daniel Kokotajlo, who risked almost $2 million in equity by quitting OpenAI without signing a nondisparagement agreement. What are your red lines?

2. Have you written them down and shared them? 

Both George Washington and Benjamin Franklin wrote down moral guidelines for themselves, with Franklin grading his own performance weekly. This is a powerful tool for avoiding the boiling frog effect, protecting your red lines against gradual erosion as in the examples at the end of the next section. Sharing them with loved ones or online adds social pressure to stick to them. For each red line, make sure to write down what action you commit to taking if it is crossed. You can click here to list your red lines (we will only share them with your permission).

3. Have you resisted moral disengagement? 

To further strengthen your moral muscles and ensure that your red lines don’t move, it’s helpful to know what failure mechanisms to watch out for. Disengaging your muscles makes you weak – and this applies to your moral muscles as well. So let’s look at moral disengagement mechanisms identified by Albert Bandura, one of the most impactful psychologists of all time. This will help you spot them and fight them when your red lines get pressured by your company, your social circle, the temptation of personal gain or the desire to feel good about yourself. 

Displacement and diffusion of responsibility: You’ll feel better if you or others convince you that you’re not really responsible for the harm: The real decision maker is leadership, investors, the market, geopolitics or history (“this technology is inevitable”). When AI work is distributed across large teams, everyone feels less accountable for the collective outcome. “I’m just a researcher,” or, “I was just doing my job,” are archetypical excuses identified by the influential political theorist Hannah Arendt. The satirical musician Tom Lehrer sums it up in this hilarious song about the rocket scientist who switched allegiance from Nazi Germany to the US: “‘Once the rockets go up, who cares where they come down – that’s not my department,’ says Wernher von Braun.”

For example, an Anthropic researcher reading about how Claude AI may have been implicated in killing over 150 Iranian schoolgirls, in one of the worst US-caused civilian bloodbaths since the Vietnam War, may be tempted to tell themselves that they’re blameless because only management is responsible for selling their tools for military targeting.

Word games: Both Bandura and Arendt highlight how subtle word choices can reframe what’s moral. We are all familiar with military euphemisms such as “servicing a target” for bombing, “collateral damage” for civilian casualties and “enhanced interrogation techniques” for torture, but AI jargon is full of analogous word games, often encouraged by financially interested parties.

The most basic game is “euphemistic labeling”: replace morally vivid language with positive or emotionally flattened terminology. Researchers are not “helping build systems that may displace workers, manipulate users, centralize power or heighten existential risk”; they are doing “capabilities research,” “model improvement” or “benchmark progress.” Training on copyrighted data becomes “freedom to learn.” Unpopular data centers become “AI infrastructure.”  Firing or deskilling workers becomes “productivity gains,” and “lobby against accountability” becomes “reduce friction.” Please practice using neutral words like “company” instead of “lab” (which sounds cool and innocent) and “AI system” instead of “AI model” (which sounds harmless). Bandura’s point is that euphemism does not merely soften tone; it weakens conscience.

Another word game is blame attribution, where critics become the problem – “doomers,” “Luddites,” “opportunistic politicians,” “ignorant journalists” or anti-tech Europeans. Once opponents are blamed for irrationality or bad faith, the AI researcher feels less obligated to treat criticism as morally serious.

A third word game is soft dehumanization: The unemployed programmer, the individual copyright infringement victim and the chatbot suicide child disappear into categories such as “the labor market,” “creatives” and “edge cases.” The more harms are discussed statistically rather than personally, the less moral pain is triggered.

Selective moral self-exemption: It’s tempting to keep strong moral standards in general but carve out an exception around the domain from which you benefit most: An AI researcher may be passionately ethical about injustice in the abstract, while suspending those same standards when judging their own employer, AI, salary or stock grant.

Advantageous comparison: It’s tempting to compare yourself only to worse actors: “At least I’m not at the most reckless lab.” “At least I’m not working on autonomous weapons.” “At least I care about alignment.” That lets you feel ethical without asking whether your own conduct is acceptable in absolute terms.

Moral justification: For those acknowledging that they’re causing current harm, it’s tempting to justify it as serving a noble mission, say “helping democracy prevail,” “creating universal abundance” or “making sure that safety has a seat at the table” – without seriously questioning whether those lofty goals are credible, or whether there’s another way to accomplish them with less current harm.

These moral disengagement techniques can be very powerful when combined and escalated: Enron executives gradually escalated from minor financial manipulations, justified as necessary for company survival and diffused through leadership directives, to massive fraud like hiding debt. Bernie Madoff started with small return fudges rationalized as client aid, then displaced blame onto markets and dehumanized victims, leading to a $65 billion fraud through incremental moral disengagement. In the Vietnam War, soldiers obediently followed orders in a “just war,” starting with minor transgressions that escalated to massacres like My Lai through diffused responsibility and victim dehumanization.

The frontier AI researcher’s signature Bandurian mantra is, “I’m not a well-paid participant in a harmful race; I am a responsible, realistic, morally serious person helping guide inevitable progress.” But is the race to replace truly inevitable, given polling finding it wildly unpopular, or is it a Bandurian excuse and self-fulfilling prophecy?

4. Do you maintain situational awareness? 

Do you actively research whether your red lines are being crossed? This includes investigating the indirect consequences of what your organization does. Hannah Arendt wrote about “the banality of evil,” arguing that the greatest harms are often done not through malice, but by obedient and conscientious technocrats who don’t think about the bigger picture. 

CNET AI Atlas badge; click to see more

We talked above about taking known harms and using word games to downplay and reframe them as manageable, transitional or outweighed by upside. But there’s also another powerful moral disengagement technique: staying conveniently ignorant by not putting in the effort to know about the harms you’re contributing to in the first place. Ignorance is a bad excuse if you could have found out by looking into it: German chemist Bruno Tesch was convicted and executed in 1946 for supplying Zyklon B gas to Auschwitz-Birkenau despite claiming he didn’t know what it would be used for.

So please ask obvious questions regularly. For example, which, if any, red lines does your organization have? Is it actively lobbying against AI safety legislation that you support? Have you looked it up in the AI safety index? How are its products used? If you work for Google or OpenAI, have you skimmed any of the lawsuits against your company for alleged chatbot-linked suicide? 

Ironically, thanks to modern LLMs, there’s really no excuse for not knowing about things like these, since they’re just a prompt away. For example, you can try this monthly: 

“Please make a list of morally questionable/controversial behavior by [MY COMPANY] in recent years, including a) controversial use of its tools (say for suicide, crime, surveillance or weapons), b) harm allegedly caused by its tools, c) alleged lies or broken promises by the company or its leadership, d) perverse incentives for the company to pursue profit over what truly benefits humanity.”

These are the ChatGPT responses we got for Anthropic, Google, OpenAI, Meta and xAI on March 29, 2026.

5. Do you make noise internally?

If you learn about something that’s close to one of your red lines, then ask questions internally to find out more. Although there were historical situations where criticizing one’s organization could get one killed, doing so in an AI company today is unlikely to even get you fired – and why would you want to keep working for a company that can’t handle respectful questions about your red lines? Most even have whistleblowing policies that protect you (see page 99 here at the Future of Life Institute website). 

If what you find out is unacceptable but you’re not ready to quit, then make noise internally: Explain why to colleagues and superiors, and push hard for change. Don’t be like one of the engineers who realized that the cold weather could cause catastrophic O-ring failure in the Challenger space shuttle and later regretted not speaking up forcefully. If you’re in the safety team and don’t know people in the lobbying team or those who make launch decisions. Make a sincere effort to connect with them and educate them – don’t become a poster child for bystander syndrome. 

6. Do you make noise externally? 

Taking a public stance that challenges your own organization can help in many ways, from nudging it to improve voluntarily to catalyzing external forces that pressure it (and its competitors) to improve. This doesn’t mean you need to risk exile like Edward Snowden: There are many recent cases where AI researchers have gotten away with well-argued criticism of their company without any retaliation whatsoever. What consequences would you face if you publicly criticized your organization or revealed harmful or illegal behavior? Most US AI companies have a whistleblower policy (see above); please read yours! In addition, a simple search (though maybe don’t do it with your own company’s LLM) will show you many reputable whistleblower organizations offering help with everything from legal support to financial aid should you get fired or sued.

So having read this, how would you rate your moral muscles? How many moral disengagement techniques did you recognize in yourself, and how strong has your research been on potential harms caused by your company? Please don’t feel disheartened if you scored low despite meaning well. Instead, think of it as going to the gym for the first time, and discovering that you can’t even bench 50 pounds: Muscles need to be used to get strong, and this six-step plan can strengthen your moral muscles in no time – and you’ll start feeling really great looking at yourself in the mirror.



Read the full article here

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *