Large language models LLMs have transformed AI and achieved breakthrough performance on a wide range of tasks In science the most interesting application of LLMs is for hypothesis formation A feature of LLMs which results from their probabilistic structure is that the output text is not necessarily a valid inference from the training text These are termed hallucinations and are harmful in many applications In science some hallucinations may be useful novel hypotheses whose validity may be tested by laboratory experiments Here we experimentally test the application of LLMs as a source of scientific hypotheses using the domain of breast cancer treatment We applied the LLM GPT4 to hypothesize novel synergistic pairs of FDA-approved noncancer drugs that target the MCF7 breast cancer cell line relative to the nontumorigenic breast cell line MCF10A In the first round of laboratory experiments GPT4 succeeded in discovering three drug combinations out of twelve tested with synergy scores above the positive controls GPT4 then generated new combinations based on its initial results this generated three more combinations with positive synergy scores out of four tested We conclude that LLMs are a valuable source of scientific hypotheses.
View on arXiv@article{abdel-rehim2025_2405.12258, title={ Scientific Hypothesis Generation by a Large Language Model: Laboratory Validation in Breast Cancer Treatment }, author={ Abbi Abdel-Rehim and Hector Zenil and Oghenejokpeme Orhobor and Marie Fisher and Ross J. Collins and Elizabeth Bourne and Gareth W. Fearnley and Emma Tate and Holly X. Smith and Larisa N. Soldatova and Ross D. King }, journal={arXiv preprint arXiv:2405.12258}, year={ 2025 } }