Cancer AI LLM Prompts Data Set Atezolizumab Trial Data in mCRPC
Introduction
Prostate cancer is the second leading cause of cancer-related deaths in men in the United States. Among its most challenging forms is metastatic castration-resistant prostate cancer (mCRPC)—a disease characterized by progression despite androgen deprivation therapy. With a median survival of under 3 years and growing treatment resistance, mCRPC represents a major unmet need in oncology.
Enter atezolizumab, an anti–PD-L1 immune checkpoint inhibitor that has revolutionized outcomes in several cancers—but has yet to be established in prostate cancer. A groundbreaking Phase I clinical trial explored the safety, response, and immune-biomarker landscape of atezolizumab in heavily pretreated mCRPC patients.
We’ve now transformed this study into a structured, AI-ready dataset, unlocking insights and value for researchers, LLM developers, and pharmaceutical innovators.
Why This Study Matters
This Phase I trial tested atezolizumab monotherapy in 35 patients with mCRPC who had progressed on standard therapies such as enzalutamide and sipuleucel-T. Key takeaways include:
- Safety Profile: Atezolizumab was generally well-tolerated, consistent with its use in other cancers.
- Clinical Activity: Limited efficacy was observed (PSA response rate 8.6%; one confirmed partial response), but immune activation was noted.
- Biomarker Exploration: The study assessed PD-L1 expression (IC/TC), DNA damage repair (DDR) mutations, tumor mutational burden (TMB), and T-cell receptor sequencing.
- Need for Combinations: Given the low response rate, this trial suggests PD-L1 monotherapy may need to be paired with other agents for better outcomes in mCRPC.
What’s in the Dataset?
We’ve curated this study into a prompt-completion format JSON dataset, optimized for use in:
- LLM fine-tuning for biomedical Q&A
- Clinical trial summarization tools
- Drug-response prediction models
- Safety signal detection and AE labeling
- Precision medicine decision support
Key components:
- Trial design, eligibility, and dosing schedule
- Objective response, PSA kinetics, and Kaplan–Meier survival estimates
- Grade 3/4 adverse events and special interest AEs
- Biomarker definitions (PD-L1 IHC, DDR gene sets, WES methods)
- Bioinformatics pipeline annotations (MuTect2, Sequenza, RNA-seq)
Who Should Use This?
This structured dataset is ideal for:
- AI startups building biomedical LLMs
- Pharmaceutical companies developing checkpoint inhibitors
- Researchers validating genomic predictors of ICI response
- Clinical trial platforms seeking high-quality trial summaries
- Medical knowledge engineers training search and Q&A systems
Why Structured Data Matters
Traditional clinical trial reports are unstructured PDFs, difficult for machines to parse. By translating this trial into structured, annotated data, we:
- Improve machine interpretability
- Enable automated reasoning and cross-trial comparison
- Empower next-gen decision tools in drug development and clinical care
Sample dataset in JSON
{ "prompt": "How many patients were enrolled in the initial and expansion cohorts of the atezolizumab mCRPC trial?", "completion": "Fifteen patients were enrolled in the initial cohort from January to December 2015, and twenty patients were enrolled in the expansion cohort from March to September 2016." }, { "prompt": "What was the median follow-up duration for patients in the initial cohort?", "completion": "The median follow-up for the initial cohort was 25.3 months (range, 2.3–28.1 months)." }, { "prompt": "What was the median follow-up duration for the expansion cohort?", "completion": "The median follow-up for the expansion cohort was 11.3 months (range, 1.2–13.2 months)." }, { "prompt": "What was the median follow-up for all patients combined in the atezolizumab trial?", "completion": "The overall median follow-up was 13.0 months (range, 1.2–28.1 months)." }, { "prompt": "How many patients received three or more prior lines of therapy before entering the trial?", "completion": "22 patients (62.9%) had received three or more prior lines of therapy." }, { "prompt": "What proportion of patients had received enzalutamide before enrolling in this study?", "completion": "32 patients (91.4%) had previously received enzalutamide as a therapy for metastatic disease." }, { "prompt": "How many patients had previously been treated with sipuleucel-T?", "completion": "13 patients (37.1%) had received sipuleucel-T before the trial." }, { "prompt": "What was the PD-L1 expression level in most patients enrolled in the study?", "completion": "Most patients (94.3%) had low PD-L1 expression, with IHC immune or tumor cell scores of 0/1, indicating less than 5% PD-L1 expression." }, { "prompt": "What was the median duration of atezolizumab treatment in this trial?", "completion": "The median duration of atezolizumab treatment was 2.1 months, with an upper range of 27.9 months." }, { "prompt": "How many doses of atezolizumab did patients receive on average?", "completion": "Patients received a median of four doses of atezolizumab, ranging from 1 to 41 doses." }, { "prompt": "As of the data cutoff in June 2017, how many patients were still on treatment?", "completion": "Four patients (11.4%) were still receiving atezolizumab at the time of the data cutoff." }, { "prompt": "How many patients remained on study as of June 30, 2017?", "completion": "Fourteen patients (40.0%) remained on the study at the data cutoff date." }, { "prompt": "What proportion of patients experienced treatment-related adverse events (TRAE) in the Atezolizumab mCRPC trial?", "completion": "21 patients (60%) experienced treatment-related adverse events (TRAEs)." }, { "prompt": "Which treatment-related adverse events (any grade) occurred in three or more patients in the trial?", "completion": "Fatigue, nausea, increased alanine aminotransferase, increased aspartate aminotransferase, increased blood alkaline phosphatase, decreased appetite, dry mouth, and pruritus." }, { "prompt": "What percentage of patients experienced grade 3 or 4 treatment-related adverse events in the trial?", "completion": "Four patients (11.4%) experienced grade 3/4 treatment-related adverse events." }, { "prompt": "What were the grade 3/4 TRAEs reported in the study, and how many times did each occur?", "completion": "Each occurred once: hypertension, lethargy, anemia, bone marrow infiltration, hypercalcemia, hypokalemia, hyponatremia, hypophosphatemia, and spinal cord compression." }, { "prompt": "How many patients discontinued treatment due to a treatment-related adverse event?", "completion": "Only one patient discontinued treatment due to a treatment-related adverse event." }, { "prompt": "How many adverse events of special interest (AESIs) were reported in the study?", "completion": "Nine adverse events of special interest (AESIs) were reported." }, { "prompt": "How many AESIs were of grade 3 or 4 severity?", "completion": "Only one AESI was grade 3/4: increased alanine aminotransferase."
Let’s Collaborate
Whether you’re training a domain-specific LLM, building a clinical search engine, or accelerating drug development with AI, this dataset provides a gold-standard reference point.
Contact us at contact@ieearc.com to access a sample full dataset.
Let’s unlock the future of immunotherapy—one dataset at a time.