Cancer AI LLM Prompts Data Set Atezolizumab Trial Data in mCRPC

Cancer AI LLM Prompts Data Set Atezolizumab Trial Data in mCRPC

Introduction

Prostate cancer is the second leading cause of cancer-related deaths in men in the United States. Among its most challenging forms is metastatic castration-resistant prostate cancer (mCRPC)—a disease characterized by progression despite androgen deprivation therapy. With a median survival of under 3 years and growing treatment resistance, mCRPC represents a major unmet need in oncology.

Enter atezolizumab, an anti–PD-L1 immune checkpoint inhibitor that has revolutionized outcomes in several cancers—but has yet to be established in prostate cancer. A groundbreaking Phase I clinical trial explored the safety, response, and immune-biomarker landscape of atezolizumab in heavily pretreated mCRPC patients.

We’ve now transformed this study into a structured, AI-ready dataset, unlocking insights and value for researchers, LLM developers, and pharmaceutical innovators.

Why This Study Matters

This Phase I trial tested atezolizumab monotherapy in 35 patients with mCRPC who had progressed on standard therapies such as enzalutamide and sipuleucel-T. Key takeaways include:

  • Safety Profile: Atezolizumab was generally well-tolerated, consistent with its use in other cancers.
  • Clinical Activity: Limited efficacy was observed (PSA response rate 8.6%; one confirmed partial response), but immune activation was noted.
  • Biomarker Exploration: The study assessed PD-L1 expression (IC/TC), DNA damage repair (DDR) mutations, tumor mutational burden (TMB), and T-cell receptor sequencing.
  • Need for Combinations: Given the low response rate, this trial suggests PD-L1 monotherapy may need to be paired with other agents for better outcomes in mCRPC.

What’s in the Dataset?

We’ve curated this study into a prompt-completion format JSON dataset, optimized for use in:

  • LLM fine-tuning for biomedical Q&A
  • Clinical trial summarization tools
  • Drug-response prediction models
  • Safety signal detection and AE labeling
  • Precision medicine decision support

Key components:

  • Trial design, eligibility, and dosing schedule
  • Objective response, PSA kinetics, and Kaplan–Meier survival estimates
  • Grade 3/4 adverse events and special interest AEs
  • Biomarker definitions (PD-L1 IHC, DDR gene sets, WES methods)
  • Bioinformatics pipeline annotations (MuTect2, Sequenza, RNA-seq)

Who Should Use This?

This structured dataset is ideal for:

  • AI startups building biomedical LLMs
  • Pharmaceutical companies developing checkpoint inhibitors
  • Researchers validating genomic predictors of ICI response
  • Clinical trial platforms seeking high-quality trial summaries
  • Medical knowledge engineers training search and Q&A systems

Why Structured Data Matters

Traditional clinical trial reports are unstructured PDFs, difficult for machines to parse. By translating this trial into structured, annotated data, we:

  • Improve machine interpretability
  • Enable automated reasoning and cross-trial comparison
  • Empower next-gen decision tools in drug development and clinical care

Sample dataset in JSON

  {

    "prompt": "How many patients were enrolled in the initial and expansion cohorts of the atezolizumab mCRPC trial?",

    "completion": "Fifteen patients were enrolled in the initial cohort from January to December 2015, and twenty patients were enrolled in the expansion cohort from March to September 2016."

  },

  {

    "prompt": "What was the median follow-up duration for patients in the initial cohort?",

    "completion": "The median follow-up for the initial cohort was 25.3 months (range, 2.3–28.1 months)."

  },

  {

    "prompt": "What was the median follow-up duration for the expansion cohort?",

    "completion": "The median follow-up for the expansion cohort was 11.3 months (range, 1.2–13.2 months)."

  },

  {

    "prompt": "What was the median follow-up for all patients combined in the atezolizumab trial?",

    "completion": "The overall median follow-up was 13.0 months (range, 1.2–28.1 months)."

  },

  {

    "prompt": "How many patients received three or more prior lines of therapy before entering the trial?",

    "completion": "22 patients (62.9%) had received three or more prior lines of therapy."

  },

  {

    "prompt": "What proportion of patients had received enzalutamide before enrolling in this study?",

    "completion": "32 patients (91.4%) had previously received enzalutamide as a therapy for metastatic disease."

  },

  {

    "prompt": "How many patients had previously been treated with sipuleucel-T?",

    "completion": "13 patients (37.1%) had received sipuleucel-T before the trial."

  },

  {

    "prompt": "What was the PD-L1 expression level in most patients enrolled in the study?",

    "completion": "Most patients (94.3%) had low PD-L1 expression, with IHC immune or tumor cell scores of 0/1, indicating less than 5% PD-L1 expression."

  },

  {

    "prompt": "What was the median duration of atezolizumab treatment in this trial?",

    "completion": "The median duration of atezolizumab treatment was 2.1 months, with an upper range of 27.9 months."

  },

  {

    "prompt": "How many doses of atezolizumab did patients receive on average?",

    "completion": "Patients received a median of four doses of atezolizumab, ranging from 1 to 41 doses."

  },

  {

    "prompt": "As of the data cutoff in June 2017, how many patients were still on treatment?",

    "completion": "Four patients (11.4%) were still receiving atezolizumab at the time of the data cutoff."

  },

  {

    "prompt": "How many patients remained on study as of June 30, 2017?",

    "completion": "Fourteen patients (40.0%) remained on the study at the data cutoff date."

  },

  {

    "prompt": "What proportion of patients experienced treatment-related adverse events (TRAE) in the Atezolizumab mCRPC trial?",

    "completion": "21 patients (60%) experienced treatment-related adverse events (TRAEs)."

  },

  {

    "prompt": "Which treatment-related adverse events (any grade) occurred in three or more patients in the trial?",

    "completion": "Fatigue, nausea, increased alanine aminotransferase, increased aspartate aminotransferase, increased blood alkaline phosphatase, decreased appetite, dry mouth, and pruritus."

  },

  {

    "prompt": "What percentage of patients experienced grade 3 or 4 treatment-related adverse events in the trial?",

    "completion": "Four patients (11.4%) experienced grade 3/4 treatment-related adverse events."

  },

  {

    "prompt": "What were the grade 3/4 TRAEs reported in the study, and how many times did each occur?",

    "completion": "Each occurred once: hypertension, lethargy, anemia, bone marrow infiltration, hypercalcemia, hypokalemia, hyponatremia, hypophosphatemia, and spinal cord compression."

  },

  {

    "prompt": "How many patients discontinued treatment due to a treatment-related adverse event?",

    "completion": "Only one patient discontinued treatment due to a treatment-related adverse event."

  },

  {

    "prompt": "How many adverse events of special interest (AESIs) were reported in the study?",

    "completion": "Nine adverse events of special interest (AESIs) were reported."

  },

  {

    "prompt": "How many AESIs were of grade 3 or 4 severity?",

    "completion": "Only one AESI was grade 3/4: increased alanine aminotransferase."

 

 

Let’s Collaborate

Whether you’re training a domain-specific LLM, building a clinical search engine, or accelerating drug development with AI, this dataset provides a gold-standard reference point.

Contact us at contact@ieearc.com to access a sample full dataset.

Let’s unlock the future of immunotherapy—one dataset at a time.