Skip to content

Enhancing Clinical Information Extraction: Iterative Refinement and Goal Articulation for Optimizing Large Language Models

admin

The study details the iterative refinement of a workflow pipeline designed to extract complex information from pathology reports, culminating in a “gold-standard” set of annotations created from 152 diverse clinical reports. The reports included various types of renal cell carcinoma (RCC) and other malignancies. The process involved six cycles utilizing the GPT-4o language model, significantly reducing the error rate to 0.99%. Discrepancies were systematically documented, revealing challenges such as inherent report complexities, specification issues, normalization hurdles, and the integration of medical nuances. Validation against internal data showed high performance in identifying key kidney tumor histologies with F1 scores near perfection. The pipeline outperformed a custom regex tool, particularly in handling varied terminology and historical naming conventions. Further assessments demonstrated the workflow’s portability and adaptability across different cancer types, confirming its robustness and clinical utility in real-world applications. Overall, the iterative refinement yielded critical insights into improving pathology information extraction.

Source link

Share This Article
Leave a Comment