Skip to main content
This is the carpentries repository for our funded project "How to Build FAIR Domain-Specific Datasets for fine tuning/training NLP models" Resources

    This is the carpentries repository for our funded project "How to Build FAIR Domain-Specific Datasets for fine tuning/training NLP models" Resources
    How to Build FAIR Domain-Specific Datasets for fine tuning/training NLP models
    • How to Build FAIR Domain-Specific Datasets for fine tuning/training NLP models
    • Key Points
    • Instructor Notes
    • Extract All Images

      • Reference
    Search the All In One page
    How to Build FAIR Domain-Specific Datasets for fine tuning/training NLP models
    %
  • Learner View

    Summary and Schedule
    1. Introduction to NLP tasks and Fine-Tuning
    2. Identifying and Collecting Domain-Specific Data
    3. Preprocessing Biomedical Text Data
    4. Annotation Strategies for Domain-Specific NLP Tasks
    5. Quality Assurance and Validation of Datasets
    6. Challenges and possible solutions to create datasets
    7. FAIRification of Domain-Specific Datasets

    • Key Points
    • Instructor Notes
    • Extract All Images

    • Reference

    See all in one page

    Introduction to NLP tasks and Fine-Tuning


    Figure 1


    Identifying and Collecting Domain-Specific Data


    Figure 1


    Preprocessing Biomedical Text Data


    Figure 1


    Annotation Strategies for Domain-Specific NLP Tasks


    Figure 1


    Quality Assurance and Validation of Datasets


    Figure 1


    Challenges and possible solutions to create datasets


    Figure 1


    FAIRification of Domain-Specific Datasets


    Figure 1



    This lesson is subject to the Code of Conduct

    Edit on GitHub | Contributing | Source

    Cite | Contact | About

    Materials licensed under CC-BY 4.0 by the authors

    Template licensed under CC-BY 4.0 by The Carpentries

    Built with sandpaper (0.16.12), pegboard (0.7.9), and varnish (1.0.7.9000)


    Back To Top