Skip to content

foa-checklist-extraction-udm

Slugfoa-checklist-extraction-udm
Version0.1.0
Statusexperimental
Last fully evaluatednone
Eval stateno validated eval cases
Categoryextraction
Domainresearch-administration
Manifestationsprompt
Created2026-04-30
Updated2026-04-30

Tags: foa federal-funding pre-award checklist evaluation-criteria nih hhs doe udm structured-extraction json

Audience: sponsored-programs-staff, pre-award-teams, ingest-pipelines

Manifestations in repo: prompt.md

Extracts a Federal Funding Opportunity Announcement (FOA) into a structured JSON object covering the eight reference sections a federal-grants office uses for FOA review: FOA summary, key dates, funding information, eligibility, application components, evaluation process, program priorities, and special requirements. Stronger emphasis than its sibling rfa-checklist-extraction-udm on the evaluation criteria, review process, and submission system fields.

Output contract: schema.json Contract scope: repo-local, UDM-aligned

Inputs

Full text of an FOA (HHS / NIH / DOE / DOD / agency-issued funding opportunity announcement) — pasted text, attached PDF/DOCX/HTML, or URL.

Outputs

A single JSON object covering 31 schema fields organized around the eight FOA reference sections (see prompt.md for the full breakdown). Headline blocks: evaluation_criteria (with weights), review_stages (sequential), program_goals (table), required_registrations (SAM/UEI/eRA Commons/Grants.gov), required_forms, application_components, page_limits (with what_counts rules).

See schema.json for the authoritative definition.

Contract scope

Repo-local, UDM-aligned. UDM bindings: foa_numberRFA.Opportunity_Number; cfda_numberRFA.CFDA_Number; federal_agencyOrganization.Organization_Name; foa_titleRFA.RFA_Title. The structured shape mirrors the deliverable produced by the foa-checklist-extraction Vandalizer workflow in the ui-insight/ProcessMapping process-mapping corpus.

Relationship to rfa-checklist-extraction-udm

Concern rfa-checklist-extraction-udm foa-checklist-extraction-udm (this)
Emphasis NSF/NIH solicitation triage; placement-rule de-duplication FOA review with strong evaluation-criteria, review-process, and submission-system focus
Output sections 8 (dates, institutions, individuals, award, components, budget, submission, special, notes) 8 (FOA summary, key dates, funding, eligibility, application components, evaluation process, program priorities, special requirements)
Distinct fields cost_sharing.status enum; important_notes synthesis evaluation_criteria table with weights; review_stages sequential list; program_goals table

A single announcement can be extracted through both contracts when the downstream consumer needs both cuts.

Triad integration

  • Evaluation datasets: none yet — planned: NIH PA/PAR/RFA with multi-stage review (LOI → full → panel → council); DOE FOA with explicit point-weighted evaluation criteria; HHS NOFO with required pre-application registrations (SAM, UEI, eRA Commons).
  • Harness notes: canonical manifestation is prompt.md. The companion top-level workflows/foa-checklist-extraction Vandalizer workflow at v0.1.0 implements the contract as six parallel Extraction tasks plus a Consolidation Prompt.

Runtime topology — the Vandalizer workflow

The canonical runtime is the foa-checklist-extraction workflow.

  • Step 1 (parallel Extraction) — six Extraction tasks mirroring the source ProcessMapping workflow one-for-one (FOA Identification & Timeline, Evaluation Criteria, Review Process, Agency Priorities & Goals, Forms & Submission Systems, Formatting Requirements).
  • Step 2 (Consolidation Prompt) — assembles the six fragments into a single schema-conformant object and enforces the cross-field reconciliation between expected_awards, award_range, and total_funding.

Manifestations

  • prompt.md — canonical, LLM-agnostic prompt

Evals

See evals/ for reference inputs and known-good outputs.

Provenance

Authored 2026-04-30 against the foa-checklist-extraction (Workflow_ID: WF-FOA-CHECKLIST-EXTRACTION) process-mapping workflow in ui-insight/ProcessMapping at commit b7176b0c913833a205efdb5e4ba00c17ff88af0f.

Contract scope

  • Output format: json_object

  • Contract scope: shared_udm_semantics_repo_local_schema

  • Validation surfaces: json_schema

  • Schema entrypoints: #

  • Notes: Repo-local 31-field FOA reference contract. Sibling of rfa-checklist-extraction-udm with stronger emphasis on evaluation criteria (with weights), multi-stage review process, program goals, and submission-system / forms / page-limit requirements. UDM bindings: foa_number to RFA.Opportunity_Number, cfda_number to RFA.CFDA_Number, federal_agency to Organization.Organization_Name, foa_title to RFA.RFA_Title.

  • Machine-readable catalog entry: component_catalog.json

Triad integration

  • UDM alignment: shared_udm_semantics_repo_local_schema — Scalar FOA-identifying fields resolve to UDM RFA and Organization. The 31-field shape itself is repo-local.

  • Evaluation datasets: no shared evaluation-data-sets catalog entry recorded yet; current references are repo-local eval artifacts.

  • Harness notes: Validate JSON outputs against schema.json. Canonical single-call invocation surface is prompt.md. The companion top-level workflows/foa-checklist-extraction Vandalizer workflow at v0.1.0 implements the contract as six parallel Extraction tasks plus a Consolidation Prompt that assembles the 31-field schema and enforces cross-field consistency (chronological critical_dates; expected_awards * max(award_range) <= total_funding).

  • Related component: rfa-checklist-extraction-udm (sibling_same_source_different_shape) — Same source family (federal funding announcements) with a different output cut. rfa-checklist-extraction-udm emphasizes NSF/NIH solicitation triage with placement-rule de-duplication across the eight checklist sections; foa-checklist-extraction-udm emphasizes FOA review with strong evaluation-criteria, review-process, and submission-system focus across its eight reference sections.

Prompt body

Source: prompt.md.

Show prompt

FOA Checklist Extraction — UDM JSON

Purpose: Extract a Federal Funding Opportunity Announcement (FOA) into a structured JSON object covering the eight reference sections a pre-award team uses when evaluating an FOA: FOA summary, key dates, funding information, eligibility, application components, evaluation process, program priorities, and special requirements.

Expected input: Full text of an FOA (HHS / NIH / DOE / DOD / agency-issued funding opportunity announcement).

Expected output: A single JSON object that validates against schema.json. No prose, no markdown outside the JSON.

Relationship to rfa-checklist-extraction-udm

foa-checklist-extraction-udm and rfa-checklist-extraction-udm are siblings. Both extract a federal funding announcement into a structured pre-award checklist; they differ in emphasis and the eight checklist sections each produces:

| Concern | rfa-checklist-extraction-udm | foa-checklist-extraction-udm (this) |

| --- | --- | --- |

| Emphasis | NSF/NIH solicitation triage; placement-rule de-duplication | FOA review with strong evaluation-criteria, review-process, and submission-system focus |

| Output sections | 8 (dates, institutions, individuals, award, components, budget, submission, special, notes) | 8 (FOA summary, key dates, funding, eligibility, application components, evaluation process, program priorities, special requirements) |

| Distinct fields | cost_sharing.status enum; important_notes synthesis | evaluation_criteria table with weights; review_stages list; program_goals table |

A single announcement can be extracted through both contracts when the downstream consumer needs both cuts. UDM bindings: foa_numberRFA.Opportunity_Number; cfda_numberRFA.CFDA_Number; federal_agencyOrganization.Organization_Name; foa_titleRFA.RFA_Title.


Prompt

You are extracting a Federal Funding Opportunity Announcement (FOA) into the eight pre-award checklist sections a federal-grants office uses for FOA review. Capture the FOA's identity and timeline; the evaluation criteria and scoring methodology; the review process; agency priorities and program goals; the forms and submission systems; and the formatting requirements.

Be 100% accurate. Match the schema's type for each field exactly:

  • Number-typed fields (total_funding) — emit as JSON numbers. $10,000,000 in the document → 10000000 in JSON. No quotes, no currency symbol, no thousand-separators.

  • Integer-typed fields (expected_awards) — emit as JSON integers.

  • Boolean-typed fields (cost_sharing_required) — emit as true/false/null.

  • String-typed fields (award_range, all percent strings, evaluation_criteria[].weight, page limits in page_limits[].limit, etc.) — quote verbatim, preserving %, $, page-counts ("15 pages"), and other document rendering.

When a field is not specified, set it to null or — for arrays/tables — return an empty array. Do not invent values.

Return a single JSON object that validates against schema.json with these top-level keys:

  • foa_number — funding opportunity number (e.g., "PA-24-246", "DE-FOA-0003117"). String. Required.

  • cfda_number — CFDA number and title (e.g., "93.213 Research and Training in Complementary..."). String or null.

  • federal_agency — one of "NSF", "NIH", "DOD", "DOE", "NASA", "USDA", "EPA", "DOT", "DOC", "ED", or "Other".

  • foa_title — full title of the funding opportunity. String. Required.

  • total_funding — number (decimal) for total available funding, or null.

  • expected_awards — integer count of expected awards, or null.

  • award_range — string (e.g., "$300,000 to $750,000 per award") or null.

  • cost_sharing_required — boolean or null.

  • critical_dates — array of {milestone, date_time, notes} objects covering all critical dates (LOI, full proposal, panel review, anticipated start). Required, may be empty when announcement is rolling.

  • performance_period — string with expected project-duration range or null.

  • eligible_applicants — flat list of eligible applicant types as stated. Required.

  • evaluation_criteria — array of {criterion_name, weight, description, rating_definitions} objects.

  • total_points — string (e.g., "100 points") or null.

  • scoring_methodology — string describing how scores are calculated, or null.

  • minimum_threshold — string (e.g., "score of 80 minimum for funding consideration") or null.

  • review_stages — array of {stage_name, purpose, outcome} objects.

  • review_personnel — string describing review-panel composition or null.

  • screening_criteria — array of strings naming initial-screening / disqualification factors. Empty when none.

  • review_timeline — string (e.g., "Estimated 90 days from submission to notification") or null.

  • agency_mission — string or null.

  • program_goals — array of {goal, objective, success_indicator} objects.

  • priority_areas — flat list of priority areas in stated order of importance. Empty when none.

  • expected_outcomes — flat list of required deliverables / intended impacts. Empty when none.

  • submission_platform — string (e.g., "Grants.gov", "eRA Commons"). Required.

  • required_registrations — array of {system_name, timeline, prerequisites} objects (SAM, UEI, eRA Commons, Grants.gov).

  • required_forms — array of {form_number, name, purpose, special_instructions} objects.

  • application_components — array of {component_name, required_or_optional, page_limit, format_requirements} objects.

  • document_structure — flat list of required proposal sections in the exact order required. Empty when not specified.

  • page_limits — array of {section, limit, what_counts, consequences} objects.

  • formatting_standards — string (margin/font/spacing) or null.

  • file_requirements — string (acceptable formats, sizes, naming) or null.

Encoding rules

  1. evaluation_criteria rows quote weights verbatim. "Significance — 25 points"{criterion_name: "Significance", weight: "25 points", ...}. If the FOA gives percentages, quote the percentage; if it gives points, quote points.

  2. critical_dates covers every gating milestone — LOI, pre-application, full application, anticipated panel review date, expected start date.

  3. review_stages is sequential. Each stage's outcome describes what proceeds to the next stage.

  4. required_registrations covers every prerequisite system (SAM.gov, UEI, eRA Commons, Grants.gov, agency-specific portals). When the FOA does not call any out, return an empty array — do not synthesize the standard four.

  5. page_limits.what_counts is the rule for what counts toward the limit (text only, references included, attachments separate, etc.).

  6. expected_awards * max(award_range) should not exceed total_funding (downstream cross-field check CHK-03).

  7. Do not output any text outside the single JSON object.

Output

A single JSON object. No surrounding markdown.

Output schema

Source: schema.json.

Show schema.json
{

  "$schema": "https://json-schema.org/draft/2020-12/schema",

  "$id": "https://github.com/AI4RA/prompt-library/components/foa-checklist-extraction-udm/schema.json",

  "title": "FOA Checklist Extraction \u2014 UDM Output",

  "description": "JSON contract for a Federal Funding Opportunity Announcement (FOA) distilled into the eight pre-award checklist sections a federal-grants office uses for FOA review: FOA summary, key dates, funding, eligibility, application components, evaluation process, program priorities, special requirements. Sibling of rfa-checklist-extraction-udm with stronger emphasis on evaluation-criteria, review-process, and submission-system fields.",

  "version": "0.1.0",

  "type": "object",

  "additionalProperties": false,

  "required": [

    "foa_number",

    "federal_agency",

    "foa_title",

    "critical_dates",

    "eligible_applicants",

    "evaluation_criteria",

    "review_stages",

    "program_goals",

    "submission_platform",

    "required_registrations",

    "required_forms",

    "application_components"

  ],

  "properties": {

    "foa_number": {

      "type": "string",

      "minLength": 1,

      "description": "Funding opportunity number. Resolves to UDM RFA.Opportunity_Number."

    },

    "cfda_number": {

      "type": [

        "string",

        "null"

      ],

      "description": "CFDA number and title. Resolves to UDM RFA.CFDA_Number."

    },

    "federal_agency": {

      "type": "string",

      "enum": [

        "NSF",

        "NIH",

        "DOD",

        "DOE",

        "NASA",

        "USDA",

        "EPA",

        "DOT",

        "DOC",

        "ED",

        "Other"

      ],

      "description": "Federal agency name. Resolves to UDM Organization.Organization_Name."

    },

    "foa_title": {

      "type": "string",

      "minLength": 1,

      "description": "Full title of the funding opportunity. Resolves to UDM RFA.RFA_Title."

    },

    "total_funding": {

      "type": [

        "number",

        "null"

      ],

      "description": "Total available funding."

    },

    "expected_awards": {

      "type": [

        "integer",

        "null"

      ],

      "minimum": 0,

      "description": "Expected number of awards."

    },

    "award_range": {

      "type": [

        "string",

        "null"

      ]

    },

    "cost_sharing_required": {

      "type": [

        "boolean",

        "null"

      ]

    },

    "critical_dates": {

      "type": "array",

      "items": {

        "type": "object",

        "additionalProperties": false,

        "required": [

          "milestone",

          "date_time"

        ],

        "properties": {

          "milestone": {

            "type": "string",

            "minLength": 1

          },

          "date_time": {

            "type": "string",

            "minLength": 1

          },

          "notes": {

            "type": [

              "string",

              "null"

            ]

          }

        }

      },

      "description": "All gating milestones (LOI, pre-application, full application, panel review, anticipated start)."

    },

    "performance_period": {

      "type": [

        "string",

        "null"

      ]

    },

    "eligible_applicants": {

      "type": "array",

      "minItems": 1,

      "items": {

        "type": "string",

        "minLength": 1

      }

    },

    "evaluation_criteria": {

      "type": "array",

      "items": {

        "type": "object",

        "additionalProperties": false,

        "required": [

          "criterion_name",

          "description"

        ],

        "properties": {

          "criterion_name": {

            "type": "string",

            "minLength": 1

          },

          "weight": {

            "type": [

              "string",

              "null"

            ],

            "description": "Quote points or percentage verbatim."

          },

          "description": {

            "type": "string",

            "minLength": 1

          },

          "rating_definitions": {

            "type": [

              "string",

              "null"

            ]

          }

        }

      }

    },

    "total_points": {

      "type": [

        "string",

        "null"

      ]

    },

    "scoring_methodology": {

      "type": [

        "string",

        "null"

      ]

    },

    "minimum_threshold": {

      "type": [

        "string",

        "null"

      ]

    },

    "review_stages": {

      "type": "array",

      "items": {

        "type": "object",

        "additionalProperties": false,

        "required": [

          "stage_name",

          "purpose"

        ],

        "properties": {

          "stage_name": {

            "type": "string",

            "minLength": 1

          },

          "purpose": {

            "type": "string",

            "minLength": 1

          },

          "outcome": {

            "type": [

              "string",

              "null"

            ],

            "description": "What proceeds to the next stage."

          }

        }

      }

    },

    "review_personnel": {

      "type": [

        "string",

        "null"

      ]

    },

    "screening_criteria": {

      "type": "array",

      "items": {

        "type": "string",

        "minLength": 1

      }

    },

    "review_timeline": {

      "type": [

        "string",

        "null"

      ]

    },

    "agency_mission": {

      "type": [

        "string",

        "null"

      ]

    },

    "program_goals": {

      "type": "array",

      "items": {

        "type": "object",

        "additionalProperties": false,

        "required": [

          "goal"

        ],

        "properties": {

          "goal": {

            "type": "string",

            "minLength": 1

          },

          "objective": {

            "type": [

              "string",

              "null"

            ]

          },

          "success_indicator": {

            "type": [

              "string",

              "null"

            ]

          }

        }

      }

    },

    "priority_areas": {

      "type": "array",

      "items": {

        "type": "string",

        "minLength": 1

      },

      "description": "In stated order of importance."

    },

    "expected_outcomes": {

      "type": "array",

      "items": {

        "type": "string",

        "minLength": 1

      }

    },

    "submission_platform": {

      "type": "string",

      "minLength": 1

    },

    "required_registrations": {

      "type": "array",

      "items": {

        "type": "object",

        "additionalProperties": false,

        "required": [

          "system_name"

        ],

        "properties": {

          "system_name": {

            "type": "string",

            "minLength": 1

          },

          "timeline": {

            "type": [

              "string",

              "null"

            ]

          },

          "prerequisites": {

            "type": [

              "string",

              "null"

            ]

          }

        }

      }

    },

    "required_forms": {

      "type": "array",

      "items": {

        "type": "object",

        "additionalProperties": false,

        "required": [

          "name"

        ],

        "properties": {

          "form_number": {

            "type": [

              "string",

              "null"

            ]

          },

          "name": {

            "type": "string",

            "minLength": 1

          },

          "purpose": {

            "type": [

              "string",

              "null"

            ]

          },

          "special_instructions": {

            "type": [

              "string",

              "null"

            ]

          }

        }

      }

    },

    "application_components": {

      "type": "array",

      "items": {

        "type": "object",

        "additionalProperties": false,

        "required": [

          "component_name"

        ],

        "properties": {

          "component_name": {

            "type": "string",

            "minLength": 1

          },

          "required_or_optional": {

            "type": [

              "string",

              "null"

            ],

            "enum": [

              "required",

              "optional",

              null

            ]

          },

          "page_limit": {

            "type": [

              "string",

              "null"

            ]

          },

          "format_requirements": {

            "type": [

              "string",

              "null"

            ]

          }

        }

      }

    },

    "document_structure": {

      "type": "array",

      "items": {

        "type": "string",

        "minLength": 1

      }

    },

    "page_limits": {

      "type": "array",

      "items": {

        "type": "object",

        "additionalProperties": false,

        "required": [

          "section",

          "limit"

        ],

        "properties": {

          "section": {

            "type": "string",

            "minLength": 1

          },

          "limit": {

            "type": "string",

            "minLength": 1

          },

          "what_counts": {

            "type": [

              "string",

              "null"

            ]

          },

          "consequences": {

            "type": [

              "string",

              "null"

            ]

          }

        }

      }

    },

    "formatting_standards": {

      "type": [

        "string",

        "null"

      ]

    },

    "file_requirements": {

      "type": [

        "string",

        "null"

      ]

    }

  }

}

Changelog

Source: CHANGELOG.md.

All notable changes to this component. Versions follow semver.

[0.1.0] — 2026-04-30

  • Initial experimental release.
  • Schema derived from the foa-checklist-extraction v2 Vandalizer workflow in ui-insight/ProcessMapping (six parallel Extraction tasks + Formatting task; 31 source fields).
  • 31-field shape covering the eight FOA reference sections used by federal-grants offices for FOA review.
  • federal_agency enum matches the source Federal_Agency Enum_Values (NSF, NIH, DOD, DOE, NASA, USDA, EPA, DOT, DOC, ED, Other).
  • evaluation_criteria, review_stages, program_goals, required_registrations, required_forms, application_components, page_limits, and critical_dates realized as arrays of typed objects (rather than the source Table fields) so per-row attributes attach to the right entry.
  • Cross-field rule from the source workflow (CFR-01: expected_awards * max(award_range) <= total_funding) is encoded in the prompt's encoding rules and the workflow validation_plan.
  • UDM column bindings preserved: foa_numberRFA.Opportunity_Number; cfda_numberRFA.CFDA_Number; federal_agencyOrganization.Organization_Name; foa_titleRFA.RFA_Title.
  • Sibling of rfa-checklist-extraction-udm: same source family, different output cut.
  • No eval cases yet — status experimental until at least one golden extraction is added under evals/cases/.