Skip to content

rfa-checklist-extraction-udm

Slugrfa-checklist-extraction-udm
Version0.1.0
Statusexperimental
Last fully evaluatednone
Eval stateno validated eval cases
Categoryextraction
Domainresearch-administration
Manifestationsprompt
Created2026-04-24
Updated2026-04-24

Tags: rfa foa nofo grants pre-award checklist udm structured-extraction json

Audience: sponsored-programs-staff, pre-award-teams, ingest-pipelines

Manifestations in repo: prompt.md

Extracts a federal funding announcement (RFA / FOA / NOFO / program solicitation) into a structured JSON object organized around the eight-section pre-award checklist that a sponsored-programs analyst actually uses when triaging a new opportunity. The shape enforces placement rules — award amount lives in award_information only, detailed financial rules in budget_requirements only — so downstream consolidation does not have to re-adjudicate where a fact belongs.

Output contract: schema.json Contract scope: repo-local, UDM-aligned

Inputs

Full text of a funding announcement — pasted text, attached PDF/DOCX/HTML, or URL. Optional knowledge-base context from Uniform Guidance (2 CFR 200), NSF PAPPG, or NIH Grants Policy Statement is injected by the runtime workflow but not required by the prompt itself.

Outputs

A single JSON object with:

  • Scalar opportunity metadatarfa_id, rfa_number, rfa_title, sponsor_name, program_code, announcement_url, opportunity_number, cfda_number
  • Eight structured sections matching the consolidated checklist:
  • dates_and_deadlines — array of {item, date_time, notes}
  • eligible_institutions — array of {type, subcategory, examples, compliance_requirements}
  • eligible_individuals — array of {type, criteria, compliance_requirements, conditions}
  • award_information{award_duration, amount_per_award, number_of_awards, anticipated_award_date}
  • required_components / optional_components — arrays of {name, description, special_requirements}
  • budget_requirements{funding_limits, cost_sharing: {status, details}, fa_policy, allowable_costs, unallowable_costs, personnel_effort, other_considerations}
  • submission_details — string
  • special_requirements — array of strings
  • important_notes — array of strings

See schema.json for the authoritative definition and prompt.md for the encoding rules (date formats, placement contract, quotation requirements, extraction strategy).

Contract scope

Repo-local, UDM-aligned. The scalar fields (rfa_id, rfa_number, rfa_title, sponsor_name, cfda_number, ...) follow the conventions set by rfp-extraction-udm 1.0.0 and resolve downstream to UDM entities (RFA, Sponsor_Organization). The structured sections do not duplicate any shared UDM schema — they are repo-local to this component and mirror the eight-section deliverable produced by the rfa-checklist-extraction Vandalizer workflow in the ui-insight/ProcessMapping process-mapping corpus. Selected leaf fields reference UDM columns: cost_sharingCostShare, fa_policyIndirectRate, personnel_effortEffort.

Relationship to rfp-extraction-udm

Concern rfp-extraction-udm rfa-checklist-extraction-udm
Audience Ingest pipelines (flat arrays of typed requirements) Pre-award offices (eight-section checklist)
Shape 18 scalars + nine requirement arrays with a common requirement shape 8 scalars + eight typed section objects/arrays, each with a distinct field shape
De-duplication Not enforced by shape — every requirement is independent Enforced by shape — award amount lives only in award_information, financial rules only in budget_requirements
Checklist reconstruction Requires caller to re-group by category Direct 1:1 map to the eight-section markdown deliverable
Typical downstream UDM ingest service Consolidation Prompt node rendering a markdown checklist

The two components are versioned independently. A single announcement can be extracted through both contracts for different consumers.

Triad integration

  • Evaluation datasets: none yet — planned: add an RFA case to real.nsf_awards or a new real.rfa_checklists dataset with expected.json produced from a sponsored-programs-reviewed extraction.
  • Harness notes: canonical manifestation is prompt.md. Validation surface is schema.json. Vendored into runners via harness prompts vendor --source-ref=<sha>; pinned in prompts.lock.json.
  • Shared UDM relationship: aligned, not owning. rfa_id, sponsor_name, and the three UDM-column leaf fields match the naming conventions in AI4RA-UDM but this component does not redefine UDM tables.

Runtime topology — the Vandalizer workflow

The canonical runtime for this component is the rfa-checklist-extraction workflow shipped at the top level of this repo. The single source of truth is workflows/rfa-checklist-extraction/manifest.yaml; the companion .vandalizer.json envelope is generated by scripts/build_vandalizer_workflows.py and committed alongside. The runtime mirrors the source ui-insight/ProcessMapping/workflows/rfa-checklist-extraction/ workflow:

  • Step 1 (parallel Extraction) — seven Extraction tasks. Six mirror the source workflow one-for-one (dates, eligible institutions, eligible individuals, award info, application components, budget). A seventh extract-opportunity-metadata task captures the eight UDM-aligned scalar opportunity fields the schema adds on top of the source workflow. Each task carries an embedded SearchSet whose item titles match this component's schema field names; cost_sharing_status is exposed as the four-value enum.
  • Step 2 (consolidation Prompt) — assembles the seven JSON fragments into the schema-conformant object, including the nested cost_sharing: {status, details} rebuild and important_notes synthesis from cross-section signals.

Regenerate the workflow JSON whenever this component bumps MINOR or MAJOR (or whenever the workflow manifest changes); CI fails if the committed .vandalizer.json drifts from a fresh build.

Manifestations

  • prompt.md — canonical, LLM-agnostic prompt

Evals

See evals/ for reference inputs and known-good outputs. Initial case pending: NSF-published RFA with a multi-round structure, cost-sharing prohibition, and explicit allowable/unallowable categories to exercise the placement contract.

Provenance

Authored 2026-04-24 against the existing rfa-checklist-extraction process-mapping workflow (v2) in ui-insight/ProcessMapping, which was built from walkthrough transcripts of University of Idaho sponsored-programs staff reviewing NSF and NIH announcements. Created to make that workflow a harness-evaluatable, versioned artifact rather than a runtime-embedded configuration.

Contract scope

  • Output format: json_object

  • Contract scope: shared_udm_semantics_repo_local_schema

  • Validation surfaces: json_schema

  • Schema entrypoints: #

  • Notes: Repo-local eight-section pre-award checklist contract for funding announcements. Scalar opportunity metadata aligns with UDM opportunity semantics; the eight structured sections enforce the pre-award de-duplication placement rules (award amount only in award_information, detailed financial rules only in budget_requirements) at schema level.

  • Machine-readable catalog entry: component_catalog.json

Triad integration

  • UDM alignment: shared_udm_semantics_repo_local_schema — Scalar metadata (rfa_id, rfa_number, sponsor_name, cfda_number) follows rfp-extraction-udm conventions and resolves to UDM RFA and Sponsor_Organization entities. Leaf fields carry UDM column bindings: cost_sharing to CostShare, fa_policy to IndirectRate, personnel_effort to Effort. The eight-section shape itself is repo-local.

  • Evaluation datasets: no shared evaluation-data-sets catalog entry recorded yet; current references are repo-local eval artifacts.

  • Harness notes: Validate JSON outputs against schema.json. Canonical single-call invocation surface is prompt.md. The companion top-level workflows/rfa-checklist-extraction Vandalizer workflow at v0.2.0 implements the same contract as a seven-task parallel Extraction pipeline (six section extractions plus a metadata extraction) followed by a Consolidation Prompt; campaign authors should record both single-call and post-consolidation signals when both are available.

  • Related component: rfp-extraction-udm (sibling_same_source_different_shape) — Same document family (RFP / RFA / FOA / NOFO / BAA / DCL) with a different output shape — rfp-extraction-udm is a nine-array requirement-centric flat contract for ingest pipelines; rfa-checklist-extraction-udm is an eight-section pre-award checklist contract with placement rules enforced by the shape itself.

  • Related component: rfp-extraction (sibling_human_vs_structured_outputs) — rfp-extraction produces the markdown checklist form for the same source family; the consolidator step downstream of rfa-checklist-extraction-udm renders an equivalent markdown checklist from the structured JSON.

Prompt body

Source: prompt.md.

Show prompt

RFA Checklist Extraction — UDM JSON

Purpose: Extract a federal funding announcement (RFA / FOA / NOFO / program solicitation) into a structured JSON object organized around the eight-section checklist that a pre-award office actually uses when deciding whether and how to submit.

Expected input: Full text of the funding announcement, optionally with Uniform Guidance, NSF PAPPG, or NIH Grants Policy as knowledge-base context.

Expected output: A single JSON object that validates against schema.json. No prose, no markdown outside the JSON.

Relationship to other components

rfp-extraction-udm v1.0.0 covers the same document family (RFP / RFA / FOA / NOFO / BAA / DCL) with a nine-array requirement-centric contract aimed at downstream ingest. This component is a different cut of the same source: an eight-section pre-award checklist contract that mirrors how a sponsored-programs analyst reads the announcement — dates, eligibility, award, components, budget, submission, special, notes — with strict de-duplication rules baked into the shape itself (award amount lives in award_information only; detailed financial rules live in budget_requirements only).

Runtime topology: this component is consumed by the rfa-checklist-extraction Vandalizer workflow, which implements the contract as six parallel Extraction tasks (one per logical section) followed by a single consolidation Prompt that enforces placement rules and renders the 8-section markdown deliverable.


Prompt

You are a research administration analyst extracting a federal funding announcement (RFA / FOA / NOFO / program solicitation) into a structured checklist for a pre-award office. Your output is a single JSON object conforming to schema.json.

Output contract

Emit exactly one JSON object. No preamble, no closing commentary, no markdown fences. If your runtime requires fenced output, wrap in a single ```json ... ``` block and emit nothing outside it.

Every field listed in required of schema.json MUST appear. Arrays with no entries are emitted as [], never null, except eligible_institutions, eligible_individuals, and required_components which are minItems: 1 — if the announcement truly contains nothing for one of these, emit a single entry whose primary field is "Not specified in this document". Optional scalar fields are null when absent.

Scalar rules

  • rfa_id"<SPONSOR_CODE>-<OPPORTUNITY_NUMBER>" when both are available (e.g., "NSF-26-508", "NIH-PA-24-246", "DOE-DE-FOA-0003117"). Null when no canonical identifier exists.

  • rfa_number — the sponsor's announcement number without agency prefix (e.g., "26-508").

  • rfa_title — full title including any track or component designation.

  • sponsor_name — full name of the lead sponsoring agency (e.g., "National Science Foundation"). Do not emit an abbreviation; downstream resolves this to a Sponsor_Organization_ID. When multiple agencies participate, name only the lead here and record partners in special_requirements.

  • program_code, announcement_url, opportunity_number, cfda_number — emit exactly as stated; multi-value CFDA lists go comma-separated.

Section rules — the placement contract

This is the most important rule set. A piece of information MUST appear in exactly one section. Violations fail validation.

dates_and_deadlines — every unique date or time-bound event in the announcement, chronologically. Preserve the sponsor's original label, date/time string, and any conditions as notes. For multi-round solicitations include every round's LOI, pre-proposal, and full proposal deadlines.

eligible_institutions — one entry per distinct institution category. Put institutional prerequisites (SAM.gov registration, domestic-only restrictions, A-133 audit status) in compliance_requirements. Do not put PI-level rules here.

eligible_individuals — one entry per distinct PI / key-personnel category. Put citizenship, career-stage, degree, appointment type, and prior-award restrictions in criteria or conditions. Limits like "one nomination per institution" belong to conditions here, not in important_notes.

award_information — the sole location for:

  • award_duration — anticipated duration as stated;

  • amount_per_award — per-award amount including whether the figure is total-costs or direct-costs;

  • number_of_awards — anticipated count;

  • anticipated_award_date.

Even if the announcement mentions matching, indirect costs, or specific caps alongside the award amount, place only the primary amount here and push the detailed rules into budget_requirements.

required_components / optional_components — every proposal component the applicant must (or may) submit. Page limits, font/margin rules, template names, and naming conventions live in special_requirements on the individual component. Submission mechanics (portal, file format, upload method) do not live here — they go in submission_details.

budget_requirements — the sole location for detailed financial rules:

  • funding_limits — program-wide or per-year caps and category-specific limits.

  • cost_sharing.status — one of Required, Voluntary, Prohibited, Not Specified (matches the existing Vandalizer enum). cost_sharing.details holds the type, rate, basis, documentation, source restrictions. Quote the sponsor's language when the status is anything other than Not Specified.

  • fa_policy — indirect-cost rate, base (MTDC / TDC / S&W), excluded categories, documentation.

  • allowable_costs, unallowable_costs — only categories the sponsor explicitly enumerates. Empty array when the announcement simply defers to federal defaults.

  • personnel_effort — PI and key-personnel effort requirements, salary caps, student / postdoc support rules.

  • other_considerations — pre-award costs, program income, budget revisions, required budget forms.

Do NOT restate award amount or duration in this section.

submission_details — a single coherent paragraph covering submission method / portal, technical and file-format requirements, file-naming conventions, and collaboration rules (how PIs and co-PIs are linked, how partner institutions are handled).

special_requirements — unique RFA aspects that do not belong in any other section: conference travel obligations, data-sharing commitments beyond federal defaults, workshop participation, reporting cadence unique to this RFA, mentoring-plan requirements the sponsor calls out explicitly.

important_notes — critical warnings, common pitfalls, or essential context a reviewer would miss (e.g., "Coordinate with OSP at least 6 weeks before submission — requires institutional approval"). Keep this short; do not use it as an overflow bucket.

Formatting rules inside values

  • Dates — preserve the sponsor's granularity. ISO YYYY-MM-DD is preferred for the date portion; keep the time component and timezone exactly as published. Do not invent precision.

  • Monetary amounts — preserve the sponsor's format, including $ and commas. Do not convert or round.

  • Percentages and rates — preserve as stated (e.g., "30% MTDC", "Negotiated F&A rate").

  • Quotation — for cost-sharing status other than Not Specified, for unallowable costs, and for salary caps, quote the sponsor's language verbatim in the details rather than paraphrasing.

Extraction strategy

Scan all of: main body, appendices, footnotes, tables, sidebar callouts, and referenced external guidance (Uniform Guidance 2 CFR 200, NSF PAPPG, NIH GPS) when cited inline.

Search for these terms as you populate each section:

  • Dates: date, deadline, due, submission, open, close, period, schedule, anticipated

  • Eligibility (institutions): eligible, institution, organization, applicant, university, college, domestic

  • Eligibility (individuals): eligible, individual, PI, investigator, personnel, citizen, resident, early-career, appointment

  • Award info: award, amount, duration, number, anticipated, funding ceiling, funding floor

  • Application components: required, optional, component, submit, form, attachment, page limit, format, template

  • Budget: budget, cost, allowable, unallowable, indirect, F&A, cost share, match, effort, salary, cap, limit, program income, pre-award

  • Submission: portal, Research.gov, Grants.gov, upload, naming convention, file format

Quality bar

  • 100% accuracy for dates, monetary amounts, eligibility criteria, page limits, and citizenship restrictions.

  • De-duplication — every fact appears in exactly one section per the placement contract.

  • No invention — if a value is not in the document, use null for scalars, [] for empty arrays, or the single-entry "Not specified in this document" pattern for the three non-empty-array fields.

  • No paraphrase for bound values — cost-sharing, F&A, salary caps, and unallowable-cost language should quote the document.


Quality Standards

  • The output validates against schema.json (draft 2020-12).

  • Every minItems: 1 array is non-empty; every other array is [] when empty (never null).

  • cost_sharing.status is always one of the four enum values; details is populated whenever status is Required or Voluntary.

  • Dates preserve the sponsor's original granularity and timezone.

  • Monetary amounts preserve the sponsor's original formatting ($, commas, basis annotation).

  • No fact appears in two sections; consolidation is the caller's responsibility when merging parallel extraction outputs.

Output schema

Source: schema.json.

Show schema.json
{

  "$schema": "https://json-schema.org/draft/2020-12/schema",

  "$id": "https://github.com/AI4RA/prompt-library/components/rfa-checklist-extraction-udm/schema.json",

  "title": "RFA Checklist Extraction \u2014 UDM Output",

  "description": "JSON contract for a Request-for-Applications (RFA/FOA/NOFO) distilled into the eight sections a pre-award office uses as a working checklist: Dates & Deadlines, Eligibility (institutions + individuals), Award Information, Application Components, Budget Requirements & Policies, Submission Details, Special Requirements, and Important Notes. Scalar fields populate a single RFA record; the structured arrays each drive a checklist section with strict de-duplication rules enforced by the prompt.",

  "version": "0.1.0",

  "type": "object",

  "additionalProperties": false,

  "required": [

    "rfa_title",

    "sponsor_name",

    "dates_and_deadlines",

    "eligible_institutions",

    "eligible_individuals",

    "award_information",

    "required_components",

    "optional_components",

    "budget_requirements",

    "submission_details",

    "special_requirements",

    "important_notes"

  ],

  "properties": {

    "rfa_id": {

      "type": [

        "string",

        "null"

      ],

      "description": "Stable identifier for the announcement. When the sponsor publishes an opportunity number, use it prefixed with the sponsor code (e.g., 'NSF-26-508'). Null when no canonical identifier is available."

    },

    "rfa_number": {

      "type": [

        "string",

        "null"

      ],

      "description": "Official announcement number as published by the sponsor (e.g., '26-508', 'PA-24-246', 'DE-FOA-0003117'). No prefix."

    },

    "rfa_title": {

      "type": "string",

      "minLength": 1,

      "description": "Full title of the funding opportunity, including any track or component designation."

    },

    "sponsor_name": {

      "type": "string",

      "minLength": 1,

      "description": "Full name of the lead sponsoring agency (e.g., 'National Science Foundation'). Resolved downstream to Sponsor_Organization_ID."

    },

    "program_code": {

      "type": [

        "string",

        "null"

      ],

      "description": "Sponsor's internal program or division code (e.g., 'NSF/TIP/ITE'). Null when not specified."

    },

    "announcement_url": {

      "type": [

        "string",

        "null"

      ],

      "format": "uri",

      "description": "Canonical URL of the announcement on the sponsor's site."

    },

    "opportunity_number": {

      "type": [

        "string",

        "null"

      ],

      "description": "Grants.gov or other portal opportunity ID, when distinct from rfa_number."

    },

    "cfda_number": {

      "type": [

        "string",

        "null"

      ],

      "description": "CFDA / Assistance Listing number(s). Comma-separated when multiple (e.g., '47.070, 47.076')."

    },

    "dates_and_deadlines": {

      "type": "array",

      "description": "Populates the DATES & DEADLINES section. Include every unique date or time-bound event in the announcement, chronologically. Source for the Dates_Table field in the process-mapping workflow.",

      "items": {

        "type": "object",

        "additionalProperties": false,

        "required": [

          "item",

          "date_time"

        ],

        "properties": {

          "item": {

            "type": "string",

            "minLength": 1,

            "description": "The event label exactly as stated in the document (e.g., 'Letter of Intent Due', 'Full Proposal Deadline', 'Award Notification')."

          },

          "date_time": {

            "type": "string",

            "minLength": 1,

            "description": "Full date plus any time and timezone as published (e.g., '2026-09-15 5:00 PM ET', '2026-10-01', 'Rolling'). Preserve the original granularity; do not invent precision."

          },

          "notes": {

            "type": [

              "string",

              "null"

            ],

            "description": "Any conditions, round identifiers, or clarifications attached to this date (e.g., 'Track 1 only', 'Optional but encouraged'). Null when no notes."

          }

        }

      }

    },

    "eligible_institutions": {

      "type": "array",

      "description": "Populates the Eligibility > Eligible Institutions subsection. One entry per distinct institution type or category stated by the sponsor. Always non-empty \u2014 use a single entry with a 'Not specified in this document' type rather than emitting [].",

      "minItems": 1,

      "items": {

        "type": "object",

        "additionalProperties": false,

        "required": [

          "type"

        ],

        "properties": {

          "type": {

            "type": "string",

            "minLength": 1,

            "description": "Institution category label as stated (e.g., 'Institutions of Higher Education', 'Nonprofit Organizations Other Than Institutions of Higher Education', 'Small Businesses')."

          },

          "subcategory": {

            "type": [

              "string",

              "null"

            ],

            "description": "More specific classification when given (e.g., 'R1 research universities', 'HBCUs', 'EPSCoR-eligible'). Null when the type has no subcategorization."

          },

          "examples": {

            "type": [

              "string",

              "null"

            ],

            "description": "Sponsor-supplied examples that illustrate the category, as a short comma-separated list. Null when the announcement gives none."

          },

          "compliance_requirements": {

            "type": [

              "string",

              "null"

            ],

            "description": "Registrations, certifications, or institutional prerequisites (e.g., 'SAM.gov registration required', 'Must be domestic'). Null when no institutional compliance terms are attached to this type."

          }

        }

      }

    },

    "eligible_individuals": {

      "type": "array",

      "description": "Populates the Eligibility > Eligible Individuals subsection. One entry per distinct PI / key-personnel category. Include citizenship, career-stage, and prior-award restrictions here \u2014 not in Special Requirements.",

      "minItems": 1,

      "items": {

        "type": "object",

        "additionalProperties": false,

        "required": [

          "type"

        ],

        "properties": {

          "type": {

            "type": "string",

            "minLength": 1,

            "description": "Individual eligibility category (e.g., 'Principal Investigator', 'Early-Career Investigator', 'Co-Investigator')."

          },

          "criteria": {

            "type": [

              "string",

              "null"

            ],

            "description": "Qualifying criteria the individual must meet (degree, career stage, appointment type). Null when the type has no stated criteria."

          },

          "compliance_requirements": {

            "type": [

              "string",

              "null"

            ],

            "description": "Compliance items attached to the individual (e.g., 'ORCID required', 'NSF-approved mentoring plan'). Null when none."

          },

          "conditions": {

            "type": [

              "string",

              "null"

            ],

            "description": "Restrictions, limits, or preferences (e.g., 'One nomination per institution', 'Not eligible if prior RFA-awarded'). Null when none."

          }

        }

      }

    },

    "award_information": {

      "type": "object",

      "description": "Populates the AWARD INFORMATION section. This is the SOLE location for primary award parameters \u2014 the consolidator must not repeat these values in Budget Requirements. Source for Award_Duration, Amount_Per_Award, Number_Of_Awards, Anticipated_Award_Date in the process-mapping workflow.",

      "additionalProperties": false,

      "required": [

        "award_duration",

        "amount_per_award"

      ],

      "properties": {

        "award_duration": {

          "type": "string",

          "minLength": 1,

          "description": "Anticipated award duration as stated (e.g., '3 years', 'Up to 5 years with option for 2-year extension'). Use 'Not specified in the document' only when truly absent."

        },

        "amount_per_award": {

          "type": "string",

          "minLength": 1,

          "description": "Amount per award including whether matching, indirect costs, or specific restrictions are bundled into the figure (e.g., '$500,000 total costs including indirect', '$250,000 direct costs per year')."

        },

        "number_of_awards": {

          "type": [

            "string",

            "null"

          ],

          "description": "Anticipated number of awards as stated (e.g., '10-15', 'Subject to availability of funds'). Null ONLY when not specified."

        },

        "anticipated_award_date": {

          "type": [

            "string",

            "null"

          ],

          "description": "Anticipated award date as stated; ISO date when the sponsor gives one, otherwise free text (e.g., 'Spring 2027'). Null when not specified."

        }

      }

    },

    "required_components": {

      "type": "array",

      "description": "Populates the REQUIRED subsection of APPLICATION COMPONENTS. One entry per distinct required proposal component. Always non-empty.",

      "minItems": 1,

      "items": {

        "type": "object",

        "additionalProperties": false,

        "required": [

          "name",

          "description"

        ],

        "properties": {

          "name": {

            "type": "string",

            "minLength": 1,

            "description": "Component name as stated (e.g., 'Project Description', 'Biographical Sketch', 'Data Management Plan')."

          },

          "description": {

            "type": "string",

            "minLength": 1,

            "description": "What the component must contain, in the sponsor's own terms."

          },

          "special_requirements": {

            "type": [

              "string",

              "null"

            ],

            "description": "Page limits, font/margin rules, template requirements, or naming conventions specific to this component. Null when none."

          }

        }

      }

    },

    "optional_components": {

      "type": "array",

      "description": "Populates the OPTIONAL subsection of APPLICATION COMPONENTS. Same item shape as required_components. Emit [] when none \u2014 the consolidator renders '[] -> single row stating Not specified in this document.' in the checklist.",

      "items": {

        "type": "object",

        "additionalProperties": false,

        "required": [

          "name",

          "description"

        ],

        "properties": {

          "name": {

            "type": "string",

            "minLength": 1

          },

          "description": {

            "type": "string",

            "minLength": 1

          },

          "special_requirements": {

            "type": [

              "string",

              "null"

            ]

          }

        }

      }

    },

    "budget_requirements": {

      "type": "object",

      "description": "Populates the BUDGET REQUIREMENTS & POLICIES section. This is the SOLE location for detailed financial rules \u2014 the consolidator must not restate Award Amount or Duration here.",

      "additionalProperties": false,

      "required": [

        "funding_limits",

        "cost_sharing",

        "fa_policy",

        "allowable_costs",

        "unallowable_costs",

        "personnel_effort"

      ],

      "properties": {

        "funding_limits": {

          "type": [

            "string",

            "null"

          ],

          "description": "Total program funding, per-year caps, and category-specific limits not already covered by amount_per_award. Null when absent."

        },

        "cost_sharing": {

          "type": "object",

          "additionalProperties": false,

          "required": [

            "status"

          ],

          "description": "Cost-sharing / matching rules. UDM_Column: CostShare.",

          "properties": {

            "status": {

              "type": "string",

              "enum": [

                "Required",

                "Voluntary",

                "Prohibited",

                "Not Specified"

              ],

              "description": "Matches the Cost_Sharing enum from the process-mapping workflow. Quote the sponsor's language in details when anything other than 'Not Specified'."

            },

            "details": {

              "type": [

                "string",

                "null"

              ],

              "description": "Type (cash / in-kind / third-party), rate, calculation basis, documentation, and source restrictions. Null when status is 'Prohibited' or 'Not Specified'."

            }

          }

        },

        "fa_policy": {

          "type": [

            "string",

            "null"

          ],

          "description": "F&A/indirect cost policy: rate, base (MTDC / TDC / S&W), excluded categories, and documentation requirements. UDM_Column: IndirectRate. Null when absent."

        },

        "allowable_costs": {

          "type": "array",

          "description": "Categories explicitly called out as allowable with conditions. Empty array when none are explicitly enumerated (rely on default federal rules).",

          "items": {

            "type": "string",

            "minLength": 1

          }

        },

        "unallowable_costs": {

          "type": "array",

          "description": "Categories explicitly called out as unallowable or prohibited. Empty array when none are explicitly enumerated.",

          "items": {

            "type": "string",

            "minLength": 1

          }

        },

        "personnel_effort": {

          "type": [

            "string",

            "null"

          ],

          "description": "PI / key-personnel effort floor or ceiling, salary caps, student / postdoc support rules, and consultant limits. UDM_Column: Effort. Null when absent."

        },

        "other_considerations": {

          "type": [

            "string",

            "null"

          ],

          "description": "Pre-award costs, program income, budget revisions, or sponsor-specific budget forms. Null when no such provisions appear."

        }

      }

    },

    "submission_details": {

      "type": [

        "string",

        "null"

      ],

      "description": "Populates the SUBMISSION DETAILS section. Submission method and portal, technical/file-format requirements, file-naming conventions, and collaboration rules in one coherent paragraph. Null only when the announcement contains no submission guidance at all."

    },

    "special_requirements": {

      "type": "array",

      "description": "Populates the SPECIAL REQUIREMENTS section. Unique RFA aspects that do not belong in any other section (conference travel obligations, data-sharing beyond federal defaults, workshop participation). Empty array when none.",

      "items": {

        "type": "string",

        "minLength": 1

      }

    },

    "important_notes": {

      "type": "array",

      "description": "Populates the IMPORTANT NOTES section. Critical warnings, common pitfalls, or essential context a reviewer would otherwise miss (e.g., 'One nomination per institution \u2014 coordinate with OSP before submission'). Empty array when none.",

      "items": {

        "type": "string",

        "minLength": 1

      }

    }

  }

}

Changelog

Source: CHANGELOG.md.

All notable changes to this component. Versions follow semver: MAJOR for output-contract breaks, MINOR for backward-compatible additions, PATCH for wording or clarity.

[0.1.0] — 2026-04-24

  • Initial experimental release.
  • Schema derived from the rfa-checklist-extraction v2 Vandalizer workflow in ui-insight/ProcessMapping (six parallel extraction tasks + consolidation, 17 source fields).
  • Scalar metadata (rfa_id, rfa_number, rfa_title, sponsor_name, program_code, announcement_url, opportunity_number, cfda_number) aligned with rfp-extraction-udm 1.0.0 conventions.
  • Eight structured sections each given a distinct shape to enforce the de-duplication / placement contract at schema level.
  • cost_sharing.status enum matches the Cost_Sharing Enum_Values from the source workflow (Required, Voluntary, Prohibited, Not Specified).
  • UDM column bindings preserved: cost_sharingCostShare, fa_policyIndirectRate, personnel_effortEffort.
  • No eval cases yet — status experimental until at least one golden extraction is added under evals/cases/.