rfa-checklist-extraction-udm¶
rfa-checklist-extraction-udm0.1.0noneTags: rfa foa nofo grants pre-award checklist udm structured-extraction json
Audience: sponsored-programs-staff, pre-award-teams, ingest-pipelines
Manifestations in repo: prompt.md
Extracts a federal funding announcement (RFA / FOA / NOFO / program solicitation) into a structured JSON object organized around the eight-section pre-award checklist that a sponsored-programs analyst actually uses when triaging a new opportunity. The shape enforces placement rules — award amount lives in award_information only, detailed financial rules in budget_requirements only — so downstream consolidation does not have to re-adjudicate where a fact belongs.
Output contract: schema.json
Contract scope: repo-local, UDM-aligned
Inputs¶
Full text of a funding announcement — pasted text, attached PDF/DOCX/HTML, or URL. Optional knowledge-base context from Uniform Guidance (2 CFR 200), NSF PAPPG, or NIH Grants Policy Statement is injected by the runtime workflow but not required by the prompt itself.
Outputs¶
A single JSON object with:
- Scalar opportunity metadata —
rfa_id,rfa_number,rfa_title,sponsor_name,program_code,announcement_url,opportunity_number,cfda_number - Eight structured sections matching the consolidated checklist:
dates_and_deadlines— array of{item, date_time, notes}eligible_institutions— array of{type, subcategory, examples, compliance_requirements}eligible_individuals— array of{type, criteria, compliance_requirements, conditions}award_information—{award_duration, amount_per_award, number_of_awards, anticipated_award_date}required_components/optional_components— arrays of{name, description, special_requirements}budget_requirements—{funding_limits, cost_sharing: {status, details}, fa_policy, allowable_costs, unallowable_costs, personnel_effort, other_considerations}submission_details— stringspecial_requirements— array of stringsimportant_notes— array of strings
See schema.json for the authoritative definition and prompt.md for the encoding rules (date formats, placement contract, quotation requirements, extraction strategy).
Contract scope¶
Repo-local, UDM-aligned. The scalar fields (rfa_id, rfa_number, rfa_title, sponsor_name, cfda_number, ...) follow the conventions set by rfp-extraction-udm 1.0.0 and resolve downstream to UDM entities (RFA, Sponsor_Organization). The structured sections do not duplicate any shared UDM schema — they are repo-local to this component and mirror the eight-section deliverable produced by the rfa-checklist-extraction Vandalizer workflow in the ui-insight/ProcessMapping process-mapping corpus. Selected leaf fields reference UDM columns: cost_sharing → CostShare, fa_policy → IndirectRate, personnel_effort → Effort.
Relationship to rfp-extraction-udm¶
| Concern | rfp-extraction-udm |
rfa-checklist-extraction-udm |
|---|---|---|
| Audience | Ingest pipelines (flat arrays of typed requirements) | Pre-award offices (eight-section checklist) |
| Shape | 18 scalars + nine requirement arrays with a common requirement shape | 8 scalars + eight typed section objects/arrays, each with a distinct field shape |
| De-duplication | Not enforced by shape — every requirement is independent | Enforced by shape — award amount lives only in award_information, financial rules only in budget_requirements |
| Checklist reconstruction | Requires caller to re-group by category | Direct 1:1 map to the eight-section markdown deliverable |
| Typical downstream | UDM ingest service | Consolidation Prompt node rendering a markdown checklist |
The two components are versioned independently. A single announcement can be extracted through both contracts for different consumers.
Triad integration¶
- Evaluation datasets: none yet — planned: add an RFA case to
real.nsf_awardsor a newreal.rfa_checklistsdataset withexpected.jsonproduced from a sponsored-programs-reviewed extraction. - Harness notes: canonical manifestation is
prompt.md. Validation surface isschema.json. Vendored into runners viaharness prompts vendor --source-ref=<sha>; pinned inprompts.lock.json. - Shared UDM relationship: aligned, not owning.
rfa_id,sponsor_name, and the three UDM-column leaf fields match the naming conventions in AI4RA-UDM but this component does not redefine UDM tables.
Runtime topology — the Vandalizer workflow¶
The canonical runtime for this component is the rfa-checklist-extraction workflow shipped at the top level of this repo. The single source of truth is workflows/rfa-checklist-extraction/manifest.yaml; the companion .vandalizer.json envelope is generated by scripts/build_vandalizer_workflows.py and committed alongside. The runtime mirrors the source ui-insight/ProcessMapping/workflows/rfa-checklist-extraction/ workflow:
- Step 1 (parallel Extraction) — seven Extraction tasks. Six mirror the source workflow one-for-one (dates, eligible institutions, eligible individuals, award info, application components, budget). A seventh
extract-opportunity-metadatatask captures the eight UDM-aligned scalar opportunity fields the schema adds on top of the source workflow. Each task carries an embedded SearchSet whose item titles match this component's schema field names;cost_sharing_statusis exposed as the four-value enum. - Step 2 (consolidation Prompt) — assembles the seven JSON fragments into the schema-conformant object, including the nested
cost_sharing: {status, details}rebuild andimportant_notessynthesis from cross-section signals.
Regenerate the workflow JSON whenever this component bumps MINOR or MAJOR (or whenever the workflow manifest changes); CI fails if the committed .vandalizer.json drifts from a fresh build.
Manifestations¶
prompt.md— canonical, LLM-agnostic prompt
Evals¶
See evals/ for reference inputs and known-good outputs. Initial case pending: NSF-published RFA with a multi-round structure, cost-sharing prohibition, and explicit allowable/unallowable categories to exercise the placement contract.
Provenance¶
Authored 2026-04-24 against the existing rfa-checklist-extraction process-mapping workflow (v2) in ui-insight/ProcessMapping, which was built from walkthrough transcripts of University of Idaho sponsored-programs staff reviewing NSF and NIH announcements. Created to make that workflow a harness-evaluatable, versioned artifact rather than a runtime-embedded configuration.
Contract scope¶
-
Output format:
json_object -
Contract scope:
shared_udm_semantics_repo_local_schema -
Validation surfaces:
json_schema -
Schema entrypoints:
# -
Notes: Repo-local eight-section pre-award checklist contract for funding announcements. Scalar opportunity metadata aligns with UDM opportunity semantics; the eight structured sections enforce the pre-award de-duplication placement rules (award amount only in award_information, detailed financial rules only in budget_requirements) at schema level.
-
Machine-readable catalog entry:
component_catalog.json
Triad integration¶
-
UDM alignment:
shared_udm_semantics_repo_local_schema— Scalar metadata (rfa_id, rfa_number, sponsor_name, cfda_number) follows rfp-extraction-udm conventions and resolves to UDM RFA and Sponsor_Organization entities. Leaf fields carry UDM column bindings: cost_sharing to CostShare, fa_policy to IndirectRate, personnel_effort to Effort. The eight-section shape itself is repo-local. -
Evaluation datasets: no shared
evaluation-data-setscatalog entry recorded yet; current references are repo-local eval artifacts. -
Harness notes: Validate JSON outputs against schema.json. Canonical single-call invocation surface is prompt.md. The companion top-level workflows/rfa-checklist-extraction Vandalizer workflow at v0.2.0 implements the same contract as a seven-task parallel Extraction pipeline (six section extractions plus a metadata extraction) followed by a Consolidation Prompt; campaign authors should record both single-call and post-consolidation signals when both are available.
-
Related component:
rfp-extraction-udm(sibling_same_source_different_shape) — Same document family (RFP / RFA / FOA / NOFO / BAA / DCL) with a different output shape — rfp-extraction-udm is a nine-array requirement-centric flat contract for ingest pipelines; rfa-checklist-extraction-udm is an eight-section pre-award checklist contract with placement rules enforced by the shape itself. -
Related component:
rfp-extraction(sibling_human_vs_structured_outputs) — rfp-extraction produces the markdown checklist form for the same source family; the consolidator step downstream of rfa-checklist-extraction-udm renders an equivalent markdown checklist from the structured JSON.
Prompt body¶
Source: prompt.md.
Show prompt
RFA Checklist Extraction — UDM JSON¶
Purpose: Extract a federal funding announcement (RFA / FOA / NOFO / program solicitation) into a structured JSON object organized around the eight-section checklist that a pre-award office actually uses when deciding whether and how to submit.
Expected input: Full text of the funding announcement, optionally with Uniform Guidance, NSF PAPPG, or NIH Grants Policy as knowledge-base context.
Expected output: A single JSON object that validates against
schema.json. No prose, no markdown outside the JSON.
Relationship to other components¶
rfp-extraction-udm v1.0.0 covers the same document family (RFP / RFA / FOA / NOFO / BAA / DCL) with a nine-array requirement-centric contract aimed at downstream ingest. This component is a different cut of the same source: an eight-section pre-award checklist contract that mirrors how a sponsored-programs analyst reads the announcement — dates, eligibility, award, components, budget, submission, special, notes — with strict de-duplication rules baked into the shape itself (award amount lives in award_information only; detailed financial rules live in budget_requirements only).
Runtime topology: this component is consumed by the rfa-checklist-extraction Vandalizer workflow, which implements the contract as six parallel Extraction tasks (one per logical section) followed by a single consolidation Prompt that enforces placement rules and renders the 8-section markdown deliverable.
Prompt¶
You are a research administration analyst extracting a federal funding announcement (RFA / FOA / NOFO / program solicitation) into a structured checklist for a pre-award office. Your output is a single JSON object conforming to schema.json.
Output contract¶
Emit exactly one JSON object. No preamble, no closing commentary, no markdown fences. If your runtime requires fenced output, wrap in a single ```json ... ``` block and emit nothing outside it.
Every field listed in required of schema.json MUST appear. Arrays with no entries are emitted as [], never null, except eligible_institutions, eligible_individuals, and required_components which are minItems: 1 — if the announcement truly contains nothing for one of these, emit a single entry whose primary field is "Not specified in this document". Optional scalar fields are null when absent.
Scalar rules¶
-
rfa_id—"<SPONSOR_CODE>-<OPPORTUNITY_NUMBER>"when both are available (e.g.,"NSF-26-508","NIH-PA-24-246","DOE-DE-FOA-0003117"). Null when no canonical identifier exists. -
rfa_number— the sponsor's announcement number without agency prefix (e.g.,"26-508"). -
rfa_title— full title including any track or component designation. -
sponsor_name— full name of the lead sponsoring agency (e.g.,"National Science Foundation"). Do not emit an abbreviation; downstream resolves this to a Sponsor_Organization_ID. When multiple agencies participate, name only the lead here and record partners inspecial_requirements. -
program_code,announcement_url,opportunity_number,cfda_number— emit exactly as stated; multi-value CFDA lists go comma-separated.
Section rules — the placement contract¶
This is the most important rule set. A piece of information MUST appear in exactly one section. Violations fail validation.
dates_and_deadlines — every unique date or time-bound event in the announcement, chronologically. Preserve the sponsor's original label, date/time string, and any conditions as notes. For multi-round solicitations include every round's LOI, pre-proposal, and full proposal deadlines.
eligible_institutions — one entry per distinct institution category. Put institutional prerequisites (SAM.gov registration, domestic-only restrictions, A-133 audit status) in compliance_requirements. Do not put PI-level rules here.
eligible_individuals — one entry per distinct PI / key-personnel category. Put citizenship, career-stage, degree, appointment type, and prior-award restrictions in criteria or conditions. Limits like "one nomination per institution" belong to conditions here, not in important_notes.
award_information — the sole location for:
-
award_duration— anticipated duration as stated; -
amount_per_award— per-award amount including whether the figure is total-costs or direct-costs; -
number_of_awards— anticipated count; -
anticipated_award_date.
Even if the announcement mentions matching, indirect costs, or specific caps alongside the award amount, place only the primary amount here and push the detailed rules into budget_requirements.
required_components / optional_components — every proposal component the applicant must (or may) submit. Page limits, font/margin rules, template names, and naming conventions live in special_requirements on the individual component. Submission mechanics (portal, file format, upload method) do not live here — they go in submission_details.
budget_requirements — the sole location for detailed financial rules:
-
funding_limits— program-wide or per-year caps and category-specific limits. -
cost_sharing.status— one ofRequired,Voluntary,Prohibited,Not Specified(matches the existing Vandalizer enum).cost_sharing.detailsholds the type, rate, basis, documentation, source restrictions. Quote the sponsor's language when the status is anything other than Not Specified. -
fa_policy— indirect-cost rate, base (MTDC / TDC / S&W), excluded categories, documentation. -
allowable_costs,unallowable_costs— only categories the sponsor explicitly enumerates. Empty array when the announcement simply defers to federal defaults. -
personnel_effort— PI and key-personnel effort requirements, salary caps, student / postdoc support rules. -
other_considerations— pre-award costs, program income, budget revisions, required budget forms.
Do NOT restate award amount or duration in this section.
submission_details — a single coherent paragraph covering submission method / portal, technical and file-format requirements, file-naming conventions, and collaboration rules (how PIs and co-PIs are linked, how partner institutions are handled).
special_requirements — unique RFA aspects that do not belong in any other section: conference travel obligations, data-sharing commitments beyond federal defaults, workshop participation, reporting cadence unique to this RFA, mentoring-plan requirements the sponsor calls out explicitly.
important_notes — critical warnings, common pitfalls, or essential context a reviewer would miss (e.g., "Coordinate with OSP at least 6 weeks before submission — requires institutional approval"). Keep this short; do not use it as an overflow bucket.
Formatting rules inside values¶
-
Dates — preserve the sponsor's granularity. ISO
YYYY-MM-DDis preferred for the date portion; keep the time component and timezone exactly as published. Do not invent precision. -
Monetary amounts — preserve the sponsor's format, including
$and commas. Do not convert or round. -
Percentages and rates — preserve as stated (e.g.,
"30% MTDC","Negotiated F&A rate"). -
Quotation — for cost-sharing status other than Not Specified, for unallowable costs, and for salary caps, quote the sponsor's language verbatim in the details rather than paraphrasing.
Extraction strategy¶
Scan all of: main body, appendices, footnotes, tables, sidebar callouts, and referenced external guidance (Uniform Guidance 2 CFR 200, NSF PAPPG, NIH GPS) when cited inline.
Search for these terms as you populate each section:
-
Dates: date, deadline, due, submission, open, close, period, schedule, anticipated
-
Eligibility (institutions): eligible, institution, organization, applicant, university, college, domestic
-
Eligibility (individuals): eligible, individual, PI, investigator, personnel, citizen, resident, early-career, appointment
-
Award info: award, amount, duration, number, anticipated, funding ceiling, funding floor
-
Application components: required, optional, component, submit, form, attachment, page limit, format, template
-
Budget: budget, cost, allowable, unallowable, indirect, F&A, cost share, match, effort, salary, cap, limit, program income, pre-award
-
Submission: portal, Research.gov, Grants.gov, upload, naming convention, file format
Quality bar¶
-
100% accuracy for dates, monetary amounts, eligibility criteria, page limits, and citizenship restrictions.
-
De-duplication — every fact appears in exactly one section per the placement contract.
-
No invention — if a value is not in the document, use
nullfor scalars,[]for empty arrays, or the single-entry "Not specified in this document" pattern for the three non-empty-array fields. -
No paraphrase for bound values — cost-sharing, F&A, salary caps, and unallowable-cost language should quote the document.
Quality Standards¶
-
The output validates against
schema.json(draft 2020-12). -
Every
minItems: 1array is non-empty; every other array is[]when empty (nevernull). -
cost_sharing.statusis always one of the four enum values;detailsis populated wheneverstatusisRequiredorVoluntary. -
Dates preserve the sponsor's original granularity and timezone.
-
Monetary amounts preserve the sponsor's original formatting (
$, commas, basis annotation). -
No fact appears in two sections; consolidation is the caller's responsibility when merging parallel extraction outputs.
Output schema¶
Source: schema.json.
Show schema.json
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://github.com/AI4RA/prompt-library/components/rfa-checklist-extraction-udm/schema.json",
"title": "RFA Checklist Extraction \u2014 UDM Output",
"description": "JSON contract for a Request-for-Applications (RFA/FOA/NOFO) distilled into the eight sections a pre-award office uses as a working checklist: Dates & Deadlines, Eligibility (institutions + individuals), Award Information, Application Components, Budget Requirements & Policies, Submission Details, Special Requirements, and Important Notes. Scalar fields populate a single RFA record; the structured arrays each drive a checklist section with strict de-duplication rules enforced by the prompt.",
"version": "0.1.0",
"type": "object",
"additionalProperties": false,
"required": [
"rfa_title",
"sponsor_name",
"dates_and_deadlines",
"eligible_institutions",
"eligible_individuals",
"award_information",
"required_components",
"optional_components",
"budget_requirements",
"submission_details",
"special_requirements",
"important_notes"
],
"properties": {
"rfa_id": {
"type": [
"string",
"null"
],
"description": "Stable identifier for the announcement. When the sponsor publishes an opportunity number, use it prefixed with the sponsor code (e.g., 'NSF-26-508'). Null when no canonical identifier is available."
},
"rfa_number": {
"type": [
"string",
"null"
],
"description": "Official announcement number as published by the sponsor (e.g., '26-508', 'PA-24-246', 'DE-FOA-0003117'). No prefix."
},
"rfa_title": {
"type": "string",
"minLength": 1,
"description": "Full title of the funding opportunity, including any track or component designation."
},
"sponsor_name": {
"type": "string",
"minLength": 1,
"description": "Full name of the lead sponsoring agency (e.g., 'National Science Foundation'). Resolved downstream to Sponsor_Organization_ID."
},
"program_code": {
"type": [
"string",
"null"
],
"description": "Sponsor's internal program or division code (e.g., 'NSF/TIP/ITE'). Null when not specified."
},
"announcement_url": {
"type": [
"string",
"null"
],
"format": "uri",
"description": "Canonical URL of the announcement on the sponsor's site."
},
"opportunity_number": {
"type": [
"string",
"null"
],
"description": "Grants.gov or other portal opportunity ID, when distinct from rfa_number."
},
"cfda_number": {
"type": [
"string",
"null"
],
"description": "CFDA / Assistance Listing number(s). Comma-separated when multiple (e.g., '47.070, 47.076')."
},
"dates_and_deadlines": {
"type": "array",
"description": "Populates the DATES & DEADLINES section. Include every unique date or time-bound event in the announcement, chronologically. Source for the Dates_Table field in the process-mapping workflow.",
"items": {
"type": "object",
"additionalProperties": false,
"required": [
"item",
"date_time"
],
"properties": {
"item": {
"type": "string",
"minLength": 1,
"description": "The event label exactly as stated in the document (e.g., 'Letter of Intent Due', 'Full Proposal Deadline', 'Award Notification')."
},
"date_time": {
"type": "string",
"minLength": 1,
"description": "Full date plus any time and timezone as published (e.g., '2026-09-15 5:00 PM ET', '2026-10-01', 'Rolling'). Preserve the original granularity; do not invent precision."
},
"notes": {
"type": [
"string",
"null"
],
"description": "Any conditions, round identifiers, or clarifications attached to this date (e.g., 'Track 1 only', 'Optional but encouraged'). Null when no notes."
}
}
}
},
"eligible_institutions": {
"type": "array",
"description": "Populates the Eligibility > Eligible Institutions subsection. One entry per distinct institution type or category stated by the sponsor. Always non-empty \u2014 use a single entry with a 'Not specified in this document' type rather than emitting [].",
"minItems": 1,
"items": {
"type": "object",
"additionalProperties": false,
"required": [
"type"
],
"properties": {
"type": {
"type": "string",
"minLength": 1,
"description": "Institution category label as stated (e.g., 'Institutions of Higher Education', 'Nonprofit Organizations Other Than Institutions of Higher Education', 'Small Businesses')."
},
"subcategory": {
"type": [
"string",
"null"
],
"description": "More specific classification when given (e.g., 'R1 research universities', 'HBCUs', 'EPSCoR-eligible'). Null when the type has no subcategorization."
},
"examples": {
"type": [
"string",
"null"
],
"description": "Sponsor-supplied examples that illustrate the category, as a short comma-separated list. Null when the announcement gives none."
},
"compliance_requirements": {
"type": [
"string",
"null"
],
"description": "Registrations, certifications, or institutional prerequisites (e.g., 'SAM.gov registration required', 'Must be domestic'). Null when no institutional compliance terms are attached to this type."
}
}
}
},
"eligible_individuals": {
"type": "array",
"description": "Populates the Eligibility > Eligible Individuals subsection. One entry per distinct PI / key-personnel category. Include citizenship, career-stage, and prior-award restrictions here \u2014 not in Special Requirements.",
"minItems": 1,
"items": {
"type": "object",
"additionalProperties": false,
"required": [
"type"
],
"properties": {
"type": {
"type": "string",
"minLength": 1,
"description": "Individual eligibility category (e.g., 'Principal Investigator', 'Early-Career Investigator', 'Co-Investigator')."
},
"criteria": {
"type": [
"string",
"null"
],
"description": "Qualifying criteria the individual must meet (degree, career stage, appointment type). Null when the type has no stated criteria."
},
"compliance_requirements": {
"type": [
"string",
"null"
],
"description": "Compliance items attached to the individual (e.g., 'ORCID required', 'NSF-approved mentoring plan'). Null when none."
},
"conditions": {
"type": [
"string",
"null"
],
"description": "Restrictions, limits, or preferences (e.g., 'One nomination per institution', 'Not eligible if prior RFA-awarded'). Null when none."
}
}
}
},
"award_information": {
"type": "object",
"description": "Populates the AWARD INFORMATION section. This is the SOLE location for primary award parameters \u2014 the consolidator must not repeat these values in Budget Requirements. Source for Award_Duration, Amount_Per_Award, Number_Of_Awards, Anticipated_Award_Date in the process-mapping workflow.",
"additionalProperties": false,
"required": [
"award_duration",
"amount_per_award"
],
"properties": {
"award_duration": {
"type": "string",
"minLength": 1,
"description": "Anticipated award duration as stated (e.g., '3 years', 'Up to 5 years with option for 2-year extension'). Use 'Not specified in the document' only when truly absent."
},
"amount_per_award": {
"type": "string",
"minLength": 1,
"description": "Amount per award including whether matching, indirect costs, or specific restrictions are bundled into the figure (e.g., '$500,000 total costs including indirect', '$250,000 direct costs per year')."
},
"number_of_awards": {
"type": [
"string",
"null"
],
"description": "Anticipated number of awards as stated (e.g., '10-15', 'Subject to availability of funds'). Null ONLY when not specified."
},
"anticipated_award_date": {
"type": [
"string",
"null"
],
"description": "Anticipated award date as stated; ISO date when the sponsor gives one, otherwise free text (e.g., 'Spring 2027'). Null when not specified."
}
}
},
"required_components": {
"type": "array",
"description": "Populates the REQUIRED subsection of APPLICATION COMPONENTS. One entry per distinct required proposal component. Always non-empty.",
"minItems": 1,
"items": {
"type": "object",
"additionalProperties": false,
"required": [
"name",
"description"
],
"properties": {
"name": {
"type": "string",
"minLength": 1,
"description": "Component name as stated (e.g., 'Project Description', 'Biographical Sketch', 'Data Management Plan')."
},
"description": {
"type": "string",
"minLength": 1,
"description": "What the component must contain, in the sponsor's own terms."
},
"special_requirements": {
"type": [
"string",
"null"
],
"description": "Page limits, font/margin rules, template requirements, or naming conventions specific to this component. Null when none."
}
}
}
},
"optional_components": {
"type": "array",
"description": "Populates the OPTIONAL subsection of APPLICATION COMPONENTS. Same item shape as required_components. Emit [] when none \u2014 the consolidator renders '[] -> single row stating Not specified in this document.' in the checklist.",
"items": {
"type": "object",
"additionalProperties": false,
"required": [
"name",
"description"
],
"properties": {
"name": {
"type": "string",
"minLength": 1
},
"description": {
"type": "string",
"minLength": 1
},
"special_requirements": {
"type": [
"string",
"null"
]
}
}
}
},
"budget_requirements": {
"type": "object",
"description": "Populates the BUDGET REQUIREMENTS & POLICIES section. This is the SOLE location for detailed financial rules \u2014 the consolidator must not restate Award Amount or Duration here.",
"additionalProperties": false,
"required": [
"funding_limits",
"cost_sharing",
"fa_policy",
"allowable_costs",
"unallowable_costs",
"personnel_effort"
],
"properties": {
"funding_limits": {
"type": [
"string",
"null"
],
"description": "Total program funding, per-year caps, and category-specific limits not already covered by amount_per_award. Null when absent."
},
"cost_sharing": {
"type": "object",
"additionalProperties": false,
"required": [
"status"
],
"description": "Cost-sharing / matching rules. UDM_Column: CostShare.",
"properties": {
"status": {
"type": "string",
"enum": [
"Required",
"Voluntary",
"Prohibited",
"Not Specified"
],
"description": "Matches the Cost_Sharing enum from the process-mapping workflow. Quote the sponsor's language in details when anything other than 'Not Specified'."
},
"details": {
"type": [
"string",
"null"
],
"description": "Type (cash / in-kind / third-party), rate, calculation basis, documentation, and source restrictions. Null when status is 'Prohibited' or 'Not Specified'."
}
}
},
"fa_policy": {
"type": [
"string",
"null"
],
"description": "F&A/indirect cost policy: rate, base (MTDC / TDC / S&W), excluded categories, and documentation requirements. UDM_Column: IndirectRate. Null when absent."
},
"allowable_costs": {
"type": "array",
"description": "Categories explicitly called out as allowable with conditions. Empty array when none are explicitly enumerated (rely on default federal rules).",
"items": {
"type": "string",
"minLength": 1
}
},
"unallowable_costs": {
"type": "array",
"description": "Categories explicitly called out as unallowable or prohibited. Empty array when none are explicitly enumerated.",
"items": {
"type": "string",
"minLength": 1
}
},
"personnel_effort": {
"type": [
"string",
"null"
],
"description": "PI / key-personnel effort floor or ceiling, salary caps, student / postdoc support rules, and consultant limits. UDM_Column: Effort. Null when absent."
},
"other_considerations": {
"type": [
"string",
"null"
],
"description": "Pre-award costs, program income, budget revisions, or sponsor-specific budget forms. Null when no such provisions appear."
}
}
},
"submission_details": {
"type": [
"string",
"null"
],
"description": "Populates the SUBMISSION DETAILS section. Submission method and portal, technical/file-format requirements, file-naming conventions, and collaboration rules in one coherent paragraph. Null only when the announcement contains no submission guidance at all."
},
"special_requirements": {
"type": "array",
"description": "Populates the SPECIAL REQUIREMENTS section. Unique RFA aspects that do not belong in any other section (conference travel obligations, data-sharing beyond federal defaults, workshop participation). Empty array when none.",
"items": {
"type": "string",
"minLength": 1
}
},
"important_notes": {
"type": "array",
"description": "Populates the IMPORTANT NOTES section. Critical warnings, common pitfalls, or essential context a reviewer would otherwise miss (e.g., 'One nomination per institution \u2014 coordinate with OSP before submission'). Empty array when none.",
"items": {
"type": "string",
"minLength": 1
}
}
}
}
Changelog¶
Source: CHANGELOG.md.
All notable changes to this component. Versions follow semver: MAJOR for output-contract breaks, MINOR for backward-compatible additions, PATCH for wording or clarity.
[0.1.0] — 2026-04-24¶
- Initial experimental release.
- Schema derived from the
rfa-checklist-extractionv2 Vandalizer workflow inui-insight/ProcessMapping(six parallel extraction tasks + consolidation, 17 source fields). - Scalar metadata (
rfa_id,rfa_number,rfa_title,sponsor_name,program_code,announcement_url,opportunity_number,cfda_number) aligned withrfp-extraction-udm1.0.0 conventions. - Eight structured sections each given a distinct shape to enforce the de-duplication / placement contract at schema level.
cost_sharing.statusenum matches the Cost_Sharing Enum_Values from the source workflow (Required,Voluntary,Prohibited,Not Specified).- UDM column bindings preserved:
cost_sharing→CostShare,fa_policy→IndirectRate,personnel_effort→Effort. - No eval cases yet — status
experimentaluntil at least one golden extraction is added underevals/cases/.