← Back to Articles & Artefacts
artefactssouth

The Interrogative Turn: A Cross-Disciplinary Survey of Prompt Decomposition's Evolution from Instruction to Inquiry

IAIP Research
rch-ctx-polyphonic-discussion

The Interrogative Turn: A Cross-Disciplinary Survey of Prompt Decomposition's Evolution from Instruction to Inquiry

Synthesis Date: April 6, 2026 Document Type: Cross-disciplinary literature survey β€” Master synthesis Source Surveys: APE Methods Evolution Β· Computational Linguistics Β· Philosophy of AI Β· PDE State of the Art Β· Researcher Mapping


Abstract

Prompt decomposition engines (PDEs) β€” systems that break complex human intentions into structured sub-tasks for large language models β€” are undergoing a fundamental transformation. This cross-disciplinary survey synthesises findings from automated prompt engineering (APE), computational linguistics, philosophy of AI, PDE systems architecture, and academic researcher mapping to trace an evolution we term the interrogative turn: the shift from rigid, imperative instruction templates toward dynamic, conversational inquiry. Drawing on 120+ sources across five disciplinary perspectives, we demonstrate that this shift is simultaneously a technical progression (from static templates through automated optimisation to context engineering), a linguistic reconfiguration (from directive illocutionary force to interrogative question semantics), and a philosophical reorientation (from extractive instrumentalism to relational epistemology). We identify five major cross-disciplinary convergences β€” including the structural isomorphism between question semantics and tree-of-thought reasoning, the alignment between Gricean cooperative principles and evolutionary prompt optimisation, and the parallel between Indigenous relational epistemology and the emerging context-engineering paradigm. We surface three substantive tensions between disciplines and propose five novel research questions visible only at the intersection. The survey concludes that the interrogative turn is not merely a technical optimisation strategy but a reconstitution of the human-AI epistemic relationship, with implications that extend from system architecture to ethical design to the politics of knowledge production.


1. Introduction

The Research Question

How should we understand the evolution of prompt decomposition engines from structured instruction-processing systems to dynamic conversational inquiry systems β€” and what do we gain by holding technical, linguistic, and philosophical perspectives simultaneously?

This question sits at a consequential intersection. From the technical side, prompt engineering has undergone four paradigm shifts in six years: from manual crafting to automated generation, from static optimisation to dynamic context engineering, and from monolithic prompts to decomposed multi-agent systems. From the linguistic side, this evolution constitutes a change in the fundamental communicative act β€” from directive speech acts to interrogative ones β€” with measurable consequences for discourse structure, semantic type, and pragmatic presupposition. From the philosophical side, this shift reconfigures the human-AI relationship along epistemological, ontological, ethical, and phenomenological dimensions, raising questions about agency, understanding, and the nature of knowledge itself.

No single disciplinary lens captures what is happening. Technical surveys document what systems do without explaining why certain prompt structures work. Linguistic analyses identify the communicative mechanisms but remain agnostic about their philosophical implications. Philosophical treatments illuminate the deeper stakes but often lack grounding in the technical reality of how systems actually function. This synthesis holds all three perspectives simultaneously, seeking the emergent insights that become visible only at the intersection.

Why This Intersection Matters

Three developments make this cross-disciplinary survey timely:

  1. The technical maturation of PDE systems. By 2026, task decomposition has become "the central cognitive primitive of LLM-based agent systems" (Huang et al., 2024; Luo et al., 2025). Every major agent framework implements some form of complex-task-to-subtask breakdown. The question is no longer whether to decompose but how β€” and this "how" is increasingly conversational.

  2. The emergence of computational pragmatics for AI. Krause & Vossen (2024) provide the first comprehensive mapping of Gricean maxims to NLP; Hu et al. (2025) benchmark LLM pragmatic competence; Zeldes et al. (2025) extend Rhetorical Structure Theory to graph-based discourse. These developments supply the linguistic vocabulary needed to analyse prompt systems rigorously.

  3. The philosophical awakening to prompting. GonzΓ‘lez Arocha (2025) articulates a "critical phenomenology of prompting"; Djeffal (2025) proposes "reflexive prompt engineering" grounded in responsibility by design; Coeckelbergh (2025) publishes Communicative AI. Prompting is no longer beneath philosophical notice.

The central thesis we trace across all five disciplinary perspectives is this: prompt decomposition engines are evolving from rigid, structured instructions into dynamic, conversational inquiries β€” and this evolution constitutes not merely a technical improvement but a reconstitution of the epistemic relationship between humans and AI systems.


2. The Technical Evolution: From Templates to Conversations

2.1 The Chronological Arc

Synthesising the APE methods survey and PDE state-of-the-art survey reveals a coherent six-phase technical evolution:

Phase 1: Static Templates (Pre-2022). Prompts were hand-written, fixed strings. Few-shot examples were selected once and reused without adaptation (Brown et al., 2020). PDE systems did not exist as a distinct category; task decomposition was a manual design activity.

Phase 2: Structured Reasoning Chains (2022–2023). Chain-of-Thought prompting (Wei et al., 2022) introduced dynamic content generation within prompts β€” the model generates its own intermediate steps. However, the meta-structure ("think step by step") remained static. Least-to-Most (Zhou et al., 2023a) and Plan-and-Solve (Wang et al., 2023a) added decomposition logic but within pre-defined frameworks. Simultaneously, DecomP (Khot et al., 2023) established modular prompt decomposition as a formal paradigm, and HuggingGPT (Shen et al., 2023) demonstrated controller-dispatcher architectures with JSON-structured task plans.

Phase 3: Automated Optimisation of Static Prompts (2023–2024). The APE explosion: Zhou et al. (2023b) demonstrated LLMs generating human-level prompts; OPRO (Yang et al., 2024) established LLMs as meta-optimisers with trajectory tracking; DSPy (Khattab et al., 2024) reframed prompt engineering as compilation; PromptBreeder (Fernando et al., 2024) introduced self-referential evolutionary optimisation; TextGrad (Yuksekgonul et al., 2025) extended automatic differentiation to text. On the PDE side, ReAct (Yao et al., 2023a), Reflexion (Shinn et al., 2023), and LATS (Zhou et al., 2024) introduced interleaved decomposition-execution with self-reflective critique.

Key observation from the APE survey: This phase treats prompts as hyperparameters to be tuned, not as living artefacts that adapt during interaction. The optimisation process is dynamic but the resulting prompt is deployed as a static artefact.

Phase 4: Multi-Agent Decomposition Through Dialogue (2023–2024). CAMEL (Li et al., 2023) introduced inception prompting β€” agents prompting each other in role-play conversations β€” marking the first major system where decomposition is conversational rather than purely structural. ChatDev (Qian et al., 2024) added "communicative dehallucination" through dialogue chains. AutoGen (Wu et al., 2023) made decomposition literally a conversation β€” an event-driven, asynchronous message-passing architecture where task structure emerges from inter-agent dialogue.

Phase 5: Context Engineering (2024–2025). Anthropic's "context engineering" paradigm (2025) explicitly shifted the optimisation target from prompt text to the entire context window β€” memory, retrieved documents, tool outputs, conversation state. GEPA (Databricks/UC Berkeley, 2025) demonstrated that evolutionary prompt optimisation with LLM-driven reflection can make open-source models outperform proprietary frontier models at 90Γ— lower cost. TextGrad's dialogic feedback loops and GEPA's reflection-on-execution-traces represent increasingly conversational optimisation mechanisms.

Phase 6: Inquiry-Based Decomposition (2025–2026, Emerging). ACT (Google, 2025) trained agents to ask clarifying questions during multi-turn task dialogue. FATA (Beijing IST, 2025) reframed decomposition as structured inquiry β€” agents proactively generate comprehensive clarification checklists before producing any answer. The Tri-Agent Evaluation Framework (KDD 2025) measured decomposition quality through dialogue quality. These systems represent the convergence point: decomposition is inquiry.

2.2 The Structured-to-Conversational Spectrum

The PDE survey maps this evolution along a spectrum:

STRUCTURED ◄──────────────────────────────────────────────► CONVERSATIONAL

JSON schemas     Fixed pipelines    Role-play dialogues    Clarifying questions
DecomP           HuggingGPT         CAMEL                  ACT (Google)
TaskMatrix.AI    MetaGPT            AutoGen                FATA
DSPy             LangChain          ChatDev                Tri-Agent Eval
ToolChain*       Semantic Kernel    AgentVerse             (emerging 2025-26)

The metaphorical shift is telling: from following a recipe (2022–23) to team standup meeting (2023–24) to Socratic dialogue (2025–26). This is not merely a change in implementation but a change in the ontological status of the decomposition itself β€” from artefact to process, from plan to conversation.

2.3 The Critical Gap

Both the APE and PDE surveys converge on the same critical gap: no published framework yet optimises prompts as part of an ongoing, multi-turn conversational process. All current APE methods operate on static task definitions (APE survey). All current benchmarks evaluate end-to-end task completion, never decomposition quality per se (PDE survey). The conversational turn is happening in system design but remains untheorised and unmeasured in the academic literature.


3. The Linguistic Lens: What Language Theory Reveals

3.1 Prompts as Discourse Units

Rhetorical Structure Theory (Mann & Thompson, 1988) provides the first analytical framework. Complex prompts exhibit nucleus-satellite structure: the core instruction is the nucleus; context, constraints, and examples are satellites connected by coherence relations (Elaboration, Background, Condition). The extended eRST framework (Zeldes et al., 2025) β€” supporting graph-based, non-projective, concurrent discourse relations β€” is particularly relevant for multi-turn prompts where discourse relations cross turn boundaries.

Cross-disciplinary insight 1: When the APE survey describes DSPy's modular Signatures decomposing a pipeline into sub-modules, and the linguistics survey describes RST's hierarchical nucleus-satellite structure, they are describing isomorphic operations. DSPy's compilation is rhetorical restructuring β€” flattening a deep RST tree into a sequence of simpler nucleus-satellite pairs. This connection has not been made explicit in the literature.

3.2 The Illocutionary Shift

Speech act theory (Austin, 1962; Searle, 1969) reveals the interrogative turn as a change in illocutionary force. Under Searle's taxonomy, the shift moves from directives (getting the addressee to do something; world-to-words direction of fit) to questions (eliciting information; words-to-world direction of fit). The preparatory conditions change: directives presuppose the hearer can perform the action; questions presuppose the hearer knows the answer and the questioner does not (Computational Linguistics survey).

Gordon (2024) complicates this by arguing that LLMs are "conversational zombies" β€” they produce utterances with perlocutionary effects while lacking the intentionality required for genuine illocutionary force. Gubelmann (2024) reinforces this from a Kantian-pragmatist perspective. This creates a foundational tension: the interrogative turn assumes a communicative partner, but the linguistic evidence suggests that the "partner" cannot genuinely participate in communication.

3.3 Question Semantics and Decomposition Structure

Formal question semantics (Hamblin, 1973; Groenendijk & Stokhof, 1984) treats questions as denoting sets of possible answers or partitions of logical space. Applied to the interrogative turn, this framework reveals a structural consequence: imperative decomposition yields a sequence of sub-commands (do A, then B, then C), while interrogative decomposition yields a tree of sub-questions whose answer sets compose hierarchically β€” a structure that directly mirrors partition semantics.

Cross-disciplinary insight 2: This linguistic prediction is precisely what the PDE survey documents technically. Tree of Thoughts (Yao et al., 2023b) β€” where each "thought" is a candidate answer to an implicit sub-question β€” operationalises what question semantics predicts: that interrogative decomposition naturally produces tree structures. The linguistic framework explains why ToT achieves 74% on Game of 24 where linear CoT achieves 4%: tree-structured inquiry provides a richer semantic framework for navigating answer spaces than linear command sequences.

3.4 Gricean Pragmatics and Cooperative Optimisation

Krause & Vossen (2024) map Grice's (1975) four maxims onto NLP:

  • Quantity: LLMs frequently over-generate or under-specify. Prompt engineers compensate by explicitly specifying output constraints.
  • Quality: Hallucination is a Quality violation β€” asserting what is not believed true (though LLMs lack beliefs, the functional analogue holds).
  • Relation: Task-specific prompts improve relevance by narrowing the response space.
  • Manner: Structured output formatting (JSON, markdown) enforces Manner compliance.

Cross-disciplinary insight 3: The evolutionary prompt optimisation methods documented in the APE survey β€” PromptBreeder (Fernando et al., 2024), EvoPrompt (Guo et al., 2024), GEPA (2025) β€” can be understood as automated Gricean optimisation. Each generation of prompts is evaluated against task performance, which implicitly rewards Quantity-appropriate, Quality-accurate, Relation-relevant, and Manner-clear prompts. The evolutionary fitness function is an operationalised Cooperative Principle. This connection has not been drawn in the literature.

3.5 The Compositionality Gap

Press et al. (2023) identify a "compositionality gap" β€” LLMs can answer sub-questions correctly but fail to compose them. This is linguistically a failure of Fregean compositionality: the meaning of the whole does not derive from its parts and their mode of combination when processed by the model. Self-ask prompting, which has the model explicitly generate and answer follow-up sub-questions, outperforms imperative linear reasoning precisely because it externalises the compositional operation.

Cross-disciplinary insight 4: The compositionality gap provides the linguistic explanation for why task decomposition works at all. DecomP (Khot et al., 2023), Least-to-Most (Zhou et al., 2023a), and the entire PDE paradigm exist because LLMs fail at natural compositional inference. Decomposition is a pragmatic workaround for a semantic limitation β€” and the interrogative mode (asking sub-questions) outperforms the imperative mode (issuing sub-commands) because questions naturally produce the compositional structure that models cannot generate internally.

3.6 Information-Theoretic Foundations

Zhang & Cao (2025) demonstrate at ACL 2025 that prompts function as information selectors, determining which slice of the model's internal representation gets verbalised at each reasoning step. Task-specific prompts improve performance by over 50% compared to generic ones by efficiently routing information extraction. This provides formal grounding for why the linguistic choices in prompts matter β€” each word shapes an information extraction pathway through the model's representation space.


4. The Philosophical Lens: What's Really at Stake

4.1 Language Games and the Constitution of Interaction

Wittgenstein's (1953) concept of language games provides the most direct philosophical framework. Prompts are not neutral encodings of meaning but moves within language games β€” each establishing specific rules, expectations, and context. The shift from imperative to interrogative prompting is a shift between language games, not merely a change of register within one (Jolma, 2024; STRV, 2024).

In the command game: success equals accurate execution; the AI is an instrument; meaning is pre-determined by the commander; the interaction is asymmetric and closed. In the inquiry game: success equals productive dialogue and further questions; the AI is a respondent (quasi-other); meaning is co-constructed; the interaction is ideally more symmetric and open-ended.

This distinction is constitutive, not merely pragmatic. The command game and the inquiry game produce different kinds of knowledge, different kinds of interaction, and β€” the philosophy survey argues β€” different kinds of humans.

4.2 The Socratic Tension

Socratic questioning has been operationalised in LLM interaction design (Chang et al., 2023; SocraticAI, Princeton NLP, 2024). But philosophical analysis reveals a fundamental asymmetry: Socratic dialogue presupposes a co-inquirer capable of genuine aporia β€” the perplexity and recognition of ignorance that drives Socratic transformation. LLMs produce the appearance of inquiry without the epistemic conditions that make Socratic questioning transformative.

Cross-disciplinary insight 5: The self-ask method (Press et al., 2023) β€” where the model asks itself follow-up questions β€” is technically a Socratic decomposition. Linguistically, it converts implicit entailment relations into explicit interrogative discourse. Philosophically, it raises the Socratic tension: is the model genuinely inquiring, or performing inquiry without aporia? The technical success of self-ask (narrowing the compositionality gap) coexists with the philosophical impossibility of genuine machine questioning. This productive paradox β€” Socratic form without Socratic substance producing Socratic results β€” is visible only at the intersection of all three disciplines.

4.3 Agency Without Intelligence

Floridi's (2025) framework β€” AI as "agency without intelligence" β€” provides the sharpest epistemological framing. AI systems act in the infosphere and influence outcomes without possessing understanding, intentionality, or epistemic responsibility. The shift to inquiry-based prompting strengthens human epistemic agency: by asking questions rather than issuing commands, the human retains interpretive authority over the AI's outputs, resisting the displacement of epistemic responsibility to the machine.

4.4 Phenomenological Reorientation

Ihde's (1990) postphenomenological framework identifies four human-technology relations: embodiment, hermeneutic, alterity, and background. The philosophy survey demonstrates that the mode of prompting determines which relation obtains. Imperative prompts position AI in a hermeneutic relation (tool-like). Interrogative prompts push toward an alterity relation (quasi-other). The TU Delft analysis (2024) shows that ChatGPT disrupts standard postphenomenological categories by functioning simultaneously as hermeneutic agent and alterity.

GonzΓ‘lez Arocha's (2025) critical phenomenology of prompting captures this: prompts are "mediating spaces" where human intentionality, language, and sociopolitical structures converge. The mode of prompting determines the character of this mediating space.

4.5 Dialogical Philosophy and Simulated Polyphony

Bakhtin's dialogism reveals that LLM "polyphony" is an algorithmic monologism β€” the appearance of multiple voices produced by a single optimising mechanism (SciELO, 2025). Genuine polyphony requires irreducible, autonomous consciousnesses in dialogue. Buber's I-Thou/I-It framework (1923) suggests that current AI interactions remain fundamentally I-It, though conversational prompting may shift the human's orientation toward openness β€” an "as-if" I-Thou that has phenomenological significance even if ontologically incomplete (Hasse, 2017; Sholzman, 2024).

Cross-disciplinary insight 6: The multi-agent PDE systems documented in the technical survey (CAMEL, AutoGen, ChatDev, AgentVerse) instantiate architecturally what Bakhtin theorises as polyphony. Multiple "agents" converse, each with different roles, producing decomposition through dialogue. But the Bakhtinian critique holds: these are not genuinely autonomous consciousnesses but subroutines of a single optimising process. The question is whether this architectural polyphony β€” multiple agents with different prompts, producing different outputs, engaging in genuine disagreement β€” approximates dialogical conditions closely enough to produce the epistemic benefits that genuine dialogue provides. The technical evidence (CAMEL outperforming single-agent baselines; ChatDev's communicative dehallucination reducing errors) suggests that form can generate some of the benefits of substance.


5. Cross-Disciplinary Convergences

This section identifies the major points where all three disciplinary perspectives β€” technical, linguistic, and philosophical β€” converge on the same phenomenon, each illuminating a different aspect.

Convergence 1: The Isomorphism of Question Semantics, Tree Search, and Socratic Inquiry

Technical: Tree of Thoughts (Yao et al., 2023b) and LATS (Zhou et al., 2024) model reasoning as tree search over candidate intermediate steps, achieving dramatic improvements over linear approaches (4% β†’ 74% on Game of 24).

Linguistic: Hamblin (1973) and Groenendijk & Stokhof (1984) model questions as sets of possible answers or partitions of logical space. Interrogative decomposition naturally produces tree structures whose nodes are sub-questions and whose edges are entailment or refinement relations.

Philosophical: The Socratic method structures inquiry as a branching tree of questions, where each answer provokes new questions, and understanding emerges from the exploration of the tree (Chang et al., 2023).

Convergence: All three disciplines independently describe the same structure β€” a tree of inquiries whose exploration generates understanding. The technical ToT search, the formal question-semantic partition, and the Socratic inquiry tree are isomorphic structures. This suggests that the tree of questions is not merely a useful heuristic but a natural topology for inquiry that surfaces independently across computational, linguistic, and philosophical traditions.

Convergence 2: Cooperative Principles Across All Three Domains

Technical: GEPA's reflection mechanism (2025) β€” where an optimiser reads execution traces and proposes targeted improvements β€” is structurally a cooperative exchange between the optimiser and the system's behavioural log. TextGrad's feedback loops are iterative critique-and-revision dialogues.

Linguistic: Gricean pragmatics (1975) models cooperative communication through four maxims (Quantity, Quality, Relation, Manner). Krause & Vossen (2024) demonstrate that LLM outputs are evaluable against these maxims.

Philosophical: Floridi (2025) and Russo, Schliesser & Wagemans (2023) argue for cooperative epistemic relations between humans and AI β€” relations of trust, accountability, and mutual benefit rather than extraction.

Convergence: The technical optimisation process, the linguistic cooperative principle, and the philosophical ethics of cooperation describe the same requirement: effective human-AI interaction demands cooperative norms. The Gricean maxims are implicit fitness functions in evolutionary APE; the philosophical demand for epistemic cooperation is the normative justification for the technical and linguistic patterns.

Convergence 3: Context-Sensitivity as First Principle

Technical: Anthropic's context engineering (2025) shifts the optimisation target from prompt text to the entire informational environment. GEPA's Pareto-efficient selection allows runtime cost-appropriate context curation.

Linguistic: RST (Mann & Thompson, 1988) and discourse coherence theory (Hobbs, 1979; Asher & Lascarides, 2003) model how context determines meaning. Presupposition accommodation (van der Sandt, 1992) explains how contextual assumptions must be managed across multi-step interactions.

Philosophical: Wittgenstein's (1953) insistence that meaning is use-in-context; Wilson's (2008) Indigenous epistemological principle that knowledge cannot be separated from its context without distortion; Dreyfus's (1992) Heideggerian argument that genuine understanding requires embodied contextual engagement.

Convergence: Context-sensitivity is independently identified as the central principle by all three disciplines. The technical shift to context engineering, the linguistic demand for discourse coherence, and the philosophical insistence on contextual meaning are different expressions of the same insight: decontextualised instruction is fundamentally limited; effective interaction requires ongoing contextual attunement. Wilson's (2008) framing is the most radical: knowledge is constitutively relational and contextual, not incidentally so.

Convergence 4: The Compositional Imperative

Technical: DecomP (Khot et al., 2023), Least-to-Most (Zhou et al., 2023a), DSPy (Khattab et al., 2024), and the entire PDE paradigm are founded on composing complex tasks from simpler sub-tasks.

Linguistic: Fregean compositionality β€” the meaning of a whole derives from its parts and their mode of combination β€” provides the theoretical basis for decomposition. The compositionality gap (Press et al., 2023) demonstrates that LLMs fail precisely where compositional inference is required.

Philosophical: Dreyfus's (1972, 1992) critique explains why decomposition is necessary: LLMs lack the holistic, embodied understanding that would make it unnecessary. Decomposition accommodates the machine's lack of holistic comprehension by breaking meaning into syntactically manageable chunks.

Convergence: Compositionality is simultaneously the technical method, the linguistic principle, and the philosophical necessity. The PDE paradigm exists because of a semantic limitation (compositionality gap), addressed through a linguistic operation (compositional decomposition), made necessary by a philosophical reality (the machine's lack of holistic understanding).

Convergence 5: The Feedback Loop as Dialogical Structure

Technical: TextGrad (Yuksekgonul et al., 2025) implements backpropagation through natural-language feedback. Reflexion (Shinn et al., 2023) uses verbal self-assessment stored in episodic memory. GEPA (2025) uses LLM reflection on execution traces. All are technically feedback loops.

Linguistic: The Question Under Discussion (QUD) framework (Roberts, 2012) models discourse as organised around implicit questions that interlocutors collaboratively address. Multi-turn interaction creates discourse trees of QUDs. Each feedback cycle is a QUD resolution.

Philosophical: Gadamer's (1960) hermeneutic circle β€” understanding proceeds through iterative interpretation; Buber's (1923) dialogical encounter β€” genuine meeting requires responsive exchange; Bakhtin's (1929/1963) dialogism β€” meaning emerges between interlocutors, not within them.

Convergence: The technical feedback loop, the linguistic QUD resolution cycle, and the philosophical hermeneutic circle are the same structure viewed from different angles. TextGrad's iterate-critique-revise is a hermeneutic circle implemented in code. This convergence suggests that dialogue is not merely a useful interface pattern but the natural structure of iterative knowledge refinement.


6. Cross-Disciplinary Tensions

Tension 1: Is the Interrogative Turn Real or Metaphorical?

The technical evidence for conversational decomposition is strong but emerging. ACT (Google, 2025), FATA (2025), and the Tri-Agent Evaluation Framework (KDD 2025) demonstrate systems that decompose through inquiry. However, the APE survey classifies this as "speculative/frontier" β€” industry practice more than peer-reviewed finding. The great majority of deployed PDE systems remain structurally imperative: JSON schemas, function calls, fixed pipelines.

The linguistic analysis treats the interrogative turn as a genuine shift in illocutionary force with measurable semantic consequences. But Leidner & Plachouras (2023) showed that the relationship between linguistic form and LLM output quality is "task- and model-dependent" β€” neither naturalness nor lower perplexity reliably predicts effectiveness. The linguistic significance of the imperative-to-interrogative shift may not translate reliably into performance benefits.

The philosophical analysis treats the shift as constitutive β€” a fundamentally different kind of interaction. But this relies on phenomenological arguments about human experience that are difficult to operationalise or falsify technically.

The tension: Philosophy claims the shift is ontologically significant; linguistics claims it is semantically significant; the technical evidence is equivocal about whether it is practically significant in all contexts. The three disciplines may be correct at their respective levels of analysis while disagreeing about the practical implications.

Tension 2: Can Machines Genuinely Participate in Dialogue?

Linguistics analyses LLM outputs as speech acts and applies cooperative principles to human-LLM interaction, implicitly treating the model as a communicative partner β€” albeit one with systematic pragmatic deficits (Hu et al., 2025).

Philosophy divides sharply. Searle's Chinese Room (1980) and Ferrario & Loi (2026) insist that LLMs remain on the syntactic side of the divide β€” symbol manipulation without understanding. Dennett's (1987) intentional stance permits treating them "as if" they were intentional. Gordon (2024) coins "conversational zombies" β€” entities producing utterances with perlocutionary effects but lacking illocutionary intentionality.

Technical systems are designed as if machines are genuine dialogue participants. CAMEL's inception prompting, AutoGen's conversation-centric architecture, and ChatDev's communicative dehallucination all presuppose that inter-agent dialogue is meaningful. The technical success of these systems (outperforming single-agent baselines) suggests that the pragmatic treatment works even if the ontological question remains open.

The tension: Technical practice assumes communicative partnership; philosophy largely denies its possibility; linguistics occupies an uncomfortable middle ground where communicative analysis is productive but ontologically uncertain. The Bakhtinian critique β€” that LLM "polyphony" is algorithmic monologism β€” applies to every multi-agent system in the PDE survey, yet these systems demonstrably outperform monologic alternatives.

Tension 3: Decomposition as Liberation or Distortion?

The technical perspective treats decomposition as unequivocally beneficial β€” breaking complex tasks into manageable sub-tasks improves performance across every benchmark.

The linguistic perspective is more nuanced. Compositional semantics assumes that meaning composes cleanly, but discourse coherence theory (Hobbs, 1979; Asher & Lascarides, 2003) shows that decomposition can sever coherence relations, introduce presupposition failures, and lose discourse-level meaning that resides in the connections between parts, not in the parts themselves.

The philosophical perspective raises the deepest concern. The philosophy survey asks: "When a complex question is decomposed into sub-questions, is the knowledge produced equivalent to knowledge generated through holistic inquiry? Or does decomposition introduce a systematic epistemic distortion β€” analogous to what reductionism introduces in the philosophy of science?" Wilson's (2008) Indigenous epistemology emphasises that knowledge is relational and contextual; decomposition β€” by separating the whole into parts β€” may violate this relational integrity.

The tension: Technical systems assume decomposition preserves meaning; linguistics identifies conditions under which it does and does not; philosophy questions whether it ever fully does. This tension is practically consequential: it suggests that PDE systems may systematically distort certain kinds of knowledge β€” particularly holistic, contextual, relational knowledge β€” even while improving performance on decomposable tasks.


7. The Relational Turn: Indigenous Epistemology as Integrating Framework

7.1 From Extraction to Relation

The philosophy survey identifies the deepest framing of the interrogative turn through Indigenous relational epistemology (Wilson, 2008). The instruction paradigm is extractive: it treats AI as a repository from which knowledge can be extracted through properly formulated commands. This mirrors what Wilson identifies as Western extractive epistemology β€” knowledge as a resource to be mined, removed from its relational context, consumed by an individual knower. The inquiry paradigm is relational: it treats AI as a participant in a knowledge-generating process more akin to ceremony than extraction.

7.2 The CARE Principles as Design Framework

The CARE Principles for Indigenous Data Governance (Carroll et al., 2020) translate Indigenous epistemology into actionable design principles that bridge all three disciplinary perspectives:

  • Collective Benefit: Technically, this demands that PDE systems optimise for community outcomes, not just individual task completion. Linguistically, it requires prompts that frame interactions in terms of shared benefit rather than individual extraction. Philosophically, it operationalises Wilson's relational accountability.

  • Authority to Control: Technically, this requires human-in-the-loop mechanisms that preserve community agency over decomposition decisions. Linguistically, conversational decomposition (ACT, FATA) naturally supports ongoing community input through its iterative, dialogical structure. Philosophically, this preserves what Floridi (2025) insists upon: human epistemic authority.

  • Responsibility: Technically, this maps onto Djeffal's (2025) reflexive prompt engineering β€” ongoing ethical reflection at every stage. Linguistically, responsibility requires attention to the pragmatic presuppositions embedded in prompts (what worldviews do they encode?). Philosophically, this is Russo, Schliesser & Wagemans's (2023) "epistemology-cum-ethics" β€” the insistence that epistemic and ethical dimensions cannot be separated.

  • Ethics: Technically, this demands decomposition fairness and bias auditing β€” an entirely uninvestigated area per the PDE survey. Linguistically, it requires cross-linguistic and cross-cultural prompt design, also nearly absent from the literature. Philosophically, it demands what Coeckelbergh (2012) calls relational ethics β€” asking not "what is AI?" but "what moral relations are we growing with it?"

7.3 The Integrating Power of Relational Epistemology

Indigenous relational epistemology integrates the three disciplinary perspectives precisely because it is not a discipline but a way of knowing that refuses the boundaries between them:

  • It dissolves the tension between technical efficacy and philosophical meaning by insisting that how we interact with technology is constitutive of what kind of knowledge we produce.
  • It resolves the linguistic-philosophical dispute about machine communicative capacity by shifting the question from "can the machine communicate?" to "what kind of relations are we cultivating through this interaction?"
  • It addresses the decomposition-as-distortion concern by providing the principle that decomposition must preserve relational integrity β€” the connections between parts are as real as the parts themselves.

The IP//AI Position Paper (Lewis et al., 2020) and the growing Indigenous AI movement (Running Wolf, FLAIR; Abdilla, Old Ways New; the Abundant Intelligences programme at Concordia) demonstrate that this integration is not merely theoretical but is being practiced in research and community contexts.


8. Gap Analysis: What the Literature Hasn't Addressed

8.1 Gaps Visible Only at the Intersection

These research gaps emerge exclusively from the cross-disciplinary synthesis β€” no single survey identifies them:

Gap 1: No formal model connects linguistic decomposition structure to technical performance. The linguistics survey establishes that interrogative decomposition produces richer semantic structures (question trees vs. command sequences). The technical survey documents that tree-structured reasoning outperforms linear reasoning. But no work formally models why certain linguistic structures of decomposition produce better technical outcomes. A formal model connecting RST structure, question-semantic type, and LLM performance is missing.

Gap 2: No framework addresses the epistemological status of decomposed knowledge. Philosophy asks whether decomposition distorts holistic knowledge; linguistics identifies conditions for coherence preservation; technical systems assume compositional validity. No interdisciplinary framework provides criteria for when decomposition preserves and when it distorts the epistemic integrity of complex queries.

Gap 3: No cross-cultural or Indigenous framework for prompt decomposition exists. The linguistics survey notes that cross-linguistic prompt research is "nearly absent." The philosophy survey notes that "almost all philosophical analysis draws on Western philosophical traditions." The researcher mapping identifies Indigenous AI researchers (Running Wolf, Lewis, Abdilla) working on language technology and AI ethics. But no work applies Indigenous linguistic or epistemological frameworks specifically to prompt decomposition design.

Gap 4: No pragmatic competence requirements exist for conversational PDE systems. Hu et al. (2025) benchmark LLM pragmatic competence (finding systematic failures in implicature, presupposition, deixis). The PDE survey documents systems that require pragmatic sophistication (clarification, negotiation, disambiguation). But no work specifies what level of pragmatic competence a conversational PDE system requires, or how pragmatic failures in the decomposition dialogue propagate to task performance.

Gap 5: The phenomenology of human-PDE interaction is entirely unstudied. GonzΓ‘lez Arocha (2025) opens a phenomenology of prompting; the PDE survey documents increasingly conversational systems; Ihde's framework provides the analytical vocabulary. But no phenomenological study examines what it is like to interact with a PDE system β€” how the experience of collaborative task decomposition through dialogue differs from issuing commands, and what epistemic and ethical consequences follow.

8.2 Prioritised Research Questions

Five research questions that emerge only from the cross-disciplinary view, ordered by impact and tractability:

RQ1 (High impact, moderate tractability): Does the illocutionary force of decomposition prompts (imperative vs. interrogative) causally affect PDE system performance, and under what task-type and model conditions? This requires controlled experiments varying linguistic form while holding semantic content constant, measured against both task completion and decomposition quality metrics (which themselves need development).

RQ2 (High impact, high tractability): Can Gricean maxim violations in PDE inter-agent dialogue predict task failure before execution completes? This requires applying pragmatic annotation schemes to multi-agent conversation traces (from CAMEL, AutoGen, ChatDev) and correlating maxim violations with downstream failures β€” a feasible empirical study.

RQ3 (Very high impact, low tractability): What would a prompt decomposition framework grounded in Indigenous relational epistemology look like, and how would it differ from current systems in decomposition structure, evaluation criteria, and knowledge-production outcomes? This requires genuine collaboration with Indigenous communities and researchers, following the CARE Principles, and cannot be done extractively.

RQ4 (Moderate impact, moderate tractability): Under what formal conditions does task decomposition preserve the epistemic integrity of holistic queries, and when does it introduce systematic distortion? This bridges philosophy of science (holism vs. reductionism), formal semantics (compositionality conditions), and PDE engineering (decomposition quality metrics).

RQ5 (High impact, moderate tractability): How does the phenomenological experience of inquiry-based PDE interaction (collaborative decomposition through dialogue) differ from instruction-based interaction, and does this phenomenological difference correlate with measurable outcomes in epistemic agency, trust calibration, and critical engagement with AI outputs? This requires empirical phenomenological methods (interviews, think-aloud protocols) combined with quantitative measures.


9. The Research Landscape

9.1 Disciplinary Clusters

The researcher mapping reveals distinct disciplinary clusters with limited cross-pollination:

Technical APE cluster: Khattab (MIT/DSPy), Zhou (U of T/APE), Yao (Tencent/ToT/ReAct), Press (Princeton/Self-Ask), Wei (Meta/CoT). Concentrated at Stanford, MIT, Princeton, and industry labs. Strong engineering culture; limited engagement with linguistics or philosophy.

Computational linguistics cluster: Bender (U of Washington), Liu Xingbing (human-computer pragmatics), the ACL/EMNLP community. The emerging "Bridging HCI and NLP" workshop series (ACL) represents the closest venue to interdisciplinary work but lacks philosophical depth.

Philosophy of AI cluster: Floridi (Yale), Vallor (Edinburgh), Coeckelbergh (Vienna), Djeffal (FAccT community). Strong conceptual work but often disconnected from technical implementation details. Coeckelbergh's Communicative AI (2025) is the most technically engaged philosophical treatment.

Indigenous AI cluster: Lewis (Concordia/Abundant Intelligences), Running Wolf (McGill/Mila/FLAIR), Abdilla (Old Ways New), the IP//AI network. The most genuinely interdisciplinary group β€” bridging technology, epistemology, ethics, and community practice β€” but the smallest and least resourced.

9.2 Institutional Geography

Montreal emerges as the most concentrated hub for IAIP-relevant work: Mila (FLAIR, Indigenous AI Gathering, Bengio's AI safety work), Concordia (Abundant Intelligences, Lewis), McGill (Running Wolf). The Canadian landscape offers unique advantages: SSHRC funding mechanisms that support interdisciplinary and community-engaged work; institutional commitments to Indigenous research partnerships; geographic clustering that enables collaboration.

The gap is at the intersections. No lab or programme explicitly bridges APE methodology, computational linguistics of prompting, philosophy of AI communication, and Indigenous epistemology. The Abundant Intelligences programme at Concordia comes closest but is not focused on prompt decomposition specifically.

9.3 Conference Landscape

The researcher mapping identifies a fragmented conference landscape: technical work at ACL/EMNLP/NeurIPS/ICLR; ethics at FAccT/AIES; Indigenous AI at the Mila Indigenous AI Gathering and GIDA workshops; philosophy at SPT/IACAP and in journals like Philosophy & Technology. No single venue convenes all four perspectives. The KDD 2025 Workshop on Prompt Optimisation is the first dedicated APE venue but lacks linguistic and philosophical representation.


10. Conclusion

What We Know

The interrogative turn in prompt decomposition is real, documented across multiple levels of analysis, and accelerating. Technically, PDE systems are moving from structured instruction-processing toward conversational inquiry, with 2025–2026 systems (ACT, FATA, Tri-Agent) explicitly implementing decomposition-through-dialogue. Linguistically, this shift constitutes a change in illocutionary force from directive to interrogative, with measurable consequences for discourse structure, semantic type, and pragmatic presupposition. Philosophically, it reconfigures the human-AI relationship from instrumental extraction to something approaching β€” without reaching β€” relational inquiry.

The five cross-disciplinary convergences identified in this survey β€” the question-tree isomorphism, cooperative principles across domains, context-sensitivity as first principle, the compositional imperative, and the feedback loop as dialogical structure β€” demonstrate that the interrogative turn is not a local phenomenon but a structural transformation visible from every disciplinary angle.

What We Don't Know

Three substantive tensions remain unresolved: the practical significance of the interrogative turn (does it reliably improve outcomes?); the ontological status of machine dialogue (can machines genuinely participate?); and the epistemological integrity of decomposition (does it preserve or distort holistic knowledge?). The five cross-disciplinary gaps β€” no formal linguistic-to-technical performance model, no epistemological framework for decomposed knowledge, no Indigenous PDE framework, no pragmatic competence requirements, no phenomenology of PDE interaction β€” define the research frontier.

What We Should Investigate Next

The field needs:

  1. Empirical studies at the intersection β€” controlled experiments varying the linguistic form of decomposition prompts and measuring both technical performance and epistemic outcomes.
  2. Formal models that bridge disciplines β€” connecting RST structure, question semantics, and LLM performance; connecting Gricean maxim compliance and task success.
  3. Indigenous-led PDE design β€” following the CARE Principles, exploring what prompt decomposition looks like when grounded in relational epistemology rather than Western analytical decomposition.
  4. Phenomenological studies β€” understanding the lived experience of collaborative task decomposition with AI, and its consequences for epistemic agency and critical engagement.
  5. Interdisciplinary venues β€” creating spaces where APE engineers, computational linguists, philosophers, and Indigenous researchers can engage with each other's work directly.

The interrogative turn is ultimately a story about what kind of relationship we want with AI systems. The command game produces efficient tool-use. The inquiry game produces collaborative knowledge-making. The choice between them β€” and the hybrid forms that will inevitably emerge β€” is not merely technical but ethical, epistemological, and political. It determines not only what our AI systems do, but what kind of epistemic agents we become in our interactions with them.


Consolidated Bibliography

Foundational Works

Asher, N. & Lascarides, A. (2003). Logics of Conversation. Cambridge University Press.

Austin, J.L. (1962). How to Do Things with Words. Oxford University Press.

Bakhtin, M. (1929/1963). Problems of Dostoevsky's Poetics. Trans. C. Emerson. University of Minnesota Press, 1984.

Brown, T., et al. (2020). "Language Models Are Few-Shot Learners." NeurIPS 2020.

Buber, M. (1923). I and Thou. Trans. W. Kaufmann. Scribner, 1970.

Carroll, S.R., et al. (2020). "The CARE Principles for Indigenous Data Governance." Data Science Journal, 19(1), 43.

Coeckelbergh, M. (2012). Growing Moral Relations: Critique of Moral Status Ascription. Palgrave Macmillan.

Coeckelbergh, M. (2025). Communicative AI. Polity.

Dennett, D. (1987). The Intentional Stance. MIT Press.

Dreyfus, H. (1972). What Computers Can't Do. MIT Press.

Dreyfus, H. (1992). What Computers Still Can't Do. MIT Press.

Dreyfus, H. (2007). "Why Heideggerian AI Failed and How Fixing It Would Require Making It More Heideggerian." Philosophical Psychology, 20(2), 247–268.

Floridi, L. (2023). The Ethics of Artificial Intelligence: Principles, Challenges, and Opportunities. Oxford University Press.

Floridi, L. (2025). "AI as Agency without Intelligence." Philosophy & Technology, 38.

Gadamer, H.-G. (1960). Truth and Method. Trans. J. Weinsheimer & D.G. Marshall. Continuum, 2004.

Grice, H.P. (1975). "Logic and Conversation." In Syntax and Semantics 3: Speech Acts, 41–58.

Groenendijk, J. & Stokhof, M. (1984). Studies on the Semantics of Questions and the Pragmatics of Answers. PhD Dissertation, University of Amsterdam.

Hamblin, C.L. (1973). "Questions in Montague English." Foundations of Language, 10(1), 41–53.

Harman, G. (2002). Tool-Being: Heidegger and the Metaphysics of Objects. Open Court.

Heidegger, M. (1927). Being and Time. Trans. J. Macquarrie & E. Robinson. Blackwell, 1962.

Hobbs, J.R. (1979). "Coherence and Coreference." Cognitive Science, 3(1), 67–90.

Ihde, D. (1990). Technology and the Lifeworld: From Garden to Earth. Indiana University Press.

Levinas, E. (1961). Totality and Infinity. Trans. A. Lingis. Duquesne University Press, 1969.

Lewis, J.E., et al. (2020). Indigenous Protocol and Artificial Intelligence Position Paper. Concordia University.

Mann, W.C. & Thompson, S.A. (1988). "Rhetorical Structure Theory: Toward a Functional Theory of Text Organization." Text, 8(3), 243–281.

Roberts, C. (2012). "Information Structure in Discourse." Semantics and Pragmatics, 5(6), 1–69.

Searle, J.R. (1969). Speech Acts. Cambridge University Press.

Searle, J.R. (1980). "Minds, Brains, and Programs." Behavioral and Brain Sciences, 3(3), 417–424.

Wilson, S. (2008). Research is Ceremony: Indigenous Research Methods. Fernwood Publishing.

Wittgenstein, L. (1953). Philosophical Investigations. Trans. G.E.M. Anscombe. Blackwell.

APE and Prompt Optimisation

Anthropic (2025). "Effective Context Engineering for AI Agents." https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents

Besta, M., et al. (2024). "Graph of Thoughts: Solving Elaborate Problems with Large Language Models." AAAI 2024. arXiv:2308.09687.

Cui, W. & Zhang, J. (2025). "A Survey of Automatic Prompt Optimization with Instruction-focused Heuristic Search." ACL 2025 Findings. arXiv:2502.18746.

Fernando, C., et al. (2024). "Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution." ICLR 2024. arXiv:2309.16797.

GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning. Databricks/UC Berkeley (2025). arXiv:2507.19457.

Guo, Q., et al. (2024). "EvoPrompt: Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers." ICLR 2024. arXiv:2309.08532.

IBM (2026). "The 2026 Guide to Prompt Engineering." https://www.ibm.com/think/prompt-engineering

Khattab, O., et al. (2024). "DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines." ICLR 2024 Spotlight. arXiv:2310.03714.

Khattab, O., et al. (2024). "Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together." EMNLP 2024.

Li, et al. (2025). "A Survey of Automatic Prompt Engineering: An Optimization Perspective." arXiv:2502.11560.

Ramnath, et al. (2025). "A Systematic Survey of Automatic Prompt Optimization Techniques." EMNLP 2025. arXiv:2502.16923.

Shin, T., et al. (2020). "AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts." EMNLP 2020. arXiv:2010.15980.

Wang, L., et al. (2023a). "Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning." ACL 2023. arXiv:2305.04091.

Wang, X., et al. (2024). "PromptAgent: Strategic Planning with Language Models Enables Expert-Level Prompt Optimization." ICLR 2024. arXiv:2310.16427.

Wei, J., et al. (2022). "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models." NeurIPS 2022. arXiv:2201.11903.

Yang, C., et al. (2024). "Large Language Models as Optimizers (OPRO)." ICLR 2024. arXiv:2309.03409.

Yuksekgonul, M., et al. (2025). "TextGrad: Automatic 'Differentiation' via Text." Nature 2025. arXiv:2406.07496.

Zhang, Y., Sreedharan, S., & Kambhampati, S. (2023). "Meta Prompting for AI Systems." arXiv:2311.11482.

Zhou, D., et al. (2023a). "Least-to-Most Prompting Enables Complex Reasoning in Large Language Models." ICLR 2023. arXiv:2205.10625.

Zhou, Y., et al. (2023b). "Large Language Models Are Human-Level Prompt Engineers." ICLR 2023. arXiv:2211.01910.

Reasoning and Decomposition

Khot, T., et al. (2023). "Decomposed Prompting: A Modular Approach for Solving Complex Tasks." ICLR 2023. arXiv:2210.02406.

Press, O., et al. (2023). "Measuring and Narrowing the Compositionality Gap in Language Models." Findings of EMNLP 2023.

Yao, S., et al. (2023a). "ReAct: Synergizing Reasoning and Acting in Language Models." ICLR 2023. arXiv:2210.03629.

Yao, S., et al. (2023b). "Tree of Thoughts: Deliberate Problem Solving with Large Language Models." NeurIPS 2023. arXiv:2305.10601.

PDE Systems and Agent Architectures

Chen, W., et al. (2023). "AgentVerse: Facilitating Multi-Agent Collaboration." arXiv:2308.10848.

Google Research (2025). "Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training (ACT)."

Hong, S., et al. (2024). "MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework." ICLR 2024. arXiv:2308.00352.

Huang, X., et al. (2024). "Understanding the Planning of LLM Agents: A Survey." arXiv:2402.02716.

Li, G., et al. (2023). "CAMEL: Communicative Agents for 'Mind' Exploration of Large Language Model Society." NeurIPS 2023. arXiv:2303.17760.

Luo, J., et al. (2025). "Large Language Model Agent: A Survey." arXiv:2503.21460.

Qian, C., et al. (2024). "ChatDev: Communicative Agents for Software Development." ACL 2024.

Shen, Y., et al. (2023). "HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face." NeurIPS 2023. arXiv:2303.17580.

Shinn, N., et al. (2023). "Reflexion: Language Agents with Verbal Reinforcement Learning." NeurIPS 2023. arXiv:2303.11366.

Wang, Z., et al. (2023b). "DEPS: Describe, Explain, Plan and Select." NeurIPS 2023. arXiv:2302.01560.

Wu, Q., et al. (2023). "AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation." arXiv:2308.08155.

Zhou, A., et al. (2024). "Language Agent Tree Search (LATS)." ICML 2024. arXiv:2310.04406.

Zhuang, Y., et al. (2024). "ToolChain*: Efficient Action Space Navigation with A* Search." ICLR 2024. arXiv:2310.13227.

"First Ask Then Answer (FATA): A Framework Design for AI Dialogue Based on Proactive Clarification." (2025). arXiv:2508.08308.

Computational Linguistics of AI Interaction

Bunt, H. (2009). "The DIT++ Taxonomy for Functional Dialogue Acts." In Proceedings of EDAML 2009.

Chaves, A.P. & Gerosa, M.A. (2021). "How Should My Chatbot Interact?" International Journal of Human–Computer Interaction, 37(8), 729–758.

Gordon, J. (2024). "Speech Acts and Large Language Models." PhilArchive.

Gubelmann, R. (2024). "Large Language Models, Agency, and Why Speech Acts are Beyond Them (For Now)." Philosophy & Technology, 37, 45.

Hu, J., et al. (2025). "Pragmatics in the Era of Large Language Models." arXiv:2502.12378.

Ivison, H., et al. (2024). "From Language Modeling to Instruction Following." NAACL 2024.

Krause, L. & Vossen, P. (2024). "The Gricean Maxims in NLP β€” A Survey." INLG 2024.

Leidner, J.L. & Plachouras, V. (2023). "The Language of Prompting." Findings of EMNLP 2023.

Ma, Y., et al. (2024). "The Death and Life of Great Prompts." EMNLP 2024.

Markl, N. (2025). "Taxonomizing Representational Harms using Speech Act Theory." arXiv:2504.00928.

Zhang, J. & Cao, Y. (2025). "Why Prompt Design Matters and Works." ACL 2025.

Zeldes, A., et al. (2025). "eRST: A Signaled Graph Theory of Discourse Relations." Computational Linguistics, 51(1), 23–72.

(2025). "Applying the Gricean Maxims to a Human-LLM Interaction Cycle." arXiv:2503.00858.

Philosophy of AI

Aguas (2025). "Martin Buber's Philosophical Anthropology." Kritike, 36.

Chang, E.Y., et al. (2023). "Prompting Large Language Models With the Socratic Method." IEEE Access, 11.

Djeffal, C. (2025). "Reflexive Prompt Engineering." FAccT 2025. arXiv:2504.16204.

Ferrario, A. & Loi, M. (2026). "Are Large Language Models Intentional?" Philosophy & Technology, 39.

GonzΓ‘lez Arocha, J. (2025). "Critical Phenomenology of Prompting in AI." Sophia, 39.

Hasse, C. (2017). "Rethinking the I-You Relation through Dialogical Philosophy." AI & Society, 32, 467–479.

Noller, J. (2024). "Extended Human Agency." Humanities and Social Sciences Communications, 11.

Russo, F., Schliesser, E. & Wagemans, J. (2023). "Connecting Ethics and Epistemology of AI." AI & Society, 38.

Sholzman (2024). [Buber-AI dialogical analysis].

"Artificial Intelligence and Epistemic Justice: A Decolonial Turn." AI & Society (2026).

"ChatGPT, Postphenomenology, and the Human-Technology-Reality Relations." Journal of Human-Technology Relations, TU Delft (2024).

"Socratic Dialogue with Generative Artificial Intelligence." Springer (2025).

Bender, E.M., et al. (2021). "On the Dangers of Stochastic Parrots." FAccT 2021.

Floridi, L. (2014). The Fourth Revolution. Oxford University Press.

Vallor, S. (2016). Technology and the Virtues. Oxford University Press.

Vallor, S. (2024). The AI Mirror. Oxford University Press.

Surveys, Taxonomies, and Benchmarks

Jimenez, C.E., et al. (2024). "SWE-bench." https://www.swebench.com

Lou, R., et al. (2024). "Large Language Model Instruction Following: A Survey." Computational Linguistics, 50(3), 1053–1106.

Schulhoff, S., et al. (2024). "The Prompt Report: A Systematic Survey of Prompt Engineering Techniques." arXiv:2406.06608.

(2025). "A Comprehensive Taxonomy of Prompt Engineering Techniques." Frontiers of Computer Science, Springer.

Indigenous AI and Relational Frameworks

Abdilla, A., et al. (2025). "Envisioning Aboriginal and Torres Strait Islander AI Futures Communique."

Running Wolf, M. & Running Wolf, C. FLAIR β€” First Languages AI Reality. Mila–Quebec AI Institute.

UNESCO (2025). "Guidelines for Indigenous Data Sovereignty in AI Developments."

OHCHR (2025). "Indigenous Sovereignty in the AI Era."


This synthesis was produced as part of the IAIP Polyphonic Discussion research protocol, integrating five disciplinary survey agents into a unified cross-disciplinary analysis. It represents the state of the literature as of April 6, 2026.