Skip to search formSkip to main contentSkip to account menu
- Corpus ID: 263310788
@inproceedings{Wynter2023IdLT, title={"I'd Like to Have an Argument, Please": Argumentative Reasoning in Large Language Models}, author={Adrian de Wynter and Tommy Yuan}, year={2023}, url={https://api.semanticscholar.org/CorpusID:263310788}}
- Adrian de Wynter, Tommy Yuan
- Published 29 September 2023
- Computer Science, Linguistics
This work evaluates two large language models (LLMs) ability to perform argumentative reasoning, and finds that scoring-wise the LLMs match or surpass the SOTA in AM and APE, and under certain I/O abstractions LLMs perform well, even beating chain-of-thought--the authors call this symbolic prompting.
3 Citations
2
1
Figures and Tables from this paper
- figure 1
- table 1
- figure 2
- table 2
- figure 3
- figure 4
- figure 5
- figure 6
- figure 7
- figure 8
- figure 9
- figure 10
3 Citations
- Shengxin HongLiang XiaoXin ZhangJian-Xing Chen
- 2024
Computer Science, Medicine
ArXiv
This paper presents a multi-agent framework called ArgMed-Agents, which aims to enable LLM-based agents to make explainable clinical decision reasoning through interaction and provides users with decision explanations that increase their confidence.
- 1
- Adrian de Wynter
- 2024
Computer Science
ArXiv
It is found that GPT-4 can play the game to a passable degree: it is able to manipulate doors, combat enemies, and perform pathing, but more complex prompting strategies involving multiple model calls provide better results.
- Federico CastagnaNadin KökciyanI. SassoonSimon ParsonsE. Sklar
- 2024
Computer Science
ArXiv
The present survey sifts through the literature to review papers concerning this kind of argumentation-based bot, drawing conclusions about the benefits and drawbacks that this approach entails in comparison with standard chatbots, while also envisaging possible future development and integration with the Transformer-based architecture and state-of-the-art Large Language models.
35 References
- Abulhair SaparovHe He
- 2023
Computer Science
ICLR
This work presents a new synthetic question-answering dataset called PrOntoQA, where each example is generated from a synthetic world model represented in first-order logic, and shows that LLMs are quite capable of making correct individual deduction steps, and so are generally capable of reasoning, even in fictional contexts.
- 149 [PDF]
- Jianzhu BaoJingyi SunQinglin ZhuRuifeng Xu
- 2022
Computer Science, Linguistics
ACL
This framework enables these two phases to be jointly trained in a single MRC model, thereby maximizing the mutual benefits of them and outperforming the state-of-the-art method.
- 9
- PDF
- J. OpitzP. HeinischPhilipp WiesenbachP. CimianoA. Frank
- 2021
Computer Science
ARGMINING
It is shown that Abstract Meaning Representation (AMR) graphs can be useful for representing arguments, and that novel AMR graph metrics can offer explanations for argument similarity ratings and make argument similarity judgements more interpretable and may even support argument quality judgements.
- 14
- PDF
- Albert WebsonEllie Pavlick
- 2022
Computer Science
NAACL
It is found that models can learn just as fast with many prompts that are intentionally irrelevant or even pathologically misleading as they do with instructively “good” prompts, and instruction-tuned models often produce good predictions with irrelevant and misleading prompts even at zero shots.
- 279 [PDF]
- Ryan LiuNihar B. Shah
- 2023
Computer Science, Linguistics
ArXiv
It is thought that LLMs have a promising use as reviewing assistants for specific reviewing tasks, but not for complete evaluations of papers or proposals.
- 28
- PDF
- Takeshi KojimaS. GuMachel ReidYutaka MatsuoYusuke Iwasawa
- 2022
Computer Science
NeurIPS
Experimental results demonstrate that the Zero-shot-CoT, using the same single prompt template, significantly outperforms zero-shot LLM performances on diverse benchmark reasoning tasks including arithmetics, symbolic reasoning, and other logical reasoning tasks, without any hand-crafted few-shot examples.
- 2,306 [PDF]
- Jason WeiXuezhi Wang Denny Zhou
- 2022
Computer Science
NeurIPS
Experiments on three large language models show that chain of thought prompting improves performance on a range of arithmetic, commonsense, and symbolic reasoning tasks.
- 4,312 [PDF]
- Adrian de WynterXun WangAlex SokolovQilong GuSi-Qing Chen
- 2023
Computer Science, Linguistics
Nat. Lang. Process. J.
- 13 [PDF]
- Liying ChengXingxuan LiLidong Bing
- 2023
Computer Science
EMNLP
This work regards GPT-4 as a data analyst to perform end-to-end data analysis with databases from a wide range of domains and designs several task-specific evaluation metrics to systematically compare the performance between several professional human data analysts and G PT-4.
- 52 [PDF]
- Jooyoung LeeThai LeJinghui ChenDongwon Lee
- 2023
Computer Science, Linguistics
WWW
Study of three types of plagiarism among GPT-2 generated texts, in comparison to its training data, and the plagiarism patterns of fine-tuned LMs with domain-specific corpora which are extensively used in practice suggest that the practicality of current LMs in mission-critical writing tasks is questioned.
- 48 [PDF]
...
...
Related Papers
Showing 1 through 3 of 0 Related Papers
Figure 7. AM symbolic (indices) prompt with one exemplar and CoT. Refer to Figure 6 for a longer version of the exemplar. This prompt performs step-by-step reasoning on AM by following a templatized generation and…
Published in 2023
"I'd Like to Have an Argument, Please": Argumentative Reasoning in Large Language Models
Adrian de WynterTommy Yuan
Figure 9 of 12