52% are incorrect or 52% contain inaccuracies?
What does 'incorrect' even mean when referring to an LLM output?
What does 'incorrect' even mean when referring to an LLM output?
Excerpt from a study on programming expertise showing participants self-reported as mostly proficient programmers but only competent in ChatGPT usage, revealing a skill gap between traditional coding and AI tool proficiency.
Brief post positioning heteromation concept alongside the automation versus augmentation debate, referencing Ekbia's work.
To better understand and address tool use, we need to understand not 'accuracy' but interaction.
Analysis of Metzler et al.'s 'Rethinking Search' paper and its framing of LLMs as dilettantes, reframing the concept to consider how human dilettantes might interact with or as domain experts in search practice.
ChatGPT-4 seems able to now adequately address this specific stacking challenge?
Thinking about what if general-purpose web search were more like searching for directions...
A tweet from @zacharylipton: Prompts are not Lipschitz. There are no “small” changes to prompts...
Thinking about comparing and contrasting 'prompt injection' and the 'Google This Ploy' (Caulfield, 2019)
Thinking about hallucination with Klosterman, Leahu, Munk et al., Rettberg, and Powles & Nissenbaum.
Here is the 'OWASP Top 10 for Large Language Model Applications'. Overreliance is relevant to my research. (I’ve generally used the term “automation bias”, though perhaps a more direct term like overr...
Logan Kilpatrick just announced search in OpenAI's developer docs so I glanced at it and saw the hint to jump to the search bar (their DocSearch-Button-Keys): ⌘ & K. I appreciate such micro interactio...
Collection of tweets describing academic search workflow using Twitter, Semantic Scholar, local whoosh index, and various citation tools while exploring alternatives to Google Scholar.
This is an incomplete listing of a few academic search tools. See also: - my paper w/ Jake Goldenfein (Google Scholar – Platforming the scholarly economy (goldenfein2022platforming)) - my scholar prof...
This is partially about prompt engineering and partially about what a good essay or search does. More than answer a question, perhaps? (this is engaged with in the essay, though not to my liking). Gri...