ChatGPT Loss of Context Affecting Users Study

September 2025 — December 2025

January 2025 — Present

January 2025 — Present

ChatGPT Needs Therapy

An HCI Case Study on Context Loss in Large Language Models by Omar Mnfy


This project investigates a core breakdown in AI-assisted work: when ChatGPT forgets the conversation’s context. Across 12 weeks, I ran pilot studies, think-aloud sessions, contextual inquiries, surveys, and prototype tests to understand why context loss happens, how users cope, and what design features might meaningfully improve long, multi-step interactions.

My full work includes research, low-fi prototyping, user testing, data analysis, and final design recommendations. This portfolio page presents the entire project as a coherent story.

1. Project Overview


Large Language Models excel in single-turn tasks, but students often rely on ChatGPT for multi-step academic workflows that require memory across turns. When ChatGPT drifts or forgets details, users repair context manually, lose trust, or abandon the task.


My research explores both the experience of context loss and design interventions that could improve reliability:


– Clarifying Questions
– Editable Conversation Memory
– Context Anchors
– A hybrid combined system

2. Research Goals


Across all studies, my goals were to:


  1. Understand how students detect and react to context loss.

  2. Identify emotional, behavioral, and cognitive impacts of drift.

  3. Observe real-time strategies for repairing lost context.

  4. Evaluate user interest in memory-aware features such as visible memory panels, fact-locking, and contextual anchors.

  5. Test multiple prototype concepts to determine what design patterns best support trust, continuity, and usability.

3. Methods Used


This portfolio highlights all research methods I applied and how each contributed unique insights.

• Contextual Inquiry (Think-Aloud Protocol)


Used to observe how users behave while ChatGPT begins drifting. This method allowed me to see frustration points, how they rebuild context, and what cues they rely on.

• Pilot Study

To refine the tasks, prompts, and observational structure before running the final sessions.

• Directed Storytelling

Used to collect past experiences of confusion, repair attempts, and coping strategies.

• Survey Research

Designed to measure frequency of context loss, disruption level, coping mechanisms, and interest in proposed features like fact-locking and visible memory.

• Low-Fidelity Paper Prototyping

A physical “paper phone” with interchangeable screens allowed me to test concepts cheaply while capturing honest reactions.

• Usability Testing with SUS and Likert Scales

Used to assess clarity, usability, cognitive load, and trust of the three prototype concepts.

• Semi-Structured Interviews

Post-task interviews captured emotional reactions and design preferences.

4. Participants

Across all stages, I tested with:

• 7 participants in the contextual study (in-person + virtual)
• 5 participants in the prototype test
• Survey respondents (students who use ChatGPT for academic work)

Participants represented a range of majors (Computer Science, Psychology, Biology, Business, English), ensuring diverse workflows and expectations.

5. Study 1 – Contextual Inquiry

Observing Users When ChatGPT Loses Context

Participants completed three tasks designed to trigger drift:

  1. Roast me using all previous details

  2. Identify an album image, then answer follow-up questions

  3. Write a structured three-scene play and check consistency details (ceramic duck, unseen assistant)

Key Observations

• Users attempted to patch context by rewriting bios, adding hint blocks, or resetting threads.
• Participants used creative repair techniques:
 – Creating a “Context Log”
 – Using checklists
 – Forcing assumptions echoing
• Trust dropped whenever ChatGPT invented details, contradicted itself, or ignored constraints.
• Most participants said a visible memory panel or constraint-locking system would significantly reduce mental load.

6. Study 2 – Survey Research

Measuring Frequency, Disruption, and Desired Features

The survey measured:

• How often ChatGPT loses context
• How disruptive it is
• What repair strategies users employ
• Interest in potential design features
• Preferences for transparency, rule-following, and reasoning visibility

Key Takeaways

• Most students encounter context loss “sometimes” to “often.”
• The most common coping strategy is rephrasing or starting a new chat.
• Students strongly support:
 – Visible memory panel
 – Fact-locking
 – Reasoning or assumption visibility
• Trust increases when ChatGPT follows constraints rigorously and avoids inventing facts.

7. Study 3 – Prototype Design and Testing

Three Concepts to Prevent Context Loss

I prototyped three low-fidelity interface concepts and evaluated them through think-aloud testing and SUS ratings.

Concept A: Clarifying Questions

ChatGPT asks targeted follow-up questions whenever the request is ambiguous.

Concept B: Editable Memory Panel

A small box displays what ChatGPT believes the conversation is about. Users can edit or correct it.

Concept C: Context Anchors

Users can set a persistent “anchor” summarizing the session’s purpose.

8. Prototype Findings

Usability Scores (1–7 scale)

Clarifying Questions: 5.8
Editable Memory: 5.8
Context Anchors: 6.0

What Users Said

• Clarifying questions help but can feel repetitive.
• Memory panel provides transparency but can be mentally taxing.
• Anchors feel natural, clean, and low-effort.
• No single solution fits all tasks; preferences vary by discipline.

9. Synthesis and Insights

Across all methods, four major themes emerged.

Insight 1: Users Want Control Over Context

Whether through clarifying questions or a memory panel, users want visibility and agency.

Insight 2: Cognitive Load Must Stay Low

Too many prompts or too much manual editing creates friction, even if it increases accuracy.

Insight 3: Contextual Drift Happens Across Tasks

Creative tasks, academic work, and coding all drift differently, suggesting multiple solutions rather than one.

Insight 4: Transparency Increases Trust

When the system shows assumptions or constraints, users forgive errors more easily.

10. The Hybrid Solution

The Most Promising Design Direction

Participants consistently gravitated toward a combined system:

• A small, unobtrusive context summary (always visible)
• Occasional clarifying questions triggered only by ambiguity
• An optional anchor for long conversations
• “Constraint-locking” or “fact-locking” for high-precision tasks
• A 1-click reset to prevent contamination across tasks

This hybrid approach balances transparency, speed, and cognitive effort.

11. Final Design Direction

If developed further, I would evolve this into:

• A modular context-management dashboard
• Visual layers showing what the model is using as its memory
• User-editable anchors and assumptions
• Auto-detection of drift with proactive correction prompts
• Modes based on the task type (creative vs precise)
• Error-checking features that cite earlier context before answering

This aligns with both user feedback and the needs uncovered in all studies.

12. Reflection on the Project

This project taught me how complexity reveals itself only when using the right methods. A simple complaint (“ChatGPT forgets things”) turned into a multi-layered human-computer interaction issue touching memory, trust, cognitive load, and interface design.

Choosing the Think-Aloud Protocol was crucial. Watching breakdowns happen live revealed behaviors that surveys alone would never show. The prototype phase showed that even lightweight sketches can clarify which ideas resonate.

If I extended this as an independent study, I would build a working mid-fi prototype and test it on longer academic workflows, such as essay writing or code debugging.

13. Acknowledgments

Thank you to all participants from Pitzer College and Pomona College who volunteered their time, shared their frustrations, and pushed this project in meaningful directions.
Gratitude to my classmates, collaborators, and instructors in CS120 Human-Centered Computing for critiques, discussions, and support throughout the project.

" Life's most persistent and urgent question is, 'What are you doing for others?' "

— Martin Luther King Jr.

" Life's most persistent and urgent question is, 'What are you doing for others?' "

— Martin Luther King Jr.

" Life's most persistent and urgent question is, 'What are you doing for others?' "

— Martin Luther King Jr.