ChatGPT-like sessions with highlighting and navigation, focused on simplicity and readability.
Parley is a streamlined LLM chat plugin for NeoVIM, focusing exclusively on providing a clean and efficient interface for conversations with AI assistants. Imagine having full transcript of a chat session with ChatGPT (or Anthropic, or Gemini) that allows editing of all questions, and answers themselves! I created this as a way to construct research report, improve my understanding of new topics. It's a researcher's notebook.
- Streamlined Chat Experience
- Markdown-formatted chat transcripts with syntax highlighting
- Question/response highlighting with custom colors
- Navigate chat Q&A exchanges and code blocks using Telescope outline
- Easy keybindings for creating and managing chats
- Streaming responses
- No spinner wheel and waiting for the full answer
- Response generation can be canceled half way through
- Properly working undo (response can be undone with a single
u
)
- Minimum dependencies (
neovim
,curl
,grep
)- Zero dependencies on other lua plugins to minimize chance of breakage
- ChatGPT like sessions
- Just good old neovim buffers formatted as markdown with autosave
- Chat finder - management pop-up for searching, previewing, deleting and opening chat sessions
- A live document
- Refresh answers on any questions
- Insert questions in the middle of the transcript and expand with assistant's answers
- You have the full NeoVIM behind you.
- Referencing other local files, for example, to get critics for that file and ask questions about them, essentially adding context.
Each chat transcript is really just a markdown file, with some additional conventions. So think them as markdown files with benefits (of Parley).
-
There is a header section that contains metadata and can override configuration parameters:
- Standard metadata like
file: filename.md
(required) - Model information like
model: {"model":"gpt-4o","temperature":1.1,"top_p":1}
- Provider information like
provider: openai
- Configuration overrides like
max_full_exchanges: 20
to customize behavior for this specific chat (controls how many full exchanges to keep before summarizing) - Raw mode settings like
raw_mode.show_raw_response: true
to display raw JSON responses
- Standard metadata like
-
User's questions and Assistant's answers take turns.
-
A question is a line starting with π¬:, and all following lines until next answer.
-
An Answer is a line starting with π€:, and all following lines until next question.
-
Two special lines in answers. Those are states maintained by Parley, and not designed for human consumption. They are grayed out by default.
- The first is the Assistant's reasoning output, prefixed with π§ :.
- The second is the summary of one chat exchange prefixed with π:, in the format of "you asked ..., I answered ...".
- We keep those two lines in the transcript itself for simplicity, so that one transcript file's hermetic.
-
Smart memory management:
- By default, Parley keeps only a certain number of recent exchanges (controlled by
max_full_exchanges
) and summarizes older ones to maintain context within token limits. - Exchanges (both question and answer) that include file references (@@filename) are always preserved in full, regardless of their age, ensuring file content context is maintained throughout the conversation.
- By default, Parley keeps only a certain number of recent exchanges (controlled by
-
File and directory inclusion: a line that starts with @@ followed by a path will automatically load content into the prompt when sending to the LLM. This works in several ways:
@@/path/to/file.txt
- Include a single file@@/path/to/directory/
- Include all files in a directory (non-recursive)@@/path/to/directory/*.lua
- Include all matching files in a directory (non-recursive)@@/path/to/directory/**/
- Include all files in a directory and its subdirectories (recursive)@@/path/to/directory/**/*.lua
- Include all matching files in a directory and its subdirectories (recursive)
All included files are displayed with line numbers for easier reference and navigation. This makes it simpler for the LLM to reference specific code locations in its responses.
You can open referenced files or directories directly by placing the cursor on the line with the @@ syntax and pressing
<C-g>o
. For directories or glob patterns, this will open the file explorer. Use this feature when you want LLM to help you understand, debug, or improve existing code. -
Markdown code blocks: Parley provides utilities for working with code blocks in markdown:
- Copy code block to clipboard: Place your cursor inside a markdown code block and press
<leader>gy
- Save code block to file: Position your cursor inside a code block and press
<leader>gs
(if the code block has afile="filename"
attribute, it will use that name; otherwise, it will prompt for a filename) - Execute code block in terminal: With your cursor inside a code block, press
<leader>gx
to run the code in a split terminal window - Copy terminal output: After executing a code block with
<leader>gx
, press<leader>gc
while in the terminal buffer to copy the entire terminal output to the clipboard - Copy terminal output from chat: Press
<leader>ge
from your chat buffer to copy the output from the last terminal session (useful when the terminal is no longer visible or you've returned to the chat) - Compare code block versions: Press
<leader>gd
on a code block with a filename attribute (file="filename"
) to compare it with a previous version of the same file in the chat (pressq
to close the diff view) - Repeat last command: Press
<leader>g!
to repeat the last set of commands you ran with<leader>gx
in a new terminal window
- Copy code block to clipboard: Place your cursor inside a markdown code block and press
With this, any question asked is associated with context of all questions and answers coming before this question. When the chat gets too long and the chat_memory is enabled, chat exchanges earlier in the transcript will be represented by the concatenation of their summary lines (π:).
Place cursor in the question area, and <C-g>g
, to ask assistant about it. If the question is at the end of document, it's a new question. Otherwise, a previously asked question is asked again, and previous answer replaced by the new answer. You might want to do this, for example, if upon learning, you tweaks your questions. Or you updated referenced file (with the @@
syntax).
If you see a message saying "Another Parley process is already running", you can either:
- Use
<C-g>s
to stop the current process and then try again - Add a
!
at the end of the command (:ParleyChatRespond!
) to force a new response even if a process is running
For more extensive revisions, you can place the cursor on a question and use <C-g>G
to resubmit all questions from the beginning of the chat up to and including the current question. Each question will be processed in sequence, with responses replacing the existing answers at their correct positions. This is particularly useful when you've edited multiple previous questions, and/or referenced files and want to update all previously asked questions.
During the resubmission, a visual indicator will highlight each question as it's being processed, and notifications will display progress. You can stop the resubmission at any time with the stop shortcut (<C-g>s
). When complete, the cursor will return to your original position.
Because you can update previous questions and even assistant's answers, the answers of future questions, will be different, subtly influenced by all those. After all, we are dealing with a large scale statistical machine
here.
The π§ :, π: are done through system prompt. It seems to work fine, but there's no guarantee. If assistant omitted those lines, you can update the question to reinforce it: "remember to reply with π§ : lines for your reasoning, and π: for your summary". Something like that.
The transcript is really just a text document. So long the π¬:, π€:, π§ :, π: pattern is maintained, things would work. You are free to edit any text in this transcript. For example, adding headings #
and ##
to group your questions sections, which shows up in Table of Content with <C-g>t
.
You are free to put bold on text, as a marker so you can remember things easier. The whole thing is markdown format, so you can use backtick
, or [link], or bold, each having different visual effect. I may add some customized highlighter, just to make certain text jumping out.
Snippets for your preferred package manager:
-- lazy.nvim
{
"xianxu/parley.nvim",
config = function()
local conf = {
-- For customization, refer to Install > Configuration in the Documentation/Readme
-- Typically you should override the api_keys, e.g. if you are using Mac Keychain to store API keys.
-- Use the following to add api keys to Mac Keychain.
-- security add-generic-password -a "your_username" -s "OPENAI_API_KEY" -w "your_api_key" -U
api_keys = {
openai = { "security", "find-generic-password", "-a", "your_username", "-s", "OPENAI_API_KEY", "-w" },
anthropic = { "security", "find-generic-password", "-a", "your_username", "-s", "ANTHROPIC_API_KEY", "-w" },
googleai = { "security", "find-generic-password", "-a", "your_username", "-s", "GOOGLEAI_API_KEY", "-w" },
ollama = "dummy_secret",
},
}
require("parley").setup(conf)
-- Setup shortcuts here (see Usage > Shortcuts in the Documentation/Readme)
end,
}
-- packer.nvim
use({
"xianxu/parley.nvim",
config = function()
local conf = {
-- For customization, refer to Install > Configuration in the Documentation/Readme
-- Typically you should override the api_keys, e.g. if you are using Mac Keychain to store API keys.
-- Use the following to add api keys to Mac Keychain.
-- security add-generic-password -a "your_username" -s "OPENAI_API_KEY" -w "your_api_key" -U
api_keys = {
openai = { "security", "find-generic-password", "-a", "your_username", "-s", "OPENAI_API_KEY", "-w" },
anthropic = { "security", "find-generic-password", "-a", "your_username", "-s", "ANTHROPIC_API_KEY", "-w" },
googleai = { "security", "find-generic-password", "-a", "your_username", "-s", "GOOGLEAI_API_KEY", "-w" },
ollama = "dummy_secret",
},
}
require("parley").setup(conf)
-- Setup shortcuts here (see Usage > Shortcuts in the Documentation/Readme)
end,
})
-- vim-plug
Plug 'xianxu/parley.nvim'
local conf = {
-- For customization, refer to Install > Configuration in the Documentation/Readme
-- Typically you should override the api_keys, e.g. if you are using Mac Keychain to store API keys.
-- Use the following to add api keys to Mac Keychain.
-- security add-generic-password -a "your_username" -s "OPENAI_API_KEY" -w "your_api_key" -U
api_keys = {
openai = { "security", "find-generic-password", "-a", "your_username", "-s", "OPENAI_API_KEY", "-w" },
anthropic = { "security", "find-generic-password", "-a", "your_username", "-s", "ANTHROPIC_API_KEY", "-w" },
googleai = { "security", "find-generic-password", "-a", "your_username", "-s", "GOOGLEAI_API_KEY", "-w" },
ollama = "dummy_secret",
},
}
require("parley").setup(conf)
-- Setup shortcuts here (see Usage > Shortcuts in the Documentation/Readme)
Make sure you have OpenAI API key. Get one here and use it in the 4. Configuration. Also consider setting up usage limits so you won't get surprised at the end of the month.
The OpenAI API key can be passed to the plugin in multiple ways:
Method | Example | Security Level |
---|---|---|
hardcoded string | openai_api_key: "sk-...", |
Low |
default env var | set OPENAI_API_KEY environment variable in shell config |
Medium |
custom env var | openai_api_key = os.getenv("CUSTOM_ENV_NAME"), |
Medium |
read from file | openai_api_key = { "cat", "path_to_api_key" }, |
Medium-High |
password manager | openai_api_key = { "bw", "get", "password", "OAI_API_KEY" }, |
High |
If openai_api_key
is a table, Parley runs it asynchronously to avoid blocking Neovim (password managers can take a second or two).
The following LLM providers are currently supported besides OpenAI:
- Ollama for local/offline open-source models. The plugin assumes you have the Ollama service up and running with configured models available (the default Ollama agent uses Llama3).
- Anthropic to access Claude models, which currently outperform GPT-4 in some benchmarks.
- Google Gemini with a quite generous free range but some geo-restrictions (EU).
- Any other "OpenAI chat/completions" compatible endpoint (Azure, LM Studio, etc.)
Below is an example of the relevant configuration part enabling some of these. The secret
field has the same capabilities as openai_api_key
(which is still supported for compatibility).
providers = {
openai = {
endpoint = "https://api.openai.com/v1/chat/completions",
secret = os.getenv("OPENAI_API_KEY"),
},
googleai = {
endpoint = "https://generativelanguage.googleapis.com/v1beta/models/{{model}}:streamGenerateContent?key={{secret}}",
secret = os.getenv("GOOGLEAI_API_KEY"),
},
anthropic = {
endpoint = "https://api.anthropic.com/v1/messages",
secret = os.getenv("ANTHROPIC_API_KEY"),
},
},
Each of these providers has some agents preconfigured. Below is an example of how to disable predefined ChatGPT3-5 agent and create a custom one. If the provider
field is missing, OpenAI is assumed for backward compatibility.
agents = {
{
name = "ChatGPT3-5",
disable = true,
},
{
name = "MyCustomAgent",
provider = "copilot",
chat = true,
command = true,
model = { model = "gpt-4-turbo" },
system_prompt = "Answer any query with just: Sure thing..",
},
},
The core plugin only needs curl
installed to make calls to OpenAI API and grep
for ChatFinder. So Linux, BSD and Mac OS should be covered.
Expose OPENAI_API_KEY
env and it should work. Otherwise copy lua/parley/config.lua
to your ~/.config/nvim/lua/parley/
and update.
All commands can be configured in config.lua
.
Open a fresh chat in the current window. <C-g>c
Open a dialog to search through chats. <C-g>f
By default, this only shows chat files from the last 3 months (configurable in chat_finder_recency.months
). While in the dialog:
- Press the configured toggle key (default:
<C-a>
) to switch between showing recent chats and all chats - The dialog title displays the current filtering state (Recent or All)
- Chat files are sorted by modification date with newest first
- Each entry shows filename, topic, and date
Request a new GPT response for the current chat. <C-g>g
Delete the current chat. By default requires confirmation before delete, which can be disabled in config using chat_confirm_delete = false,
. <C-g>d
Opens a Telescope picker for selecting an agent. If Telescope is not available, shows the current agent information. You can also specify a specific agent name as an argument to switch directly: :ParleyAgent ChatGPT4o
.
Opens a Telescope picker for selecting an agent if Telescope is available. If not, cycles between available agents based on the current buffer (chat agents if current buffer is a chat and command agents otherwise). The agent setting is persisted on disk across Neovim instances. <C-g>a
Stops all currently running responses and jobs. <C-g>s
The chat finder feature includes intelligent filtering based on file recency, making it easier to find relevant chats:
-- Chat finder recency filtering configuration
chat_finder_recency = {
-- Enable recency filtering by default
filter_by_default = true,
-- Default recency period in months
months = 3,
-- Use file modification time (true) or creation time (false)
use_mtime = true,
},
-- Keybinding for toggling between recent and all chats in the finder
chat_finder_mappings = {
delete = { modes = { "n", "i", "v", "x" }, shortcut = "<C-d>" },
toggle_all = { modes = { "n", "i", "v", "x" }, shortcut = "<C-a>" },
},
- By default, the chat finder only shows files from the last 3 months (configurable)
- Files are sorted with newest first to quickly find recent conversations
- A toggle key (default:
<C-a>
) lets you switch between recent files and all files - The dialog title updates to show whether you're viewing "Recent" or "All" chats
- Each entry displays the file's last modification date for easy reference
This feature helps manage growing collections of chat files and quickly locate relevant conversations without overwhelming the finder with old, rarely used transcripts.
Parley includes a "raw mode" feature for debugging and advanced use cases that makes it easier to directly interact with the LLM provider APIs:
Raw mode can be enabled in your setup or in individual chat headers:
-- In your config
raw_mode = {
enable = true, -- Master toggle for raw mode features
show_raw_response = true, -- Show raw JSON API responses
parse_raw_request = true, -- Parse user JSON input directly as API requests
},
or in a chat file header:
- file: mychat.md
- raw_mode.show_raw_response: true
- raw_mode.parse_raw_request: true
-
Raw Response Mode (
show_raw_response: true
):- When enabled, the API's raw JSON response is displayed as a code block
- This reveals complete model output including usage statistics and metadata
- Useful for debugging and understanding the provider's response format
-
Raw Request Mode (
parse_raw_request: true
):- Allows you to craft custom JSON requests to send directly to the API
- Format your request as a JSON code block in your question:
π¬: ```json { "model": "gpt-4o", "messages": [ {"role": "system", "content": "You are a JSON validator."}, {"role": "user", "content": "Explain the structure of a valid OpenAI request."} ], "temperature": 0.7 }
- The plugin will extract and use this JSON as the direct payload for the API - Overrides normal message processing and allows full control of request parameters
This feature is particularly useful for:
- Testing and debugging API interactions
- Exploring advanced model capabilities
- Learning API formatting requirements
- Experimenting with different request structures
- Seeing complete token usage statistics
The plugin supports automatic summarization of longer chat histories to maintain context while reducing token usage. This feature is particularly useful for long conversations where earlier parts can be summarized instead of sending the full transcript to the API.
- When chat messages exceed a configured threshold, older exchanges are replaced with a summary
- Summaries are extracted from assistant responses with a specific prefix (default: "π:")
- This allows the LLM to maintain context of the conversation without the full token cost
The chat memory feature can be configured in your setup:
chat_memory = {
-- enable summary feature for older messages
enable = true,
-- maximum number of full exchanges to keep (a user and assistant pair)
max_full_exchanges = 3,
-- prefix for note lines in assistant responses (used to extract summaries)
summary_prefix = "π:",
-- prefix for reasoning lines in assistant responses (used to extract summaries)
reasoning_prefix = "π§ :",
-- text to replace omitted user messages
omit_user_text = "Summarize previous chat",
},
To take advantage of this feature, instruct your LLM in the system prompt to include summaries of the conversation. For example the following, or check defaults.lua for details, which is already included as default. If LLM is not good at following after a long session, you can add those to your question to refresh its memory.
When thinking through complex problems, prefix your reasoning with π§ : for clarity.
After answering my question, please include a brief summary of our exchange prefixed with π:
When the chat grows beyond the configured limit, the plugin will automatically replace older messages with the extracted summaries.
Parley is designed to work well with all color schemes while providing clear visual distinction between different elements.
By default, Parley links its highlight groups to common built-in Neovim highlight groups:
- Questions (user messages): Linked to
Keyword
- stands out in most themes - File references (@@filename): Linked to
WarningMsg
- clearly visible in all themes - Thinking/reasoning lines (π§ :): Linked to
Comment
- appropriately dimmed in most themes - Annotations (@...@): Linked to
DiffAdd
- typically has a subtle background color
You can customize these highlight groups by adding a highlight
section to your configuration:
highlight = {
-- Override with your own highlight settings
question = { fg = "#ffaf00", italic = true }, -- Orange text for questions
file_reference = { fg = "#ffffff", bg = "#5c2020" }, -- White text on red for file refs
thinking = { fg = "#777777" }, -- Gray text for reasoning lines
annotation = { bg = "#205c2c", fg = "#ffffff" }, -- White text on green background
},
Each field is optional - set only the ones you want to customize and leave the others as nil
.
Parley includes built-in integration with lualine, allowing you to display the current agent in your statusline when working with chat buffers.
The lualine integration can be configured in your setup:
lualine = {
-- enable lualine integration
enable = true,
-- which section to add the component to
section = "lualine_x",
},
- When working in a chat buffer, the lualine component will show the current agent
- The component will only appear when you're in a chat buffer
- The integration automatically registers itself with lualine if enabled
To set this up, no additional configuration is needed beyond enabling it in your Parley config.
You can also manually add the Parley component to your lualine configuration if you need more control:
-- In your lualine setup
require('lualine').setup {
sections = {
lualine_x = {
-- Other components...
require('parley.lualine').create_component(),
}
}
}
This is useful if you want to position the component precisely within your statusline configuration.
This was adapted from gp.nvim. I decided to fork as I wanted a simple transcript tool to talk to LLM providers.