Skip to content

Allow comment in dictionary#3915

Open
AlightSoulmate wants to merge 17 commits intocodespell-project:mainfrom
AlightSoulmate:allow-comment-in-dictionary
Open

Allow comment in dictionary#3915
AlightSoulmate wants to merge 17 commits intocodespell-project:mainfrom
AlightSoulmate:allow-comment-in-dictionary

Conversation

@AlightSoulmate
Copy link
Copy Markdown

Fix: #3901

Add validation to build_dict loop in codespell/codespell_lib/_spellchecker.py to prevent crashes mentioned in #3901 :

    for line in f:
          line = line.strip()
          # check statement
          if not line or line.startswith("#") or "->" not in line:
              continue
          [key, data] = line.split("->")

Lines starting with # will be recognized as comments and be ignored.
Empty lines and lines missing the -> are also ignored, even though they are not supposed to exist.

Manual verification:
Created example/sample.py to test the revision:

# example/sample.py
def test():
    print("clas")
    print("buring")
# example/dict.txt
# 1. This is a comment line
tis->this
opem->open
buring->burying, burning, burin, during,
clas->class, disabled due to name clash in c++
# 2. Below is an empty line

# 3. Below is a line without '->'
servcie-service

Running on project root

python -m codespell_lib -D example/dict.txt example/sample.py

No ValueError.
image

@DimitriPapadopoulos
Copy link
Copy Markdown
Collaborator

Isn't this a duplicate of #2617?

@AlightSoulmate
Copy link
Copy Markdown
Author

Isn't this a duplicate of #2617?

I overlooked the previous PR, sorry about that. Since the previous one appears to have stalled, I can help complete it by adding the missing tests if you're open to including this feature.

@DimitriPapadopoulos
Copy link
Copy Markdown
Collaborator

DimitriPapadopoulos commented Apr 14, 2026

I must admit I am always afraid new enhancements might impair performance. But then we don't measure performance, and reading the dictionaries shouldn't be the bottleneck in most use cases. Let's give it a try.

Note that this change would disallow typos and fixes including # - but I think we have decided that's OK 😄

To start with, instead of testing "->" not in line, I would recommend:

-            [key, data] = line.split("->")
+            try:
+                key, data = line.split("->")
+            except ValueError:
+                continue

@DimitriPapadopoulos
Copy link
Copy Markdown
Collaborator

Then you could also have a look at #2063/#2068, which are about comments in --ignore-words files.

Copy link
Copy Markdown
Collaborator

@DimitriPapadopoulos DimitriPapadopoulos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but shouldn't documentation be updated?

Comment thread codespell_lib/_spellchecker.py Outdated
Comment thread codespell_lib/_spellchecker.py Outdated
@AlightSoulmate
Copy link
Copy Markdown
Author

AlightSoulmate commented Apr 15, 2026

The core appeal is to make comments allowed in dictionary, including inlines comments.

Comments allowed
pure comments:

#comment
# comment
 # comment
###### comment

inline comments: the first hashtag must be preceded by whitespaces.

abondon->abandon #comment
abondon->abandon # comment
abondon->abandon ###### comment

Illegal comment

abondon->abandon#comment
abondon->abandon# comment

I prefer to ensure there is a space before the hashtag, otherwise hashtags in invalid positions will be interpreted as comments, like:

thenumberone->the#one

Is this setting acceptable ? Referring to #2068, I would also recommend not supporting typos and their correct writing form containing hashes, however we dont know whether users' customized dictionary files includes hashtags or not, so I added some code to deal with that.

Code snippets

for line in f:
    left, pound, _ = line.partition("#")
    # The first hashtag is treated as comment only if preceded by whitespace
    # Otherwise it is considered part of `err` or `req`
    if pound and left and left[-1] not in (' ', '\t'):
        continue
    line = left.strip()
    # Skip empty lines or pure comment lines
    if not line:
        continue
    try:
        [key, data] = line.split("->")
    except ValueError:
        continue

These lines will be skipped:

aban#don->abandon
abandon->aban#don
abondon->abandon#comment

Blank lines are also skipped.

AlightSoulmate and others added 3 commits April 15, 2026 23:25
Co-authored-by: Dimitri Papadopoulos Orfanos <3234522+DimitriPapadopoulos@users.noreply.github.com>
AlightSoulmate

This comment was marked as resolved.

@AlightSoulmate
Copy link
Copy Markdown
Author

One problem is that without warning, users are less likely to notice that unsatisfactory inline comment lines are skipped, then they wouldn't have corrected them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow comments and empty lines in dictionary

2 participants