Cspell: sanitize suggested words for dictionary
>>> [!note] Migrated issue
<!-- Drupal.org comment -->
<!-- Migrated from issue #3439240. -->
Reported by: [grimreaper](https://www.drupal.org/user/2388214)
Related to !303
>>>
<h3 id="summary-problem-motivation">Problem/Motivation</h3>
<p>Currently the Cspell job generates an artifact with a txt file with the words to put in a dictionary ".cspell-project-words.txt" in a project.</p>
<p>But in this txt file, the list of words is from Cspell output:<br>
- unsorted<br>
- as it is found in the code: capitalized, lowercased, uppercased, any step in between.<br>
- duplicated</p>
<h3 id="summary-proposed-resolution">Proposed resolution</h3>
<p>Add a manipulation to sanitize this output.</p>
<p>In my project skeleton, I have a small script that do this sanitization: remove duplicate, lowercase, sort</p>
<p><a href="https://gitlab.com/florenttorregrosa-drupal/docker-drupal-project/-/blob/10.x/scripts/quality/spellcheck/clean-dictionaries.sh?ref_type=heads">https://gitlab.com/florenttorregrosa-drupal/docker-drupal-project/-/blob/10.x/scripts/quality/spellcheck/clean-dictionaries.sh?ref_type=heads</a></p>
<p><code>cat ${DICTIONARY_FILE_PATH} | tr '[:upper:]' '[:lower:]' | LC_ALL=C sort -u -o ${DICTIONARY_FILE_PATH}</code></p>
<p>Maybe the best will be, if a project already have a ".cspell-project-words.txt" file, also provide a file with merged existing and new words in this project dictionary.</p>
<h3 id="summary-remaining-tasks">Remaining tasks</h3>
<p>- Discuss if maintainers want such addition: YES<br>
- Provide MR<br>
- <strong>Goals</strong>:<br>
-- <del>sorted, </del>lowercase<del>, unique</del> list of words provided as artifact.<br>
-- Merge the new reported words with a possible list of existing words (via file and/or variable??)</p>
issue